Rajeev Balasubramonian

176
2 435 345

1:55

MICRO 2019 Lightning Talk: GenCache: Accelerator for Sequence Alignment

4 года назад

2:24

ASPLOS 2019 Lightning Talk, Relaxed Hierarchical ORAM

5 лет назад

10:15

Spectre Explained (CS/ECE 3810 Computer Organization)

6 лет назад

10:28

Meltdown Explained (CS/ECE 3810 Computer Organization)

6 лет назад

1:58

Secure DIMM Lightning talk for HPCA 2018

6 лет назад

10:09

ISAAC: An Analog Convolutional Neural Network Accelerator (Part II)

7 лет назад

11:28

ISAAC: An Analog Convolutional Neural Network Accelerator (Part I)

7 лет назад

7:51

Video 81: GPU Hardware Introduction, CS/ECE 3810 Computer Organization

8 лет назад

6:16

Video 80: Simultaneous Multi-Threading (SMT), CS/ECE 3810 Computer Organization

8 лет назад

9:36

Video 79: Parallel Version of the Ocean Kernel, CS/ECE 3810 Computer Organization

8 лет назад

9:12

Video 78: Parallel Programming Models, CS/ECE 3810 Computer Organization

8 лет назад

12:44

Video 77: Consistency Models, CS/ECE 3810 Computer Organization

8 лет назад

9:53

Video 76: Synchronization Primitives, CS/ECE 3810 Computer Organization

8 лет назад

11:43

Video 75: Directory Based Cache Coherence, CS/ECE 3810 Computer Organization

8 лет назад

6:14

Video 74: Cache Coherence Example (cont.), CS/ECE 3810 Computer Organization

8 лет назад

8:12

Video 73: Snooping Based Cache Coherence, CS/ECE 3810 Computer Organization

8 лет назад

10:49

Video 72: Virtual Memory Basics, CS/ECE 3810 Computer Organization

8 лет назад

12:23

Video 71: Virtual and Physical Memory Management, CS/ECE 3810 Computer Organization

8 лет назад

7:53

Video 70: Main Memory System Basics, CS/ECE 3810 Computer Organization

8 лет назад

8:08

Video 69: Capacity, Conflict, Compulsory Misses, CS/ECE 3810 Computer Organization

8 лет назад

6:43

Video 68: Cache Policies, CS/ECE 3810 Computer Organization

8 лет назад

6:43

Video 67: Set-Associative Caches, CS/ECE 3810 Computer Organization

8 лет назад

6:50

Video 66: Cache Access Example, CS/ECE 3810 Computer Organization

8 лет назад

12:05

Video 65: Cache Access Terminology/Concepts, CS/ECE 3810 Computer Organization

8 лет назад

7:17

Video 64: Access and Placement in a Cache, CS/ECE 3810 Computer Organization

8 лет назад

4:16

Video 63: Locality Benefits, CS/ECE 3810 Computer Organization

8 лет назад

9:16

Video 62: Cache Hierarchy Intro, CS/ECE 3810 Computer Organization

8 лет назад

7:56

Video 61: Out-of-Order Implementation Details, CS/ECE 3810 Computer Organization

8 лет назад

6:37

Video 60: Out-of-Order Example, CS/ECE 3810 Computer Organization

8 лет назад

Комментарии

@user-bm4ig4fw1t 17 дней назад

very nice,..do y ouhave videos on mesi, moesi as well

@quercus_opuntia 2 месяца назад

this video saved my life and marraige

@quercus_opuntia 2 месяца назад

Thank u Rajeev!!!

@hussainfathy1065 6 месяцев назад

thank you <3

@VikramReddyAnapana 7 месяцев назад

Excellent teaching, and so well explained. I absorbed. Thank you so much

@VikramReddyAnapana 7 месяцев назад

Thank you so much.

@enzoding7558 7 месяцев назад

it should be -128~127 instead of -127~128

@lulu-xp7mf 8 месяцев назад

thank you

@EriknocTDW 11 месяцев назад

The 1st LW instruction has 8(R4) and the 2nd has 16(R4). What are the numbers 8 and 16 for?

@enzoding7558 7 месяцев назад

4 byte, 8 byte, 16 byte, etc. that's the offset from the stack pointer or from the current address of a register(e.g. R4, R3, R8, etc)

@shridharbendi9087 Год назад

Excellent presentation 👏👏

@rashedh2009 Год назад

Annoying voice

@achieverakash5192 Год назад

you are super sir tomorrow is my exam you are my saviour

@chinmayrath8494 Год назад

Awesome explanation. Thank you !!!!

@raghul1208 Год назад

best explanation

@MaxAbramson3 Год назад

Nowadays, compile time branch prediction (with profiling) is usually better than 95%, and run time branch prediction (one cycle beforehand) is about 97%, according to some manufacturers. So branching code actually suffers from this technique as it make the code larger, resulting in both higher ICache mispredict rate and pressure on the instruction bandwidth. It's not ironic that the only surviving major architectures are those without a delayed branch.

@MaxAbramson3 Год назад

Perfect branch prediction recognizes that only about 2-3% of branches are actually misbehaving, though we're only about to see about 90% of the time what direction a branch will go well ahead of time.

@ggxue Год назад

nice thank you

@archgirl2665 Год назад

Great video thankyou sir!

@chelseaethiohighlights1971 Год назад

Thank you! excuse me, would you provide slide for us?

@gabrieldias6430 Год назад

you are life savier, man

@ok-jg9jb Год назад

Tq sir

@zalatanibrah1803 Год назад

concise explanation..

@rajeshhariharan7575 Год назад

Prof. Rajeev, Thanks a lot. very nicely explained.

@TheAyanMan 2 года назад

Great vid ☺️

@TheAyanMan 2 года назад

First comment 😃

@abhishekghosh1998 2 года назад

Also the fast algorithm picture, (as given in Hennessy Patterson Textbook) seems a bit erroneous to me. When we are adding Multiplicand.Multiplier[0] + Multiplicand.Multiplier[1], then wouldn't we need to shift the Multiplicand.Multiplier[1] by 1 unit to the left for addition alignment? The fast multiplier implementation given in the Carl Hamacher textbook, explains it properly (where the entire hardware is given in details). Actually, Multiplicand.Multiplier[0], this gives a 32 bit number, the LSB of which forms product[0].... And the rest of the bits of Multiplicand.Multiplier[0], i.e. from 1 to 31st are given to the first level adder... such that Multiplicand.Multiplier[0][1] is aligned with Multiplicand.Multiplier[1][0]. Please correct me if I am wrong.

@abhishekghosh1998 2 года назад

I do not think that the previous algorithm also works for signed numbers (I mean negative numbers) in all cases. The previous algorithm works in case of signed numbers, I guess, only when the multiplier is positive. The multiplicand can be positive or negative. But while right shifting the partial product formed at each step, we need to do an arithmetic right shift (instead of a simple logical right shift). Please correct me if I am wrong.

@souravgupta8182 2 года назад

Awesome Video!!

@jaysiddhapura 2 года назад

Does DA and AD conversion needed between each layer !?

@weirdsciencetv4999 2 года назад

Why would we need to keep the resolution as opposed to just making a network which is tolerant of error accumulation?

@Max-ge7sv 2 года назад

If you have a network like this, the currents will not simply be added. In fact you have a complex current divider with multiple voltage sources. The output current is the result of the superposition of all current dividers, which depend on the resistor values and it will get more and more complex by increasing the input vector. Furthermore, the memrisors are changing their values by applying a voltage. How is it possible to get a consistent result?

@nabhay583 Год назад

fWhat if we ground the lines? Won't we then easily be able to add the currents due to superposition?

@omersakkar5670 2 года назад

أشكرك

@arnabsaha2021 2 года назад

Very well explained

@hungke6211 2 года назад

how did you do it can you share with me , thank you

@warcroft23 2 года назад

I came across this channel in 2015; I was pursuing my master's. I was impressed by your way of teaching and articulate explanations. Since then, I have been revisiting this channel whenever I need to recall Computer Organization concepts.

@sugee98 2 года назад

what happens is you've a heterogeneous multicore system with one processing element (say a DSP, for example) that has caches but doesn't participate in the MSI/MESI/MOESI protocol. I presume a read request will cause it to get he correct data from either main memory or another cache. But what if it wishes to modify that location and has, say write back cache). How do the other processing elements know it's been modified? Must the DSP do something in software to alert the other PEs? Excellent tutorial - the best I've seen in fact!

@manarzh7460 2 года назад

Thank you Rajeev

@eurotrash4970 2 года назад

I have to say I love your british accent with a hint of indian as well. A lot less thick than most Indian computer science teachers you can find, and lot easier to understand

@pulkitjain25 2 года назад

Also, what is the size of the the directory typically ?

@pulkitjain25 2 года назад

Is the Directory on each node and is it kept coherent across all nodes? Or it is a in a global shared memory ?

@living_curious 2 года назад

How the directory is updated on all nodes?

@rajatbhattacharjya1443 3 года назад

This playlist needs to go viral

@vamosabv 3 года назад

Thanks Rajeev for sharing :)

@akhilhooda740 3 года назад

thanks sir ,you made easy

@saurabhdp 3 года назад

Prof. Rajeev, thanks for the video and your explanation. I have 1 question. If a situation arises where, say for example, both Processor P1 and P2 want to write to their copies of x exactly at the same time and issue an upgrade request simultaneously. Then how is this issue resolved. Thanks.