Тёмный

Kernel Grid | GPU Programming | Episode 2 

Simon Oz
Подписаться 4,2 тыс.
Просмотров 2,7 тыс.
50% 1

Опубликовано:

 

20 окт 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 5   
@dimanft6160
@dimanft6160 3 месяца назад
How does this have only 165 views, it's so good
@vastabyss6496
@vastabyss6496 2 месяца назад
ikr! Even 3 weeks later, it's not even at 1k :(
@gowiththeflo59
@gowiththeflo59 Месяц назад
This is a great series, thank you!
@bhavindhedhi
@bhavindhedhi Месяц назад
equations at 2:24 are incorrect
@Stefan-td1pw
@Stefan-td1pw 3 месяца назад
Hi, I've been watching these videos in addition to reading the Programming Massively Parallel Processors, My take on the exercise: (for the sake of brevity, I will not include assigning memory or memcpy for now) ```c // Kernel Function for Array Summing __global__ void sumArrays_Kernel(float *A, float *B, float *C, float *D, int Width, int Height, int Depth) { int x = blockIdx.x * blockDim.x + threadIdx.x; int y = blockIdx.y * blockDim.y + threadIdx.y; int z = blockIdx.z * blockDim.z + threadIdx.z; if (x < Width && y < Height && z < Depth) { int index = x + y * Width + z * Width * Height; // Defined as index as used twice in next line D[index] = A[index] + B[y * Width + x] + C[x]; } } void sumArrays_Host(float *A, float *B, float *C, float *D, int X, int Y, int Z) { float *A_d, *B_d, *C_d, *D_d; // Malloc and Memcpy vars (i.e A -> A_d) dim3 block(2, 2, 2); // I'm not massively sure on good sizing here dim3 grid((X + block.x - 1) / block.x, (Y + block.y - 1) / block.y, (Z + block.z - 1) / block.z); sumArraysKernel(d_A, d_B, d_C, d_D, X, Y, Z); // memcpy result back, and then free memory } ``` General idea is that we're using a different index for each input vector, based on the logic you were mentioning earlier, the block and grid logic is just making sure we're in bounds
Далее
ARRAYLIST VS LINKEDLIST
21:20
Просмотров 68 тыс.
Introduction to CPU Pipelining
10:29
Просмотров 43 тыс.
Harder Drive: Hard drives we didn't want or need
36:47
Nix explained from the ground up
23:39
Просмотров 46 тыс.
TAS Explained: Super Mario Bros. 3 in 0.2 seconds
19:39
What P vs NP is actually about
17:58
Просмотров 113 тыс.
When you Accidentally Compromise every CPU on Earth
15:59
Writing a game the hard way - from scratch using C. #1
34:20