Тёмный
No video :(

std::simd: How to Express Inherent Parallelism Efficiently Via Data-parallel Types - Matthias Kretz 

CppCon
Подписаться 152 тыс.
Просмотров 17 тыс.
50% 1

Опубликовано:

 

27 авг 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 16   
@__hannibaalbarca__
@__hannibaalbarca__ 8 месяцев назад
Std::SIMD finally.
@user-lm1ew4ep7k
@user-lm1ew4ep7k 8 месяцев назад
yeah...
@scion911
@scion911 8 месяцев назад
I absolutely love this, for my current project/library this is absolutely a game changer for portability.
@Roibarkan
@Roibarkan 8 месяцев назад
28:17 [slides 31-32] I think “old school c developers” would define Pixel as a union of a single uint32_t and a struct with 4 uint8_t, and try to use this union as a way of simplifying the read-/writing code. Such approaches are undefined in c++ (break strict aliasing rules, I believe). I’m not sure if that C-style state of mind could guide us when designing how c++ should do it. Perhaps we should allow some std::simd for T’s that are aggregates of same-type “vectorizable” member-variables? Perhaps this is a generalization that can implicitly allow simd, mentioned in 48:22. Great talk, thanks Matthias !
@redram4574
@redram4574 3 месяца назад
very useful video
@dat_21
@dat_21 8 месяцев назад
It's a cool concept, but in practice that will mean even more spoon-feeding the compiler to get the code you want.
@eclipse4419
@eclipse4419 8 месяцев назад
Awesome!!
@Roibarkan
@Roibarkan 8 месяцев назад
Great talk! It seems that exploiting ILP when using simd can be very beneficial. Will library/compiler vendors be allowed to “do it for us” - e.g. is the default size() of std::simd strictly mandated by the hardware, or will specific compiler/library vendors be allowed to choose larger size() (perhaps based on compiler flags) to exploit ILP? perhaps the ABI tag which was mentioned is able to support such desires.
@blacklion79
@blacklion79 7 месяцев назад
Intel's left hand: push SIMD into all languages it could, including many mask defined operations. Intel's right hand: don't give us, simple people, AVX-512 for 10 years.
@PaulJurczak
@PaulJurczak 8 месяцев назад
@4:00 I'm curious why fake_modify/fake_read instead of passing initial x value as a parameter and returning the result.
@cranil
@cranil 7 месяцев назад
Because the compiler might simply remove the loop if you don’t use it later. And for modify first I think it’s to avoid the compiler pre computing the result at compile time.
@Alexander_Sannikov
@Alexander_Sannikov 8 месяцев назад
if you actually care about the performance of your data-parallel code, your PC has a special massively powerful hardware component that's specifically designed to maximize the throughput of this exact kind of task. it's called a GPU.
@MrHaggyy
@MrHaggyy 8 месяцев назад
Only view systems that have SIMD also have a graphics processor. And if they have one it`s only as much as you need for graphics. Servers, industrial machines, cars, home and kitchen devices etc. pp.
@ckjdinnj
@ckjdinnj 8 месяцев назад
Sending data to the gpu and reading back a result is also a pretty slow so for algorithms that utilize recursion or dynamic programming the gpu doesn’t make for a great resource.
@Antagon666
@Antagon666 Месяц назад
Ahmad's law. GPU processing is only ever worth it when the compute time greatly outweighs the serial time (in this case the atrocious pcie transfer times).
Далее
I Took a LUNCHBAR OFF A Poster 🤯 #shorts
00:17
Просмотров 4,8 млн
Back to Basics: Debugging in Cpp - Greg Law - CppCon 2023
1:00:37
SIMD Libraries in C++ - Jeff Garland - CppNow 2023
1:30:07