Arm vs RISC-V? Which One Is The Most Efficient?

Подписаться 314 тыс.

Просмотров 130 тыс.

50% 1

Arm has been making power efficient processors for decades. RISC-V is relativity new and many parts of its specifications aren't even ratified, but that hasn't stopped chip designers making RISC-V processors, including microcontrollers. Can RISC-V challenge Arm's power efficiency supremacy?
---
Let Me Explain T-shirt: teespring.com/...
Twitter: / garyexplains
Instagram: / garyexplains
#garyexplains

Опубликовано:

29 сен 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 438

@Matthigast 2 года назад

2010 does indeed seem 23 years ago

@GaryExplains 2 года назад

🤦‍♂️😜 Darn. That was a stupid mistake! But I think you are right it feels like soooo long ago!

@kurakuson 2 года назад

Apple's first iPad: April 2010

@ZDevelopers 2 года назад

@@GaryExplains Future proofing the video, that's all

@TechPill_ 2 года назад

@@ZDevelopers Yea that's what I was going to say

@NovasVilla 2 года назад

It’s so sad hear that 😢

@daniahmed 2 года назад

Gary, there was an article that i read about a week age that Apple may be shifting away from ARM to RISC-V. What do you think that Apple will switch to RISC V or continue with ARM for the time being?

@GaryExplains 2 года назад

If we read the same article it says that Apple is using RISC-V for some of its small co-processors, that is all. It is a good engineering choice, if it has to design bespoke hardware blocks then RISC-V is a workable solution.

@daniahmed 2 года назад

@@GaryExplains Maybe but that Article had some text about moving to RISC-V that Apple might be considering. Moving to RISC-V would benefit Apple in long-term as they wouldn't have to keep paying ARM for royalties or whatever deal they have with ARM. What's your take on this?

@GaryExplains 2 года назад

No, that part was just pure speculation because otherwise it would be a boring article and no one would read it.

@daniahmed 2 года назад

@@GaryExplains ok, thanks for clarifying.

@TheShorterboy Год назад

Your difference may be compiler, you would need to check the assembler out with gcc -S

@Henrix1998 2 года назад

Blackpill costs 20-30€ here... Sad times

@oisnowy5368 2 года назад

If you leave out everything else and just concentrate on one thing, it's possible to find some little victory somewhere. Efficient is such a vague term. RISC-V is nice, great that some people make it and I hope they get further. But my first ARM chip was an ARM2, no intention of switching camps and I see no reason to. These ARM chips were made to be low power, yet still have some performance. They do what they have been designed to do. That is also a form of efficiency; nothing wasted on doing what they shouldn't.

@rajivpalayan7028 Год назад

For measuring relative performance, it is wrong to do a per MHz calculation. The only metric that should matter is the total time needed to run the same application on both processors. A more complicated ISA means clock speeds will be reduced (which gives better per MHz performance), but that does not mean the processor is faster

@marklewus5468 2 года назад

One more comment. Most processor manufacturers produce a spec called DMIPS/MHz, or millions of integer calculations per megahertz clock speed. This allows you to do a clock for clock comparison between parts.

@ralfbaechle Год назад

Let me wind back the clock to the mid-80s to point you at the horrors of the Dhrystone benchmark which back then was more or less the canonical benchmark for integer performance. Even in the best case Dhrystone results didn't represent real world performance very well. Dhrystone wasn't only ignoring fp math entirely, its results also got more and more comically absurd as architectures got more sophisticated (caches and out-of-order made a giant difference) but also as compilers improved and started to "optimize away" part of dhrystone. The peak was rached when certain compilers started to recognice Dhrystone and applied Dhrystone-specific optimizations for almost arbitrary benchmark results - whatever marketing orders ;-) It gives me headaches to see parts of the industry are still using DMIPS decades after it's been throughly proven to be rubbish. (It seems many folks don't know these days - the D in DMIPS stands for Dhrystone).

@marioprawirosudiro7301 Год назад

@@ralfbaechle Thank you for this informative comment. One learns something new everyday.

@mnomadvfx 11 месяцев назад

It's an artificial value though and not very helpful for real world performance comparisons. I know this because Qualcomm quoted quite a high DMIPS for their Krait CPU core back during the ARMv7-A generation, and it routinely got thrashed by the lower DMIPS rated Cortex-A9 based SoCs in actual performance.

@ralfbaechle 2 года назад

RISC-V, ARM and others like MIPS which I have plenty of experience with are just architectures; the chips you can buy are implementations of these architectures. First thing to notice from a 30,000ft perspective is that 64-bit ARM, MIPS and RISC-V are surprisingly similar. In the past CPU architects were more adventerous. These days no more bat crazy shit like segments (x86) or register windows (SPARC, and totally batshit crazy on IA-64). MIPS is an early but well designed RISC architecture; 64-bit ARM (which fortunately is rather dissimilar to the 32-bit ARM architecture) is surprisingly similar. Which is unsurprising because one of the architects used to work for MIPS. And RISC-V was designed by the fathers of SPARC and MIPS. So are they all the same? Not quite but 64-bit ARM and RISC-V benefit significantly from hindsight. Now, once you take things to the limit things will be different. RISC-V's smaller footprint allows fitting more cores running at a higher clockrate on a die. It barely matters for the birdseed class of microcontrollers that's polluting most PCBs ;- So for most uses architecture doesn't matter - software does. That's where ARM is very well supported, MIPS is well established and RISC-V is still catching up. That said, the folks behind RISC-V is smart and have impressed me by what they have achieved and in my discussion so they're going to close that gap. Plus hgiher end implementations are going to show up. Being a truely open architecture however the RISC-V market can be as confusing as a ant pile - or open source in general ;-)

@markhaus Год назад

RISCV having open specs should make software catching up easier than for arm no?

@ralfbaechle Год назад

@@markhaus Yes and no. Availabillity of documentation is greatly simplyfying a port to a new architecture, especially when there is already a port to a simila architecture. It still remains a major underrtaking in terms of manhours required. Been there, done that. Three times 🙂 As far as software development is concerned, x86, MIPS, RISC-V, MIPS, ARM are open enough to aloo development of dcent software. IA-64 was special in that it's performance characteristics are ... complex. without NDA or possibly even being Intel it was not possible to certain software including high-end compilers. The level beyond that is licensing for the architecture itself to develop a core by anybody who wants to. RISC-V didnt really inovate there, there have been other such public domain or similarly unrestricted architectures before. But they were the first polished architecture with academic and industry acceptance, documentation and very liberal licensing on top. It's this mix (and probably a few more things on top) which made the rise of RIsc-V possible.

@conorstewart2214 Год назад

It’s not just about the people who designed the RISC-V ISA it is also about the people that implement it, they are the ones that actually determine the efficiency and clock speeds and similar.

@ralfbaechle Год назад

@@markhaus Yes and no. Programming specs are open for most architectures though the degree of detail varies. On the example of Intel, Intel CPUs accept only a signed blob as microcode and the microcode programming interface is not even documented. Probably few users care. More painful was Intel's attitude towards protecting the IA-64. The Merced documentation covers four thick books printed on thin paper and is also available for download (1-click sign-away your soul acceptance of terms required ;-) but errata required and NDA and certain very deep secrets were only available under terms of a much stricter NDA - the most restrrictive I've ever seen. Finally some further aspects such as deep details on performance aspects of the pipeline which are essential for the ijmplementation of top notch code generators and compilers were not available outside of Intel at all. I don't want to single out Intel but just picken them as an illustrative example Companies very in their degree of paranoida, protectiveness and openness and corporate history and experiences are part of that. RISC-V may be an open architecture. That means the architecture is open. It does not mean an actual implementation is open. It is possible to implement a fully RISC-V-compliant processor under the terms of the RISC-V licensing conditions - great. Yet iI can keep the implementation as closed as a traditional microprocessor implementation from companies such as Motorola, IBM, Intel, AMD, HItachi, MIPS, ARM etc. The result may be something that executes RISC-V code just fine yet for certain aspects such as performance has to be treated as opaque, as a black box. As somebody who has ported Linux to MIPS it's been occasinally helpful to have access to the folks who did all the mental heavy lifting and wrote the specs. With a RISC-V compaliant core one may or may not have the same kind of access. for a particular project.

@ralfbaechle Год назад

@@conorstewart2214 While this is correct one should consider the architecture defiinition of any processor architecture as something that sets the absolute limits of what's possible. A good implementation can reach for near 100% of that; a bad one will stay well below. With my MIPS experience I found the size of early MIPS and RISC-V cores which are somewhat comparable And the RISC-V implementation is much smaller in terms of transistors / gates which is an indication of how polished the architecture is. Ok,RISC-V had the benefit modern software tools to aid the implementation. Such comparisons across decades are bound to limp somewhat. An interesting aspect is how the RISC-V architecture is made up of several optional parts. Just to pick one example, an implementation does not need to have a multiplication or division instruction. They were looking at other architectures' pain points. MIPS was born as a super-fast RISC processor for super-mini computers, later workstations and servers. Nobody early on thoght of embedded computing. Such omissions are hard to rectify lateron in a clean manor. One point where RISC-V is brutally efficient is cost due to absence of licensing fees for the architecture itself. To some users that's the #1 aspect that matters.

@mementomori1868 Год назад

Its not about performance only!!!! The biggest thing RISCV is OPEN SOURCE processor...

@GaryExplains Год назад

Really? You understand that only the document describing the instruction set is open source. What advantage does that give consumers?

@mementomori1868 Год назад

@@GaryExplains Pls read (even in google) why riscv and open instruction set is so important.

@GaryExplains Год назад

@@mementomori1868 😂 Or please watch my videos as I have several about RISC-V and what it really is.

@markwarburton8563 2 года назад

I was surprised that the now somewhat venerable Black Pill did so well in these tests against the newer upstarts, especially in power consumption and power efficiency. Thanks Gary!

@dekus80 Год назад

Not just "black pill" but stm32f401 or 411. Today one mc on black pcb tomorrow another... And f411 has 12.7mA at 100MHz core with periph disabled. Not all periph is need to be on. I have doubts about this video test 20mA. The Chinese have a lot of analogues stm32. And for example CH32V203 (f103 clone with riscv core) has 8mA at 144MHz. CH32V30x ( riscv core with fpu) 12mA at 144MHz. And they have ever tssop20 case. As f103 clone CAN onboard, that f411 doesn't have. And 307 has Ethernet, 208 has bluetooth + Ethernet and 2.2$ in my local store. I have not been interested in buying stm32 for a long time. Only Chinese only like stm32 has CH32, HK32, AT32, GD32 and so on.

@jamesmcintyre2747 2 года назад

I appricate that Gary is right here that RISC-V is not yet *as* effecient as but I'm very impressed that RISC-V is already *almost* as efficent as ARM with for the same processes being run 1.36mWh compaired to the equivlent ARM board getting 1.31mWh and even compaired to the *much* more established Pi Pico, it's only 8% less effient (than the Pico). Obviously being almost 89% less efficient than the Blackpill isn't ideal for RISC-V but this is still early days for it compared to ARM and just with there being so many less RISC-V processors produced vs ARM, I don't think you can expect it to be beating the leaders of the pack in ARM just yet. Maybe when there are as many models of RISK-V processor as ARM processors the leader will be arm. Maybe with more time for tuning, the leader of the RISC-V pack will beat the leader of the ARM pack; even with less models out there. Encouraging stuff. Stating my bias: I want RISC-V to succeed as I think open source is the way forward and garding "intelectual property" like dragons over gold, is holding humanity back. Thanks for the interesting video Gary!

@kayakMike1000 2 года назад

You're in the realm of potential compiler optimizations... And which process node these chips are made of...

@xade8381 2 года назад

arm & risc-v are nearly of same age. Sadly, only ARM got attention at that time.

@TheWallReports 2 года назад

I agree. RISC-V is not there yet but made a very good showing being the new kid on the block. ARM has been at this game for decades. It is unrealistic to expect the new kid to outperform the veteran. ARM has been optimized over decades. RISC-V has to pay its dues to take the crown. I am strong RISC-V advocate. I look at this as there is plenty of room for RISC-V to improve. The ground to cover in some areas are not that great to close the gap.

@BruceHoult 2 года назад

@@xade8381 that's not correct. ARM started to be designed in 1983 and the first chips and boards were in 1986. ARM the company started in 1991, when there were already 100,000 ARM-based Archimedes PCs in use. RISC-V started to be designed in Berkeley university in 2010 (27 years after ARM), the initial frozen spec was published in 2014, the first board you could buy commercially from the first RISC-V company was in 2016 (30 years after ARM).

@TAP7a 2 года назад

@@xade8381 RISC-V was still an educational tool for years though, with zero plans for reaching any sort of market. Whereas ARM was made from the very beginning as a commercial ISA, and is 30 years older to boot. Not very comparable.

@jacobrosen 2 года назад

A nice explanation as always. But I'm missing the sleep current for the different boards. It would be intresting to see how they perform compared to eachother. It is more if a comparison between MCU brands than core architechture, but still! :D

@BruceHoult 2 года назад

Crazy to use only a single RISC-V board as representing a whole ISA. Obviously not all ARM cores or boards are created equal, and neither are all RISC-V cores or boards. Espressif doesn't even say in their datasheet what RISC-V core it uses. Crazy also not to include Sipeed Longan Nano ($4.80, 108 MHz, been around for three years), some Bouffalo lab BL602 board (similar price to ESP32s, we know it uses a SiFive core) or even extend the price limit a fraction to include a K210 board (dual core 400 MHz 64 bit) such as Maix Bit. Still, it is interesting to see that from the same chip/board manufacturer the RISC-V does in fact give better performance per MHz and per Watt than what they were using before. A really interesting test would be the Longan Nano (GD32VF103 clone of an STM32 but with a RISC-V core) vs either a GD32F103 (same manufacturer STM32 clone with a real licensed ARM core) and/or a real STM32F103.

@GreySectoid Год назад

When I studied computer science Risc-V was my favorite to program. Good to see they are now doing a comeback.

@LokiScarletWasHere 2 года назад

It’s always strange being reminded that people think RISC-V is inherently more efficient than ARM. That’s not why people like the architecture. It’s an open standard, whereas ARM is proprietary. Anyone who can make a chip can make and innovate a RISC-V design, not the case with ARM. That being said, this was nice to see. I’m sure it has a lot of people blackpilled now.

@GaryExplains 2 года назад

"Anyone who can make a chip..." Really? I wouldn't even know where to start.

@LokiScarletWasHere 2 года назад

@@GaryExplains I was referring to the legality, mostly. The architecture is open, you don’t need to pay for a license to make RISC-V chips.

@marklewus5468 2 года назад

Great work as always. Benchmarking is always a can of worms because it is as dependent on the application as it is on the processor. Do you need fast integer? Fast interrupt response? Floating point? DMA? If you used newer M3 and M4 parts they would have performed much better even in this integer-only test both with regard to processing speed and power consumption given that they’re built on *much* newer process nodes. And a recent STM32 M7 would’ve blown everything else out of the water.

@BruceHoult 2 года назад

Why are so many of the commenters here so obsessed with process node? It strikes me that many (not aimed at you in particular Mark, sorry) may just be reciting jargon without understanding it. Even a very old node such as 180nm is good enough for making a 300+ MHz chip (e.g. the SiFive FE-310 on many RISC-V microcontroller boards) which is plenty for anything in this test. Smaller process nodes do allow higher clock speeds, but if you're not USING that ability then they are not just a waste of money in the much more expensive design and manufacturing process, but they may actively be WORSE because of things such as higher leakage current when operated at low clock speeds or in low power sleep modes. It's also a complete waste when you're making a simple stand-alone chip such as a microcontroller with a small core and a small amount of SRAM because even with the old nodes you end up with the actual processor&memory being a tiny little square inside a huge bit of silicon with the I/O pin pads taking up 90% or 99% of the extremely expensive small process node chip area. The default assumption unless you're a real expert should be that the manufacturer has chosen the best process node to optimise what they want to achieve with their chip.

@adymode 2 года назад

We are familiar with needing to sample the test code many times to generate benchmark results which are not misleading, but it is also essential to sample different kinds of test code, to not be misled even by random compiler differences on each bit of code tested. With the performance results between the esp-c and the black pill coming within 1% of each other, that suggests the test was entirely memory bound on those systems and the systems share very similar memory systems. Multiple programs need to be benchmarked for a picture to emerge.

@DFPercush 2 года назад

@@BruceHoult In general I would think that a smaller feature size would mean less parasitic capacitance, but I didn't think about leakage current. Is that from quantum tunneling? I wonder where the sweet spot is for that. But there's also the matter of different topologies like finfet and gaa, that might reduce the switching current. Mostly I think it's an economic decision. Everybody wants better speed and battery life, but how much are they willing to pay for it? For a computer that only runs a single program continuously, all you need is "good enough". Microcontrollers often have external power anyway. The main concern vis a vis power consumption is cooling.

@marcusk7855 2 года назад

Isn't the manufacturing process(how many nm) a major factor in power consumption?

@abdox86 2 года назад

I really love the black pill , actually currently I’m working on project using it (STM32F411CE), so glad to hear it did will in the benchmark, but Gray I have question Did u write the program for each board in assembly or C ? In case the answer C , then What compiler did u use for each one? I hope didn’t throw up a lot of questions 😅😅. Amazing work man, thanks a lot for this benchmark and I hope see more of them!!

@Tapajara 4 месяца назад

You should put "power efficient" in the title.

@robonator2945 9 месяцев назад

minor but architecture can 100% be relevant for effeciency, speed, etc. Yes, a good x86 implementation can always sip power in comparison to a shit ARM implementation, but that doesn't mean implementation is all that matters. A slow algorithm on a super computer will outpace a fast algorithm on a microcontroller, but that doesn't mean picking the right algorithm doesn't matter, it just means that it's not the sole deciding factor. These architectures were invented to solve specific problems and to suggest that architecture is irrelevant is really just disengenous. No, the differences won't be direct, but the architecture influences the implementation; different architectures lend themselves better or worse to different designs, and some designs are better in some functionality than others. Intel was *_far_* ahead of AMD for a good long while, but then AMD started going batshit and putting dozens of cores on their CPUs and now at the ultra-high performance they're pretty unmatched. In single core they still lag a tiny bit IIRC, but in multicore it's real hard to beat 16, 32, 64, 128 seperate cores. Speed isn't just an RPG stat, there is a *_lot_* of nuance and 'speed' is really just the composite of how fast it can go and how easily it can go that fast. If your chip is the fastest thing in the world, but it takes 500x more work to develop for, it'll never take off. (outside of niche use cases of course) On the other hand, if your chip is 25% faster and a drop-in replacement, it'll spread like wildfire. One thing I really think RISC-V needs to work on is making sure that they go out of their way to make cross-compilation as easy as possible; that or invent a damn good emulation suite. (but only Apple has really ever pulled off performant cross-architecture support AFAIK. I hear a few projects are getting pretty good, but I've never heard of one *_really_* bridging the gap outside of Apple) A new architecture just can't demand people spend time porting their software unless they have something *_really_* good to offer, and really RISC is more of just an incremental improvement than anything.

@fakecubed 6 месяцев назад

Gonna take a while for the RISC-V manufacturers to figure out how to design really great chips with it, but there's no reason not to expect it will be roughly the same as ARM in the long run, just with an open ISA which is an absolute win on its own. Hobbyists who aren't trying to squeeze every last bit of performance and efficiency out of their projects should support RISC-V to help it along and encourage faster development. It's already outpacing ARM's development, which was already quite rapid.

@mrbigberd 2 месяца назад

The really interesting bit is RV32E with Zc* extensions. It essentially repurposes a bunch of the the floating point compressed instructions allowing a 16-bit only CPUs with half the registers. That'll be a tiny core.

@dead-claudia Месяц назад

@@mrbigberdminor nit: they still have to implement the 32-bit instructions. there's no base isa that's compressed-only. there are a few hoops to jump through, but it's not hard for an individual to contribute. and i've been tempted to contribute for months now. (i have a number of ideas i'd like to fling their way already.)

@mrbigberd Месяц назад

@@dead-claudia That's not strictly necessary as the 16-bit only format is Turing Complete. There are still 10-ish opcodes still left and a couple of them could be broken down further to provide a few more 2-reg instructions. Most importantly, a 16-bit only extension would allow the use of the 11 top-level opcode space. This would increase total instruction space by 25% in 16-bit only designs and that would give enough space for a massive 64 2-register opcode space using the CA instruction format. That's enough to add in more branch instructions, A, B, M, CSR, Zicond, Zacas, etc. Going further, the E-series only has access to 16 registers. Reclaiming those bits for CR gives 2 extra instruction bits (4x as many instructions). CI doubles its available instruction space too. This would open a path to add a basic Vector/DSP extension too.

@IamTheHolypumpkin 2 года назад

I just bought my first RISC-V chip, an esp32-c3 from adafruit. Mostly bought it to learn RISC-V Assembly. Generally want to learn AVR, ARM and RISC-V Assembly.

@BruceHoult 2 года назад

Sounds like a great plan. All are good ISAs. If you have any questions the Reddit /r/asm forum is pretty good for any ISA, and /r/avr and /r/riscv are helpful too. Sadly, /r/arm seems dead and/or non-technical.

@repostor 2 года назад

Very interesting article. I have always thought about how RISC-V would be compared to ARM. Do you have similar comparising for enterprise chips too? comparing RISC-V with x86 (Intel/AMD) and perhaps also including ARM?

@autohmae 2 года назад

I think 'process node' probably has a huge influence

@kayakMike1000 8 месяцев назад

Well... There are certain extras in your core implementation that will make a difference; stuff like the different caches and the coherency mechanism, the branch predictor, cpu internal bus and the bus arbiters, there's just so many extra internals that are all abstracted away in complex logic. Some of that complex logic is just more appropriate to implement in another program, i think some of the cpu caches are governed by a whole other "management engine" that runs its own firmware to keep track of the bits in the cache....

@ryan258147 2 года назад

You also need to consider the code density. The firmware binary size is usually smaller using ARM cortex compare to RISC-V or ESP32.

@mrrolandlawrence 2 года назад

There is also a compact version of arm called thumb which offers higher code density.

@dead-claudia Месяц назад

update from the future: the code density is starting to change in risc-v's favor as compressed instruction support is maturing.

@LogioTek 2 года назад

Useful test but not good test on the topic of CPU core efficiency for several reasons: 1. likely system bus speed differences between these (system bus interfaces to on-chip SRAM) obfuscate differences between true CPU core performance/MHz/Watt unless you downclocked all of them to lowest common denominator system bus speed, 2. differences in flash memory/prefetchers further obfucate CPU core performance unless you ran the benchmark from RAM and even then some like M3/M4 could use dual-buses 1 for data and 1 for instructions making it unfair, 3. finally at least some of these probably manufactured on different process nodes

@GaryExplains 2 года назад

How would you suggest I resolve those issues?

@LogioTek 2 года назад

@@GaryExplains Actually I didn't finish watching when I commented, I see you ran all of them at 1MHz later to level the playing field and I assume system bus was dropped to 1MHz also and that's a first important step. I would run all of these CPU cores at the system bus speed of the lowest common denominator system bus speed. The second step is to link to run the code out of SRAM instead of Flash on all of them. That's probably the best you can do to isolating core performance efficiency.

@nitinj1234 2 года назад

Hi Gary, it would be really interesting to have you do a Intel Atom/E-core (Alderlake/Gracemont) architectural deep dive video, and a comparison to Arm/Risc-v.

@MarquisDeSang 2 года назад

They are too fare away to be compared.

@MatrixJockey 2 года назад

e-cores aren't efficient whatsoever

@adriancoanda9227 Год назад

Ah, it can't be compared. Atom is an x86 cpu, and it depends on how the cache and fsb are set. Arm chips usually operate at max 0.5 volt atom can go up.to 2 volt on turbo so is definitely diffrent clas of cpu ah why not against an ia64 cpu lol 😆 wanna se that race 😆

@MarquisDeSang Год назад

@@adriancoanda9227 In the end what makes the difference between slow and fast, is 99% software. I would win that race if I am the programmer : would use inline assembly, lookout tables with pre-computed values, would not miss the caches with visibility list, local goto.... Sofware always wins.

@adriancoanda9227 Год назад

@MarquisDeSang not always. It still needs hardware to run on y saw some remastered games to be used via the browser chromebook target it ah tnd that launcher looped 2 gb of data but target just on cpu core so the loading screen took 10 minutes search for ah y like to see you in quantum pc your thinking won't apply there cause it is not a digital cpu is analog and capable of insane parallel computing and it exists already a portable one withou a transmission it won't run any apps like you are used to, 😉

@NNokia-jz6jb 11 месяцев назад

6502 is the best.

@ArniesTech 2 года назад

Both are amazing and exciting alternatives to X86 💪🙏

@magfal Год назад

A huge factor for efficiency is compiler quality which grows with age. The major design differences ariund efficiency is stuff like dark silicon for common tasks and SIMD engine implementation plus caches.

@laci272 2 года назад

As I watch, I get questions, and as soon as they pop into my mind, Gary already responds to them. It's rare that a tech video is this well thought out and structured this well!

@gigihanmandarin 2 года назад

the one and only legendary Gary Explains.

2 года назад

Do you have thoughts about potential of Risc-V? I was curious about any production difference, like applied node size.

@GaryExplains 2 года назад

I talk about RISC-V's potential in my RISC-V series.

2 года назад

@@GaryExplains indeed you did! Quite a few as well ru-vid.com/group/PLxLxbi4e2mYFTkLsNYqWLrSQZtLB94wnY

@bsheldon2000 Год назад

I would love to see the xiao nrf52840 board or equivalent, put to the test as it is running at 64 mhz. This is the microcontroller used on a lot of smartwatches. Plus it would also be interesting to see the boards already test, retested at lower clock speeds, if that option is available. I know some esp32 can have the clock lowered. For pure power efficiency, I believe lower clock speed tends to be more power efficient for the same work done, as power usage tends to go up on an exponential scale, whereas processing power for the same processor goes up linearly. If the amps are the same at 3.3 and 5, then it is using an inefficient regulator to drop the voltage. Just curious, did you calculate the power efficiency using 3.3 or 5 volts? I am not a fan of any architect as I just use whatever is better suited to the task. Of course having one that does it all would be nice and save having to learn all the differences, but now that assembly language is rarely used, it is not like having to learn an entirely new instruction set. By the way, if anyone gets a xiao nrf52840, if they say double click the button beside the usb c, the double click speed is a bit slower double click than I was used to. Took me a lot of tries to get it right. Luckily someone mentioned doing a slow double click somewhere.

@Andrew-rc3vh 2 года назад

'Which one is more efficient', not most efficient. You use most if comparing more than two things. This is primary school English tuition!

@GaryExplains 2 года назад

Thanks Andrew, I am glad you found the video useful.

@chipcode5538 2 года назад

Gary did you check the real clock speed of the RP2040. The maximum clock speed is 133 MHz but in the SDK it is set to 120 MHz because it is easier to get the correct clock for peripherals like the USB. Check SystemCoreClock in the SDK. Are you running the test from RAM or XIP? You probably see a difference here.

@TheFerdi265 2 года назад

There is actually even more fun stuff here: The chip has 2 PLLs; it sets one to 48MHz for USB, and one to 125MHz for CPU and bus clock. 125 is also much more manageable to get useful clocks for other peripherals as you said. You can also push the pico MUCH further than what it is specced for. I have run complex programs with PIO and PWM at 300MHz just fine running from RAM, and ~250MHz when running from XIP.

@meister550 2 года назад

Too bad MIPS is dead

@perforongo9078 2 года назад

The measurement or the ISA?

@BruceHoult 2 года назад

MIPS recently announced a range of high performance RISC-V cores, including eVocore P8700 and I8500

@PeetHobby 2 года назад

That is very slow M4, low power version? Most M4's run at 168Mhz or 180Mhz or so. Edit: And real power of the stm32 M4 is the FPU. Maybe you can do a floating-point test between esp32, risc-v and the M4.

@riscy00 6 месяцев назад

Low-power variants of M4 have reduced performance, especially in DMA and Bus Matrix as they have been simplified (in order to maximize battery saving and system power consumption under green something) compared to workhorse F4 with better parallel architecture in DMA and Bus-Matrix. The reason for M0+ from M0 is that interrupts have been proven to be too limiting in the past, where M0+ relieved some of this issue.

@psiah9889 Год назад

As I see it: Arm's been around for a while. It's had an awful lot of work put into its efficiency, power, etc. over the decades. RISC-V is new, and there isn't a lot of money in perfectly optimizing it (yet). The fact that it is at all competitive now is a good sign for things to come, but it's gonna need more time, work, and support to be fully realized in this regard.

@daniellewis984 Месяц назад

So - I've implemented an ARM and RISC-V architectures "on paper", and RISC-V is simpler in ways that pay. There's only about a dozen basic choices that even *can* be optimized out in the core ISA yielding an architecturally pure ISA. My less favorite parts: * The opcode, funct3, and funct7 not being unified in decode step. * The LU opcodes not mapping to truth table means LU operations are not simple 4BD decodes with add and mul being 2x4BD+1 decodes. AFAIK no commercially available ISA has ever achieved this, but it's been discussed widely in academic circles. ARM though has for example the Java bit which *halves* the available opcode range, and is AFAIK based on an earlier RISC platform with some commercial extensions. And sure, some of that makes it fast, but it's going to be less efficient per wire than a cleaner ISA. There's actually tons of details I don't like in it.

@andrewsutton6640 2 года назад

How do these compare with x86 chips, specifically in running programs that are designed for x86?

@GaryExplains 2 года назад

x86 chips can only run Arm and RISC-V programs using emulation. The opposite is also true.

@u9vata 2 года назад

I actually disagree that architecture does not decide efficiency just implementation. It is not so simple! You can say that one can implement both using steam engines and valves and silicon and make a HUGE difference - but you can also come up with instruction set architectures that are "easier" to implement well on silicon or harder to implement well. The same is saying that you cannot basically tell which programming language is faster because they are just a language - a specification - not an implementation of that. Yes you can make a very dumb C compiler that makes it slower than python, but for some reason all well-made C compilers are tons faster... Why? Because the specification was made that way (or ended up that way) so that it can be more efficiently implemented both on average and on best effort! Trust me: I can come up with an instruction set architecture any day that can be only implemented garbage slowly if you don't understand what I talk about or parallels... Also would be good to have riscV SoC variants that does not have the integrated wifi... being in sleep or not existing likely makes a huge difference (if not for other, then it takes space on the die which could be utilized to have lower consumption solutions for example).

@GaryExplains 2 года назад

True, but we aren't dealing with a specification that you "can come up with", we are dealing with Armv6 and RISC-V. That is the context, not "u9vata's amazing ISA".

@u9vata 2 года назад

@@GaryExplains Sure. That is an extreme example, but it shows well that ISA specs - just like programming language specs - do have a leverage on the possibility of implementation. I expect you to know that and don't get me wrong, this is all interesting data worth really measuring (I would even argue that if most ppl literally use whole boards among makers, whole board consumption / efficiency is also not bad to measure and makes it more real life stuff). Only saying that it could be misleading also to think that spec or ISA cannot affect performance of average, worst and best implementations efficiency range. It is not as simple to measure that like measuring a product and I guess it is actually too early for that but it is not just business randomness that all mobile devices have not x86 but arm for example and generally fare better. It is not just the implementation, but the spec can be better suited for this or that.

@GaryExplains 2 года назад

The reason why I emphasize the implementation over the ISA, is because people think that RISC-V is magic and that the ISA itself will somehow solve problems of power and efficiency. That is nonsense. Since the context is two well designed and well defined ISAs then I think my statement stands and isn't misleading.

@u9vata 2 года назад

@@GaryExplains I think to be honest it is too early to draw conclusion on which ISA fares better when it comes to efficiency. I am pretty sure there will be difference, not yet sure what direction. Has hopes and that is all. So just like you emphasize the implementation over ISA so that people might thing ISA is the only factor, I want to emphasize that it is indeed a factor and a defining aspect - but on a different granularity: Implementation more tied to the product directly. ISA more tied to class of products overall statistically relevant effficiency generally. I think we do not really contradict each other here to be honest.

@Chris-wf2lr 2 года назад

Why not transistor count instead of energy used, too many variables. Assuming transistor numbers usually correlate to cost ultimately… to show what architecture more efficient for the theoretical cost of production (if they were same fab, same node)

@GaryExplains 2 года назад

Transistor count doesn't correlate in any meaningful way. It won't help you decide what size battery to use etc. Power usage is the most important thing, everything else is just statistics.

@Navhkrin 3 месяца назад

Unfortunately, there is no way to quantify which ISA is more efficient based on random boards from different manufacturers. There are too many variables in this equation to drive any meaningful data from these tests. One would need to custom engineer their own hardware while keeping CPU design really close to each other to be able to accurately quantify this

@muha0644 2 года назад

1:10 I did, RV32I in fact! although i had a hard drive failure so now it's abandoned...

@Serhii_Volchetskyi 4 месяца назад

Consider ploting these chips as a chart Power_consumption vs Time_of_execution. By doing that, we will see the best over all chip.

@hasanagera 3 месяца назад

11:20 how can current stay the same? I have searched many and many voltage regulators. They all come with no load or quiescent current. If you don't use a voltage regulator, it must use less current. This is not complicated.

@ByteMeCompletely 10 месяцев назад

I just created a NAS with a Raspberry Pi 4B and an external USB HDD. This would be a good application to verify an SBC is useful.

@vikaspoddar001 2 года назад

I guess Gary, you really should put out a video series explaining the differences between ISAs, microarchitecture, process node etc. to the general public, as I have watched many people are disagreeing with you on various issues. I think this video series will work as prelude to ARM vs RISCC-V video BWT i also felt that I need some more help 😅😅😅😅 on this. Thank you

@ebuzertahakanat Год назад

there is no such thing as efficient ISA, efficiency comes from physical design and layout and which nm technology chip using, ask compiler developers which one is better ISA because thats what matters. ISA is interface between chip and compiler efficiency comes from implementation not interface.

@maximus6884 2 года назад

Chinese technology is becoming superior

@kasperlhde7893 2 года назад

Interesting video :) I do not think it is enough to just to power the 3.3v rail since there are other onboard electronics which also require a power (usb to serial converter) on the esp32 chip. It could have been interesting to see it compared to the datasheet :)

@stealthinator00 Год назад

I think that first generation product are not going be great anyways. Just wait in the future you will get better ones as they get more experience refining the process. That is why early adopters always get screwed in the long time.

@dahlia695 4 месяца назад

How did you ensure that wifi and bluetooth radios did not affect the power measurements?

@gadlicht4627 2 года назад

It might be better to run multiple types of programs bc different ones may compute using different power drawing

@michaelkaercher Год назад

In general, performance of risc-5 is not up to the standards of ARM. Full stop. But this battle does not stop today. ARM just announced, that they will charge their customers in future based on the device prices instead for IP. That will drive the research in the area of Risc-V up. I expect the Risc-V to become a contender in the Mobile Phone space (low end) in about 3 years and in the high end market in 6-7 years.

@GaryExplains Год назад

ARM has not announced anything of the sort. You are repeating a rumor published by the FT.

@michaelkaercher Год назад

@@GaryExplains It came from Softbank, the owner of ARM. Let us wait and drink tea. Maybe it is a hoax.

@GaryExplains Год назад

Again, nothing official has been said by Softbank or Arm.

@michaelkaercher Год назад

Let us wait and drink tea. Btw. Enjoying most of your content. Great channel.

@lepidoptera9337 Год назад

@@michaelkaercher I am waiting and drinking my tea while the attention trolls on RU-vid keep asking me for all the love they didn't get from their Moms. :-)

@jpjude68 2 года назад

Isn't power consumption also a function of speed though? i wouldn't be surprised if the microcontroller's power consumption is directly proportional to the speed

@GaryExplains 2 года назад

Of course it is proportional to clock frequency.

@tetraquark2402 Год назад

Just spent three months learning the wrong instruction set. I'm a bit miffed about it

Год назад

One detail that you missed is that the Pico and Pico W do not have a linear regulator; they have an on-board buck-boost switching power supply. Current consumption will not be constant; it will go up as voltage decreases.

@AndersHass 2 года назад

I do wonder how much current ran through them at the same clock speed.

@kayakMike1000 2 года назад

Hmmm ... Efficiency is largely dependent on the implementation and which extensions are used...

@GaryExplains 2 года назад

Did I not say that?

@kayakMike1000 8 месяцев назад

@@GaryExplainsyeah, sometimes I type out my thoughts before I watch the whole video. You did great.

@Schutti73 Год назад

I am waiting for a fullsize PC with RISC-V CPU.

@GaryExplains Год назад

Why? What will it give you that x86 or Arm don't do/have?

@Schutti73 Год назад

@@GaryExplains A useful PC instead of a developer Board that cannot do my averyday work with a open ISA AND a Open Source OS like Linux. X98_64 or the ARM Cors are not free.

@GaryExplains Год назад

@@Schutti73 When you say free, what do you mean?

@TheEulerID 2 года назад

I think it quite surprising that a 13 year old design stands up so well. I would suspect that if the power saving features of more modern ARM processor designs were to be exploited for a micro-controller SoC, then it might do better still. However, presumably the priority has switched to producing much more powerful, low-power architectures for use in servers, laptops and the like. producing the ultimate in low power micro-controllers is probably not a priority as these things are rarely required to do heavy number crunching.

@broccoloodle 2 года назад

Back in uni, I still remember the active power (total power - leakage current power) is proportional to square of frequency. Can we use it to extrapolate the power usage of the pi to 160 or 240 mhz?

@GaryExplains 2 года назад

Or better still watch my previous video on this topic where I actually changed the clock speed of the Pico and measured the power usage.

@volodumurkalunyak4651 2 года назад

Active power is proportional to Vcore^2 * frequency. Not frequency squareq but just frequency multiplied by core voltage squared. You may get around frequency squared when cores are pushed harder than above mentioned microcontrollers (not as hard as full boost latest Intel or AMD chips, frequency still has to be supported by changing core voltage).

@leonardosabino2002 Год назад

@@volodumurkalunyak4651 The formula I remember from university is proportional to voltage and to frequency squared (P ∝ V * f^2).

@volodumurkalunyak4651 Год назад

@@leonardosabino2002 i literally wrote the very same formula: Vcore^2 * frequency power is proportional to frequency and to voltage squared. Power scaling does also resemble frequency squared at some part of volt-frequency curve (probably 0,7 to 1V region for latest chips)

@leonardosabino2002 Год назад

@@volodumurkalunyak4651 Not the same formula. Look again, it's the -frequency that's squared.- EDIT: I just looked up the formula, looks like voltage squared is correct. Sorry about that.

@-Slade- 11 месяцев назад

Its kinda wrong to average out the performance. The esp32, esp32-s2 and esp32-c3 have an adjustable clock ( 80 Mhz,160 MHz and 240 Mhz). The newer Esp32-s3 can go as low as 10 Mhz. You can set the esps to 160 Mhz to compare to each other. You can also average the time it takes for fixed set of operations etc

@GaryExplains 11 месяцев назад

But the point is the power efficiency per MHz, which is what I showed. I don't think you understood the video.

@TheElectronicDilettante Год назад

Were the connectors taken into account? USB C has transfer rates close to 10Gbps while micro usb is pushing over 450 Mbps. Then as far as power, USB-C handle nearly an order of magnitude power than the micro usb at 100W. Just curious.

@GaryExplains Год назад

The test didn't use the USB ports.

@minecraftermad 4 месяца назад

clock speed scaling is definitely not linear enough to fix afterwards, you should down clock all of them to the same speed, if you want compare at the same speed...

@GaryExplains 4 месяца назад

Clock speed scaling is linear on microcontrollers. They are in-order and deterministic. Plus I did actually change the clock speed on many of the units to check that, and it is.

@Bibbatron 2 года назад

Literally searched this a few days ago with all the news about RISC V Vs ARM. And there was no video. Thank you for this one.

@GaryExplains 2 года назад

What news are you referring to? Also, did you see this video of mine? ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-GyWyikB2hFs.html

@Bibbatron 2 года назад

@@GaryExplains Talking about an efficiency specific comparison.

@toorero 2 года назад

I would have loved to see more different benchmarks hitting different areas of the MPUs, since concluding based on one very specific crypto-benchmark not even using floats seems quite off to me...

@GaryExplains 2 года назад

LOL, other people complained when they thought I was using floats (as some MCU's don't have an FPU). I just can't win. RU-vid comments for the victory! 🤪

@BruceHoult 2 года назад

Outside of very specialised areas, almost no software uses floating point on desktop computers, let alone on microcontrollers! I've been programming professionally for 40 years and 99% of C programs I work on don't even have the word "float" or "double" in them. Gary's previous "Primes by division" benchmark was quite unrepresentative of normal programs, but this one sounds pretty good (I don't know if the actual source code is available?) so I for one applaud this change.

@Winnetou17 2 года назад

@@BruceHoult "almost no software uses floating point on desktop computers" u wot mate ? Browsers and games are "almost nothing" ? Though to be fair, I don't know much about other software, but I'd be surprised if these would be the only major ones. Still, I'd also say it's kind of irrelevant what desktop-level software use and then compare to what MCU-level software uses.

@BruceHoult 2 года назад

@@Winnetou17 "outside of very specialised areas". Games and browsers are specialised. A lot of people run them, it's true, but they constitute a very small proportion of the lines of code written or programmers employed.

@GaryExplains 2 года назад

Bruce, the code to Oceantoo is in my GitHub repo, there is also an accompanying video here on this channel.

@Andrew-rc3vh Год назад

ESP32 also has an ultra low power processor.

@etmax1 2 года назад

mA/MHz or mWh/MHz is a useful metric.

@GaryExplains 2 года назад

They shouldn't be too hard to calculate, I think all the data you need is presented.

@etmax1 2 года назад

@@GaryExplains absolutely, just saying that when comparing different chips they're a useful metric. I once did the OPS per MHz comparison for ARM to AVR/Microchip/MSP430 where the ops were x/+ - and I/O operations, it explains a lot of why all these architectures survive in the "32bit" world we now live in.

@GaryExplains 2 года назад

I agree. I will think about including that in any future videos. However, if you read the comments everyone has their preferred metric. People are asking for all kinds of variations. Some are even saying that per MHz anything isn't valid. It is a jungle out there!

@ps3301 2 года назад

Risc v only has open source as their selling point.

@vampritt Год назад

when china think they can produce any arm chips, they are wrong. risc-v also contain propietary american licensing things from the basic ground up to its implementations. so tough luck, china.

@GaryExplains Год назад

What proprietary American licensing are you referring to? You mean in the microarchitecture?

@perpetualrabbit Месяц назад

You are comparing what against what exactly? For ARM, there are many different versions of the ISA (instruction set architecture). v7, v8, THUMB, THUMB2 being only the major families. Let's say you take the latest and greatest of these: that would be ARMv8 with Thumb 2 instructions. For RISC-V, the situation is more clear: There is the RV32I (32 bit) and RV64I (64 bit), with I = basic/integer, and extensions M (multiply/divide), A (atomic operations), F (floating point), D (double precision). Collectively IMAFD is called G. There are compressed instructions of the I set, called C. Then there is the V extension for "vector". Also there is the H extension for "hypervisor" I think that when comparing ISA's it would be fair to compare ARMv8+THUMB2 with RV64GCVH. Now of course, somewhat decent RISC-V boards are coming available just about now, and efficient CPUs with the ARMv8+THUMB2 are now on the verge of beating Intel/AMD in laptops and servers. So it is just not fair to compare the current implementions of both instruction set families. You can compare code side: RISC-V linux executables are smaller than both x86_64 and ARMv8 for the programs I compared: ls, mv, cp, sshd, gzip. This is in contrast with what everybody claimed: C programs should be bigger when compiled to RISC-V machine language because it is RISC and the other two are CISC. Well, ARMv8 is technically RISC, I read, but compared to RISC-V the language is huge. However, code size is vanishingly small compared with data even on a Windows system. Still, RISC-V Linux has consistently about 10% to 20% smaller executables. You could also count the number of instructions executed for a certain task, say sorting an array, or compressing a file, or computing something scientific and massively parallel. Then you can compare the number of instructions used in RISC-V vector extensions against ARMv8 Thumb2 instructions. Still there is a caveat: RISC-V V extension is vector length independent. Newer chips can run the same binary more efficiently when it has a larger vector length. You can do normal performance benchmarks but then you are comparing hardware implementations, not the ISA's.

@GaryExplains Месяц назад

Why do I get the feeling that you didn't even watch the video? 😭

@perpetualrabbit Месяц назад

@@GaryExplains You are partly right. I actually did watch it before but I more or less forgot. I now watched it again. My issue remains though: If you are comparing the efficiency of an ISA to another ISA, that is really hard, I think. It depends on the qualify of your assembly program if you are programming that directly. Or if writing C, it depends on the quality of the compiler. The compilers for RISC-V may not be as mature as those for other archs. Especially for critical fast code using the vector instructions. So you can count cycles for instance, and see in how many cycles each arch can get a certain task done. Still not really fair: CISC can do more in less cycles presumably, although x86_64 instructions can take 10's of cycles and RISC-V does 1 cycle per most instructions and maybe 3-4 for difficult ones. Anyway, I have always wanted to start writing assembly, but always found the ISAs way too complicated. Including the various ARM ISAs. My last real experience was with the 6502 (C64 days), and I only tried it when those days where almost over. But now there is this new promising ISA that is simple enough for me to learn Assembly from scratch. So I am exited for it and I want the platform to succeed. I have a Milk-V Mars on my desk but have not been able to boot it from a eMMC card yet. I also have a Milk-V Jupiter on order which has the vector RVV1.0 extension. And I have pre-ordered four of the Milk-V Oasis boards with the sg2380 chipset. I have tried some assembly in an RISC-V qemu machine running Ubuntu that works surprisingly well. Anyway, how would you go about comparing the relative efficiency of two ISA families? Can it be done?

@Shrek_Holmes 9 месяцев назад

frequency scaling with power usage isn't linear, its exponential, its better to have all of them at the same clock frequency

@GaryExplains 9 месяцев назад

While I agree that it isn't necessarily linear, as far as I know that is only if the voltage changes with the frequency. In my testing I didn't only use extrapolation, I did clock them (where possible) at the same freq and the results correlated with my extrapolations.

@youcantata 2 года назад

You should have compared ESP32-C (RISC-V) series chip with STM32 WB (ARM M4) series chip. They are direct competitor in function and price. Chip from 2010's are too old. I am not impressed with ESP32-C (RISC-V). I can not find any reason to use RISC-V based controller like ESP32-C in my future project. ARM based controllers are still the way to go in performance, efficiency, and cost, not to mention ease of development.

@bjarnenilsson80 Год назад

Well is ut really fair to compare a first gem dev kit( for risc-v) to something that , I assume,a set of products that had years of optimisation ?

@GaryExplains Год назад

In fact that is the whole point.

@rursus8354 Год назад

Good video. 13:59: Board A uses 20mA·26s = 0.52 Coulomb = 3.2448·10²¹ electrons to accomplish the task, and Board B uses 51mA·18s = 0.918 Coulomb = 5.72832·10²¹ electrons, so Board A peruses only ~57% of the electrons that Board B uses. Therefore A is more efficient.

@nateb1804 2 года назад

The silicon fab processor node tech used to make the chips plays a huge role in their efficiency. It would be good to include fab node info in the comparison data.

@GaryExplains 2 года назад

Indeed, it is something I will note for future videos. As for this video the key is that the Arm Cortex-M4 is using 90nm and the RISC-V ESP32-C3 is on 40nm, which makes the performance of the RISC-V processor even worse.

@nateb1804 2 года назад

@@GaryExplains Wow that's very telling. Thanks Gary!

@geoemm 2 года назад

Also the area of the chip also should be a criteria

@DataSmithy 2 года назад

What do you mean by efficiency? There's performance efficiency, and then there's energy efficiency.

@GaryExplains 2 года назад

Did you watch the video?

@minecraftermad 4 месяца назад

you should also list the process node for the processor, it also really affects efficiency.

@GaryExplains 4 месяца назад

Yes it does and what is shocking is that the Arm chips were on the older process nodes, making the RISC-V even worse .

@borbetomagus 2 года назад

Hopefully you look into purchasing the DeepComputing/Xcalibyte ROMA RISC-V laptop (or a related RISC-V laptop or desktop) for a future video, but much more refinement will probably be necessary for it to reach it's full potential.

@ElectronicFanArm 12 дней назад

RU-vid Streamers n😢w want me to use ESP32 but I want to use pico or if I want more performance so I use rpi zero

@GaryExplains 12 дней назад

Use whatever you want. 🤷‍♂️

@ElectronicFanArm 12 дней назад

@@GaryExplains thanks, excellent topics

@BrianKelsay Год назад

Not sure if this is a valid question, but here goes. Based in these clock speeds, could one of these chips act as a processor in a micro DOS or Windows environment? Thinking kiosk that runs a corporate webpage and allows customer data entry or order entry on-site. Or tiny web book or a tablet just for web or ebook reader where its mostly text. I know that the Pi, which is more powerful and has a video decoder is slow at video and graphics. Just thinking that if not much computing power was needed, you could pair with a mid power graphics chip for running the display and decoding video streams. Then maybe you get TVs with minor computing and networking power. Or is this how they are making smart TVs?

@adriancoanda9227 Год назад

Arm is a risc chip. Also, it stands for reduced instruction set. Actually, you will nrrd to have the same motherboard with a socket mount in order to exclude other factors in the testing, but even then the fastest chip was at 240 mhz y won't se where those can make a use maybe in remote controls, elsewhere those are to slow, or use them I a insane cluster 999999999999x cluster but you will need a dam fast cluster management running within the firmware

@ole.petersen Год назад

But shouldn't the power (aka V*I) be more relevant than the current (I)?

@GaryExplains Год назад

Of course. That is why I present mWh towards the end. But when V is constant then I is important.

@El.Duder-ino 2 года назад

3:21 22/23 years ago? R u sure Gary?😂🤣🤣🤣 Anyway Gary, well done comparison, thx!

@oidpolar6302 2 года назад

It's never been about efficiency, was always about pcore license dependency

@GaryExplains 2 года назад

So the Raspberry Pi Pico is expensive at $4? 🤔

@rogerdeutsch5883 4 месяца назад

Fantastic succinct but thorough coverage. I can tell a lot of work went into this great video. Subscribed!

@Zhaymoor Год назад

great video, thank you

@AndersHass 2 года назад

But which is more efficient RISC-V or ARM in Minecraft lol. But still important point that what is used to handle the instruction sets matter way more than instruction sets themselves.

@fjgaston 2 года назад

It would be interesting to know also the idle power consumption, it would give an idea of how the boards would behave when powered with a battery.

@justinhall7819 2 года назад

I was just thinking the current measurements aren't very useful because of all the extra stuff on a lot of those boards. Plus the esp32 are not known for low power. You would have to compare active current with the idle current of each board.

@tails4e 2 года назад

Yes the delta power should show the true cpu energy used for the benchmark, maybe Gary can follow up?

@GaryExplains 2 года назад

The tricky thing with a delta number is that a CPU can never actually be idle. Even doing nothing is still looping and reading instructions waiting to no longer be "idle". To help in this situation there are two general solutions. 1. Lower the clock frequency and the voltage. This is something that smartphones and laptops do. 2. Put the CPU to sleep, this is a feature MCUs tend to have and it is similar to 1 but not dynamic.

@tails4e 2 года назад

@@GaryExplains thanks for replying. The motivation for the delta is to see the difference between the dynamic power consumption of the cpu architectures. I take the point that the cpu is never really idle, but I the case of MCUs, it should be at least the cores are idle, or running noops. I think the data would be interesting nevertheless. Idle power in itself would be interesting, so all 3 data points tells a story, idle, full load, and 'full load - idle'. Its quite surprising that a 22 year old design/process can still beat a 2 year old one.

@GaryExplains 2 года назад

I will look into this more and see if it is interesting enough for a follow up video...

@angeldude101 Год назад

You mentioned that you encryption algorithms don't use floating point or integer division, but does use bit manipulation. I'll ask if it also uses integer multiplication, because multiplication by default comes in the same extension as division, but was also made available on its own as Zmmul. Bit manipulation instructions beyond basic bitwise logic are also their own extension B and its parts. Did the RISC-V processors used support these extensions, and if so did you tell the compiler to use them when compiling your code?

@schizoidman9459 2 года назад

It's not that surprising that the winners are the ones with faster clocks especially when in dual-core configurations the second core is just set idle (why?) This test was designed to benefit single cored and higher clocked processors. Multicore processors are known to have better performance at much lower clocks frequencies and consequently more energy efficient. I'm having a very hard time to understand what's the motivation here. However, obviously the performance is never about a difference in ISAs, especially if all of them are RISC architectures. Put some CISC ISAs in the mix and you will see huge differences, though. Also chips that run at faster clocks generally consume more energy. That's the whole deal about multicore architectures, to have high performance with low frequencies. The whole deal of ARM processors and their use on mobile devices is exactly that. No surprise here either, since this is obvious. The only surprise is to see a processor running at 72 MHz consuming more than one at 160 MHz. I think this must come from the fact that you are measuring power at the board level, not at the processor itself, otherwise we would see this reversed. Now about RISC-V. There is no way RISC-V processors could compete at any level with ARM processors that have been far and wide used in smartphones. RISC-V processors are new kids in the block that are running far behind. RISC-V is still lacking the support needed to be better than ARM processors. But there is a huge advantage of RISC-V, though, that cannot be measured for now. It's potentially much cheaper to produce RISC-V than ARM processors, since it is an open and free ISA. However, we cannot still see the advantage in prices because they are still not produced in high volume. Volume production is everything in chips prices. But we can expect to see much cheaper RISC-V processors in the future to the point of beating ARM processors prices. I think that's where the RISC-V will position itself as a competitive ISA.

@GaryExplains 2 года назад

I am planning a dual core follow up video. Also the devices with a higher clock speed didn't "win".

@schizoidman9459 2 года назад

@@GaryExplains: Thanks for your comment, Gary. You are probably referring to the hypothetical comparison if the processors were all running at 1MHz. You know that just multiplying the time by the clock frequency is not a very accurate performance indicator. I am looking forward to see the comparison between dual-core and single core processors. For the kind of comparison (very repetitive and computing intensive tasks) you are doing, you would generally be better with dual-cores. However, that's not always true. As I stated in another comment in another video, modern architectures have lots of intrinsic parallelism (that translates into several instructions executed per cycle) that simply don't work when you impose atomic execution to synchronize threads. That benefits single cores better than multicores. In my estimate, to start having clear cut better performance in multicore you need at least 8 cores, unless you don't use atomic operations. That's the reason smartphones have dedicated cores for certain activities, because in this way you don't need synchronization. The advantage of these configurations is simplicity, you don't need load balancing. But the problem is that you will have most cores idle if their correspondent activities are not taking place.

@GaryExplains 2 года назад

Except for the raw performance test (ie how many Ms to complete the task), none of the higher clock speed microcontrollers won. As for the clock frequency, in my previous video I actually changed the clock speed, and while performance isn't perfectly linear it is quite close, certainly close enough to make meaningful comparisons.

@schizoidman9459 2 года назад

@@GaryExplains : Thanks. I didn't see your previous video, so I just assumed you multiplied the frequency by the time. It seems I will have to see this video again to understand what you mean with "raw performance". I probably overlooked that. Sorry.

@GaryExplains 2 года назад

Also, I don't think MCUs have much in the way of ILP, and certainly not out of order execution.

@georgeh6856 2 года назад

This is good for RISC-V. It is comparing ARM which has been around (and refined) for decades with RISC-V which is quite new.

@peterbates4696 Год назад

The sp20.1 cortex chips are 1.7 times s.I.p. than the M.C.U. Chips even at vv3 dash 9

@TheLouKou 2 года назад

Garry, please, you;re killing me! It's ESPRESSIF, there is no X in there! XD

@todayonthebench Год назад

A decent video. And yes, instruction set architectures don't largely impact power efficiency. Hardware implementation however impacts efficiency far more. But there is nuances on the ISA level that sets limits for actual implementations of the ISA. Be it limits on minimum transistor count, power efficiency, peak clock speed, etc. Sometimes one has to trade one aspect for another. As an example, a resource efficient architecture using few transistors will generally not offer all that great peak performance. While a more peak performance oriented ISA will tend to be hard to build with few resources. Power efficiency is meanwhile largely decoupled from this view of complexity, since power efficiency is more about how well a given piece of software can make use of the architecture provided. It is oftentimes better for efficiency to have dedicated instructions for complex tasks, but what tasks to choose is a debatable subject in itself. If one throws in everything but the kitchen sink, then it is often far from trivial to make an efficient hardware implementation of it in practice. In short, designing an ISA is all about compromises to reach a prespecified goal. And then make a good hardware implementation of that along the way. Then it is up to the market to find/make applicable software for it.