Тёмный

Assembly Language Misconceptions 

Creel
Подписаться 97 тыс.
Просмотров 104 тыс.
50% 1

Support What's a Creel? on Patreon: / whatsacreel
Office merch store: whats-a-creel-3.creator-sprin...
FaceBook: / whatsacreel
In this video we look at some misconceptions about Assembly language. Apologies for the sound in this one (and possibly the next too!), the main audio was not usable so I have used the camera mic instead.
Instruction listings by Agner Fog: www.agner.org/optimize/instru...
Software used to make this vid:
Blender: www.blender.org/
Audacity: www.audacityteam.org/
OBS: obsproject.com/
Davinci Resolve 16: www.blackmagicdesign.com/prod...
OpenOffice: www.openoffice.org/
Gimp: www.gimp.org/

Опубликовано:

 

31 май 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 582   
@mheermance
@mheermance 3 года назад
I learned to program in the 80s when compilers stunk, and it was a piece of cake to beat them with hand coded assembly. As a result many projects were written in assembler to run on older and newer hardware. The advent of efficient compilers was a godsend, and for work I was glad to see it sidelined. But for fun I still code in assembly because building high level features like lambda functions or garbage collectors from the ground up teaches you a great deal.
@sallylauper8222
@sallylauper8222 3 года назад
Yeah, I thought it was really inarestin that he said that today to write faster assembly you have to know all the tricks of the compilers.
@SunMasterXIV
@SunMasterXIV 3 года назад
I used Lattice C (and 68k assembly) on the Amiga in the 80s, and I thought it was pretty good. But the way modern compilers are able to optimize the code is sometimes amazing. It doesn't help tailormake assembly that so many x64 CPUs variations are available, where instructions execution time vary.
@AURORAFIELDS
@AURORAFIELDS 3 года назад
68000 is a good example of why C compilers are not good for everything. A lot of the efficient code relies on passing arguments via registers, while C relies on stack frames. Memory access on the 68000 is really slow, so automatically C will be slow too.
@mheermance
@mheermance 3 года назад
@@AURORAFIELDS true, but many C compilers implement fast call linkage. They pass by registers and the called function saves on the stack if it calls another function.
@Ehal256
@Ehal256 3 года назад
@@mheermance finding a compiler that does that for the 68k nowadays however, is quite difficult. GCC doesn't, and while llvm recently added support, I doubt it does either. Maybe something from the 80s, but I'd rather code things by hand when performance is really important.
@randyscorner9434
@randyscorner9434 3 года назад
With current compiler technology there is one area where the move to assembly provides massive advantages. That is when you can vectorize the code to fully use the SSE and MMX extensions. For one routine, unrolling the loop 1 time fit the register set, allowed 8 wide vector calculations and increased the overall performance of a high end electronic piano by 12X. This was sufficient to move the program off a new Mac to a RPI3. The load went from 40% of the CPU on the Mac to 9% of the CPU on the RPI3 with just one thread. Getting to this point with a high level programming language requires a different compiler and coupling that to C or C++ is much harder than doing the 60 assembly instructions by hand. It's all about how badly one or two routines dominate the runtime. It's often the case that these "hotspots" can get extra love and show major performance improvement. Of course, the best optimization would be to stop using Python as production code.....
@thomasmaughan4798
@thomasmaughan4798 2 года назад
"Of course, the best optimization would be to stop using Python as production code" LOL 🙂
@FM-tq2gs
@FM-tq2gs Год назад
Newbie question: why can't compilers do that kind of optimization? Will they be able to one day?
@Mr8lacklp
@Mr8lacklp Год назад
​@@FM-tq2gs they will be able to do it sometimes in the future but there are really two problems here: One is that the compiler can only do an optimization if it can prove that it won't change the behavior of the program for any value it might possibly see and it simply doesn't have all the information as all it sees is the source code. You might for example have a number that represents the day of the week so *you* know it's never going to be greater than seven but the compiler can't know that so it can't apply any optimizations that assume that the number won't be greater than seven. So there are some optimization you can do that are literally impossible to do for a compiler no matter how advanced. The other problem is that both finding an optimization and proving that it doesn't change the behavior of the code are very difficult and not generally things computers can do at all. And this is where compilers are steadily getting better but it's very possible that there are some optimizations that will just never be worth the longer compile times or the effort of implementing them.
@FM-tq2gs
@FM-tq2gs Год назад
@@Mr8lacklp thank you for the explanation!
@robegatt
@robegatt 10 месяцев назад
​@@Mr8lacklpyeah, that is why some programming language are better than others... a Pascal compiler could easily do what you said in the first example.
@spacewolfjr
@spacewolfjr 3 года назад
I work in CyberSecurity and end up using assembly a lot when reverse engineering / disassembling malware, it's an essential skill for that kind of work
@shanehebert396
@shanehebert396 3 года назад
Well... you have to since I doubt the malware writers are going to give you the source and all you have is the executable ;)
@tappineapple3381
@tappineapple3381 3 года назад
Did you go to college? If so what did you major in? I am currently a junior in high school and I would like to further learn about reverse engineering and getting better with stuff like IDA and reclass. Any advice?
@y2ksw1
@y2ksw1 3 года назад
Agreed.
@y2ksw1
@y2ksw1 3 года назад
@@tappineapple3381 I suggest to disassemble Viruses. Most of them are brilliant examples of engineering and most of them are made by true masters of art. The next step I suggest, is to make your own operating system. If you master this step, you will have no problem to solve all other problems you may come across.
@tappineapple3381
@tappineapple3381 3 года назад
@@y2ksw1 Thank you!, I have been following the tutorials on guided hacking and I have very much enjoyed reversing video games and I feel like malware would be the next best step. Now, making an operating system scares me.
@wingunder
@wingunder 3 года назад
"If you can help yourself, try not to write a virus." 😂😂😂 You should put this quote on a t-shirt. Your sense of humor is simply wicked 👍
@OpenGL4ever
@OpenGL4ever 10 месяцев назад
I love that line. And the background to that is, if you can do that, you don't need to write a virus. You will also find a well-paid job without having to drift into the criminal corner to make a lot of money.
@ChiliTomatoNoodle
@ChiliTomatoNoodle 3 года назад
Really good information quality and density here. This guy knows his stuff.
@WhatsACreel
@WhatsACreel 3 года назад
Means a lot brus! You are a legend, Chili :)
@classicnosh
@classicnosh 3 года назад
@@WhatsACreel - He's not wrong. I learned Pascal and C wasn't really taught in my school since Pascal was considered "academic". Assembly was also easier in those days since the microcomputers were much smaller and it was possible to really understand the memory map. Nowadays, the philosophy is very different. The rule of thumb is, don't try to outsmart the compiler. ;)
@tootaashraf1
@tootaashraf1 2 года назад
The c++ guy
@Andoxico
@Andoxico Год назад
ayy it's papa Chili
@craigmhall
@craigmhall 10 месяцев назад
I rarely write in assembly any more, but it's good to know for: -debugging release / optimized code -studying the generated assembly and finding ways to tweak the source code to generate better assembly -generally understanding how the machine works, what is expensive and what is not
10 месяцев назад
This! I personally write asm only as a hobby for microcontrollers, where cycle-level timing is sometimes required (the rest of the time C suffices), but I read it a lot more as disassembled code for the reasons you mentioned.
@lgrantcdg
@lgrantcdg 3 года назад
Excellent talk! In the 1970s at General Motors Research Labs, they ran an experiment with a PLI-based computer graphics system. They recoded a few high-usage routines in assembly language. The system got faster. Then they recoded them in PLI and the system got even faster. Then they recoded them in assembly language again, and it got faster still. It turned out that each time they recoded the routines, they improved the algorithm, and that made much more of a difference than which language they used.
@guillermoleon0216
@guillermoleon0216 3 года назад
First Assembly I ever learned was for the Z80 and I absolutely loved it! I don't use it at work but getting to know it taught me a lot about how computers work.
@kevinjensen3056
@kevinjensen3056 3 года назад
Been programming in assembly and C since '79. Assembly is still widely in my field of embedded programming, but I haven't needed to resort to it for years. The code density that an expert on the CPU can achieve in assembly is incredible. Still most of what you've said is correct for most complex CPUs, but some comments are a little inaccurate for embedded processors today. Most MCU core instructions are still atomic, but the problem of mutilthreaded read write race conditions still apply when the data size is less than the buss width. This sort of issue appears in most interview tests for embedded programmers. You really should do a lecture on race conditions at the sub instruction level (as you just did), the instruction level, at the thread level, the o/s level and even beyond. Liked your lecture on radix sort. Never tried that one before. Keep up the good work.
@SimGunther
@SimGunther 3 года назад
Gotos are NOT considered harmful Wormholes in the other hand are considered VERY harmful
@k7iq
@k7iq 3 года назад
If one does not like "goto" then just rename it to jmp and then it's OK because it's what the compiler might output in assembly anyway ! 😁
@imperatoreTomas
@imperatoreTomas 3 года назад
Goto is my favorite function
@programaths
@programaths 3 года назад
In BASIC, well, it was very present. I learned that on my own and was used to put GOTO everywhere as it was the way to skip code based on a value "ON x GOTO label1,label2,label3" (or line numbers!) Then I used GOTO also to recycle code (as in GOSUD). Very good for state machines too, even if I didn't know it had a name. Then I had to take visual basic courses at school and the teacher was pulling her hair reading my code...no FOR and IF, GOTO worked just fine. On top of that, I kept my habit of reusing code. I am not even sure I would be able to understand my own code as I totally forgot that habit. Still, have good memories of that because the teacher ended up saying she will not correct it anymore and just give points for it working as intended. ^^ At the same time, others had troubles to understand what a variable was and I had already implemented snake and Sokoban just for fun :-D (As devs, we find it to be very simple, but I taught a bit too and this is a huge hurdle!)
@LionKimbro
@LionKimbro 3 года назад
Wormhole = en.wikipedia.org/wiki/COMEFROM
@roygalaasen
@roygalaasen 3 года назад
@@programaths when I started out with computer classes back in 1991, we had to draw flowcharts before we were allowed to write a single line of code. Only one entry point, one exit point and no lines were allowed to cross, essentially banning goto entirely. Now my favourite programming language, Swift, is sometimes forcing you to use a label to tell which loop you want to BREAK out of, which is essentially a goto in disguise. My brain cringes but I have to get used to it lol Edit: to clarify. Break in all programming languages breaks out of neared LOOP. If you are in a switch .. case you will still break out of the nearest loop. In Swift you will break out of the switch case, still stuck in the loop unless you label the loop you want to break out of.
@ricos1497
@ricos1497 3 года назад
If I'm to take just one thing from this video its that I shouldn't write viruses. One virus, absolutely fine - or recommended perhaps - viruses, not. Great advice, thanks.
@WhatsACreel
@WhatsACreel 3 года назад
Hahaha :)
@clickrick
@clickrick 3 года назад
I'm glad you got to the point that there are assembly languages for just about every processor and didn't allow people to assume that x86 is all there is. As someone who has written assembler on ICL 1900, IBM 360 & 370, DEC PDP 11, as well as microprocessors like the 6502 and Z80, I've become aware of just how different the fundamental architectures are, in particular addressing modes.
@starpawsy
@starpawsy 2 года назад
Most successful assembly program I wrote was in 1992. I did a square root function using Newton's method, that was faster than what the compiler of the day provided in the maths library! In those days, the width of the floating point divide register was 80 bits. Dunno what it is today. This might not work today. As an aside, some people night say "only 80 bits"? Well, consider that 80 bits == 24 significant decimal digits. Consider that if you measure the diameter of the known universe to 24 significant figures, the last figure is less than the classical diameter of a hydrogen atom. Newton's method for calculating the square root of x. Start with a guess, call it a. Calculate b = x/a. Take the average c = (a/b)/2. That will be closer than either a or b. Use c as your next guess for a and iterate. Keep going until a & b vary only by 1 in the LSB. The challenge was making a really really good guess for a that works for all numbers. I hit on the idea of dividing the exponent by 2 (shift right by 1) , and zeroing all but the most significant bit of the mantissa. For negative exponents you do the opposite - double the value of the exponent. This actually worked really well.! Here's a worked example. Square root of 10 (well actually 10.000000000000000000000000000000) start with 3 10 / 3 = 3.33... 3 + 3.33...= 6.33... divide by 2 = 3.166... In one iteration, you've got 2 decimal places.
@draconite
@draconite 2 года назад
#1: This does depend on the architecture you're building for. Compiling for the 68000 with GCC, it's easy to beat the compiler if you know what you're doing
@OpenGL4ever
@OpenGL4ever 10 месяцев назад
You've already made an assumption here, using a specific compiler. On the other hand, if you use a compiler that is optimized for the use of fast calls and 68k, then it can look different.
@ParagonX13
@ParagonX13 2 года назад
i'm a young person and i taught myself reverse engineering/assembly over the past several years (messing around with disassemblers and searching my questions on the internet) and actually enjoyed it way more than i thought i would... at first it was just a means to an end but i very quickly grew fascinated with it all. i have no idea what to do with this passion though other than hobby projects... :p
@OpenGL4ever
@OpenGL4ever 10 месяцев назад
If you need a playground. Many open source audio and video codecs are already optimized for the x86 and ARM architectures, but this is not yet the case for the RISC-V architecture. So you could buy a single board computer (SBC) with a RISC-V CPU and then see what could be optimized there. You would need to learn RISC-V assembly though.
@mattias3668
@mattias3668 3 года назад
There are some case were you want to use assembly for performance because the compiler will not choose the best instructions for your good. For example, if you are addition on bigints, you will probably with to use the addition with carry instruction, which the compiler probably will not be able to figure out that it can use. And there are probably a large number of very specialised instructions like this, I imagine for example that the compiler won't use the SHA or AES instructions. Not only are there different assembly languages for different architectures, you also have different dialects for different assemblers.
@WhatsACreel
@WhatsACreel 3 года назад
I absolutely agree!
@shanehebert396
@shanehebert396 3 года назад
You would hope that if you are using a library that's implemented bigint or SHA/AES that the people who wrote the library used intrinsics to implement the library calls.
@mattias3668
@mattias3668 3 года назад
​@@shanehebert396 Actually, I wouldn't necessarily hope that. When I implemented addition for bigint, GCC didn't have a good intrinsic for doing add with carry (I don't know it it has now), the closest it had was addition with overflow detection, which it couldn't optimised, so inline assembly was necessary for good performance. So you want your bignum to use inline assembly in this case, and then just add a portable fallback for unknown architectures. In other situations, intrinsics may work just as well, but in these cases you still need a portable fallback, so the older reason to use intrinsics instead of inline assembly in these situations is that the intrinsics may be supported for multiple architectures, and hopefully most compilers will recognise them, but that's not necessarily they case, and it is more likely that they will recognise the inline assembly. Similarly, intrinsics for SHA/AES, if there even are any, are not portable.
@shanehebert396
@shanehebert396 3 года назад
@@mattias3668 yeah, that's the beauty of conditional compilation ;) if the arch is detected, use the version of the library that uses intrinsics, if not, fall back to the library made from portable code. Then it's up to the library providers (or an interested 3rd party in the case of open source) to add to the project. But yes, you're also at the mercy of the compiler and how it generates code (gcc, in your case, with add with carry).
@andrewdunbar828
@andrewdunbar828 2 года назад
Rotate instructions are also not accessible from your high level language. Endian-switching instructions used to be inaccessible too but various compiler + CPU combos I looked at a while ago could recognize most ways to do endian switching in C and produce the right ASM code... but not always!
@brannonharris4642
@brannonharris4642 3 года назад
Reductive learning. Discovering what something is not is seemingly more potent than only pondering on what that thing is. Love this video!
@Alex-op2kc
@Alex-op2kc 3 года назад
Here's an alternative definition: An assembly language is a set of mnemonics and other language elements defined by an assembler that let you write symbolic statements that map to hardware instructions. Under that definition, there can be multiple assembly languages per architecture. For example, there are multiple assemblers for x86: MASM, NASM, YASM, and fasm. And each define a different, although very similar, assembly language.
@robertobokarev439
@robertobokarev439 9 месяцев назад
Nasm has the finest "classical" syntax, while all you wanna do looking at masm is to go back to C. Can't tell anything about fasm and yasm, don't have enough experience
@_mrgrak
@_mrgrak 3 года назад
The best programming related content on youtube right now. Creel explains complex topics simply, truly a great teacher. Looking forward to the next video!
@TerjeMathisen
@TerjeMathisen 3 года назад
Congratulations Creel, you've managed to create a very informative set of videos on x86 asm, all stuff that I would have loved to have back in the days, starting in 1982 when I had to write interrupt drivers in hex. :-) PS. I went on to use asm on everything from video (DVD & BluRay) & audio codecs (ogg vorbis), crypto (AES competition), games (Quake) and I still write some really low-level code, usually using compiler intrinsics since Visual Studio doesn't allow inline asm anymore. :-(
@DownhillAllTheWay
@DownhillAllTheWay 3 года назад
12:15 "Assembly language is the language of the hardware." Permit me to nit-pick. *_Machine language_* is the language of the hardware. Asm is a near-English representation of it. Many years ago, I had access to a Data General Nova computer (it was the back-up machine on a customer site). I knew how to swap modules, and I was OK at hardware maintenance (scopes, and that sort of stuff) but I didn't know anything about computers at the time. By reading the manual, I entered a 3 (in binary) into a memory address, and a 6 into another address using the front-panel switches, then I wrote an instruction in machine code to add them together - and it produced a 9 in the destination address - a thrill that I remember to this day. I learned the machine code pretty well on that machine, and wrote an assembler in binary code. I had been intending to write diagnostics on the machine, but I moved on before I did that, and never used my (rather strange) assembler. Well, I had never seen an assembler up to that point, so I didn't have much to go on.
@ancapftw9113
@ancapftw9113 2 года назад
The best example I saw was a guy making a 6202 (I think) program by writing to a ram chip and feeding it into the processor. He showed what the assembly would look like, but had to program it in hex code.
@alberto3028
@alberto3028 3 года назад
ASM is perfect for bootloaders and some parts of OS
@WhatsACreel
@WhatsACreel 3 года назад
It is indeed! UEFI changed the necessity a little, but certainly low level OS code is one of the most important use cases for ASM! Cheers for watching mate :)
@lewiscole5193
@lewiscole5193 3 года назад
Assembly language gives complete control of the hardware to the programmer in a way that no HLL can, in no small part because assembly language is processor architecture specific, while an HLL is supposed to be processor architecture independent. So, it's not that "ASM is perfect for bootloaders and some parts of OS", it's that there is no other way to get there from here using an HLL.
@WhatsACreel
@WhatsACreel 3 года назад
@ozan o. I would love to :) Judging by the recent reviews of Apple’s new M1, I think maybe ARM will give x86 a very good shake very soon! We might be witnessing the beginnings of the fall of x86 in the laptop and desktop markets...? Unbelievable! Not sure when I can cover these things, but they’re certainly on my to-do list. Thanks for the suggestions, and cheers for watching :)
@lewiscole5193
@lewiscole5193 3 года назад
@ozan o. OSs have to change over time to meet new hardware and/or user demands, or else they die off. Unix is no different and has evolved over time to be different than what it originally started out as. So in a very real sense, I suspect that Tony Hoare's famous saying, “I don't know what the language of the year 2000 will look like, but I know it will be called Fortran,” has applicability to OSs with "Linux"/"Unix" being substituted for "Fortran". And keep in mind that there already environments where "Linux"/"Unix" is not king ... real time environments such as can be found in cars where QNX, a proprietary message passing microkernel based OS (which can run on ARM based systems by the way), is already more common. Yet, thanks to the Posix standard and the QNX's people's interest in it, how, QNX offers a similar interface ("abstraction") to application programs so that their developers feel warm and fuzzy about it. I suspect the same thing will likewise happen with any OS that depends on C, including Fuschia.
@lewiscole5193
@lewiscole5193 3 года назад
@ozan o. > As you know, processes never really > pause in posix, I don't know if it > was due to hardware restriction or > design error during constructing > of unix back then. I don't know what you mean by "processes never really pause in posix". Posix is an interface standard for OSs that just happens to look like the interface that Unix/Linux typically used to present. It's not an OS itself. An OS can be something other than Unix/Linux entirely under the hood and yet present a Posix compliant interface as is the case with QNX which is a proprietary message passing microkernel based OS that is Posix compliant as I indicated before. To the extent that Posix was supposed to look like Unix/Linux to the outside world (programmer), various interface calls such as a file read or write do block (pause) because that's what they in Unix/Linux historically did in The Good Old Days. That doesn't mean that an OS can't present natively use non-blocking interfaces internally which are look like they are blocking to the user. > there is also root privilege problem. Again, I don't know what you mean since Posix isn't an OS. > Plus Android turn into giant layers of burger. > I guess that's why google wanna leave Android. Android *IS* Linux by another name. Really. > if any other new os becomes complicated > and consist of many layers in the future, > it will be loop then they will be wandering > new solutions in the future:). Again, OSs change over time or they die. To the extent that everyone thinks that what they want done is the way thing should be, OS developers are likely to toss in lots of crap to satisfy different users. If you want a lean, mean OS for your specific machine(s)/application(s), feel free to write one yourself ... and spend forever doing it.
@jeffm2787
@jeffm2787 3 года назад
I was writing x86 before it was called x86. Did 6502, 6809, etc. as well. Stopped when the 486 came out.
@Guztav1337
@Guztav1337 3 года назад
You should get more cushions/backdrop in the room, there is a bit of echo in the background.
@mrdouble
@mrdouble 3 года назад
Was thinking the same, looks like an expensive mic though :/
@swharden
@swharden 2 года назад
The condenser microphone is "too nice". It's picking-up every little echo in the room. A dynamic microphone or a basic gaming headset (microphone closer to the mouth) could be better options for this space. Edit: audio is good in later videos
@stevem3432
@stevem3432 3 года назад
I begun learning assembly at uni this semester and I actually enjoy it. Thanks for these videos.
@hell0kitje
@hell0kitje 3 года назад
Glad to see you back, mate :) I started with your c++vids and now im discoveri g asm, keep posting more!
@WhatsACreel
@WhatsACreel 3 года назад
Thanks, will do!
@kevinz1991
@kevinz1991 3 года назад
great information and great delivery. thanks a lot for the time you put into this. subscribed
@VTdarkangel
@VTdarkangel Год назад
I had to do some SPARC assembly programming when I was in school. The real advantage of it was when we had to do hardware interfaces. Those functions could have been done in C, but when I broke the object files down, I found out that the compiler was inserting a bunch extra commands that were completely unnecessary such as settings in the master register for settings that weren't being used. By doing the interfaces in assembly, I could bypass all of that.
@PaulaBean
@PaulaBean 10 месяцев назад
When the rubber hits the road, you can always benchmark the speeds of your C++ code against assembly code. Measurement trumps speculation. Thanks for the nice video!
@RufianEmbozado
@RufianEmbozado 9 месяцев назад
Assembly will always retain two strong points. First, when you learn to code in assembly you go through a rush of "illuminations" (I'm always thinking on 8 bit platforms because they are simple enough to have a grasp on all the landscape, and because I'm that old. Nothing is yet done, you push and pull all those pesky bits all over the place "by hand", a blazingly fast hand) that put a lot of pieces of the information science puzzle rigth into place. Second, there is an inherent beauty in assebly code. Motorola 68000 had a beatiful , beautiful assembler (I crashed on it with an Amiga 500 and, man, what a joy it was! All those fancy chips at your command... Most missed piece of hardware ever). I never got that feeling when I tried to code assembly on i386. I still think learning to write assembly for any CPU is worth the price. No need to do great things, just some humble tasks. You'll have the ride of your life (as a nerd, at least) and wont fall for those kind of misconceptions. Great video, of course. Assembly has the virtue to dispell all sorts of misconceptions. But assembly itself is covered by some key misconceptions which keep it from teaching all it can.
@johnyoungquist6540
@johnyoungquist6540 3 года назад
Talking about assembly in general across different processors is fraught with trouble. I do embedded apps in 8051 assembly only. In fact I wrote the assembler. I can promise that C in the 8051 environment is at least 500% slower and also 500% bigger than assembly even for simple things that C should be good at. It is widely accepted that compilers use a tiny fraction of the instructions set and leave a lot behind. It is easy to point out that ordinary languages contain no information to help compilers use special instructions or constructs. The assembly programmer will recognize an AES algorithm and use the AES instructions a C compiler won't. In modern processors the compiler code generator could hold a significant advantage over the programmer with a detailed knowledge of architecture magic like pipelines, cores, caches, threads. I don't know they handle the moving target of the new processor of the week or tell what processor they will run on. One processors optimization is another's down fall. In contrast the assembly programmer wizard may better the C code speed by 100 times or more with devilish clever thinking and detailed knowledge of the whole instruction set. One thing that is universally overlooked is how assembly and high level applications are similar. Apps are typically constructed of functions tailored to do common things for that app. If you need 98 digits precision you'll be writing routines to handle that in any language. These modules are easy to define and test and spread among several programmers. We build bricks first then walls later. A function call is about the same complexity and work to implement in any language. Now all of a sudden apps in all languages are basically function calls and logically look about the same. Neither is more difficult than the other. The planning stage and logic can be nearly identical for any language.
@donjindra
@donjindra 3 года назад
Exactly. People who don't regularly program in assembler have no idea how much faster assembler is than any high level language. Compiler optimization cannot compete with a programmer who knows the instruction set intimately and can tailor the use of those instructions for a particular task. A 10x improvement in speed is pretty normal. OTOH, a poor programmer is not going to benefit much from assembler code. You have to know what you're doing. The 8051 is a good example. That cpu is so weird a compiler can't deal with it efficiently. A compiler does better with something like ARM.
@SimonBuchanNz
@SimonBuchanNz 3 года назад
@@donjindra complier optimisation can definitely best any reasonable amount of effort for the majority of code, assuming you're not using the trivial C implementations that come with microcontrollers - inlining and avoiding pipeline stalls is drudge work that's better to let the computer handle, especially when your problem is getting something working or cleaning up a mess, not making something faster. Not always, there's always going to be some cases that confuse a compiler enough that it's easier for you to use assembly than to figure out how to mangle your code so the compiler does the right thing, but advanced instructions are available through intrinsics, and compilers will auto vectorize loops, and so on. The low hanging fruit is getting picked all the time.
@donjindra
@donjindra 3 года назад
@@SimonBuchanNz I don't know why you think that. In fact, I don't even know what sort of code you have in mind. I don't advocate using assembler to add two register-width numbers.
@SimonBuchanNz
@SimonBuchanNz 3 года назад
@@donjindra sorry, could you clarify what I said that you have an issue with? I was taking about your statement that "a compiler can never compete with [an assembly] programmer": trivially true in that said assembly programmer could at worst use the same instructions, but not practically true. Not sure where you're getting adding numbers from, but if that's literally all you're doing, then actually yeah, you probably will beat a compiler. It's the 50kloc of "adding two numbers" that's not worth the absurd effort to keep optimized in assembly, and mixing and matching can (depending on your baseline) actually pessimize the code since the compiler can't inline now.
@donjindra
@donjindra 3 года назад
@@SimonBuchanNz Concerning adding numbers I said the opposite of what you think I said. If the task is simple, such as adding two numbers, the compiler does just fine. There's no point in resorting to assembler. It's the complicated, time consuming tasks that benefit from assembler. Compiler optimization was done by assembly language programmers. But they optimize general cases. They aren't magicians. They can't predict all particular cases. Therefore they cannot optimize for all of them. I have no idea what you mean by the end of your comment.
@sergiomarroquinjr3587
@sergiomarroquinjr3587 2 года назад
I always seem to learn something new from you. Keep it up!
@controlflow89
@controlflow89 3 года назад
Absolutely amazing channel, keep up the great work!
@LukeAvedon
@LukeAvedon 3 года назад
Wonderful video! Glad you are back.
@3Balala3
@3Balala3 3 года назад
Great video, helps a lot understanding the assemly's place and purpose nowdays. Also great timing. Tomorrow I have an exam in assembly. We are programming on an emulated dos program. Really, really interesting... :D
@herrbonk3635
@herrbonk3635 3 года назад
2:34 _"That one clockcycle is called the latency"_ Not really, that one cycle is called _throughput_ in these contexts. The latency *for simple instructions* (like ALU reg,reg/im) usually equals the number of pipeline stages. In a simple pipelined CPU, that would be: fetch+decode+calculate+write result, i.e. 4 stages and so 4 clock cycles. For the 486, that was five stages and five cycles, for the P4 it was around 20 stages and cycles, and so on (again for simple instructions like ALU reg,reg/im).
@laurelsporter4569
@laurelsporter4569 2 года назад
But, calculate can be repeated as nauseum, and as long as that can go on, write can be hidden. The full pipeline isn't executed fully for each instruction, before the next one executes.
@herrbonk3635
@herrbonk3635 2 года назад
@@laurelsporter4569 Yes, that's the basic idea with a "pipeline", i.e. having all the stages of the instruction execution fully overlappning, so that (different stages of) several instructions in a sequence can be processed at the same time. (Typically instruction fetch -> decode -> effective address calculation -> operand fetch -> ALU -> write-back.)
@TellowKrinkle
@TellowKrinkle Год назад
Don't know how people talked about the 486, but on modern processors, when people talk about latency, they mean the number of cycles from when the register value is first needed to when it's available to the subsequent instruction. If your CPU has forwarding circuitry (like every modern processor), that's only the number of calculation stages. For the example of an `inc rax`, if you had four of those in a row, the cpu would fetch all four in parallel, decode them all in parallel, and calculate them serially, with each one forwarding its result to the next without waiting for writeback. In the end, four (dependent) `inc rax`s would run in four consecutive clock cycles, which is why `inc` is considered to have a latency of just 1 cycle, not 20 or however many a modern processor's pipeline has. The throughput of inc is not 1 but 1/4 for a skylake processor, meaning that the processor can execute four non-dependent inc's in one clock cycle.
@BlackStarEOP
@BlackStarEOP 2 года назад
8:10 "Race conditions are brilliant" :D (y) Thumbs up for that... Tracking down race conditions has been the most difficult part of my career as a software engineer. If you implement something using more than 1 thread, if you carefully think things through, there's not much you can do wrong. However... when suddenly one guy in your team says "yes I know how to improve the performance, just put this and this into its own thread" then you know you need to buckle up. You're in for one hell of a ride...
@sikkavilla3996
@sikkavilla3996 3 года назад
Happy Holidays @Creel!
@WhatsACreel
@WhatsACreel 3 года назад
Happy holidays to you :)
@mikefochtman7164
@mikefochtman7164 3 года назад
Good information. When we had some ASM instruction dependencies, we sometimes would look down a few lines and see if we could move some other instruction in between the dependent instructions. That meant we could space out the two dependent instructions to let the first one finish and give another ALU something to do while the first one crunched. Also worked on a different processor that had a special increment. Used in the OS interrupt handling, it had a couple of instructions that were non-interruptable so we could guarantee that the increment and sto would be atomic.
@spacewolfjr
@spacewolfjr 3 года назад
The legend returns! Thanks Mr. Creel.. man..
@programaths
@programaths 3 года назад
First year in school: Compute the volume of a cone...in assembly! Most student were blocked on the division!!! That's when the learn overflow AND underflow. I do not remember the in and out, but the division gives you a good ride if you didn't pay attention to the curriculum. Then that's when you are doing your work that you realize that registers can be split in different way, that there is a flag register too. At that time (15 years ago), there was "help PC" with nice explanations of all of this... Another difficulty of assembly is that it's "verbose". In higher language, "if" is identified as is. In assembly CMP+JNE,JEQ,JZ,JNZ,JNP. And even conditions with conjunctive or disjunctive becomes challenging. Another nicety was using the stack for local variables instead of trying to guess which register is safe to use ^^ It's a bit cloudy, because it's far away now. But that wasn't that easy! It's a gymnastic on its own! But overall, whatever is the language, programming is really complicated. It's all about solving problems and expressing the solution as code...And most of the time, the problem to be solved is also to be found!
@WhatsACreel
@WhatsACreel 3 года назад
So true! Cheers for watching :)
@coder2k
@coder2k 3 года назад
Looking forward to seeing that next video you already teased :)
@theDemong0d
@theDemong0d 3 года назад
In my experience writing assembly (mostly to capitalize on AVX), yes the function call overhead is a huge performance hit, but you need to write your program in assembly anyways because when you switch to AVX intrinsics, you need to know what assembly you want the intrinsics to produce. Writing the function first in assembly makes it easy to translate into AVX intrinsics, and the intrinsics should allow you to write C++ that compiles almost exactly instruction-for-instruction identical to your handwritten assembly. Yeah, it's not quite as cool as your program running your handwritten x86, but it's the next best thing and with the call overhead eliminated, you can reap large performance boosts.
@roax206
@roax206 Год назад
Though from my understanding, assembly is mostly just machine code but replacing the binary instruction IDs with short nicknames for the instruction. Technically any compiled "higher level" language will be converted into assembly at one point (unless the person who wrote the compiler is a masochist and memorized all the instruction ID numbers). The main point when assembly becomes quicker then simply relies on whether the problem is easier to express in assembly language rather than the HLL used and to what level you are willing to manually optimize the assembly code.
@DigitalPhage
@DigitalPhage 3 года назад
"x86 Assembly Language Misconceptions" would be a more apt title, however a good video.
@TheBypasser
@TheBypasser 3 года назад
Oh yeah, say Arduino compared to pure AVRASM is like a snail vs a ballistic missile (just like for the most of the RISC cores, HLL vs ASM that is).
@niclash
@niclash 3 года назад
Misconception; x64 Instruction Set is a typical one. The micro controllers are typically magnitudes easier to learn fully. And then there are the funky/academic outliers, like 1 OpCode Instruction Set. But the majority of Assembly Languages out there are dozens, maybe 100 and a bit, and not the thousands in the Intel/AMD world.
@trashtrashisfree
@trashtrashisfree 9 месяцев назад
I always wrote a good macro library for the assembly I was working in. System 360/370 didn't even have stacks so my first priority was writing things to push and pull values and create subroutines. Everyone else was hand-cutting every single line. Far more error free. Same for other issues in 6502.
@wrtlpfmpf
@wrtlpfmpf 2 года назад
One thing doing a project on a small assembler can really help is with coding style. I used to write multiple screen long functions with control structured nested several levels deep. Writing in assembler can really teach you how to write code that is as simple as possible, yet correct. I once did that for a little project on an ATMega. Those are cute little 8-Bit micro controllers. Since they have different addresses for RAM and Flash, programming them in assembler is a lot less painful than, for example, C. Anyhow that project really helped me write readable code when I later did C projects. I later played around with those microcontrollers in C and looking at the assembly created by the compiler I have to say that it's highly dense. (The rationale behind assembler was that I had more experience with AVR assembler and that that code would use the remaining flash program storage as data storage, something that is even harder to do in C)
@steveokinevo
@steveokinevo 3 года назад
Another beaut of a video chris man, thanks again pal.
@AngDavies
@AngDavies 3 года назад
Minor nit/clarification: while you definitely need to know assembly on a deep level to be able to code an optimising compiler- after all, it's a program that turns code in a given language into as efficient/fast machine code representation as possible. That doesn't mean you necessarily should write one in assembly itself- it wouldn't make faster code, only code, faster. The better option is often to write the compiler in the language that you intend to compile with. You spend loads of time writing a compiler that can create really optimised code for a given platform, build it using some existing compiler, which doesn't make very optimised code, and so the compiled compiler takes ages to compile code. But now you've just created a program that turns your code in your language into optimised machine code, so just feed the original code through the new compiler, and you now have an optimised optimizing compiler :D Having just "GCC" that compiles to your machine is so much better than having to find a version of GCC tailored to your exact platform
@michaelbuerge
@michaelbuerge 3 года назад
Great stuff. Interesting and relevant info. Thanks. Allow me a remark about audio: You invested in a nice mic. Now you might want to think about the room you're recording in. Maybe put something absorbing in place to reduce room reverberation.
@Lantalia
@Lantalia 2 года назад
So, with regards to #1 inline assembly skips the function call overhead, the main reason to do it is to do it is to use instructions not yet supported by your compiler
@rfvtgbzhn
@rfvtgbzhn 9 месяцев назад
From what I heated, you can get a significant performance boost in some cases by disassembling the compiled code and rewriting parts in Assembly language.
@danepane527
@danepane527 2 года назад
The algo sent me here.. was watching a bunch of Coach McGuirk videos.. subbed!
@PvblivsAelivs
@PvblivsAelivs 3 года назад
I have seen many people say that compilers do these wonderful tricks and that hand-coded assembly language is not (generally) faster than a compiler's output. While there may be some compilers that do this, no compiler I have actually used does so. "You might get the right result." Especially if you use the lovely little LOCK. Any processor that can feasibly be part of a multi-processor system needs a way of executing al least certain instructions without interference from other processors. "The CPU will perform the instruction a lot slower." It will if two processor units are trying to access the same memory at the same time. After all, one must stall. But the processor that "gets there first" has a negligible performance penalty. It was a two-cycle penalty on the 8086. (I only have timing information up to the 486.)
@BrightBlueJim
@BrightBlueJim 3 года назад
So to summarize a couple of things you said: 1) Functions written in assembly don't really run faster than compiled functions. 6) Assembly is still necessary for low-level optimization, where speed is really important. Also, your point on atomic operations applies just as directly to C and C++, or indeed for ANY program written to take advantage of multi-threading.
@gregorymifsud5389
@gregorymifsud5389 3 года назад
Great content mate love it
@sambrown9494
@sambrown9494 3 года назад
Very interesting stuff, enjoying these videos. Hope you don't mind my asking - is that microphone actually turned on? It's a bit echoey like it's the camera microphone doing the recording across the room ..? Looking forward to more vids! Thx :)
@sambrown9494
@sambrown9494 3 года назад
Ha umm sorry! I commented and only then read the description. Already covered. Just so you know I was paying attention! ;) Rock on ...
@CallousCoder
@CallousCoder 9 месяцев назад
ARM 64 cpus actually have a couple of assembly dialects. You have your AARCH64 but also your Thumb instructions, which are a small instruction to save space.
@NomenNescio99
@NomenNescio99 3 года назад
A long time ago in a galaxy far far away, before the time when gcc used the mmx instruction set to optimize vector arithmetic there was sometimes huuuge gains to be had from inlining some assembly code.
@kindpotato
@kindpotato 3 года назад
"race conditions are brilliant" This guy is awesome.
@DukeDudeston
@DukeDudeston 2 года назад
"You can do a lot of stupid things in any language" I was able to delete ntfs.sys in a language called "DarkBASIC" when I first started out. So yes. You can do a lot of stupid things in languages.
@microdocker
@microdocker Год назад
Very good and explanatory shot. One small weired thing (not related to the topic) is, guy is literally sitting in front of a mic and still recording his voice on oncamera microphone ^_^
@vikassm
@vikassm 3 года назад
Fantastic video and channel! Subbed. My 2¢ about the poor audio: Use your mobile phone with a ~5$ lapel mic to capture your "B-Roll" audio 🙂 That way if your nice desktoo mic doesn't record for some reason, the backup audio from your cellphone is still wayyyyy better than the absolute garbage camera mic. Just clap once (Aaand ACTION) at the beginning and the end of each take to simplify A/V sync during editing.
@thomasmaughan4798
@thomasmaughan4798 2 года назад
There was a time when assembly was much faster than compiled but eventually the compiler optimizations produced code that executed efficiently. Depending on what one is doing, assembly is considerably smaller. A function in COBOL to parse a text file was 30 kilo-words and took 30 seconds to execute; I re-wrote it in assembly and it produced an executable that was only 3 kilo-words and parsed the same file in 3 seconds. 1/10th the size and ten times faster! But that extreme example is a result partly of COBOL not really a good choice for that sort of thing and my re-write also used static linking; everything it needed was already linked in the executable so at run time, no "fixups" were needed.
@EvilSandwich
@EvilSandwich 3 года назад
I like to program for old systems like the Apple II and the NES, so I code a lot in 6502 ASM. Believe me, you start to miss high level after a while. You guys ever try Hello World when you have to explain to the computer how to read and print strings before it can even do that? Heck, the NES doesn't even have ANY internal ROM, so you have to draw the letters manually before you can even start on strings. lol
@EvilTaco
@EvilTaco 3 года назад
I was wondering where I could find information on the cycles of different instructions for zen 3 (since I have the Ryzen 7 5800x) And instructions that reference ram are dependant on the memory frequency, right? (Mine is overclocked to 4000MHz)
@overcritical304
@overcritical304 3 года назад
Love your videos man. Learned so much from this channel. Have you thought about ARM. Will love to learn that too
@WhatsACreel
@WhatsACreel 3 года назад
I would love to do some ARM! Hopefully I can record soon, though I am unsure at the moment exactly when. Thank you for watching, and thank you for the suggestion :)
@gideonz74b
@gideonz74b Год назад
@Creel: Executing an instruction in one cycle does *not* mean that the *latency* is one cycle. It means that the *throughput* is one instruction per cycle. The latency is always a lot more than that, because it has to pass through the pipeline.
@FORTRAN4ever
@FORTRAN4ever 3 года назад
I programmed in assembly on a Sperry Univac 1143 mainframe computer in the early 1980's. Each instruction consisted of a 36 bit word. Commenting was a must. I would prefer to program in FORTRAN or COBOL anyday over assembly.
@thadtheman3751
@thadtheman3751 2 года назад
Actually part of the complexity of assembler comes from the fact that "decorations" of instructions are not uniform. To clarify I will make up an example (it's been a while so don't expect this to be a real world example ). You might have INC A,N. increase A by N. A might be a memory location and N a number (direct addressing) INC $A, N A might a memory location pointed to by a memory location (indirect addressing) INC [$A],N N might be a memory location INC A,$N ... THe thing is that some comands accept some of these addressing modes and other do not. A JMP forexample might exceprt all addressing modes, abut a JSR would not. So it get complicated keeping track of which instruction does what.
@MistWing
@MistWing 9 месяцев назад
The first computer language I learned was back in the 80's (back when the myths weren't myths :) ) and was assembly on the Z80. Back then, we had simple instructions like "LD reg1,reg2". Nice and simple. Now we have things like "AESKEYGENASSIST xmm1,xmm2/m128,imm8". And instructions were only 1 or 2 bytes long. Now, instructions can be up to 15 bytes long. My how times have changed :)
@briancampbell179
@briancampbell179 9 месяцев назад
I started a few years before that on a Motorola 6800 D2 kit, then my own 6502 based SYM-1. Yes, assembly language was a lot simpler assuming you had the luxury of access to an assembler. I recall hand assembling programs and entering the code byte by byte. It wasn't a choice between a compiled language and assembly language, it was a choice between assembly language and raw object code. The key difficulty with assembly language is the sheer number of lines needed to do the same as a couple of lines of a high level language.
@Alex-op2kc
@Alex-op2kc 3 года назад
Creel's back on his cubemaps!
@malusmundus-9605
@malusmundus-9605 10 месяцев назад
I love this channel
@tchiwam
@tchiwam 3 года назад
Would be fun to see a video on transforming locked multithread to lockless thread with a thread manager and completely lock less multithread manager.
@erwinmulder1338
@erwinmulder1338 2 года назад
I grew up programming home computers in the 1980s. You had to write assembly (and sometimes even translate it to number by hand) to make anything that would run faster than at a snail's pace. I mean 8 bit computers at 3.5HMz are not incredibly fast at anything. So if you had BASIC, which was interpreted (not even compiled) that was SUPER slow. You couldn't even draw an entire screen in one second most of the time. These days, I mostly work with assembly in writing (toy) compilers for my own programming languages. In the end, what any compiler really does is basically translate the source code to assembler instructions.
@derzweistein8973
@derzweistein8973 3 года назад
Where do i learn "everything that [i need to lern] about a computer" to gain significant speed in assembly ? (especially the fun hardware stuff like ooo Execution, Loop Streaming, difrent Execution Engines)
@WolfCoder
@WolfCoder 3 года назад
The only time I've written assembly was for the 6502 (because its fun), the Z80 clone in the Gameboy (because its fun and the only compiler I found was terrible and couldn't handle ROM paging well, etc.) and the ARM7 DTMI in the GBA where, while there's a port of gcc for it, you still have to write assembly for heavy duty subroutines like interrupts, audio engines, etc. as the compiler optimizations don't seem to work as well in the gcc port. For x86-64 though? Uh.. I think I'll let the compiler have the 'fun' when it comes to that.
@dcocz3908
@dcocz3908 3 года назад
I agree but there are lots of situations where the compiler simply fails for example gnuarm won't use multiple load and store properly which for me generated a lot larger code that wouldn't fit in SRAM so it had to run with wait states from flash on my project. By re-writing it in hand assembly allowed me to get a much smaller function, allowing it to be moved into SRAM with the data that was required by application and that is where I got a really large speed improvement. I couldn't have done it without swapping micro for larger memory footprint using just compiler
@rjones6219
@rjones6219 11 месяцев назад
Assemblers and machine code is where I did all my programming. Obviously writing in assembler takes more time than a higher level language. But the code space can be more efficient.
@RT55J
@RT55J 3 года назад
The effectiveness of unrolling loops as a performance optimization can vary wildly depending on the caching situation. If your architecture has no cache to worry about, then it would give a definite performance boost. However, if you have an instruction cache to worry about, then (depending on the size of the unrolled loop vs the cache) you might suffer a performance decrease from the extra instruction fetching from RAM.
@nordgaren2358
@nordgaren2358 2 года назад
Hey, Creel! Was curious what you mean by the overhead over jumping to assembly? Everything compiles down to assembly, so I am curious why the overhead? Thanks for the videos! Cheers!
@WhatsACreel
@WhatsACreel 2 года назад
Oh, just meant calling the function. Often the compiler will inline functions, but if you use ASM, then it can't. It's going to be the time to set up the stack, pass parameters, and the jump to the function itself. Hope this helps, and thanks for watching :)
@nordgaren2358
@nordgaren2358 2 года назад
@@WhatsACreel Yea, that's what I was guessing, but I figured I'd confirm! Thanks again, Creel!
@xeridea
@xeridea 2 года назад
Older compilers were known for being slow, and assembly was often used, especially in early consoles. Modern compilers are highly optimized. Besides all the basic stuff, they have all sorts of tricks for optimizing multiply, divide, and what instructions to use, even specific to CPUs if you want. Sometimes CPUs have weird quirks that compiler developers can take advantage of, or at least avoid penalties. Optimizing multiply and divide goes beyond obvious stuff, like bitshifts for powers of 2, they have all sorts of tables for methods for various numbers. Often they can even convert loops into SIMD instructions automatically. If not, doing SIMD completely manually is very tedious, there are methods available in some lower level languages to make it a bit easier. Some things can still be hand optimized, but requires very in depth knowledge of CPUs, and even then, may not even be faster. For most purposes, not worth it, though some low resource embedded systems, some drivers, and some other niche cases benifit.
@scowell
@scowell 2 года назад
If you go to the trouble of having a big condenser mic with popscreen, why use the camera audio? Also, please explain to me the 'overhead' of inline assembly... I thought that there was no state-save, the compiler/linker just inserted the ASM.
@y2ksw1
@y2ksw1 3 года назад
I have been programming for a vast time of my life in Assembly, and the most challenging tasks were to write code in a way, to run in parallel in the separate pipelines (super scalar). The example you have given, would have been rewritten, eventually longer, in order to get the parallel mechanism working. One way would be: mov ebx, eax inc eax nop inc ebx So the first two run together, and the resting again. And we would gain at least 2 clock cycles. However: assembly made a lot of sense in the old days. Now, with multi-core multi-scalar processors and the brilliant optimisation of compilers, Assembly code died pretty much out. I still use it on special hardware though. I am eyeballing the Raspberry Pi Pico, for example 😊
@OpenGL4ever
@OpenGL4ever 10 месяцев назад
inc eax mov ebx, eax Does the same job as your code and requires less RAM.
@y2ksw1
@y2ksw1 10 месяцев назад
@@OpenGL4ever It's not a question of memory, but to get part of this code running in a different pipeline and thus double up the speed.
@y2ksw1
@y2ksw1 10 месяцев назад
Your code would run 4 times slower
@OpenGL4ever
@OpenGL4ever 10 месяцев назад
@@y2ksw1 Why should it? In my opinion it runs at the same speed. Your code might do mov ebx, eax inc eax in its own pipeline, but nop ; does nothing and inc ebx depends on the mov ebx, eax before.
@y2ksw1
@y2ksw1 10 месяцев назад
@@OpenGL4ever If you do first an operation on eax, and then use it to assign its value to another register, it stalls and waits to settle just that tiny bit which doesn't allow to move the code to the other pipeline. I have been timing these instructions very accurately and your assumption, while are technically correct, perform way less efficient. On time critical applications, such as real time graphics manipulation I was working for, the code alignment and sometimes illogical reordering of instructions, made the difference of fluent or staggering graphics. I got mainly the filter and render code prepared by graphics specialists and my task was it to speed it up. But also big number mathematics and operating system libraries. Most of them grew noticeable in size, but were of unmatched speed.
@wingman2tuc
@wingman2tuc 2 года назад
Modern CPU are also "deep" pipelines. Fetch -> decode -> exec ->mem access-> rightback.As a very simple example. Todays CPU can have 20 to 40 steps for completeing a single instruction. Things can be pipelined but you need a very inteligent a complicated forwarding unit and branch predictor in order to take advantage of pipelines. Understanding modern cpu architecture is a must in order to use ASM eficiently. Also ASM can be cpu spesific so it may not work in other cpus.
@christophergreeley4880
@christophergreeley4880 3 года назад
What would you say of using LLVM as an assembly language. I'm told you can not use it as a "compile once run anywhere" language, but might it give you the benefits of asm without as much work because LLVM bytecode can automatically be optimized? You can get hacky cool optimizations with it and the help of the compiler for performance. Also can LLVM bytecode overwrite itself? I know a lot *about* this stuff but haven't actually written much assembly.
@kvdrr
@kvdrr 2 года назад
You should check out LLVM lifters.
@kylegivler8372
@kylegivler8372 3 года назад
Thanks for sharing this 😁
@den2k885
@den2k885 2 года назад
Compilers optimize very well... for general purpose code, without knowing its data layout. It's very difficult that a compiler will use SIMD instructions and in the rare cases it does it won't make use of the inner characteristics of your problem, as it has no knowledge of them. Using Assembler I managed to douvke a linear Sobel algorithm performaces and triple a segmented integral table algorithm's performances. Not even Intel compiler managed to equal those times.
@ug333
@ug333 3 года назад
Great information, great knowledge Side note: what's up with the audio?
@pugboi8017
@pugboi8017 3 года назад
what is dis? I’m so glad i got recommended this channel. The coding gsus is australian
@brorelien8447
@brorelien8447 3 года назад
14:43 I partially disagree with you on this point. Some processor like the 6502 has a little instruction set which can be easily learn (only around 56 instructions). I know an 8 bit CPU can't really be compared with a modern x64, but some embedded CPU still uses these simpler 8 bit instruction set. Otherwise I like the video.
@y2ksw1
@y2ksw1 3 года назад
Well, some 8 bit processors have a lot of instructions. Of course, if you group, then almost any processor has only a few: Add, subtract, multiply, divide, invert, move. That's about it. When I teach, I actually point out that most processors can only add and negate. They do it in a very efficient way though.
@NoNameAtAll2
@NoNameAtAll2 2 года назад
risk v >_>
@amigalemming
@amigalemming 9 месяцев назад
15:45 I am too lazy to plan register usage myself, thus I use LLVM to generate real assembly code for me. But I inspect the results regularly in order to find weaknesses in LLVM or my code.
@GogiRegion
@GogiRegion 2 года назад
I’ve actually looked into virus programming, and commonly out of curiosity, and it looks like good hackers will use C and then compile to assembly for optimization, then assemble it. That’s assuming that you need high level functions in order to do what you need, you want it to take up as little space as possible so it’s harder to detect, and possibly want to remove null bytes (which is supposed to allow your code to work with a wider array of hacks since some rely on a lack of null bytes). It’s actually an interesting topic, and from what I was reading, it sounds like C is preferred over assembly for the same reason Linux is shown in primarily C.
@msoulforged
@msoulforged 3 года назад
Great video!
@Cubinator73
@Cubinator73 3 года назад
15:49 I think you got something wrong there. Obviously, assembly is needed in all sorts of things like programming compilers and optimizing low-level routines. The "misconception" that "assembly language is no longer needed due to optimizing compilers" expresses the fact that your average programmer doesn't need to write assembly himself because far more competent people already did it and made their optimized routines available in the optimizing compiler. I myself only ever used assembly to explore how CPUs work and how compilers optimize stuff, but I never NEEDED to write my own assembly code for my own projects.
@lewiscole5193
@lewiscole5193 3 года назад
That's nice ... OTOH being a former OS maintainer/developer, I used assembly a lot, not just because most of the OS was also written in assembly (which it was), but because it gave me control over data/code placement that no available compiler did/could, which was especially important in the bootstrap code I was responsible for the care and feeding there of. And I suspect that's still true ... the hardware defines and uses data structures that I don't want/need a compiler guessing what sort of code should be generated for.
@WhatsACreel
@WhatsACreel 3 года назад
Yes, I do wish that the proper position of ASM was expressed more clearly in computer science education. I was taught to fear the language during my degree, encouraged to neglect it entirely. Maybe it’s different in other institutions? I do not disagree entirely with the sentiment. But I do think it is skewed a little too far away from ASM. I think learning ASM for OS development or to understand the CPU are excellent applications! Cheers for watching and commenting folks :)
@lewiscole5193
@lewiscole5193 3 года назад
@@WhatsACreel I have no idea how ASM is being taught in schools these days, but back when I was a student -- just after the dinosaurs had been killed off by an asteroid -- there was no question that any non-impaired human could outdo a compiler in terms of generating fast/small code. The reason why you were supposed to use an HLL was because it increased programmer productivity. Studies had supposedly been done that showed that the average number of DEBUGGED lines of code that could be produced per programmer per day was about TEN (10) independent of programming language. And because each HLL statement typically turned into more ASM line, that meant that if you could use an HLL, you should because you could potentially get more done using an HLL than you could ASM especially in terms of code that was supposedly "portable" across platforms. There were also supposedly studies that showed a wide variation in programmer output as well and so YMMV, but familiarity with a particular language also had a lot to do with programmer productivity (I don't recall how much). The gist of this is that I usually write in ASM because that's what I'm most familiar with, and because I'm no longer getting paid for what I write, it's my choice. I can speak C if I have to, but I don't consider myself fluent and I simply don't see the need to spend time becoming more fluent in C when I can do what I want probably (?) faster in ASM. What bothers me is that people who seem to shy away from away from using ASM seem to think that there's something fundamentally different in how you generated ASM code versus an HLL thrown at a compiler. To me, though, that's not the case. When I occasionally do write HLL code, I do the exact same thing that I do when I write ASM code, the only difference being how far "down" I "refine" the code before I come to a valid HLL or ASM statement. I just don't understand what it is that makes people think there's something special when it comes to how to write ASM code versus HLL code. It makes me think that maybe too much time is spent teaching the structure of various HLLs and not enough on how to think and solve problems. Just my opinion ....
@WhatsACreel
@WhatsACreel 3 года назад
@@lewiscole5193 Ha! I know the feeling! I learned in the 90’s. Things have changed a lot since then. Especially Assembly language. It’s gone from maybe 100 instructions and 16 registers to massive SIMD register files and 3000 instructions! I certainly agree that programmer productivity and portability are very important. And the choice of language is a big part of that. Sometimes ASM is a good fit, and sometimes it is not. I do love how fast it can be, and how flexible. There’s some brain-melting, deep trickery that is natural to ASM, which is too low level to be practical in HLL’s. But for the most part, anything is pretty achievable in any language, and so it becomes a matter of choosing the best tool for the job. I couldn’t agree more! The problem with ASM is the perception of it. Folks shy away from it in a way that might not be warranted. It’s just a language, after all. IMHO, it’s a really fun and powerful language. I do love a good bit of HLL code too, but ASM will always hold a special place for me. If for nothing else, I made a video about ASM 10 years ago and put it up on RU-vid, and have since built this little channel :)
@lewiscole5193
@lewiscole5193 3 года назад
@@WhatsACreel Ten years? My how time goes by when you're having "fun".
@jp5000able
@jp5000able 2 года назад
Back in the early 80's I did some 6502 assembly programming. What made it so difficult, the cpu was only 8 bits. There were no instructions for 16 bit numbers and floating point numbers.
@davidliverman4742
@davidliverman4742 2 года назад
Thanks dude!!
@connclark2154
@connclark2154 3 года назад
I think one thing that wasn't mentioned was assembly allows you flexibility that higher level languages do not. With this flexibility you can implement more efficient algorithms. For example in between assembly routines you can return more than one value from a function by using a custom calling convention. Its the ability to leverage the freedoms that gives assembly its power and performance.
@bigshrekhorner
@bigshrekhorner 9 месяцев назад
That's not something exclusive to Assembly. C is able to do this by using pointers as function arguments. Even higher level languages are also able to do this by using tuples that mix types (or simply the same type), or with methods similarly to C, if they allow memory management concepts like pointers. Compilers and compiler engineers are extremely smart and definitely way smarter than me or you. That means that if you have thought of an efficient implementation of an algorithm in Assembly, it's also pretty likely the compiler engineers have also thought of it and implemented it. At least if we are talking about mainstream compilers, like GCC or Clang (for the case of C/C++)
@emjizone
@emjizone 9 месяцев назад
3:53 This "one instruction per cycle" might be true for the oldest machines, with no clever vectors and lookups and with a very limited set of instructions. This might explain why people believe it to be still true today. In that case you'd have to program most of usual math functions yourself (modulo, square root, etc…) and they would take several cycles anyways.
@furyzenblade3558
@furyzenblade3558 3 года назад
Great Video!
Далее
Top 10 Craziest Assembly Language Instructions
15:19
Просмотров 455 тыс.
I need your help..
00:28
Просмотров 2,5 млн
What Is Assembly Language?
24:56
Просмотров 449 тыс.
Where GREP Came From - Computerphile
10:07
Просмотров 931 тыс.