Some folks are asking about the rotate *left* instruction - wouldn't that have the same problem? I forgot to mention it in the video, but the 6502's rotate left is a little weird. It just adds the number to itself, with the carry coming in and going out just like in a regular add (ADC) operation. A pipeline delay takes care of the rest.
Wow. A suit jacket. Was not expecting that. You look sharp! Great explanation that the "bug" was a feature or lack thereof. I have an old MOS kim-1 dev board with the white ceramic package in a box somewhere. Been meaning to dig that out.
Is it a bug if its intentionally masked out ? Would you find defective circuitry if its masked out ? or would you find blank spaces and tracks to nowhere if an intended instruction hadn't been resolved in time for the release date ? All fascinating stuff though.
@@garyc3476 😵💫"Though". 😵💫 The designers left out one (or many!) nice-to-have instructions, some possibly decided in the last days due to lack of space. They documented no ROR instruction. The 6502A was complete and bug-free. When a new model 6502B came out, with identical specs and behavior except that one instruction was added, that was no bug either. Compared to the 1976 model, the 1975 model looked *AS* *IF* it were the new model with a defect introduced (undefined behavior instead of ROR). Nowadays, new instructions can be added (and bugs fixed) retroactively. The BIOS chip can update the CPU microcode after-the-fact, at power-up. Undefined instructions can trigger interrupts that call routines that emulate the missing instructions.
Knowing that these first processors, already complex with thousand of transistors, were designed with big sheets by *hand* is mindblowing. Hat's off to those people.
@@kirkhamandy I wasn't this deep into it but I was a PC board layout artists in the mid 80s for a communications company. It was all done with layout tape and rub on transfer solder pads on clear mylar and then photo etched to copper clad boards. I have worked with rubylith, but that was for large scale art silk screening. The owner of the company contracted a programmer to attempt to create a PCB layout program to run on the fledgling IBM PC but he never got the "pathing" to work well enough to use it. I think that was when my eyesight started to blur. We used a lot of micro-miniature surface mount components dealing with microwave tolerances and I stupidly refused to use the desktop magnifier.
@@3DPDK I also did PCBs in the 80s. We used blue and red tape for double sided PCBs. I used to get paranoid when I had just sent off a design to be manufactured and got home to discover a bit of red or blue tape stuck to my arm!
0:51 The ubiquitous ARM microprocessor also worked the first time. Designed by a team of 4 at Acorn Risc Machines (ARM) of Cambridge UK in 1985, it took just 1 week from sketchpad to first working silicon. Really impressive.
Well said, Peter. The ARM lives on in tons of mobile devices. That initial realisation of a working design was a truly remarkable achievement by some incredibly talented people.
This is what fascinates me about history. Stories come up that dominate the reality of what actually happened. We don't even have a clear idea of what happened 50 years ago let alone 2000 years ago.
@@matthiaslipinsky501 No, we don't. We know how the climate will be in 50 years, given we do X. But the climate and the weather are two different things. To stay in the same area, we cannot predict even a single raindrop, and yet we can predict that it will rain and how much. Single raindrops are much, much harder to predict than the weather, which in turn is much, much harder to predict than the climate. Also, "climate collapse"? The climate doesn't collapse. It just changes. Ecosystems, however, can easily collapse if it changes too fast. Oh, and you're not a denier for asking questions - you're a denier (and a conspiracy theorist) if you claim that the vast majority of climate scientists are somehow lying about it.
@@KaiHenningsen according to Greta's youth, the "climate has collapsed". And yes, if we now have a strong rain flooding houses being built closed to a river or in flood areas, that is not because we reduced rivers area, but because "climate change" And sure you come up with "deniaer" to stop any discussion in lack of arguments. And yes, all "experts" say this. If they wouldnt, they wouldnt being called experts, but denialers or conspiracy theorists or Trump supporters or whatever
In the MOS manuals that came with my KIM-1, it was documented as not being available until after June 1976. An errata sheet had initially said it would be available in production quantities by May 1976. That said, my KIM doesn't have it. Back then we relied on vendor documentation and as we had no internet, I never heard this rumor.
Copy of the January 1976 manual with confirmation on p.150: archive.6502.org/books/mcs6500_family_programming_manual.pdf August 1975 data sheet which does not mention ROR: archive.6502.org/datasheets/mos_6501-6505_mpu_preliminary_aug_1975.pdf
I imagine the "confusion" is that the majority of the documentation for the 6502 shows the ROR instruction so unless you have the original then you'd think it didn't work. People are very bad at looking at notes on data sheets, and when those data sheets get used in manuals and articles its details like this that get omitted. If you've been told there is a bug in the early version people obviously didn't bother lo look for the correct data sheet to check it out.
Coincidentally, many years ago I wrote a VHDL model of a microprocessor, and it actually did have a bug in the ROR instruction. It took a while to find it, because most of the code written that used the instruction didn't tickle it, but eventually I found what I thought was a bug in the program that turned out to be a bug in the microprocessor itself. I can see why the original designers didn't include it, as most of the code I've written for my own CPU rarely uses ROR in its original form. Instead, I typically force the carry bit clear in order to use the instruction as a right shift, since the instruction set didn't include shift instructions. In fact, I so rarely use ROR with the carry intact that I added a compile-time switch that effectively turns it into a logical shift right.
3:33 No. Joke. I was just reading 6502 Assembly books yesterday. And I found one that had a nice graphic that explained the difference between shift and rotate. It was a big "ah ha" moment for me. It is just funny to me to have it explained well twice within two days!
Awesome deep dive. I would argue the original run of 6502s behave in an unexpected way when running the ROR instruction which is one of the definitions of a bug. (Even if the underlying reason is that the instruction isn't fully implemented in hardware.) I think this is the age old argument if a missing feature in a released product is a bug or just a "missing feature" and there are many arguments both ways. A thought experiment could be: if MOS has not marked ROR "as not working" on the original datasheet, would this problem then be considered a bug? (Supposedly ROR was listed on the original datasheet but it said it would not work until 1976) Either way, clearly it isn't a flaw in the silicon or design, and that was really cool to see.
@@TubeTimeUS correct.. it can't "fail" something it never had even if something else in the future had an instruction added it is not possible for the original to be buggy.
@@ingmarm8858- not always true. If you manufactured a product, but left out a feature that was expected to be present by the market, even if not in your specification, what would your rivals call it?
@@TubeTimeUSThe January 1976 manual listed ROR as a regular instruction with in parentheses (available after June 1976). Admitting there is a bug without admitting it...
WOW! That was really cool. I can't believe that I have been working with computers since 1982 and this is the first time I have seen what the circuits look like under a microscope. And I loved the circuit diagram explaining the physical circuit. Thanks so much!
Great video! I remember Adrian Black talking about this ROR thing awhile back on his youtube channel, but he didn't really get into the story like you did. Always enjoy seeing one of your projects or on Curious Marc's channel lol
If it's true that there were no plans to implement ROR, then why is there a perfect spot in the opcode decodes for ROR, right alongside ASL, LSR, and ROL? It may not be a bug, but it's not a missing feature either. It would be a planned but unimplemented or cancelled feature.
That's simply because the opcodes were designed to use very regular bit patterns, so as to be easy to decode. So for example, the shift direction is a specific bit in all the shift instructions. Then if you have three shift instructions, two which shift left and one which shifts right, then there's an obvious place for the second right-shifting instruction, and it's free because there is no obvious other instruction that would have fit there. It's completely what you would expect.
@@KaiHenningsen There's definitely a lot of cheating going on in the instruction decoder which is why a lot of unused opcodes do things that similarly coded opcodes do. At the time, a true microcode ROM would have been as big as the CPU itself.
Really fun video. I recall the definition of "Bug" from my youth (the 1970s) - a bug is a feature you don't like. In this case, it's a missing feature. Thanks much for this video!
Next time we chat, I'm going to tell you in person what a superb job I think you did on this video...fantastic! So well done, and I learned a LOT by watching this! Thank you.
This was a very interesting video! Interestingly, in modern Atari 2600 programming, using unimplemented instructions is actually sometimes used to save ROM space!
Would have loved to share this with my dad! He was able to meet Chuck Peddle way back when in the 70s and used the 6502 in a bunch of old printers of the era.
Really fascinating video, that finally makes sense of two conflicting stories I had heard. Shame it took YT 1 year to tell me the video existed. Also thanks for the explanation of rotate left in the comments.
i think people consider it a bug because the 6501 was suppose to be a drop in replacement for the MC6800. the MC6800 has ROR while the 6501 and early 6502s don't.
I remember one of the undocumented instructions. Some people ended up calling it LDAX which loaded both the accumulator and the X-register with the same value.
My preferred undocumented was DCP (DCM): DEC oper + CMP oper that combined a decrement and a comparison so it was very appropriate for loop counter to zero and saved a couple cycles in a loop.
I remember knowing a guy that designed ICs he had transparencies all over and overhead projector and a microscope. Talking to him it seemed like it was as much an art as a complex puzzle. He was an interesting guy. I'll never forget going to his house and seeing these diagrams and transparencies all over his house. I wonder how many actually know how to do this today. It seems like it's a very small few that actually do this now.
It's all done on computers now, but for analog and critical digital circuits, it's drawn out by hand. Most logic is laid out with APR (automatic place and route). A lay out person could do it the old way for an old chip, but new chips have many layers and the masks are 'complicated'.
That's exactly how my house looked, back in my early R&D days. Stuff pinned to the walls, green lined code listings spread across the floor. I had a very understanding wife...
When I was 17 I wrote down all the 6502 op codes. That way I found missing assembly instructions that I immediately tried, hanging my cpu quite a lot. Fun times!
I did the same thing when I was 14! I used in built in assembler (call -151) to list the assembly instructions so I could figure out what each one did. It was so fascinating.
I was a tech at CalComp in the early '70's. They built drum and flatbed plotters. One of the test plots we would run was called "IC Mask on Strippable Film" (RubyLith)
Oh wow, I enjoyed this quite a lot. I learned on an RCA 1802 (bc I couldn't afford an Apple ][ or Atari 400 / 800 as a kid) but graduated to the 6502 (and later 68000) a few years later. Though these days I do mostly firmware / embedded and mobile, it's _quite_ nice sometimes to travel back to the early days like this 🙂
I remember reading from the BYTE magazine back then that a reader had discovered that Microsoft BASIC floating point performance on Commodore PET (IIRC) was inferior to some other computer with the 6502. He traced it into the floating point math using seven times rotate left (or some such, forget the exact details and wont bother to get out a pen and paper) instead rotate right. When confronted with that Microsoft (small company back then) responded that some in some early 6502 chips the rotate right 'did not work correctly'. I think this can be (one) reason why or how this rumour started.
Hehe, no wonder you prefer doing threads on microblogging platforms to RU-vid videos... the production quality of this is off the charts! What microphone are you using? Excellent explanation. I learned something this morning :)
Good to know my mid 80's VIC20 had a non buggy processor in it! Also kudos for making such a high quality yt video explaining all these details. Liked and subscribed (and commented - anything [free] I can do to boost your channel)
I wish you would do more videos. Sometimes, in interviews, the interviewers will talk over you just as you are about to say something interesting and they change the subject. This is not inherently terrible, it's their show they are running but I would like to see more of this type of video where you are allowed to speak everything without being interrupted and with you in full control of the topics.
Way cool explanation and video. As is often joked about MS bugs 'Its a feature' When I was a teen working with the 6502 in my Atari 800, I always thought it would be somewhat simple to make a "monster 6502" out of 7400 series logic gates and wanted to make one. The closest I've come across and built was Ben Eater's 8 bit breadboard CPU/computer though my version of his project has all registers 8 bit instead of some registers only 4 bit with instruction code/operand sharing the same byte split into two 4 bit sections.
2:03 - "And at this price an enterprising young engineer" It's Woz, right? 2:21 - "That engineer was Steve Wozniak" I mean was that supposed to surprise us?
Ah, Rubylith. The reason masks in Photoshop are red. Fascinating story. Things can get missed when cutting down a complicated project in a hurry. Congratulations on your Monster 6502. Does it run fast at all? Pretty lights!
This was a very well-made video. I think the only question I have in my head after watching is why did Chuck think he didn't need the ROR instruction? Is there are reasonably efficient way to get the same result? I look forward to more videos from you. Thanks.
I remember working on the 6502 back in the day (circa 1983/4) and we found a bug in the chip. I don't recall the exact nature, but it involved a problem when an address was in an operand and the operand went across a page boundary. Basically, when it was fetching the third byte (so the second byte of the operand) it didn't increment the upper byte of the address, resulting in the byte being read from a page lower than where it should. Very obscure, almost impossible to spot / reproduce and I seem to recall losing weeks of development trying to track down a bug in our code, that wasn't a bug in out code! The fix was to put in a NOP or two to prevent the operand spanning the page boundary. Fun times....
"Note: If you get clocks for less than $5, buy the mcs6501 and give your purchasing agent a bonus." This ad was/is absolute gold! Sadly for my purchasing agent I still will not be paying him as the alarm clocks he buys are over $10. Not sure what alarm clocks have to do with microcontrollers... also makes me wonder why I have the same purchasing agent buying me microcontrollers and alarm clocks, could it possibly be because he is me? Nah, what would that have to do with things?
It was intentional at one level (they made no effort to put it in). It was unintentional at a higher level (the market required it). The 6502 was designed to implement the minimum set of instructions that the market wanted to see in a cheap processor. On the whole, I call this a bug. If they had labelled the fixed cpu mos6503, then it was not a bug.
I think Woz could actually get a 6800 for free from HP as an engineer there to use even on personal projects. The Apple I board has straps that allow that processor to be used instead of a 6502, though I haven't heard of anyone actually doing that. But since he wanted other people to build his design, the price of the CPU would matter quite a bit to them.
Why is a register called a register? Is it because of the everyday meaning where you bring two things together and adjust one to match the other, then allow them to separate keeping the information? Eg, put face to name at conference registration, match water levels in a river lock, match diagram orientation via registration marks, and therefore, match voltage levels especially similar to a river lock?
I recall an ACTUAL silicon bug in the original 8080, possibly fixed in the 8080A, but the AI robot Overlord couldn't find any details about it: "The 8080 and 8080A microprocessors were two closely related microprocessors developed by Intel in the 1970s and 1980s. The 8080A was an enhanced version of the 8080, with some minor improvements and bug fixes. The 8080A had a faster maximum clock speed than the 8080, with a limit of 3.125 MHz compared to the 8080’s limit of 2 MHz. The 8080A also had a few additional instructions that were not present in the 8080." Also: "There was a bug in the 8086 microprocessor, which was released in 1978. The bug caused the processor to return incorrect results for certain division operations. Intel fixed the bug in later versions of the 8086." There is a very fine line between a feature and a bug, but those were bugs.
Excellent video, Eric! And excellently illustrated. Especially the info about the layout process was interesting. Oh I hope that manufacturing of very small gate count, small volume custom ICs will become viable in the future and we will see people producing new 6502s and similar in tiny packages. :)
Has anyone done ported the reverse engineered 6502 to a modern place and route system? It'd be interesting to see how much better it could do on area. I guess you'd want to model the original process to get a fair comparison.
there are bugs in the 6502 that carried through the later 6510's used on the C64 such as the indirect jump bug. Electronic Arts used the X2 crash "bug" as part of their copy protection on certain games.
Like the show. I had no access to the rev A back then so I did not know about the rumor, I started with the 8080 that I bought from radio shack for under $20. My main issue was hand assembly of 6502 code, was with relitive jumps. You had to make sure it didn't cross a page boundry and you didn't necessarily know the location as you typed it. Luckily most of the linkers could catch it so as long as you were not doing by hand no problem.
I'm pretty sure relative jumps could cross page boundaries, but they could not jump further that one signed byte. Crossing a page boundary may taken an extra cycle, not sure about that.
@@ronald3836 I was referring to the NMOS version of the part. Relitive jumps did not increment the Page when reading the second byte so the jump was wrong, if it split the page boundry. I believe it was corrected with the CMOS version. If you were using an assembler with a updated linker it would not allow it to happen. You can find a better description on line. I only encountered it while hand debugging( hand assembly)
@@user-fr3hy9uh6y Interesting! My experience is with the 6510 from the C64, but that is NMOS. I thought the 6510 was essentially identical to the NMOS 6502. The C64 programmer's reference guide confirms the 2/3/4 cycles for branch not taken/taken/taken+crossing. So if relative jumps could not cross a page boundary in the early versions, I suppose it was fixed quite early on? (I doubt the 6510 really is different here, but I could be wrong.)
I'm still confused. The ROL instruction would require a similar latching. How did it work since it presumably did? Also, you mention that the original ROR lines weren't tied to transistors and that your best guess is that they originally invoked a completely different instruction that was ripped out but that the lines themselves weren't. How is it easy to rip out the desired logic from those lines but not the lines themselves? One final comment: There was a 6510 instruction that has a minor flaw in its addressing. I think it was STA [-,X] which had issues when x forced an address lookup across a 256-byte boundary. Was this 6510-only?
I'll never get tired with stories about this CPU. After all, as a kid I spent tons of time programming it and learning how to organize larger ideas into smaller parts encoded with these ascetic commands. What was it - only 56 commands and (impressive) 13 addr modes?
If the ROR instruction had never been added and documented, there would be no reason to call it a bug. But as it is, the very first 6502 cpus cannot run all of today's 6502 programs. For all intents and purposes, they are bugged/broken.
@@ronald3836 That's the catch -- a lot of older software relies on undocumented instructions. Since you said the first 6502's cannot run all 6502 programs --- I'm just saying modern 6502's can run all 6502 programs.
You forgot to mention: Although 6502 programs could be written without ROR using different accommodation depending on the situation, some programmers used a standard sequence of instructions that did the same thing as ROR, probably expanded from an assembler macro named "ROR" -- it was simple to use, but bloated the machine code slightly with the unusual "multi-byte instruction". Programs that used the sequence could run on 6502A or 6502B, but were slower; programs that used the ROR instruction could run only on 6502B, but were faster. Some people pointed out a way to patch their programs, replacing the work-around code with the new instruction, to increase speed (though not reduce program size). I have not seen the ROR sequence nor the typical code that was put in its place (probably something like ROR NOP NOP NOP?). What code sequence did they use to replace ROR?
6502/6510 ROR and ROL rotate carry in and rotate out to Carry, it's not direct circular as some mcu are. e.g. bit 0 to does not move to bit 7, Carry is more like the extra 9th bit. Weird that they included LSR instead of a ROR, as the LSR could been simulated with a CLC.
I'm glad you cleared this up. I hope Adrian's Digital Basement issues an apology to Chuck Peddle for his video "This 6502 processor has a hardware bug"
Surely a lot of that circuitry should be shared with ROL, since that instruction has the same issue with carry as both input and output? Or is ROL implemented differently, for example by using the adder to do ADC A, A, so no need for it? Some RISC load/store architectures omit a rotate left because it can be simulated with the adder. But 6502 rotates can write to memory as destination, whereas ADD cannot.
Interesting, the undertone is also that it was a good candidate for a cost reduction chopping block because it had some complexity coming with it. Thanks for the explanation. I guess modern CPUs check their instructions and all have an actual illegal instruction errors (I assumes it comes witht the capability bits, ands other CPU variants)? And said instruction checking subsystem is more complex than an entire 1975 CPU?
We just started studying the NIOS II architecture designed to be run on FPGAs. This architecture has the ability to raise an exception if an instruction isn't implemented (common being multiply and divide). If the exception handling part is written properly, then the exception handler can be used to emulate the missing instruction without the original program knowing it doesn't exist. I'm not sure about other architectures though.
I'm not a silicon/transistor designer, just a coder with rudimentary knowledge in logic design. What I wonder is why there needs to be a copy of the carry it? If you were to design a shift register using discrete logic you would use a a D-latch for every bit. You don't need extra latches for the other bits so why you need it for the carry which is just an other latch? You just route the carry out to the first latch in, that should do it. If you want to implement a shift right, just add a mux to the input of the first latch that so that it takes in either a zero or the carry bit. Obviously with pure transistors things are slightly different but still I wonder this explanation of a 'copy of the carry' bit ... i
Indeed, if they can shift bits 0-7 into 1-7 and c in one clock, then surely the same can be done with c into 0. ROR applied to the accumulator is two clocks, and probably one is for obstruction decoding. 2 is minimum for the 6502.
Is this also the reason the 6502 and 6510 have the instruction LSR (logical shift right) and ASL (arithmetic shift left)? The latter shifts bits into bit 7 which is the sign bit, so yes, seems to be named on purpose.
Note that LSL==ASL but ASR!=LSR. The 6502 is missing the ASR insturction, and I guess the market did not complain about it (but did about the missing ROR).
OK, LSR (Logical Shift Right) could have been used instead, but if you are dividing a 16 bit number with LSR, you need extra commands getting the carry flag into the 7th bit of the lowest byte after the shift operation, so not having a ROR instruction was really an issue. It may sound strange, but LSR seems less important to me than a proper ROR instruction.
Yes the LSR could been simulated with a CLC, Simulating ROR would require BCC to skip over the ORA #$80 line, but that would be direct circular ROR that current 6502 ROR is not. So you would need to BCC before the LSR, to two different LSR sections, one that has ORA #$80.
this brought back a memory from 2014... the minecraft mod RedPower 2 has a brilliant 6502 emulator. i was writing a full C compiler IDE (using cc65, scintilla and some direct minecraft integration) to support it. that is, until i ran into a bug with the emulator's SBC instruction mishandling a status register flag.. i even wrote some code to check the compiled C assembly for the issue and inject some extra instructions for a work-around. it worked, but unfortunately meant that some of the 8-bit branch/jump instructions also had to be changed because the extra instructions could push the destination out of 256 byte range, so they would have to be changed to long jumps... got too complicated and i gave up lol.