Тёмный

This Function Destroys Programs: MS-BASIC's VAL() 

8-Bit Show And Tell
Подписаться 56 тыс.
Просмотров 45 тыс.
50% 1

There's a bug lurking in the VAL() function of most early implementations of Microsoft BASIC that has the power to corrupt your program. We demonstrate it on the Commodore 64 and VIC-20, but it's present on many other 6502-based machines, as well as Z80 and even 6809 computers as discovered by many helpful people on the internets. We then attempt to explain why the bug happens: it's the result of a kind of nasty hack using Microsoft BASIC's evaluation routine combined with the particular edge case when an overflow error occurs.
The VIC-20 Quick Reference Handbook by Jeff Daniels: jeffdaniels.itch.io/vic-20-qu...
Tool Kit: BASIC archive.org/details/Compute_s...
Allen Huffman's blog: subethasoftware.com/2023/08/1...
All the X-Tweets:
/ 1692238191720600008
/ 1692313328209559923
/ 1693074790855102861
/ 1692300818815283262
/ 1692368840569851958
Closing song lyrics "Call An Awesome Superhero" by Robin's son, aged 5.
To support 8-Bit Show And Tell:
Become a patron: / 8bitshowandtell
One-time donation: paypal.me/8BitShowAndTell
2nd channel: / @8-bitshowandtell247
Index:
0:00 A bit about VAL()
2:31 + Addition or Concatenation?
4:42 10 A=VAL("1E39"):REM SHOW BUG
7:53 The VIC-20: VAL(TI$)
11:45 Tool Kit: BASIC Explanation
13:40 About the Overflow Error? 39 digits
15:14 VAL() needs a null-terminated string
18:30 Machine Language Monitor time
22:22 Thanks to my patrons and X-Twitter pals!

Наука

Опубликовано:

 

3 авг 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 408   
@PaoloBergamo
@PaoloBergamo 7 месяцев назад
Where was RU-vid 40 years ago when I needed it?
@MichaelDoornbos
@MichaelDoornbos 7 месяцев назад
That was great. I suspect it took quite a bit longer than 20 minutes to work that out. Even system creators in 1977 had to say “we better check for this, you just know some user is gonna try it”
@rocketman475
@rocketman475 7 месяцев назад
Doornbos = Thornbush
@MichaelDoornbos
@MichaelDoornbos 7 месяцев назад
@@rocketman475 I'm aware.
@AndreasDelleske
@AndreasDelleske 7 месяцев назад
Error culture had not been a thing: Will the code still fit into ROM had been the main consideration.
@jnharton
@jnharton 7 месяцев назад
@@AndreasDelleskeIt's also worth noting that "crashing your computer" was less of a huge problem back then, especially when all you are doing is running one program at a time and everything else it relies on is in ROM. Still super annoying of course, but all you needed to do was hit reset to get back to a clean slate and the BASIC prompt.
@marisakirisame867
@marisakirisame867 7 месяцев назад
yes and me do this and it corrupted
@stevethepocket
@stevethepocket 7 месяцев назад
Something interesting I finally realized, when you showed a reversed "@" being used to represent the ASCII 0 null character in the memory dump: The reversed symbols used by BASIC's quotes-mode to indicate control characters weren't chosen at random; they're derived by taking the PETSCII code, looking up the _screen_ code for that number, and adding either 128 (if it's low enough) or 64 (if it's not). This happens to work because all of the control codes either under 32, or between 128 and 192. So you get normal ASCII characters for the former and key-front symbols for the latter. Clever of Super Snapshot's monitor to extend that scheme to codes that can't be typed.
@D0Samp
@D0Samp 6 месяцев назад
This reminds me of PHP authentication bypasses where password hashes starting with "0e" were compared using the == operator, which recognized both sides as the numeric value 0 with an exponent.
@GeoffSeeley
@GeoffSeeley 7 месяцев назад
And this is why we build unit tests now. Thanks Robin!
@chromosundrift
@chromosundrift 7 месяцев назад
... with exception flows!
@mudi2000a
@mudi2000a 7 месяцев назад
Very interesting! As soon as you read from the book how the VAL function works I guessed correctly how the bug happens. I think this channel really offers the best in depth videos about the old Commodore machines, you have such a deep knowledge and also very good explaining skills!
@jasejj
@jasejj 7 месяцев назад
The best implementation of the VAL statement I've seen is in Sinclair Spectrum BASIC. You can put full mathematical formulae into the string, and it will evaluate it as the program is run. So something like: 10 INPUT A$ 20 FOR X=0 TO 256: PLOT X, VAL A$: NEXT X And enter say X*2, or even ((SIN X)*80)+80 (to centre the plot on the screen vertically and expand it so it fills the screen), this simple program will plot any mathematical function on the screen. Obviously there's no sanity checking and it will return errors for some graphs, but it's an insanely powerful tool which I don't believe was ever documented by Sinclair either. I believe what is happening is that Sinclair is re-using the BASIC interpreter's line parser to implement the VAL statement itself. This bug of course does not exist in ZX BASIC as the code is unrelated. Regarding the VIC's version of this bug, I wonder if it behaves any differently with an 8k memory expansion as the memory map changes.
@SalivatingSteve
@SalivatingSteve 7 месяцев назад
Reminds me how in C++ you can feed system terminal command strings directly into cin.
@NuntiusLegis
@NuntiusLegis 7 месяцев назад
On the C64 and other CBM machines, you can use programmed direct mode to enter formulae at run time.
@marksilverman
@marksilverman 7 месяцев назад
​@@SalivatingStevefun fact: you can also overflow memory in C++ 😅
@CartoType
@CartoType 7 месяцев назад
I used the Sinclair VAL function to implement the cell recalculation for a spreadsheet program I wrote and sold back in 1982. I was very lucky and made enough in royalties to pay the deposit for my first flat.
@silkwesir1444
@silkwesir1444 6 месяцев назад
@@NuntiusLegis Well it's not really run-time, is it? It's a clever cheat (using invisible text on the screen and poking stuff into the keyboard buffer manually), where it seems you are still in the program but actually you have stepped out for a moment. And if the user doesn't play nice with it, it all goes kaput.
@retrozmachine1189
@retrozmachine1189 7 месяцев назад
Bug is present in the L2 ROM BASIC in TRS-80 MI/III. Some 40 years on from owning these machines and I'm still finding things out about them. Not that I really did a lot with BASIC back then.
@GadgetUK164
@GadgetUK164 7 месяцев назад
Very interesting bug! Always a joy to watch your deep dives on the C64 and VIC-20 =D
@jack002tuber
@jack002tuber 7 месяцев назад
Fascinating. This shows the structure of BASIC in memory. There's a line number in there, there's a pointer to the next line, theres special data for the last line. Neat stuff. Type a program, go into the ML monitor and look around. It would be simple to store a ML routine right in basic to use, provided no one goes in and changes the program afterward
@renakunisaki
@renakunisaki 7 месяцев назад
I was imagining hiding a copyright string...
@whitslack
@whitslack 7 месяцев назад
Another of Robin's recent videos explicitly described the structure of BASIC lines in memory. I think it was the one about why the number of BASIC bytes free shown in the startup banner is exactly the number it is.
@skilletpan5674
@skilletpan5674 7 месяцев назад
Yes, as a 10 or 11 year old i once wrote a simple apple soft parser. You could usually find a tokin list or table that had the number of each tokin (keyword) and with a little more research or messing around you can find out the starting address of basic and start to work out how to decompile it. Sometimes you'd want to move the address of your basic program. Things like copyprotection or maybe you wanted some kind of disk loader.
@tramadol42
@tramadol42 7 месяцев назад
Thats how we used to put ML routines into Basic programs (mostly in REM lines)... Ahh memories...
@EyMannMachHin
@EyMannMachHin 7 месяцев назад
I do remember a math function plotter in Commodore Basic on the VIC-20, that let you enter any function (eg. x^2+4x) and would use character set manipulation to actually plot the function like it's hires graphics. Instead of you having to change the line where the function was in the code and you having to rerun the program every time you changed the function, you would simply input the function as a string. It had one REM line with the maximum characters a line could have and it would parse your input and put it into that space as code to run in the plot loop. Really nifty thinking there.
@Dwedit
@Dwedit 7 месяцев назад
On an unrelated note, VAL is also broken if you run QBasic in MSVC builds of DosBox. Asking for VAL("5") gives you 4.99999... instead of 5. MSVC (Microsoft Visual C++) 64-bit compilers do not support 80-bit floating point numbers, and do not support inline-assembly that would use the legacy x87 floating point instructions that support the 80-bit numbers.
@williamdrum9899
@williamdrum9899 7 месяцев назад
Why are they using floats lol
@neilscales
@neilscales 7 месяцев назад
​@@williamdrum9899 AFAIK all values in msbasic are floats internally.
@williamdrum9899
@williamdrum9899 7 месяцев назад
@@neilscales Ewww
@neilscales
@neilscales 7 месяцев назад
@@williamdrum9899 when you only have a few hundred bytes (not kilobytes) to implement floating point mathematics (including sin/cos/tan/sqrt etc), on an 8bit cpu that can only add or subtract, you don't have the luxury of separate routines for integers. I think people forget how clever Bill Gates and Steve Wozniak were at fitting in complicated code into tiny places.
@williamdrum9899
@williamdrum9899 7 месяцев назад
@@neilscales Fair point. I guess it was make everything a float or don't support them at all
@PeranMe
@PeranMe 7 месяцев назад
Thanks Robin, another great video! Thank you for taking the time to dig into these mysteries!
@retrozmachine1189
@retrozmachine1189 7 месяцев назад
It's not the only bit of oddness that Microsoft did in BASIC. I have a vague memory of Z80 instruction reuse in the TRS-80's L2 ROMs. Pass through the ROM via one path and you get a particular instruction but another part of the ROM jumps into the same location + 1, halfway through the same instruction sequence, which decodes to a different instruction. It's probably present in any Z80 (and perhaps 8080) MS BASIC of the same version.
@danielmewes
@danielmewes 7 месяцев назад
The MS 4k 8080 Basic has such a behavior as a bug when the memory check overflows. It overflows if you run it on a machine with the full 64KB of RAM. It causes it to go into a loop where every second time, it executes the loop with a one-byte offset that alters the meaning of the instructions, and causes it to jump differently to get back to the correct offset again next time, going back and forth between the two ways to interpret the same code.
@melkiorwiseman5234
@melkiorwiseman5234 7 месяцев назад
This was done in particular for the error printing routine. The A register needed to be loaded with the error number and all other registers would be ignored. Using a LD A,#nn instruction followed by a JR $nn instruction to skip the other LD A instructions would take 4 bytes per error message. The TRS-80 BASIC reduced that to 3 bytes by "hiding" the 2-byte LD A,#nn instruction inside the data for a LD HL,#nnnn instruction. Jumping to a particular LD A instruction would load the appropriate error number into A and then the CPU would do all of the following LD HL instructions, which would have no effect on A, before hitting the routine to actually print the error message. I called that "instruction hiding".
@damouze
@damouze 7 месяцев назад
There are quite a few of those in the MSX BIOS and BASIC ROMs as well.
@flatfingertuning727
@flatfingertuning727 7 месяцев назад
@@melkiorwiseman5234 A really nice technique on the 8080/Z80 which was used in the C128's ROM but not that of the C64 was placing arguments for a called routine immediately after the CALL (or JSR) instruction. Indeed, the 8080/Z80 has an instruction that feels like it was designed for that purpose, simultaneously reading the stacked PC into HL while storing the old HL to the stack. A function can thus read data from HL, incrementing HL after each byte, and then restack HL while retrieving its old value.
@c128stuff
@c128stuff 7 месяцев назад
@@melkiorwiseman5234 A very similar approach is used in the Commodore kernal for io and file related errors to save one byte per error. You do a jmp to the right lda # instruction, which is followed by a bit instruction which has the next lda instruction as opperand, effectively skipping it.
@tabachanker8716
@tabachanker8716 7 месяцев назад
Super interesting! Never knew this bug existed before. 2 things I tried after this video on a c64: 5 a$="1e39"; 10 ? val(a$):rem bug?. I thought there would be no bug, since the val() routine should use the variable a$ in memory, right? Nope, the bug appears on line 5 now! The val() routine uses the string defined on line 5! Then i tried to build a string, so I changed line 5 for: 5 a$="1"+"e39". Now val() uses the string stored in memory and the bug doesn't appear on any BASIC line. This tells me that C64 BASIC may store string constants with a pointer directly on the line where its definition is provided. Only when the string is built that it uses memory space in the string heap.
@melkiorwiseman5234
@melkiorwiseman5234 7 месяцев назад
All versions of BASIC that I've used store strings in this way. If it's a string literal in the program, the variable pointer contains the length of the string plus the address where the string literal is stored inside the program.
@flatfingertuning727
@flatfingertuning727 7 месяцев назад
@@melkiorwiseman5234The C128 BASIC uses one 64K bank of memory for holding the user program, and the other for variables and strings. String literals always have to be copied to the second bank.
@isaactanner6403
@isaactanner6403 7 месяцев назад
Hi all !! I tested it in the WebMSX and worked fine. No bugs and no errors… the value of “A” variable is (print out) “1E+39”, exactly this way… Cientific notation from biggest numbers… I have the real machine and will try it next week !! My MSX is Brasil Model Gradiente MSX 1, transformed to 2 with 256k of mapper (64 resident)
@atomcode
@atomcode 6 месяцев назад
Dear Robin! I really appreciate your videos! Many thanks for your efforts in presenting things in such detail. It is always interesting. 👍
@daniellomblock6216
@daniellomblock6216 7 месяцев назад
Love your videos - so much detail and background info. Subscribed!
@8_Bit
@8_Bit 7 месяцев назад
Thank you!
@HelloKittyFanMan
@HelloKittyFanMan 7 месяцев назад
Wow, that's a pretty weird bug that I had no idea was running around inside Commodore 64s! And then even weirder that basic BASIC (ha!) is such a direct port (but with a few proprietary things modded in by the computer brands) that not only are ports in different models of Commodore affected, but in different brands using the 6502 variants, but EVEN in systems using whole different styles of CPU (instruction sets, etc.) like the Zee-80, and it's been overlooked in so many cases until we got to some level of consumer IBM. So of course it's gonna be in Altair BASIC as observed via terminal, too.
@subethasoftware
@subethasoftware 7 месяцев назад
I was getting some Deja Vu here, thinking “didn’t Robin already post a video about this?” I had to see what prompted me to try it on the CoCo - your Twitter post. Nice to see a video!
@RaquelFoster
@RaquelFoster 7 месяцев назад
Nice work! The part that seems crazy is that somebody actually wrote a book which describes the BASIC interpreter in plain English with literal per-instruction granularity!
@sonicunleashedfan124
@sonicunleashedfan124 7 месяцев назад
Apple 1’s basic has it worse. Line 1 completely disappeared after running this program
@tedthrasher9433
@tedthrasher9433 7 месяцев назад
I can confirm that the bug exists in the AppleSoft basic that is in rom on my ROM01 Apple IIGS (1986) and in ProDOS BASIC 1.5 (1992). On an Apple IIGS this corrupts all kinds of things. When the shutdown command is issued from GS/OS after testing for this bug in AppleSoft, it causes an overflow error on the smart port and the computer completely hangs. Even a soft reset doesn’t work. GSoft BASIC (1999) does not have the bug, but trying to print A returns “inf.” There is no overflow error when the VAL statement is evaluated in GSoft BASIC.
@mikegarland4500
@mikegarland4500 7 месяцев назад
Interesting video as always. Thanks!
@gcewing
@gcewing 7 месяцев назад
At least it's not as egregious as the arithmetic bug in the BASIC interpreter I wrote for my kit-built Z80 system, whereby adding any negative power of 2 to itself gave 0. It was a surprisingly long time before I noticed it, and it didn't happen to cause a problem in any of the programs I wrote, so I never got around to fixing it. (The whole thing was hand-written and hand-assembled on the back of old line printer paper, so making any changes to it was rather a pain!)
@rotordave81
@rotordave81 7 месяцев назад
Merry Christmas Robin. I'm looking forward to my annual watching of your C64 Christmas video come Monday :) Thanks for another year of knowledge, fun and pedantry (sorry, I repeat myself). (This video's title is a little clickbaity, surely it *can* destroy programs?)
@8_Bit
@8_Bit 7 месяцев назад
Thanks Dave :) Does the title seem dishonest? VAL() can, has, and does destroy programs as demonstrated in this video. Does it have to destroy programs every time it's used to make the qualifier unnecessary? Hmmmm.
@8_Bit
@8_Bit 7 месяцев назад
Now that I made the thumbnail does it seem better? "This Function (When Used In The Way Shown In The Thumbnail) Destroys Programs: MS-BASIC's VAL()"
@8_Bit
@8_Bit 7 месяцев назад
You were right about the clickbait, and it's really working. This is the fastest one of my videos has got to 10,000 views in ages!
@rotordave81
@rotordave81 7 месяцев назад
@@8_Bit In that case, great! I was just being pedantic :) I guess, after all, if you say a telephone sanitiser sanitises telephones, it doesn't mean they sanitise all telephones. Maybe you could even title it "this function reduces distraction" since it destroys programs.
@HelloKittyFanMan
@HelloKittyFanMan 7 месяцев назад
Interesting video, thanks! Happy Christmas, Robin! 🎄🎅
@Thiesi
@Thiesi 7 месяцев назад
Great video as usual, and please thank your son for providing the lyrics for this banger of an outro.
@allenhuffman
@allenhuffman 7 месяцев назад
When you finally get around to watching a video, and find yourself referenced...
@theuglycamel8122
@theuglycamel8122 7 месяцев назад
Great Video! I had to give it a try on my Access Mach 5 cart and both it and the extended basic on floppy have the bug. The cart is labeled 1985!
@DavidAsta
@DavidAsta 7 месяцев назад
I'm running MS BASIC v4.7 on my homebrew Z80 computer. It's a modified version of the MS BASIC from 1978 that came with the NASCOM 2 computers. The whole disassembly was published in a magazine in 1983. And I can confirm it has the bug. Thanks a lot Robin, at least now I know there is a bug to be solved 😀 BTW, I've also tested with my MSX (MSX BASIC v1.0 from 1983). The bug does not happen. PRINT A prints 1E+39
@gcewing
@gcewing 7 месяцев назад
This suggests that numbers have a wider range in your version. To test for the bug properly, you would need to use a large enough number to trigger an overflow.
@Curt_Sampson
@Curt_Sampson 7 месяцев назад
@@gcewing I've tested with 1e69, which does overflow and print the error message, but the bug is not there. MSX-BASIC is based on v5.x of Microsoft BASIC, which is substantially different in terms of parsing from the versions up to 4.x. (Among other things, numbers are parsed differently because they're tokenised.)
@Lord-Sméagol
@Lord-Sméagol 7 месяцев назад
I found this video yesterday, great detail! Having disassembled Nascom ROM BASIC all those years ago, and fixing many of the bugs, and optimizing it, I just HAD to fix this bug :) I think this method should be easy enough to implement on other BASICs (6502, 8080, Z80, 6800, 6809 ...) Make VAL save the address and original byte somewhere safe. If VAL succeeds, clear the saved byte to indicate no further action is needed. Change all the jumps to the overflow error to go to a check routine, so when an overflow error occurs: If the byte saved by VAL is zero, simple continue to the overflow error, otherwise (VAL didn't complete), so restore the byte using the saved address. Clear the saved byte to signal all is done and continue to the overflow error. Not too bad: 26 bytes of Z80 to fix it. :) I also looked at MS BASIC-80 [5.21]. It doesn't have a problem; Overflow is not a fatal error, it gives a warning, so VAL repairs the 'damage' it did.
@greggoog7559
@greggoog7559 7 месяцев назад
Very interesting. I totally expected you to fix the bug in the BASIC interpreter at the end 😄 regarding why they did it this way -- I think it really does make sense from a CPU cycles point... a program with a lot of VAL()s would probably be significantly slower with all the copying going on. The best fix would probably have been to make an exception in the error subroutine to put the quote character back.
@flatfingertuning727
@flatfingertuning727 7 месяцев назад
A really huge portion of the time spent in VAL(), or floating-point numeric evaluation in general, is spent in a sequence of instructions which perform a sign-extending right shift in situations where the operand is going to be positive. Patching that routine to simply use an LSR instruction will cause floating-to-integer conversions to fail for negative operands, but make it much faster. Creating a separate routine that uses LSR would greatly improve the performance of many BASIC programs. The time required to make an extra copy of a string pales in comparison.
@greggoog7559
@greggoog7559 7 месяцев назад
Interesting, thanks... I'm by no means an expert on C64 BASIC implementation details or 6502 assembly in general. I did meddle with it alot from the age of 7 or so 😄 but I'm now in my 40s and all I do now is occasionally use an emulator. But it does make sense that the actual floating point evaluation is so much slower that the string copy wouldn't matter. Thanks for pointing it out! @@flatfingertuning727
@chromosundrift
@chromosundrift 7 месяцев назад
Wouldn't a good fix be to store the endbyte or length of the string on the stack or in zeropage, rather than splatting a zero terminator in what could be a basic listing? Other solutions that come to mind don't seem as simple. I don't think putting a fix in the error routine would be ideal since the quote problem is specific to VAL()'s parsing of a number rather than all causes of overflow error, e.g. the result of arithmetic.
@davidellsworth4203
@davidellsworth4203 7 месяцев назад
Wouldn't the best fix just be to not do any NUL terminator overwrite at all, and still read the string in-place? The character that would be overwritten would always be a non-numeric character anyway (a closing quote) which terminates VAL's evaluation anyway, wouldn't it? If the kind of fix you described were used, it might still leave a race condition in which pressing Ctrl-Break with perfect timing could leave the NUL there, corrupting the program... unless all errors handlers would restore the overwritten character, or Ctrl-Break handling is done by polling instead of an interrupt.
@flatfingertuning727
@flatfingertuning727 7 месяцев назад
@@davidellsworth4203 Given the string "GET A$:GET A$:GET A$", if the statements read the characters "X", "2", and "1" in that order, the three bytes starting at the address of A$ would be "12X", with no other intervening bytes.
@ge97aa
@ge97aa 7 месяцев назад
I found that bug in my own reverse engineering of the BASIC ROM. Gotta make sure you clean up your mess before allowing BASIC to raise an error. There are few places in the BASIC ROM where not doing so causes problems.
@chromosundrift
@chromosundrift 7 месяцев назад
I wonder how feasible it is to run cleanup code for these types of exception cases. I think it may be better not to have situations that require cleanup. Data destruction is a pretty big deal. I'd be curious how they actually did fix it.
@ge97aa
@ge97aa 7 месяцев назад
It wouldn't have been trivial. You're right, it's better not to require the cleanup in the first place.
@flatfingertuning727
@flatfingertuning727 7 месяцев назад
@@chromosundriftIn most cases, the assumption was that if a program died because of an error, it wouln't matter if things were left in an awkward state, and some of today's compilers go out of their way to exploit such assumptions. When processing a function like "unsigned mul(unsigned short x, unsigned short y) { return x*y; }", it may treat the code as an invitation to identify what inputs to the calling function would cause x to exceed INT_MAX/y, and omit from machine code for the calling function any portions (such as array-bounds checks) that would only be relevant if such inputs were received.
@Green_House
@Green_House 7 месяцев назад
The Val function stops reading the string at the first character that it can't recognize as part of a number. Symbols and characters that are often considered parts of numeric values, such as dollar signs and commas, are not recognized.
@AiOinc1
@AiOinc1 7 месяцев назад
Interesting, this is also the cause of a stack leak I suspect. Does the character ever get pulled back off the stack? Do this enough times and it might overflow the stack.
@8_Bit
@8_Bit 7 месяцев назад
I just experimented with that now and it seems there's no stack leak, so it must be getting cleaned up somehow.
@gcewing
@gcewing 7 месяцев назад
The 6502 stack wraps around in a 256 byte area, so even if there was a leak you might not notice. But it's likely that the stack pointer gets reset whenever BASIC bails out due to an error anyway.
@lostwizard
@lostwizard 7 месяцев назад
On the 6809 version at least, the stack is completely reset on error. That clears any GOSUB/RETURN frames, FOR loop records, expression evaluation intermediate results, and anything else something has stashed on the stack.
@GerbenWijnja
@GerbenWijnja 7 месяцев назад
That was very interesting!
@glenm9376
@glenm9376 7 месяцев назад
Great insight thanks. So now I need to boot up just to see what happens when you put the @ symbol in a line.
@stevethepocket
@stevethepocket 7 месяцев назад
It wouldn't do anything because the "@" shown here isn't a literal @; it's a reverse symbol used to represent an untypeable ASCII character. Like the ones that appear when you type a color code or a cursor key or something inside quotes, except there's no key combination I'm aware of that will let you input a null. Though you did get me wondering what happens if I trick the parser into letting me type control codes outside of quotes. I tried it just now and... it's very interesting! I'm going to have to tell Robin about it because it sounds like another fun subject for a video.
@csbruce
@csbruce 7 месяцев назад
1:47 It makes complete sense that the task of parsing a number would be reused. In fact, it's odd that Commodore BASIC has separate parsing for line numbers. However, there are artifacts from reusing the BASIC-code parser, since, as you show, it skips over spaces. 5:35 Yeah, that's an ugly hack - just poke a $00 in after the string and don't bother to clean it up on error. 8:39 This would seem to be more of a problem on the expanded VIC-20 and C64. On the unexpanded VIC, the $00 is tossed into the memory right after the evaluated sting, which happens to be screen RAM, and it works okay. But on an expanded VIC, following the RAM is normally empty space and on the C64, it's ROM, so the parsing behaviour really depends on what's read out of that location. What if it reads a digit out of that spot? It's interesting that evaluating the TI$ keeps reusing the same spot at the top of the string heap without needing a garbage collection. 10:58 That's unexpected. I thought string literals were just referenced to the raw BASIC code or input-buffer space where they reside. 20:20 What happens if you try to edit that line? BASIC should be confused about the length of that line and make the garbage after it into a new line. 21:58 A hack way to solve it would be to store $22 instead of $00. BASIC wouldn't get mangled, though the screen or heap might, but they're volatile anyway.
@8_Bit
@8_Bit 7 месяцев назад
Aha, on the C64 location $A000 (ROM) contains $94 which VAL will just ignore (and quit evaluating). Fortunate! If there had been a $34 there (for example) then we'd see some interesting results. String literals in immediate mode seem to be put on the heap immediately. Perhaps that's because doing something like A$="TEST" in immediate mode requires it, so rather than distinguishing based on the use, it's only checking if we're RUNning or not.
@fllthdcrb
@fllthdcrb 7 месяцев назад
"What happens if you try to edit that line? BASIC should be confused about the length of that line and make the garbage after it into a new line." I just tried that. Something like that happens. Except if I edit the affected line itself, it just moves all the real lines after it to be immediately after the edited line, resulting in the garbage just disappearing. It's only if I edit any _other_ line that the garbage becomes its own line, including a garbage line number, but minus the first two bytes that get overwritten with a proper next-line pointer.
@flatfingertuning727
@flatfingertuning727 7 месяцев назад
@@8_BitIn Applesoft, on a 48K machine when not using DOS, RAM is followed by the value of the keyboard input byte, with the high bit set if it hasn't been read yet, but cleared if it has. If a program performs e.g. GET A$:PRINT VAL(A$) and types a digit, the digit will be reflected many times in the output value.
@flatfingertuning727
@flatfingertuning727 7 месяцев назад
Parsing line numbers directly as integers avoids the need to perform a floating-point-to-integer conversion after the fact. What's more curious is that the code wasn't used to handle the leading portion (which might be the whole thing) of floating-point numeric constants. Parsing a constant like 13579 requires converting the digit 1 to a floating-point value by shifting it left 7 places and adjusting the exponent, adding that to zero, then multiplying by ten by incrementing the exponent, copying that value to the second floating-point accumulator, adding two more to that exponent, and adding the two floating-point accumulators. then converting the digit 3 to a floating-point value by shifting it left 6 places and adjusting the exponent, adding that to the previous value, multiplying that result by ten, converting the digit 5 to a floating-point value by shifting left 5 places and adjusting the exponent, adding that to the previous value, etc. Processessing as much of a conversion as possible using integer math, converting that result to a floating-point value, and then using the general-purpose recipe to handle anything that was left over, could likely roughly double the performance of a lot of code that uses many floating-point constants.
@csbruce
@csbruce 7 месяцев назад
@@flatfingertuning727: When dealing with a manually entered line, what does taking an extra millisecond matter? I was thinking more of a mode where the floating-point parser disallows all non-digit characters ("+", "-", "e", ".") (since you wouldn't want «10E=.5» to be misinterpreted). The biggest speedup with numbers would be to represent integers as integers for simple operations. I.e., have a special exponent like $00 to mean the mantissa holds a 16-bit unsigned integer. Most variables are small integers, especially in BASIC games. Operations like load, store, add, subtract, and convert to "integer" would be done with uint16 arithmetic, and if an over/underflow occurs or any other operation is requested, the FAC is converted to floating point first. You'd also want number parsing and INT() to detect if the result is a uint16.
@croysk
@croysk 7 месяцев назад
Great stuff!
@codahighland
@codahighland 7 месяцев назад
I have to wonder why the substitution was even necessary. A " character, being non-numeric non-whitespace, would have terminated the routine anyway.
@Lord-Sméagol
@Lord-Sméagol 7 месяцев назад
I just fixed the bug in Nascom ROM BASIC 4.7 by having VAL save the byte and address so that an Overflow Error intercept can repair it. But seeing your comment, I just tried stopping it clearing the " ... and initial tests show that it is behaving properly ... another conditional assembly option to add to my source code :)
@Lord-Sméagol
@Lord-Sméagol 7 месяцев назад
Update: some more testing: temporary strings don't appear to have termination bytes: I just tried A$="12":B$="34":? VAL(B$) ... and got 3412 !!!
@rager1969
@rager1969 7 месяцев назад
I don't know if you have any control over where ads are placed, but sometimes YT videos pop up at magical places. For example, in this video when you ran the overflow example at 14:48, it went to an ad, as if the bug is so powerful it forced an ad. It made me chuckle.
@RedPillRachel
@RedPillRachel 7 месяцев назад
In Amstrad CPC's Locomotive BASIC, it does exactly this, as an integer or float. A similar function, ASC, returned the ASCII code of a single char.
@CRCO1975
@CRCO1975 7 месяцев назад
I tried this in TI Extended BASIC on the 99/4A just for grins, knowing it isn't a Microsoft derived BASIC. The TI allows for values up to 1E127. 1E128 results in an overflow. (The computer only displays exponents up to 99, but maintains them up to 127.) Extended BASIC was used because TI BASIC doesn't allow for multiple statements per line. I learned a few things by accident doing this (or remembered something I knew long ago). Overflow errors in TI BASIC don't stop a program from running but just print a warning message. Extended BASIC allows that to be overridden and actually can stop on a warning if you tell it to do that. Leaving off the quotes around the string value in VAL caused interesting behavior: 10 A=VAL(1E128)::REM REST OF LINE 20 REM LINE 20 >RUN * WARNING: NUMERIC OVERFLOW IN 10 * STRING-NUMBER MISMATCH IN 10 So it parsed the value first, declared it an overflow, then determined I hadn't entered a valid string into the function and stopped the program. 2 error messages for the price of 1!
@timewave02012
@timewave02012 7 месяцев назад
You know you've spent too much time thinking about IEEE 754 floating point when you immediately recognize the significance of 1e39. I think the first floating point bug I fixed was when my employer's ATE software suite started handling NaN incorrectly after MSVC changed its floating point behavior.
@agpxnet
@agpxnet 7 месяцев назад
Nice catch. It's absurd that the BASIC interpreter have to modify the program (just to terminate the string with a null) to execute the VAL function. This is done by the following code: B7CF A0 00 LDY #$00 B7D1 B1 24 LDA ($24),Y B7D3 48 PHA B7D4 98 TYA B7D5 91 24 STA ($24),Y B7D7 20 79 00 JSR $0079 B7DA 20 F3 BC JSR $BCF3 B7DD 68 PLA B7DE A0 00 LDY #$00 B7E0 91 24 STA ($24),Y First the original character is stored on stack (B7D1 - B7D3), then its turned to zero (B7D4 - B7D5). The routine BCF3 convert the null terminated string to float and then the original character is restored (B7DD - B7E0). A trick to avoid copying the string in another area. In case of overflow, however, the routine doesn't return to the caller (B7DD) so the program line is cut. Another approach would be to pass the termination character to the routine at BCF3.
@NuntiusLegis
@NuntiusLegis 7 месяцев назад
I don't find it absurd becasue it is a fast and memory-saving solution, which is important on an 8-bit system, and that bug is quite unlikely to occur. I use VAL quite a lot (I like to use string-arrays like C-structs to store text and numbers and use VAL if a calculation is needed), so I am glad this works fast enough in most cases, and I never had an overflow error.
@agpxnet
@agpxnet 7 месяцев назад
@@NuntiusLegis As I wrote, it would have been enough to pass the termination character ($22) to the routine and you can avoid this hack.
@NuntiusLegis
@NuntiusLegis 7 месяцев назад
@@agpxnet What do you mean by "termination character"? I think there usually is no such thing in C64 BASIC, that's why a zero is set for this particular routine.
@agpxnet
@agpxnet 7 месяцев назад
@@NuntiusLegis I mean, the character that a string ends with. The routine in BCF3 scans through all locations of the string until it hits a zero (which is why that hack modifies the code). However, in the case of literal string, it doesn't end with a zero, but with the quotation mark character (ASCII = $22). If we could pass, for example via a register, the end-of-string character to the BCF3 routine (rather than being hardcoded as 0), the hack would not be necessary.
@NuntiusLegis
@NuntiusLegis 7 месяцев назад
@@agpxnet So you mean the quotation mark, which only works with literals, but VAL must also process string variables where the string data lies in string memory without any seperating characters, as far as I know.
@joechevy2035
@joechevy2035 7 месяцев назад
This should be a meme based on that famous scene in 'Conan, the Barbarian.' MS-BASIC: "What is best in life?!" C64 VAL Bug: "To be entered into a program.. Ran to completion... And watch the lamentation of the programmers!" 😂
@insectodium206
@insectodium206 5 месяцев назад
This val() bug is present on the Oric as well ( Oric Extended Basic 1.1 (c) 1983 Tangerine )
@IllidanS4
@IllidanS4 7 месяцев назад
It's so surreal to see a function get its argument right from the actual line of the program. That is something you could never see in modern languages.
@NuntiusLegis
@NuntiusLegis 6 месяцев назад
You could never see such caring for the most efficient solution in modern bloatware.
@localroger
@localroger 7 месяцев назад
Implementing VAL for floating point math is actually very tricky if you try to properly catch all the edge case errors, and on those early computers every byte of interpreter code was a byte that wasn't available for the application programmer to use. This was probably a deliberate decision to reduce code size in the interpreter as long as it worked in normal situations.
@chromosundrift
@chromosundrift 7 месяцев назад
This prompts an interesting question, is there a comparably compact implementation of VAL() that doesn't overwrite the string terminator? I haven't tried it but I think storing the length or endpoint on the stack or in zeropage may prove to be as compact. I don't claim it would also be as fast but slightly slower VAL() speed is probably a tradeoff I would prefer.
@localroger
@localroger 7 месяцев назад
I suspect this was just an early, lazy solution that worked "well enough" that it got propagated through generations of releases until probably someone more serious at IBM complained about it.@@chromosundrift
@NuntiusLegis
@NuntiusLegis 7 месяцев назад
@@chromosundrift I prefer the faster version becasue I don't feel the need to overflow my computer with ridiculously high numbers.
@flatfingertuning727
@flatfingertuning727 7 месяцев назад
@@chromosundrift Using the same code for VAL and for processing floating-point constants within a program, while allowing it to use length-counted strings that could be followed by irrelevant digit characters, would require incorporating logic to handle the length count within the code that parses strings stored in code. On the other hand, given how inefficient that code is already (given a loop "FOR I=256 to 511:Q=32768:NEXT", more than 70% of the loop's execution time would be spent evaluating the constant 32768), adding 7 extra cycles to the cost of processing each digit wouldn't be noticeable. The cost of code to process such checks would be offset by eliminating code to save, modify, and restore the byte after the string, so the total code cost would probably be about the same.
@markrosenthal9108
@markrosenthal9108 7 месяцев назад
This was not a bug, it was a feature. In those days, the machine code size of the interpreters was critical. With as little as 4K of memory, the routines for parsing had to be very small if you wanted enough space to enter that Trek game you were typing in from Creative Computing. Bill Gates had some difficult trade-off decisions to make for MS Basic. You could make your conversion logic strict or forgiving and quirky. The one thing it couldn't be was large. A completely different numeric conversion issue happens with floating point numbers even on large computers with compilers. Have you ever seen code testing equality checking for a difference less than something like 0.000001? This comes from the way floating point numbers are represented in binary and associated rounding errors during calculations. This is why Cobol was used for business instead of Basic. With the extra space, you could afford to store numbers as decimal digits. Slower than floating-point, but more accurate.
@NuntiusLegis
@NuntiusLegis 7 месяцев назад
If Cobol doesn't use floating point numbers, I guess it should also replace C and C++ in business.
@markrosenthal9108
@markrosenthal9108 7 месяцев назад
@@NuntiusLegis Cobol has floating point as well. It's best used for scientific and engineering applications. C and C++ eventually addressed the lack of decimal data types with third-party libraries. Java has a standard decimal data type, but it is somewhat awkward to use.
@NuntiusLegis
@NuntiusLegis 7 месяцев назад
@@markrosenthal9108 Science and engineering don't care about rounding errors? I'd rather lose a millionth of a penny in my bank account than live with nuclear power plants having glitches. ;-) I wonder if there is software for the C64 using the CPU's BCD mode to avoid rounding errors in calculations.
@8_Bit
@8_Bit 7 месяцев назад
I can accept the flashing @ symbol on the screen as a feature (I called it a quirk) but the corruption of programs is absolutely a bug, not a feature.
@NuntiusLegis
@NuntiusLegis 7 месяцев назад
@@8_Bit But with the gain in speed and memory with this "hacked" solution every time val() is used and the unlikeliness of this bug to occur, I can accept it as almost a feature.
@stuartmcconnachie
@stuartmcconnachie 7 месяцев назад
This code for handling VAL is all horrendously complicated given that basic already has a built in expression evaluation routine for calculating things like A$ = “HELLO” : A$ = A$ + “ WORLD”. For example BBC BASIC just passes the expression inside the VAL parentheses to the expression evaluator, which returns the result and type (real, integer, string). That means you aren’t limited to string constants and single string variables inside the VAL (you can put expressions there as well), and there’s no need to jump through this adding and removing of termination bytes malarkey, either. Presumably things like VAL(A$+B$) and VAL(“1”+A$) (contrived examples, but you get the point) are invalid in variants of MS BASIC also?
@NuntiusLegis
@NuntiusLegis 7 месяцев назад
It works on the C64.
@gcewing
@gcewing 7 месяцев назад
I'm pretty sure you can put expressions there. The code in question is run after the expression has been evaluated. If you read the comments, it talks about a "string descriptor" on the "temporary string descriptor stack". I think what's happening is that when the expression happens to be just a string literal, the string descriptor resulting from the evaluation points into the program memory. I suspect that if you triggered an overflow using a more complicated expression, some other part of memory would get corrupted that wasn't so noticeable. The zero-swapping thing seems to be a hacky way of re-using an existing routine they had lying around for converting a string to a number. A cleaner way would have been to refactor that routine to take an address and a length instead of relying on zero-termination.
@NuntiusLegis
@NuntiusLegis 7 месяцев назад
@@gcewing It seems Val DOES use address and length with string data on the string heap, the "hack" is used with string data in the code, which, at least in case of a literal not assigned to a variable, does not have a string descriptor with addresss and length I think. Anyway in the following example the overflow error does not corrupt the code: 10 a$="1e+3"+"9": rem comment 20 print val(a$) The concatenation in line 10 forces the storage of the string data on the string heap.
@countzer0408
@countzer0408 7 месяцев назад
What happens if you RUN the program again after the overflow error? Do you get a different error?
@erwinvandenberg1815
@erwinvandenberg1815 7 месяцев назад
I think the simplest way to fix this bug is to add the end of string character as parameter to the val function. So instead of assuming it is a zero terminated string, the eos character defines the end. This way the memory has not to be set to zero and repaired afterwards (!) and the bug will never appear.
@WY.C64-Guy
@WY.C64-Guy 7 месяцев назад
Actually, reading the text, there's no need for the 00 byte... If the function encounters *anything* that is *not* +, -, E, or 0-9, it stops. Just leave the quote character... That's the easiest solution.
@flatfingertuning727
@flatfingertuning727 7 месяцев назад
@@WY.C64-Guy The VAL() function is intended to be usable with strings that are stored in the heap without any other characters around them. Thus, if code performed "GET A$" three times, reading "X", "3", and "1", and then called "VAL(A$)", it would receive a pointer to the start of the byte sequence "13X". Including more information on the heap around strings, however, could greatly improve garbage-collection performance, and probably wouldn't have cost much: if a program has few line strings, adding one or two bytes each wouldn't amount to much, and if a program has many strings but couldn't afford the storage to handle extra couple bytes each, it would likely be rendered unusable by the existing slow garbage collector. If adding the extra pointers reduces free space by 80%, but their existence could cut GC times by 90%, that would be a net win. If pointers were stored MSB first during a GC cycle, and otherwise were stored as a zero byte followed by the length, putting a zero at the end of the heap would have eliminated the need to zero-terminate strings before calling "val".
@erwinvandenberg1815
@erwinvandenberg1815 7 месяцев назад
@@WY.C64-Guy Good one. I made a small test program and it seems to work: 10 FORP=40960TO49152:POKEP,PEEK(P):NEXT:REM COPY BASIC 20 POKE47061,234:POKE47062,234:REM REMOVE STORE 0 30 POKE1,54:REM TURN BASIC ROM OFF
@botsjeh
@botsjeh 7 месяцев назад
@@WY.C64-Guy The Val code is also used to parse numbers from strings that are not delimited by quotes, Also the reading of a non-zero and non-digit (or E or .) will cause an error situation that should force the program to stop in other situations, like reading numbers in a DATA statement.
@beakt
@beakt 7 месяцев назад
So the only computer you showed where it's fixed from that era was the IBM PC. Do you think that IBM engineers inspected the code they licensed from Microsoft before bestowing upon it the IBM label, and, using proper techniques, discovered what no one had thought of? Or maybe just noticed the simple fact that a function which should have only returned a value instead had a global effect (even if temporary), and provisions in the error handling did not reverse the effect.
@choppergirl
@choppergirl 7 месяцев назад
Man this is brutal, we did not know about this bug back in the day. It would of never occurred to me the E for exponent would blow things up, because we rarely used numbers that way and stuck to integer math which was faster in programming and just better programming practice. Always for speed use an integer as much as you can for speed, and even better if it's a one or two byte integer for real speed.
@mudi2000a
@mudi2000a 7 месяцев назад
On the C64 due to implementation details in BASIC Integer was actually slower.
@choppergirl
@choppergirl 7 месяцев назад
@@mudi2000a I seriously doubt that. Regardless, Real women programmed in TinyMon when on the C=64 anyway. It's all integer basic there. The closer you can get to the bare silicon of the machine the better.
@mudi2000a
@mudi2000a 7 месяцев назад
@@choppergirl the problem with CBM Basic integers is that they actually were converted to floating point and back internally at least for calculations because the BASIC has only float math routines. Now I didn’t verify if what I remember is true but I am relatively confident. Of course you could use a BASIC compiler which usually has proper integer math or just assembly. EDIT: there is a video on this very channel which confirms what I wrote: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-wo14rDnGUbY.htmlsi=zF2hmi2dReIaoaVo
@choppergirl
@choppergirl 7 месяцев назад
@@mudi2000a Well I'll have to test it some day with a timing routine. All us girls were spending so much time writing in Compute's TinyMon on the 64 back in the day, that Commodore took notice and built it and a Sprite editor into the C128's ROMs. By then though I had already moved on my BASIC Programming exploits to Microsoft QuickBasic both on the Macintosh and PC AT's.
@4rumani
@4rumani 7 месяцев назад
​@@choppergirlWell, you're wrong, "woman"
@TheGreatAtario
@TheGreatAtario 7 месяцев назад
Here we see the wisdom of TRY/FINALLY
@BryonLape
@BryonLape 16 дней назад
A nearly 50 year old bug in Microsoft code. Glorious.
@HelloKittyFanMan
@HelloKittyFanMan 7 месяцев назад
Ohhh! I had TOTALLY forgotten about our friend Val! By the way, one of my aunts' name is Val (Valery Christensen/Aunt Val), hehe. -- Mike Christensen
@chromosundrift
@chromosundrift 7 месяцев назад
So would the doublequote still be on the stack at $01ff ? Also I wonder if there are any other cases of basic functions altering the program memory temporarily such that other exception flows lead to program corruption. I know that BASIC memory layout was sometimes hacked to obscure program listings or to implement other forms of selfmod. Interesting branch of analysis, thank you Robin.
@melkiorwiseman5234
@melkiorwiseman5234 7 месяцев назад
As I understand it, the error routine resets the stack pointer. The double quote mark would still be in memory, but not technically "on the stack" since the stack wouldn't be pointing to it.
@chromosundrift
@chromosundrift 7 месяцев назад
@@melkiorwiseman5234 makes sense thanks
@HelloKittyFanMan
@HelloKittyFanMan 7 месяцев назад
"So I shared this bug on X/Twitter a few months ago..." Ha, said like that it almost has a double-meaning: "X/Twitter" sounds like "ex-Twitter," which it also is! (Oh, that reminds me: I still need to get back to your reply in that other video about "hearing the parentheses," because here I "heard" that slash, ha!)
@monkybros
@monkybros 6 месяцев назад
Just tested this on two japanese micros, Fujitsu's FM-7 (1981 6809 machine with a microsoft copyright for its basic) and Sega's SC-3000 (1983 z80 machine with a basic interpreter written by Mitec, but pretty much 1:1 function offerings to the tandy basic levels II and III) and neither exhibit this behaviour
@damouze
@damouze 7 месяцев назад
I checked it in OpenMSX and the bug does not seem to affect MSX BASIC 1.0, which is from 1983, either. It also did not give an overflow error, so I tried it with 1E99, which did give an overflow error, but indeed did not corrupt the BASIC program. It occurs to me that this is a very dirty way of putting a machine code payload into a BASIC program. I'm not sure if BASIC will save the payload to disk/cassette or load it from disk/cassette when requested, but suppose it will. Then the payload would be there, but it would be invisible to the user, obfuscated by that stray NUL character.
@JamesJones-zt2yx
@JamesJones-zt2yx 7 месяцев назад
Had to try BASIC09. It printed 0 for VAL("1e39"), not causing an error--which would concern me in serious numeric code. The program doesn't get corrupted.
@aceenterprise
@aceenterprise 7 месяцев назад
That's very interesting, especially about the null character to end the basic line like that. I wonder if any program/game ever took advantage of the null character on a line of code to hide additional code behind that?
@flatfingertuning727
@flatfingertuning727 7 месяцев назад
Although Robin didn't test this, saving and loading a program which contains an embedded zero like that, which is at least four bytes from the end of a line, would result in the third and fourth bytes after the zero being interpreted as a line number of a line that would extend until the next zero byte. If this line number was larger than the following line number, this would wreak havoc on any GOTO statements which should target lines that appear after the out-of-sequence line but have a lower number, as well as efforts to edit the program.
@Curt_Sampson
@Curt_Sampson 7 месяцев назад
@@flatfingertuning727 Are you sure about that? I thought that all versions of MS-BASIC used the offset-to-next-line value at the start of the tokenised BASIC line to determine where the next line starts.
@flatfingertuning727
@flatfingertuning727 7 месяцев назад
@@Curt_SampsonThat's what they do during execution, but to allow for the possibility that a program might be loaded at a different address from where it had been stored, versions of BASIC for the Commodore VIC-20, C64, and Apple II (and probably many others as well) scan all BASIC lines to find zero bytes and fix them. Actually, loading and saving the program isn't necessary to trigger this behavior. Adding a new line to the program can have the same effect, since it's simpler to generate line links from scratch than to try to apply relative offsets to existing line links. It's a bit of a shame, though, that MS didn't precede each line with a length byte, since traversing line links stored that way way would be faster than traversing line links stored as pointers, since only one link-related byte would need to be fetched to traverse each link.
@Curt_Sampson
@Curt_Sampson 7 месяцев назад
@@flatfingertuning727 As for the scanning for zero bytes, yes, you are right that in the pre-5.x MS-BASICs the LNKPRG routine is called immediately after a LOAD, and that does simply scan for zeros. (I've checked only the C64 6502 versions, but I assume the others are the same.) That wouldn't work in later variants such as MSX-BASIC because the tokenisation of numbers can produce $00 bytes in the middle of the line; I don't at the moment recall how the line linker in that version dealt with this.
@Curt_Sampson
@Curt_Sampson 7 месяцев назад
@@flatfingertuning727 Argh. For some reason my first reply vanished, leaving only my second. As I was mentioning in the first reply, no, on the 8080 (for which MS-BASIC was originally written) using a length byte instead of following pointers would have been slower, since the 8080 has good capabilities for loading and following 16-bit pointers, but poor capabilities for doing arithmetic on them.
@moddaudio
@moddaudio 7 месяцев назад
I use to crash BBS back in the 80's by entering '1e99' on a number prompt, good times.
@melkiorwiseman5234
@melkiorwiseman5234 7 месяцев назад
I remember when I saw Paul Alger's game Galactic Conflict which was intended to run as a dial-up game, he fixed that problem by simply running the power for his modem through the cassette remote control port. Since TRS-80 CoCo BASIC would always shut off the cassette remote control when it dropped to BASIC, crashing his game would just cause you to be disconnected until he re-started the game.
@roysainsbury4556
@roysainsbury4556 7 месяцев назад
The TRS-80 Model-I and Model-III have this bug (I tried it with the TRS32 emulator), but not the Model-4, as this used a version of Extended BASIC. Same goes for MBASIC in CP/M. The actual start address of a BASIC program on these machines isn't fixed, but there's a pointer somewhere that tells BASIC where it starts. This allowed for tricks like appending a program onto another, but that's a whole different story!
@Lord-Sméagol
@Lord-Sméagol 6 месяцев назад
The BASICs with the bug simply bail out of any function that causes overflow, printing Overflow Error / ?OV Error. The stack pointer is reset and BASIC returns to immediate mode (This prevents VAL restoring the original byte). MBASIC uses an overflow flag, so it always restores the byte. Using an overflow flag is much cleaner, it also makes error trapping (ON ERROR GOTO ) easier.
@hanybakir
@hanybakir 5 месяцев назад
I believe that Basic got overflowed because the 1E39 is a very huge number. It is One Duodecillion.
@Rob2
@Rob2 7 месяцев назад
When I saw that thumbnail I immediately knew what was going wrong 🙂 In those days I disassembled the TRS-80 basic and even now, 42 years or so later, it came back in my memory that VAL() actually saved and overwrote the character at the end of the string to temporarily terminate it, and now realized that would go wrong when an error occurs...
@maximilianmecke3012
@maximilianmecke3012 7 месяцев назад
The Commander X16 has got that bug, too. I have just tried it on the emulator.
@63801170
@63801170 7 месяцев назад
The Commander X16 modern computer uses C64 BASIC v2 (with extensions) and also has the bug included.
@chromosundrift
@chromosundrift 7 месяцев назад
I wonder if their license agreement forbids them from fixing it!
@flatfingertuning727
@flatfingertuning727 7 месяцев назад
If license agreements wouldn't be a problem, there are many places where a few tweaks to the BASIC interpreter could massively improve performance without breaking compatibiliy with any code that doesn't rely upon the addresses of routines in memory. The floating-point shift-right loop is somestimes used on positive values in FAC1, sometimes used on positive values in FAC2, sometimes used on potentially-negative up-to-32-bit values in INT, and sometimes used on potentially-negative values in cases requiring a 16-bit result. Having separate routines to handle these different uses might increase the ROM size by a few dozen bytes, but enormously improve performance. When executing "FOR I=128TO255:NEXT", the existing interpreter spends about a quarter of the overall execution time in one of the aformentioned shift loops, and when processing PEEK and POKE statements even more time is spent in those loops. Having special 16-bit versions of those loops could probably more than double the performance of something like `FOR I=1024 TO 2023:POKE I,PEEK(I+D):NEXT", without having to make any major changes to the overall design of the interpreter.
@KitsuneFuzzy
@KitsuneFuzzy 7 месяцев назад
Not sure if this is an emulator thing. But I got curious: since everything of line 10 still seems to be in memory what would happen if you run the program to break it as shown in the video. Then list the broken program, with the "removed" REM comment, and put back the quotation marks from where they were removed. Immediately after placing the " it seems to break arrow key navigation through the code and instead inputs symbols. That only seems to happen if you put quotation marrks back, any other symbol (at least tested with letters) broke nothing.
@silkwesir1444
@silkwesir1444 6 месяцев назад
That is actually a feature of the screen editor. It automatically detects if you within quotes and then these control characters for special keys like arrow keys, function keys and so on can be input and displayed like that. Pretty neat actually, can get confused sometimes though...
@mdpenny42
@mdpenny42 7 месяцев назад
FWIW, doesn't seem to affect Acorn's BASIC (as in "BBC BASIC" version 2 on a BBC Micro) - checked on an emulator. Then again, Acorn's BASIC was their own development, rather than extended from an extant version of Microsoft BASIC.
@HardDriveGuruOfficial
@HardDriveGuruOfficial 6 месяцев назад
Okay, what's the song during the patron list? It's a bop for real!
@cheater00
@cheater00 7 месяцев назад
hey Robin, I have an idea for a video, or series of videos, for you. you usually show a thing on the c64 and go deep into how it works and why. wouldn't it be fun if you compared some other computers? like, take a simple thing, like integers, and their overflow modes. compare basic implementations and see why they differ, and how that is grounded in the hardware it's running on. or how line drawing functions compare. or how strings work. or how the value of pi**2 changes. or what various control structures each computer has. this would be a great chance to whip out some really obscure basic machines or even clones and knockoffs from the soviet union, china, or south america... let me know if you like the idea!
@ChrisCromwellHP
@ChrisCromwellHP 7 месяцев назад
I wonder if this bug was corrected in the Tandy COCO 3 Color Basic? 🤔
@lindnertim
@lindnertim 7 месяцев назад
Nope.
@HelloKittyFanMan
@HelloKittyFanMan 7 месяцев назад
The E actually means _"times_ ten to the specified power."
@williamdrum9899
@williamdrum9899 7 месяцев назад
Hmm I guess I'm used to thinking of it as a hexadecimal value
@HelloKittyFanMan
@HelloKittyFanMan 7 месяцев назад
@@williamdrum9899: Yeah, it's (dec.) 14 when in notations like "$3E7B" or "E4"; but means " *E* xponent" (with the already-understood "*10") in notations like "1E39" or "5E+26" or "18E-9," where the environment is assumed to be decimal (as it is in BASIC). But that raises the question: Normally I've seen big numbers expressed by the computer as "[x]E+/-[y];" I mean always with a + or -. So that reminds me to ask Robin to try this with the + or - and see if that's still triggers the bug. (If not today, then to wait until after Christmas to address it, of course.) Happy Christmas! 🎄🎅
@davidellsworth4203
@davidellsworth4203 7 месяцев назад
Are there any platforms on which the bug is fixed (by undoing the NUL-terminator overwrite upon an overflow error, instead of copying the string), but pressing the equivalent of Ctrl-Break with perfect timing can still result in program corruption, by breaking out of the machine language routine that handles VAL (a race condition)? [Edit: I retract the following as it has been brought to my attention that that solution would only work for string literals and not for strings in the heap.] It's strange that they used a NUL termination overwrite at all, though. Wouldn't the character it overwrites always be a non-numeric character anyway (a quotation mark) which terminates VAL's evaluation? It should be possible to fix the bug merely by getting rid of that overwrite, and still operating on the string in situ.
@NuntiusLegis
@NuntiusLegis 7 месяцев назад
If I understand it correctly: Only if a string literal in the code is evaluated, but VAL can also process string variables where the string data is in string memory without quotation marks or other terminating characters, and then the next string can start with a numeric character.
@WY.C64-Guy
@WY.C64-Guy 7 месяцев назад
You mean like: 10 a$="4321"+chr$(34)+"9876" 20 print val(a$) ?
@awilliams1701
@awilliams1701 7 месяцев назад
I actually ran into 100+100 = 100100 yesterday. It's a problem I run into with javascript a lot. The way I fix it is I subtract 0. So I do "100"-0+100 = 200. I also recently learned thanks to our code scans that javascript scoping is shit compared to what I'm familiar with. I was getting "variable already declared" errors in the scanner. I'm like.....does javascript not have proper scopes? NOPE!!
@G1itcher
@G1itcher 7 месяцев назад
Use let rather than var.
@awilliams1701
@awilliams1701 7 месяцев назад
@@G1itcher I'm not using val or let. It's just adding variables with numbers. The numbers are numbers, but sometimes the variables are strings and sometimes they are numbers.
@whitslack
@whitslack 7 месяцев назад
Rather than subtracting zero, just use the unary plus operator to convert the first operand into a number. +"100"+100 == 200.
@awilliams1701
@awilliams1701 7 месяцев назад
@@whitslack I'm not testing against 200, I'm adding 100 to it and then using the 200 result later. or are you saying that +"100" = 100?
@whitslack
@whitslack 7 месяцев назад
@@awilliams1701 +"100" evaluates to 100. Applying the unary plus operator to a string value produces a number value.
@CrazyBossDK
@CrazyBossDK 6 месяцев назад
At Memorech Basic (Memotech MTX500,512 and RS128) You just get "Overflow" nothing else. You dont need to use the val version, but you can, but Memotech Basic accept print 1e39 itself. But both versions will give an Overflow.
@CrazyMan_Engineer
@CrazyMan_Engineer 7 месяцев назад
Who created the first version that had this bug? Was it intentional to stop theft of the basic software?
@LaserFur
@LaserFur 7 месяцев назад
When I was in school I entered 1e9999 into a educational program and it crashed to basic. I set my score to 110% and ran the code to print out the test results. The teacher was so worried that I broke the Apple II.
@user-pn6yw2mz9d
@user-pn6yw2mz9d 4 месяца назад
Hi Robin. Nice video. You made a video about a VTECH PRECOMPUTER 1000. I've got the 2000 and the Precomputer Prestige computer for kids. They also got the bug. Looks like they have the same BASIC as the PC 1000?
@davidhand9721
@davidhand9721 7 месяцев назад
I've used Basic dialects that didn't use the *+* operator for concatenation, and I've never liked the string+string syntax since. I think it was the *&* operator. It doesn't make a lot of sense to have numeric addition and string concatenation share a syntax because there's no case where the operations are interchangeable. The only thing you can do by changing the type of the variables alone is create a quiet error. It really frustrates me now when I use languages like Python that do the same thing.
@NuntiusLegis
@NuntiusLegis 6 месяцев назад
It is quite intuitive to use + here. A few yaers later it would have been praised as clever "operator overloading".
@philp4684
@philp4684 7 месяцев назад
19:23 The singular form of "parenetheses" is "parenthesis", not "parenthese".
@8_Bit
@8_Bit 7 месяцев назад
Since I won't remember that, I'll just go back to calling them (round) brackets which is the usual Commonwealth term for them.
@therealxunil2
@therealxunil2 7 месяцев назад
Guessing it’s storing 1x10^39. Now I’ll finish watching.
@aresaurelian
@aresaurelian 7 месяцев назад
I have a memory of trying VAL() as a kid and getting this error when doing some elaborate mathematical wizardry. It annoyed me.
@barcoboy2
@barcoboy2 7 месяцев назад
Very interesting. After running the program, if you try to insert a line between 10 and 20, or add a line to the top or bottom of the program, line 10 gets changed to: 10 PRINT VAL("1E39 8335 SHOW BUG Fixing line 10 or even making it shorter or longer does not result in any corruption that I can see.
@WY.C64-Guy
@WY.C64-Guy 7 месяцев назад
I think that's because the linker has to execute to make a correct linked list to include the new line(s), and runs across the garbage left behind.
@faenethlorhalien
@faenethlorhalien 7 месяцев назад
Wow. I remember using it on the Speccy and never had any issues with it. Obvs even fewer on GWDOS on the pc.
@8_Bit
@8_Bit 7 месяцев назад
Yeah, as far as I know Speccy's BASIC has no connection to Microsoft's so it suffers from a completely different set of bugs :) (I might finally make some Speccy videos in 2024, so watch out) ;)
@AureliusR
@AureliusR 7 месяцев назад
Tim Lidner is such an awesome person.
@junker15
@junker15 7 месяцев назад
I wonder what happens to the stack when the VAL subroutine encounters an unexpected error. I have the assembly listing in front of me, and that byte they replace to terminate the string is pushed onto the stack, and if there's no error, then it's pulled off and put back. But if there's an error, BASIC's error routine seems to bail through NEWSTT, which might make it all good as far as BASIC's concerned, but leave that tiny bit of junk still on the stack (NEWSTT saves the stack pointer, but doesn't seem to do anything else related to the stack). I don't imagine it being enough to overflow the stack, but "I don't think it'll matter" is a great start for some truly insidious bugs to happen.
@NuntiusLegis
@NuntiusLegis 7 месяцев назад
The bug described in the video is extremely unlikely to happen anyway. I would be surprised if it had ever been a practical problem.
@kevinwnz
@kevinwnz 7 месяцев назад
The commander x16 also has this basic bug
@VintageGearFreak
@VintageGearFreak 7 месяцев назад
Does VAL() also evaluate expressions like "4+5" ?
@8_Bit
@8_Bit 7 месяцев назад
No, unfortunately, it'll stop when it hits the + sign. It converts a number in text form, consisting of the digits 0-9, and characters "." and "E" for decimals and scientific notation. It does parse "-" or "+" but only at the beginning of the number, or after the "E".
@abramuccio
@abramuccio 7 месяцев назад
Commander X16 is also affected by this bug :)
@FuerstBerg
@FuerstBerg 7 месяцев назад
I remember many crashes after an overflow error...
@watchmakerful
@watchmakerful 7 месяцев назад
Why does it perform these actions in the program memory itself? Isn't it safer to copy this string somewhere else in the memory and then process it?
@8_Bit
@8_Bit 7 месяцев назад
Yes, it really should do this somewhere else in memory, with an extra byte at the end for the null terminator. I can only guess that they did this sort of hacky solution just to save ROM code space, RAM at runtime, and CPU cycles which were always in short supply in those days.
@NuntiusLegis
@NuntiusLegis 7 месяцев назад
@@8_Bit I am glad they did, because I won't lose sleep over this bug. :-)
@fuzzix
@fuzzix 7 месяцев назад
7:30 While Microsoft released fixed BASICs, it looks like Commodore still didn't shell out for an updated version and backported fixes themselves? Great investigation, cheers!
@williamdrum9899
@williamdrum9899 7 месяцев назад
Probably to save money on the licensing
@joehoy9242
@joehoy9242 7 месяцев назад
​@@williamdrum9899- CBM (specifically Jack Tramiel) didn't do licensing, he insisted on buying a cut of MS BASIC outright in 1977 that incurred no further cost, and CBM would be able to brand and modify themselves. This was done when the PET was in development - and because Micro-Soft were very much a fledgling company at the time and needed the money, they sold it lock, stock and barrel for about 50 thousand dollars {yup, you read that right), but they didn't read the fine print, which said CBM could use it on any derived technology, not just the PET itself. Microsoft then had to watch as that version of BASIC was used on the VIC-20 (the first home machine to ship a million units), and to add insult to injury, the C64 (Most successful home computer of its generation, shipping around 8 million units) and knowing that they would not see a red cent from any of it. Tramiel pretty much took them to the cleaners, and it wasn't until they lucked out leveraging BillG's parents' relationship with IBM that the company's future was certain. There's a reason MS (and Apple) don't like to talk much about CBM!
@pantherosgaming1995
@pantherosgaming1995 6 месяцев назад
I found the same bug in the Commander X16 emu too.
@turnkit
@turnkit 7 месяцев назад
This bug is also in the TRS-80 Model III BASIC. I think Radio Shack tried to claim the Model 4 BASIC wasn't Microsoft's anymore since they modified parts to get out of licensing. Something like that. Would be curious to see if the bug is still there.
@turnkit
@turnkit 7 месяцев назад
The DOS based TRS-80 Model 4 BASIC 01.01.01 for TRSDOS Version 6, Copyright 1984 by Microsoft, licensed to Tandy Corp., works properly. (The Model 3 ROM BASIC did not.)
@turnkit
@turnkit 7 месяцев назад
The ROM based BASIC for the TRS-80 Model 4 fails though. This BASIC claims to be "(C) '80 Tandy" but yet still has exactly the same bug as Microsoft's. Maybe because Tandy really didn't create their own BASIC but just stole Microsoft's code?
@crazyedo9979
@crazyedo9979 7 месяцев назад
It can be easily identifed. This bug has a hunchback, a wooden leg, a steel hook for a hand and a long grey beard. A parrot sitting on his shoulder.😁
@8bitsaga
@8bitsaga 7 месяцев назад
Confirmed that MSX-BASIC does NOT contain the 1E39 bug.
@bjbell52
@bjbell52 7 месяцев назад
I tried it on an Atari 800 (emulator) running Microsoft Basic. It comes back with a message "overflow" cr "overflow" .
Далее
38911 Bytes Free? Commodore 64's BASIC RAM
29:57
Просмотров 38 тыс.
How To Start An ISP (like it's 1993)
16:54
Просмотров 179 тыс.
Water powered timers hidden in public restrooms
13:12
Просмотров 704 тыс.
Is this the FASTEST and CHEAPEST 8-Bit Computer Ever?
28:43
iPhone 16 - 20+ КРУТЫХ ИЗМЕНЕНИЙ
5:20