Worse than the floating-bus in this example is when it depends on uninitialized RAM which is often consistent based on DRAM so the code will always work on your machine/emulator but won't on someone else's machine with different DRAM chips (invariably you catch this at a demoparty when it won't run on the party machine and you only have 15 minutes to fix it before your demo is about to be presented)
But still DRAM is what you would use for a "real" system. Wozniak's design for the Apple II used a clever hack where the system actually runs at 2 MHz with an effective CPU rate of 1 MHz. Any read from a DRAM row will refresh the entire row. Approximately every other cycle the video system steps incrementally through memory, refreshing as it goes.
The reason is that the designers saved a few chips by repurposing the Z80's refresh circuit as a counter/address generator, when generating the video signal. Specifically, it uses the instruction fetch cycle to read the character code from RAM, then it uses the refresh cycle to read the actual line of character data from the ROM. The ZX80 nominally clocks the Z80 at 3.25MHz, but a machine cycle is four clocks (two for fetch, two for refresh), so it's effectively the same speed as a 0.8125 MHz 6502.
I wrote a long section here about how the ZX80 uses the CPU to generate the screen and the extra logic that involves, but it was getting too long :) The ZX81 is basically just a cost-reduced ZX80 where all the discrete logic chips are moved into one semi-custom chip.
Doing this makes external RAM packs more expensive too. You couldn't use the real refresh address coming from the Z80 because the video generator would be hopping around a small range of addresses in the ROM, rather than covering the whole of RAM (or at least each row of the DRAM). The designer has two options:
1. Use static RAM in the external RAM pack, making it substantially more expensive for the RAM itself; 2. Use DRAM in the external RAM pack, and add extra refresh circuitry to refresh the DRAM when the main computer is using the refresh cycle doing its video madness.
I think most RAM packs did the second option.
This is more about saving chips than saving cycles, since the Apple II was implemented entirely with 74 series logic. A more traditional approach that used spare cycles during horizontal blanking would have required several more chips.
It does mean that the layout of the Apple II's screen memory is somewhat insane. Those DRAM chips needed to be refreshed every 2ms, and it takes 16ms to scan out a whole screen. Every 8th of the screen needs to be spread out across all 128 rows.
The BBC Micro range all had 250 ns DRAM, with the CPU getting 2e6 accesses and the video getting the other 2e6 (taking advantage of the 6502's predictable RAM access rate). The display memory fetches served to refresh the RAM.
I don't know much about the Acorn Electron, which was very different internally, but it had dynamic RAM as well. I expect the video refresh was used to refresh the DRAM in this case too - as the display memory layout was the same, and so over every 640 microsec it would touch every possible address LSB.
The 6502 second processor had DRAM as well, refreshed by a circuit that ran on a timer and stole the occasional cycle from the CPU at some rate.
Though static RAM was quite common for RAM upgrade boards (of one kind or another), presumably cheaper for this case than the alternative.
For higher memory capacities, e.g. 32 kB, 48 kB or 64 kB, static RAM would have been too expensive and too big, even if 6502 did not have an integrated DRAM controller, like Zilog Z80.
Using SRAM instead of DRAM meant using 4 times more IC packages, e.g, 32 packages instead of 8. The additional DRAM controller required by DRAM would have needed only 1 to 4 additional IC packages. Frequently the display controller could be used to also ensure DRAM refresh.
The Apple II was one of the first 6502 systems to use DRAM (in 1977) and Woz was incredibly clever in getting the refresh for free as a side effect of the video generation
Static RAM (SRAM) is a circuit that retains its data as long as the power is supplied to it. Dynamic RAM (DRAM) must be refreshed frequently. It's basically a large array of tiny capacitors which leak their stored charge through imperfect transistor switches, so a charged capacitor must be regularly recharged. You would think that you would need to read the bit and rewrite its value in a second cycle, but it turns out that reading the value is itself a destructive operation and requires the chip to internally recharge the capacitors.
Further, the chip is organised in rows and columns - generally there are the same number of Sense Amplifiers as columns, with a whole row of cells discharging into their corresponding Sense Amplifiers on each read cycle, the Sense Amplifiers then being used to recharge that row of cells. The column signals select which Sense Amplifier is connected to the output. So you don't need to read every row and column of a chip, just some column on every row. The Sense Amplifier is a circuit that takes the very tiny charge from the cell transistor and brings it up to a stable signal voltage for the output.
So why use DRAM at all if it has this need to be constantly refreshed? Because the Static RAM circuit requires 4-6 transistors per cell, while DRAM only requires 1. You get close to 4-6 times as much storage from the same number of transistors.
Of course if you just blindly ask it to write asm it will occasionally invent new instructions or address modes but it's very good at reviewing and making adjustments
After reading, I realized that he just meant that the bus was "open" as in not connected to anything, because the address line decoders had no memory devices enabled at the specified address ($2000).
It's pretty funny that the omission of the immediate mode (#) went unnoticed until the obsolete emulator didn't behave in the same way as the real hardware when reading <nothing> from memory.
His solution of changing the instruction to use immediate addressing mode (instead of absolute) would have the consequence of faster execution time, because the code is no longer executing a read from memory. It's probably now faster by about 2us through that blob of code, but maybe this only matters on bare metal and not the emulator, which is probably not time-perfect anyway.
(Some) SNES emulators really are basically time-perfect, at this point [0]. But 2us isn't going to make an appreciable difference in anything but exceptional cases.
[0] https://arstechnica.com/gaming/2021/06/how-snes-emulators-go...
That means the 2 clock cycles could theoretically make an observable difference if they cause the CPU to miss a frame deadline and cause the game to take a lag frame. But this is rather unlikely.
When byuu/near tried to find a middle-ground for the APU clock, the average turned out to be about 1025296 (32040.5 * 32). Some people have tested units recently and gotten an even higher average. They speculate that aging is causing the frequency to increase, but I don't really know if this is the case or if there really was that much of a discrepancy originally.
It does cause some significant compatibility issues, too, like with attraction mode desyncs and random freezes.
IIRC ZSNES actually had basically no timing; all instructions ran for effectively one cycle. ZSNES wasn't an accurate emulator, but it mostly worked for most games most of the time.
Donkey Kong 64 has a memory leak that will kill the game after a (for that era) unlikely amount of contiguous time playing it (8-9 hours, if I understand correctly). That was not caught in development but is a trivial amount of time to rack up if someone is playing the game and saving progress via emulator save-state instead of the in-game save feature.
(Note: there is some ambiguous history here. Some sources claim the game shipping with the Memory Pak was a last-ditch effort to hide the bug by pushing the crash window out to 13-20 hours instead of 8-9. I think recent research on the issue suggests that was coincidence and the game didn't ship with either Rare or Nintendo being aware of the bug).
Propeller: JMP #address 6502: JMP address
Propeller: JMP address 6502: JMP (address)
And the worst thing, my buggy Propeller code sometimes works, like in this article. Until it stops and I spend hours figuring out why.
In order to understand what actually happens, we need to look a little closer at the physical structure of a data bus -- you have long conductors carrying the signals around the motherboard and to the cartridge, separated from the ground plane by a thin layer of insulating substrate. This looks a lot like a capacitor, and in fact this is described and modeled as "parasitic capacitance" by engineers who try to minimize it, since this effect limits the maximum speed of data transmission over the bus. But this effect means that, whenever the bus is not being driven, it tends to stay at whatever voltage it was last driving to -- just like a little DRAM cell, producing the "open-bus reads return the last value transferred across the bus" effect described in the article.
It's not uncommon for games to accidentally rely on open-bus effects, like DKC2. On the NES, the serial port registers for connecting to a controller only drive the low-order bits and the high bits are open-bus; there are a few games that read the controller input with the instruction LDA $4016 and expect to see the value $40 or $41 (with the 4 sticking around because of open-bus).
There's also speedrun strategies that rely on open-bus behavior as part of memory-corruption or arbitrary-code-execution exploits, such as the Super Mario World credits warp, which sends the program counter on a trip through unmapped memory before eventually landing in RAM and executing a payload crafted by carefully manipulating enemy positions [1].
But there's some exceptions to the usual predictable open bus behavior. Nonstandard cartridges could return a default value for unmapped memory, or include pull-up or pull-down resistors that impact the behavior of open bus. There's also an interesting interactions with DMA; the SNES supports a feature called HDMA which allows applications to schedule DMA transfers to transfer data from the CPU to the graphics hardware with precise timing in order to upload data or change settings mid-frame [2]. This DMA transfer temporarily pauses the CPU in order to use the bus to perform the transfer, which can change the behavior of an open-bus read if a DMA transfer happens to occur in the middle of an instruction (between reading the target address & performing the actual open-bus read).
This very niche edge case has a significant impact on a Super Metroid speedrun exploit [3] which causes an out-of-bounds memcpy, which attempts to transfer a large block of data from open-bus to RAM. The open-bus read almost always returns zero (because the last byte of the relevant load instruction is zero), but when performed in certain rooms with HDMA-heavy graphical effects, there's a good chance that a DMA transfer will affect one of the reads, causing a non-zero byte to sneaks in somewhere important and causing the exploit to crash instead of working normally. This has created a mild controversy in the community, where some routes and strategies are only reliable on emulators and nonstandard firmwares; a player using original hardware or a very accurate emulator has a high chance of experiencing a crash, whereas most emulators (including all of Nintendo's official re-releases of the game) do not emulate this niche edge case of a mid-instruction HDMA transfer changing the value of an open-bus read.
Also, the current fastest TAS completion of Super Metroid [4] relies on this HDMA interaction. We found a crash that attempted to execute open bus, but wasn't normally controllable in a useful way; by manipulating enemies in the room to influence CPU timing, we were able to use HDMA to put useful instructions on the bus at the right timing, eventually getting the console to execute controller inputs as code and achieve full arbitrary code execution.
[1]: https://youtu.be/vAHXK2wut_I
[2]: https://youtu.be/K7gWmdgXPgk
Once again, I have to give a shout out to Ben Eater, whose video series on making a breadboard computer with the 6502 is why I actually understand what the article is about and what you're referring to when describing the hardware issues. (Obviously, extrapolating from his basic bus example to a commercial machine.) I'd be pretty clueless otherwise.
Lots of reads and writes in the original NES just toggled voltages on a line somewhere, and then what happened, happened. You got the effect you wanted by toggling those voltages in a very controlled manner lock-stepped with the signal indicating the behavior of the CRT blanking intervals. Some animations in Super Mario Bros 3 involved toggling a RAM mux to select from multiple banks of sprite data so that when the graphics hardware went to pull sprites, it'd pull them from an entirely different chip with slight variations in their look. And since the TV timing mattered, they had to release different software for regions with NTSC and PAL TVs since those TVs operate with different refresh rates and refresh rate was the clock that drove the render logic.
It was a wild time.
Not quite related, but i get a similar feeling if the game seems really tough: "is this due to emulation latency". I went down a rabbit hole on this one and built myself a mister FPGA!
Sometimes it does feel that way...
I really enjoyed DKC 3 though, which apparently is not that popular among hardcore DKC fans, so there is that.
As for difficulty, we will have to agree to disagree. Only point that had me somewhat frustrated was the waterfall boss in the last game and that one had me stumped for some time (should have read the manual I guess?). Overall, I would still recommend the games as good for their time and among the better action platformers for their console, but they really are nowhere close to a masterpiece like Super Mario World that has pretty much perfected controls and you can tell exactly where platforms and objects start and end.
Regarding my comment about the difficulty, for me it is mostly a result of the bad controls, which to be honest is a problem that plagues many games of that era. You have enemies popping out from the edge of the screen and you need to react in a fraction of a second, but the controls don't let you do so, so you need to memorize the whole game and react in advance. It is not fun for me.