Streaming GTA5 on the original Game Boy via WiFi cartridge

A few weeks ago I created a Game Boy cartridge with built-in WiFi. Now I taught it to stream video and play games – In full resolution. At 20 fps. On an unmodified original Game Boy.

00:00 Intro
01:38 Drawing a full screen image on the Game Boy
04:32 Fast data transfer for fast videos
07:45 Video stream demos
08:50 Game demos
10:36 Conclusion


  4. The Gameboy sound chip has 4 channels: two play square waves, (that classic chiptune sound) one noise channel and one that can play samples from storage. your best bet would be to feed an audiostream to that samples channel by loading a continuous flow of chopped up samples from the card. i think you need to downsample the audio to 4 bit first or something. looking forward to hear it happening!

  5. Register A can begin hblank in a loaded state, saving two cycles.

    ZX Spectrum kids would suggest loading data into registers during LCD refresh, so slightly more can be moved into VRAM during hblank… but the GB doesn't have Z80 shadow registers. So you could start hblank with bytes in B & C, to do LD A,B / LD [HLI],A / LD A,C / LD [HLI],A, which saves one cycle per loaded register.

    Extending that becomes a tradeoff. You can overwrite DE… which also requires overwriting HL, temporarily. Basically, put the read address in HL, read data into B, C, D, and E, then pop the write address back into HL. Once hblank starts, do four of those 3-cycle writes, then pop DE from the stack and continue as before.

    You could probably begin hblank with A loaded, for an additional 2-cycle write, but the data order is weird. You MUST write A first, but you can only read it last. Which… isn't a problem for your wifi cartridge, since all reads come from the same address. You'd just change the order of bytes coming out.

    Pops take three cycles, but only one of them occurs during hblank itself. LD [HLI],A is two cycles. Register-to-register loads are one cycle, and you pair those with LD [HLI],A, four times. Then the pop. 17 cycles for the first five bytes, then back to 4 cycles per byte until hblank runs out. So 18 bytes per hblank. Which screws up HL's clean 8-bit alignment every 14 scanlines and probably isn't worth the hassle.

    Honestly I'd just lose one row of tiles to hit 30 Hz at 160×136 ot 152×144. Throw an unsharp mask filter into the pre-processing. Maybe send 16 bytes of FFT data so the sample channel can broadly approximate in-game audio. It brings this ridiculous monstrosity closer to useful. And isn't that the best punchline?

