AI Model Runs on 64 kB of RAM and a Zilog Z80 CPU
What you'll learn:
- How AI is run on a Zilog Z80.
- What kind of responses is a Z80 capable of handling?
- Where you can try it yourself.
In 1976, the Zilog Z80 CPU was launched, and it’s driven most everyday devices, including calculators, arcade systems, and home PCs. Recently, HarryR (via github) developed the Z80-μLM, an AI tool running on a Zilog Z80 CPU with 64 kB of RAM. While this is a notable achievement, it doesn’t meet the standard of human-like intelligence.
HarryR wrote in the Readme, "Z80-μLM is a 'conversational AI' that generates short character-by-character sequences, with quantization-aware training (QAT) to run on a Z80 processor with 64kb of RAM." The developer wants to explore how AI can be minimized, even with a personality. His project aims to determine if the AI tool can be trained and fine-tuned under extreme constraints. This model achieves that in 40 kB with a chat UI, weights, and inference.
According to HarryR, the Z80AI project features:
- Trigram hash encoding: Input text is hashed into 128 buckets — typo-tolerant, word-order invariant.
- 2-bit weight quantization: Each weight is {-2, -1, 0, +1}, packed 4 per byte.
- 16-bit integer inference: All math uses Z80-native 16-bit signed arithmetic.
- ~40KB .COM file: Fits in CP/M's Transient Program Area (TPA).
- Autoregressive generation: Outputs text character-by-character.
- No floating point: Everything is integer math with fixed-point scaling.
- Interactive chat mode: Just run CHAT with no arguments.
There are two pre-built examples in this project, Tinychat and Guess. Tinychat is a minimalist conversational agent that outputs a short answer in response to greetings and questions. Guess is a game with 20 questions, and the model has a hidden answer that the user must uncover through dialogue.
Both are distributed as precompiled binaries for CP/M systems and the Sinclair ZX Spectrum. On CP/M, the tool uses buildz80com.py to generate standard .COM files that can be run directly. Meanwhile, the ZX Spectrum generates two .TAP files that load on an emulator or original hardware. It uses ZX Spectrum ROM routines for input and output, and the memory is optimized for 48K systems.
It also has short, nuanced responses:
- OK - acknowledged, neutral
- WHY? - questioning your premise
- R U? - casting existential doubt
- MAYBE - genuine uncertainty
- AM I? - reflecting the question back
"It's a different mode of interaction. The terse responses force you to infer meaning from context or ask probing direct yes/no questions to see if it understands or not," said HarryR.
About the Author
Cabe Atwell
Technology Editor, Electronic Design
Cabe is a Technology Editor for Electronic Design.
Engineer, Machinist, Maker, Writer. A graduate Electrical Engineer actively plying his expertise in the industry and at his company, Gunhead. When not designing/building, he creates a steady torrent of projects and content in the media world. Many of his projects and articles are online at element14 & SolidSmack, industry-focused work at EETimes & EDN, and offbeat articles at Make Magazine. Currently, you can find him hosting webinars and contributing to Electronic Design and Machine Design.
Cabe is an electrical engineer, design consultant and author with 25 years’ experience. His most recent book is “Essential 555 IC: Design, Configure, and Create Clever Circuits”
Cabe writes the Engineering on Friday blog on Electronic Design.

