|TOY/2 - a minimalist 16 bit CPU|
TOY/2 is a minimal 16-bit processor inspired by the TOY CPU described in
. The architecture was improved in many areas:
TOY/2 was implemented in 3µm CMOS (Philips / Faselec SACMOS), and
takes up a whopping 3300 transistors (excluding I/O pads). It would
fit into a small corner of a FPGA today. It was designed by Pascal
Dornier and Stephan Paschedag
at ETH Zürich in 1988 as part of a course in chip design.
- TOY/2 was implemented using pipelining - instructions are executed while
the next instruction is fetched.
- The program counter is incremented by the main ALU while the operand is
read. This reduced complexity. The accumulator and the program counter
switch places for this.  also does this.
- The instruction set was improved. Some redundant instructions were dropped,
others added. Control logic is minimal (max. 4 levels of logic).
- TOY/2 can address a full 64K word program and memory address space.
TOY/2 has a simple 16-bit data path based on the ALU described in
. The programmer sees the following registers:
and the following flags:
- A accumulator
- PC program counter
- T temporary register for indirect store
|All instructions are 16-bit: 4 bits for the opcode, and 12 bits
for the direct address.|
||A:=A+[src]; update C|
||A:=A xor [src]|
||A:=A - [src] - C; update C|
||Rotate right through carry
||A,C:=A,C ror 1|
||Transfer to T
||A:=A or [src]|
||A:=A and [src]|
||Load A, clear carry
||Indirect jump if carry clear
||IF C=1 THEN PC:=[vec]|
||Indirect jump if not zero
||IF Z=0 THEN PC:=[vec]|
||Load A indirect
||Store T indirect
||Load A, don't clear carry
TOY/2 does not implement a stack, call instructions (which can be
indirect jumps) or interrupts. The focus was on a minimalist design
that could be implemented in the time available (and that was so
simple that the assistants would let us do it).
The architecture does not make very effective use of code space.
It should still do a bit better than a Turing machine.
Performance was still projected to be quite respectable. For
example, the prime number sieve benchmark would take about 1.2s at
4 MHz compared to 0.7s on a 80286 (10 MHz, no wait states) or 0.35s
on a 68020 (16 MHz, 1 wait state, cache enabled).
Microcoded versus Hard-wired Control, Phil Koopman,
The Architecture of Microprocessors, Francois Anceau,
Strip Architecture Fits Microcomputer Into Less Silicon,
|© 2002-2021 PC Engines GmbH. All rights reserved.|