Skip to content

Chip-8 on the COSMAC VIP: The Call Routine (Fetch and Decode)

This is part of a series of posts analysing the Chip-8 interpreter on the RCA COSMAC VIP computer. These posts may be useful if you are building a Chip-8 interpreter on another platform or if you have an interest in the operation of the COSMAC VIP. For other posts in the series refer to the index or instruction index.

In a previous post I explained how the initialisation section of the Chip-8 interpreter worked. In this post I’ll look at the Call loop (which is the equivalent of a fetch and decode sequence for a machine language).

This part of the interpreter is responsible for fetching the next Chip-8 instruction from memory, decoding it by analysing its parts and then calling the appropriate routine to deal with each instruction type. To understand how this works, we first need to understand how a Chip-8 instruction is structured.

Each Chip-8 instruction is two bytes. Each byte can be broken down further into two nibbles. The first nibble of the first byte is used to indicate one of sixteen possible instruction groups. These are shown in the table below.

First Instruction Nibble

Instruction Group

0

Call machine code instructions

1

Branch instructionsv

2

Call Chip-8 subroutine instructions

3

Skip if variable equal to immediate operand instructions

4

Skip if variable not equal to immediate operand instructions

5

Skip if variable equal to register instructions

6

Load variable with immediate operand instructions

7

Add immediate operand to variable instructions

8

Other arithmetic and logic instructions

9

Skip if register not equal to register instructions

A

Memory indexing instructions

B

Branch with offset instructions

C

Random number generation instructions

D

Display instructions

E

Skip if key pressed or not pressed instructions

F

Various I/O (timers and keyboard) and memory instructions

So the first thing the interpreter must do after it has fetched the first byte of the instruction is decode the most significant four bits to determine which of these instruction groups is to be executed.

Group 0 is dealt with as a special case. These instructions are used to call machine language subroutines from within a Chip-8 programme. If one of these instructions is detected, the handling routine is executed immediately at this point. Basically this forms an address by masking off the most significant digit of the first byte of the Chip-8 instruction, which then becomes the high order byte of the address and then using the second byte of the Chip-8 instruction as the low order byte of the address. The routine at that address is then called. This means Chip-8 can call any address in the on-card RAM (i.e. any address from 0x0000 to 0x0FFF), but routines in any extended RAM can not be called directly. I’ll look further at machine code integration in a future post.

For the remaining instruction groups, the interpreter next sets up two variable pointers. For some instructions the low order nibble of the first byte is used to select a variable, designated VX. For some of these instructions a second variable, designated VY, is indicated by the high order nibble of the second byte. The interpreter gets these values and uses them to construct pointers to where the relevant variables are stored in memory.

Note that these pointers are established even if the instruction will not make use of them. This makes some instructions a little less efficient than they could be in terms of execution time, but the trade off is that the interpreter is more straightforward and compact. In a 2K system, efficient memory use was more important than getting the best possible execution speed.

The interpreter now uses the instruction group code to index a couple of lookup tables that are used to find the address of the handling routine for each instruction group. The routine at this address is then called. When execution is returned to the Chip-8 call routine, it loops back round to fetch the next instruction. Here’s a flowchart of the sequence:

A flowchart describing the Chip-8 fetch and decode routine.
A flowchart describing the Chip-8 fetch and decode routine

Now here’s the code from the interpreter:

Labels
Address (hex)

Code (hex)
Assembly

Comments

FETCH_ DECODE_LOOP:
001B

96
GHI 6

Get the high order byte of the VX pointer …

001C

B7
PHI 7

… and copy this to the high order byte of the VY pointer.

001D

E2
SEX 2

Use the stack pointer (R2) for indirect register addressing operations.

001E

94
GHI 4

Copy high order byte of CALL routine pointer (R4) …

001F

BC
PHI C

… and copy it to RC (RC will be used later as a pointer into a pair of lookup tables that hold the addresses of the routines that handle each instruction group).

0020

45
LDA 5

Get the first byte of the next Chip-8 instruction and advance the instruction pointer (R5).

0021

AF
PLO F

Copy first byte of Chip-8 instruction to RF.0.

0022

F6
SHR

The next four instructions move the most significant digit of the Chip-8 instruction (first byte) – the instruction group code – to the position of the least significant digit. The least significant digit is discarded.

0023

F6
SHR

0024

F6
SHR

0025

F6
SHR

0026

32 44
BZ FIRST_DIGIT_0

If the instruction group code indicates a machine language call (code 0), jump directly to the routine that handles this.

0028

F9 50
ORI 0x50

Apply a mask to the instruction group code to turn it into the low-order part of an address that points to an entry in a lookup table (This table is stored from 0x0051 to 0x005F).

002A

AC
PLO C

RC now points to the correct entry in a lookup table for the instruction group of the current instruction - this table holds the high order byte of the address of the routine that handles that instruction group.

002B

8F
GLO F

Retrieve the unaltered copy of the first byte of the Chip-8 instruction from RF.0.

002C

FA 0F
ORI 0xF0

Apply a mask to the least significant digit of the first byte of the Chip-8 instruction to form the low order byte of a pointer to the relevant variable (These variables are stored in the final page of on-card RAM from 0x0XF0 to 0x0XFF).

0030

A6
PLO 6

The VX pointer (R6) now points to the correct variable for this instruction.

0031

05
LDN 5

Get the second byte of the Chip-8 instruction (do not advance the instruction pointer).

0032

F6
SHR

The next four instructions move the most significant digit of the Chip-8 instruction (second byte) - VY - to the position of the least significant digit. The least significant digit is discarded).

0033

F6
SHR

0034

F6
SHR

0035

F6
SHR

0036

F9 F0
ORI 0xF0

Apply a mask to the VY part of the Chip-8 instruction to form the low order byte of a pointer to the relevant variable (These variables are stored in the final page of on-card RAM from 0x0XF0 to 0x0XFF).

0038

A7
PLO 7

The VY pointer (R7) now points to the correct variable for this instruction.

0039

4C
LDA C

Get high-order byte of routine from look-up table.

003A

B3
PHI 3

Store this in the high order byte of the interpreter programme counter (R3).

003B

8C
GLO C

Get the low order byte of the address currently pointed to by RC - this will have been moved on by 1 by the LDA instruction…

003C

FC 0F
ADI 0x0F

… so, as the corresponding entries in each table are placed 16 bytes apart, it's just necessary to add 0x0F to the address …

003E

AC
PLO C

… so that RC now points to the correct place in the second look up table.

003F

0C
LDN C

Get the low order byte of the address from the lookup table.

CALL_ SUBROUTINE:
0040

A3
PLO 3

And use this to set the low order byte of the interpreter programme counter (R3).

0041

D3
SEP 3

Now call the interpreter subroutine to handle this instruction group.

0042

30 1B
BR FETCH_ DECODE_LOOP

On return from the subroutine, loop back and get the next Chip-8 instruction.

FIRST_ DIGIT_0:
0044

8F
GLO F

This subroutine is entered when the first digit of the instruction is 0x0. This indicates a call to the machine code routine stored in the remaining three digits of the instruction. The routine starts by retrieving the original first byte of the Chip-8 instruction in RF.0.

0045

FA 0F
ANI 0x0F

Use a mask to remove the first digit of the instruction (leaving the high order byte of the address to be called).

0047

B3
PHI 3

Use this to set the high order byte of the interpreter programme counter (R3), as this is also used as the programme counter for machine code routines called with this instruction.

0048

45
LDA 5

Get the low-order byte of the address to be called directly from memory using the Chip-8 programme counter (R5) and then advance this.

0049

30 40
BR CALL_ SUBROUTINE

Now return to the main fetch and decode loop and call the relevant subroutine.

004B-004E


There is a short routine here to turn on the COSMAC VIP's display. I'll analyse this in a future post.

004F

00 00
DB 0x00, 0x00

This is filler before the subroutine address lookup tables so that the last digit of the address for each entry corresponds to the digit that indicates the instruction group (i.e. the entry for instruction group 1 is found at 0x0051, the entry for instruction group 2 at 0x0052, etc.).

0051

01 01 01 01 01 01 01 01 01 01 01 01 00 01 01
DB 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x00, 0x01, 0x01

A lookup table holding the high order bytes of the addresses of the subroutines for Chip-8 instruction groups 1 through F.

0060

00
DB 0x00

This is filler between the tables so that the second table is also aligned to instruction group numbers (i.e. 1 is at 0x0061, 2 at 0x0062, etc.).

0061

7C 75 83 88 95 B4 87 BC 91 EB A4 D9 70 99 05
DB 0x7C, 0x75, 0x83, 0x88, 0x95, 0xB4, 0x87, 0xBC, 0x91, 0xEB, 0xA4, 0xD9, 0x70, 0x99, 0x05

A lookup table holding the low-order bytes of the addresses of the subroutines for Chip-8 instruction groups 1 through F. So the completed addresses for each digit are:
0x1: 0x017C
0x2: 0x0175
0x3: 0x0183
0x4: 0x018B
0x5: 0x0195
0x6: 0x01B4
0x7: 0x01B7
0x8: 0x01BC
0x9: 0x0191
0xA: 0x01EB
0xB: 0x01A4
0xC: 0x01D9
0xD: 0x0070
0xE: 0x0199
0xF: 0x0105.

Execution times for the fetch and decode loop are 40 machine cycles (181.6 microseconds) for group 0 instructions and 68 machine cycles (308.72 microseconds) for all other Chip-8 instructions. Note that these are the execution times for the fetch and decode loop only – the execution time for the called routine needs to be added to this to get the total execution time for the instruction.

Contemporary interpreters might not use this fetch and decode algorithm. For example, a fairly common way to select routines for each instruction group is to use a switch statement. Other interpreters might use function pointers, which is closer to the algorithm used here. It also may not be necessary to set up variable pointers this early in a contemporary interpreter.

In future posts I’ll analyse each instruction group, starting with group 0 for machine code integration.

Published inProgrammingRetro Computing

Be First to Comment

Leave a Reply

Your email address will not be published. Required fields are marked *