CS 231: Introduction to Programming
CS 231 Supplement: Programming the PC-231 Computer


This supplement describes an imaginary computer which has been designed for instructional purposes, the PC-231. We will use this computer, and a simulator and assembler for it, to explore how computers work at a fairly low level (there are actually several lower levels, including micro-code and device-based descriptions, but we will ignore these). This supplement includes a basic description of the computer's architecture, its normal operation, assumptions made about its environment and a description of a more readable assembly language that can be used as an alternative to its native machine code. It also includes a description of the sim231 simulator program and some examples of its use (by next week we should also have an assembler available: for this week you will have to assemble your programs into hexadecimal codes by hand).

Acknowledgements:

This version of a simple instructional computer is actually the third in a series (the other two were the PC-101 from Oberlin College and the SCA-130 from the Fall 1997 Computing Concepts course). The original idea for this approach came from my wife, Carol Goldfarb. Brian Davis (a visiting faculty member during 1997-98) contributed to ideas and documentation for an improved version. The current version is the most sophisticated, sporting indirect addressing, multiple I/O devices, etc.


Contents


Architecture and operation

The PC-231 machine has 16 internal registers, which are referred to in several ways, including a "raw" binary or hexadecimal ordinal or a special mnemonic code as used in the standard assembly language. Since there are exactly 16 registers, we can use a single hex digit or 4 bits to refer to them in machine code instructions. Each register is 12 bits wide, as are the storage locations in the main RAM and the data path between RAM and the CPU. The PC-231 normally comes equipped with 256 words of RAM, which can thus be accessed with 8-bit addresses (or 2 hex digits).


Several I/O devices may be attached to the machine, each with its own protocol and uses: for now, we will assume 3 standard devices for each of input and output, supporting direct reads and writes of decimal, hexadecimal and ASCII values, respectively (we may add a fourth device for color graphics later on).

Machine instructions are structured as a 4-bit opcode plus up to 8 bits of operand or argument specifications. The operand portion of an instruction usually conforms to one of the following patterns:

As mentioned above, the PC-231 machine is normally equipped with a 256-word random-access memory, with each word holding 12 bits of data. These memory locations are accessed through the LOAD and STORE commands. The LOAD command allows the user to retrieve data from a location in memory and place it in a register. The STORE command is dual to the LOAD, allowing the user to take data in a register and place it at a specific location in memory.

As with most modern computers, the PC-231 is a stored-program computer, so its instructions are also considered to be data: thus they take up space in the memory and must be retrieved from memory (or "fetched") in order to be performed.

The LOAD and STORE commands both use an indirect addressing mode in which the address being referred to is not directly kept as a part of the instruction. Rather, the instruction refers to another register, which in turn holds the location which will be used. This approach is a bit inconvenient in that addresses normally have to be copied into the indirection registers in a separate step. It does, however, allow us to address significantly more RAM if necessary (why?) and it makes some kinds of operations much more straightforward, since the addresses in registers can be incremented, decremented and otherwise modified during program execution.

For all examples and problems we will assume that the PC-231 starts out with all registers and RAM set to a full 12-bit zero value. When the machine is used, its RAM is normally loaded with instructions and data, and then execution proper is begun. Thus any memory locations which are not specifically set during the program load will remain zeroed out until execution begins. This also means that the first instruction to be fetched by the machine will be the first data value in memory, at location 0.

Once the machine is started up, it operates as follows:

  1. An instruction is loaded, from the memory indexed by the PC register;
  2. The program counter register is incremented (this is important!);
  3. The instruction is decoded and executed according to the rules below; such execution may in some cases (called jumps) change the value of the program counter;
  4. go back to step 1.
This cycle is repeated until a HALT instruction is executed, which causes the machine to stop completely.

The PC-231 Instruction Set

The PC-231 machine has 16 instructions which can be categorized by purpose into those involving data movement, control transfer, logic, arithmetic and input/output. Each instruction fits in an 12-bit wide field which includes an opcode (i.e., bits which distinguish this instruction from all others) and specifications of register numbers, memory locations or constant values. The table below describes the formats of each of the instructions:

Code HEX Mnemonic Argument format Description
0000 0 HALT ---- ---- Halts the machine
0001 1 ZERO RRRR ---- Zeroes (or "clears") out register RRRR
0010 2 SET RRRR BBBB Sets the 4 lowest-order bits of register RRRR to BBBB
0011 3 DATA BBBB BBBB Clears the data register DR, then sets its 8 lowest-order bits
0100 4 INC RRRR sNNN Adds or subtracts from 1 to 8 from register RRRR
0101 5 SHIFT RRRR sNNN Shifts register RRRR left (-) or right (+) by from 1 to 8 bits
0110 6 ADD RRR1 RRR2 Adds the contents of register RRR1 to RRR2 (result in RRR2)
0111 7 SUB RRR1 RRR2 Subtracts the contents of register RRR1 from RRR2 (result in RRR2)
1000 8 AND RRR1 RRR2 Logically ANDs the contents of register RRR1 into RRR2 (result in RRR2)
1001 9 COPY RRR1 RRR2 Copies the contents of register RRR1 to RRR2
1010 A LOAD RRR1 RRR2 Loads the contents of location addressed by RRR2 into register RRR1
1011 B STORE RRR1 RRR2 Stores the contents of register RRR1 into location addressed by RRR2
1100 C READ RRRR DDDD Reads a value (up to 12 bits) into register RRRR from device DDDD
1101 D WRITE RRRR DDDD Writes a value (up to 12 bits) from register RRRR to device DDDD
1110 E JPIF RRRR CCJJ If contents of RRRR meet condition CC, jump to address in JJ
1111 F JUMP AAAA AAAA Jump directly to the location whose address is AAAA AAAA

Instruction Details for the PC-231 Machine

Following are detailed descriptions of each of the instructions used by the PC-231 machine. Each description is titled with the assembly mnemonic of the associated instruction, followed by its instruction format, an English description of its action and an example of its use.

HALT ---- ----

This instruction is used to stop the machine; it takes no arguments or parameters. Any bit values in the remaining part of the instruction are ignored (i.e., beyond the 4-digit opcode). For assembly purposes, the remaining bits will always be filled with 0 values. Note that the use of the "all zero" opcode (in combination with the assumption that the RAM will start up filled with zeros before program load) will tend to prevent "runaway programs", since they will tend to run into zero-opcode commands in the higher parts of memory.

Example:

The instruction

HALT
stops the computer.


ZERO RRRR ----

This instruction is used to "clear out" a register, i.e., to set all of it's bits to zeros (including the left-most bits). The single argument is the register to be cleared. By convention, the assembler will generate zeros in the 4 low-order bits of the instruction, but they are ignored by the machine when the instruction is decoded and executed.

Example:

The instruction

ZERO R3
clears the contents of the R3 register.


SET RRRR BBBB

This instruction is used to set the 4 lowest-order bits of a specified register (RRRR) to hold the bits given directly in the instruction (BBBB). The eight higher-order bits are left unchanged in the register. This instruction is typically used in conjunction with the SHIFT instruction in order to completely fill a register with a sequence of 8 or 12 bits. Alternatively, eight bits can be set at once in the data register using the DATA instruction, then copied into the relevant register.

Example:

The instruction

SET R0,xF
changes the 4 lowest order bits of register R0 to ones. If the register oiginally contained the hex value FF0, it will now contain all ones, i.e., hex value FFF.


DATA BBBB BBBB

This instruction is used to put a specified value (8 bits maximum) directly into the machine's data register (DR). Executing the instruction will always over-write the eight lowest-order bits in the data register, even if only a fewer number of bits are specified in the assembly language version of the instruction. In other words, the assembled instruction will always have some values in all eight of these bits (by onvention, the assembler will pad the eight lower-order bits with zeros to the left if necessary).

Example:

The instruction

DATA xFF
puts the hex value FF into the eight lowest-order bits of the data register. If the instruction had been
DATA xF
instead, then the hex value 0F (4 bits of zeros, 4 bits of ones) would have been used to over-write the eight lowest-order bits of DR.


INC RRRR sNNN

The INC instruction is used to modify the contents of a specified register by adding or subtracting a small constant value ("INC" stands for "increment"; the word "decrement" is normally used when a subtraction is being made). The first argument (RRRR) specifies the register to be affected. The second argument (sNNN) specifies a number in the range between 1 and 8 (if the sign bit s is 0) or between -1 and -8 (if the sign bit s is 1). The actual magnitude of the value NNN, if read in binary, is one less than the absolute value of the incrementation amount. This is justified by appeal to the fact that we would not normally need to change a register's value by 0 (or negative 0), and using the bits this way gives a fuller and more symmetric set of arguments.

Example:

The instruction

INC R7,-3
causes the value of the R7 register to be decremented by 3. Note that the assembler (or the programmer, if assembled by hand) is responsible for properly translating the "-3" argument into the corresponding binary value: in this case it would be "1010" (xA), where the leading "1" stands for a negative amount, and the trailing "010" stands for 2, which is one less than the (absolute value of) the decrement amount.


SHIFT RRRR sNNN

This instruction is used to move data around within a register. More specifically, the register operand (RRRR) specifies which register is to be affected and the signed value (sNNN) specifies the amount (in bits) that the data is to be shifted, to the left if negative or the right if positive. The register is filled with zeros on the other side as bots are moved out (and lost) on the side of the specified direction (given by s). The amount specified is given in an "off-by-one" fashion using the bits NNN, just as for the INC instruction (i.e., we can specify shifts by as much as 8 bits in either direction, but not by zero bits).

Example:

The instruction

SHIFT R5,8
shifts the contents of the R5 register right by 8 bits. The corresponding instruction generated by the assembler would be 0101 0101 0111, where the last nibble represents a positive value (the leading 0) and a distance of eight (since eight is one more than severn, written 111 in binary).


ADD RRR1 RRR2

This instruction is used to add the numeric values in any two registers RRR1 and RRR2 (Note: this doesn't mean that it only works for registers R1 and R2!). The result of the addition is stored in the second register argument, RRR2 (if you wantedit stored in the first register, you should have specified them in the other order!). Note that if two large values are added together, the results in the second register may not be accurate due to overflow of the twelve-bit width of the accumulator; the machine gives no overt indication of this situation.

Numbers stored in registers are interpreted as signed integers with one sign bit (the leftmost, one for negative, zero for positive) and 11 "data bits".

Example:

The instruction

ADD R1,R3
adds the contents of the registers R1 and R3, as signed 11-bit integers, and puts the results in register R3 (register R1's contents remain unchanged).


SUB RRR1 RRR2

This instruction is used to subtract the numeric values in register RRR1 from that in register RRR2. The result of the subtraction is stored in the second register argument, RRR2. Note that if a large value is subtracted from a small (negative) one, the results in the second register may not be accurate due to underflow; the machine gives no overt indication of this situation.

Numbers stored in registers are interpreted as signed integers with one sign bit (the leftmost, one for negative, zero for positive) and 11 "data bits".

Example:

The instruction

SUB R1,R7
subtracts the contents of the register R1 from that of R7, as signed 11-bit integers, and puts the results in register R7 (register R1's contents remain unchanged).


AND RRR1 RRR2

This instruction logically ANDs together the contents of registers RRR1 and RRR2 and puts the result into register RRR2. The logical AND is done bit-wise, so that, for example, bits 5 and 5 are ANDed together to get bit 5 of the result (thus all bit positions are independent of each other).


COPY RRR1 RRR2

Copies the contents of register RRR1 to RRR2. Note that the value originally stored in RRR2 is lost, i.e., it is written over by the new value from RRR1.


LOAD RRR1 RRR2

Loads the contents of location addressed by RRR2 into register RRR1. That is to say, register RRR2 is considered to hold an address in its rightmost 8 bits. That address is used to index into memory and obtain some value (a 12 bit quantity held at that memory address). That value is put into RRR1, over-writing any value previously in RRR1. The contents of the indexed memory location and that of RRR2 remain unchanged.

See this picture for a graphic interpretation of a sa,ple LOAD command.


STORE RRR1 RRR2

Stores the contents of register RRR1 into location addressed by RRR2. That is to say, register RRR2 is considered to hold an address in its rightmost 8 bits. That address is used to index into memory; the value at that memory address is over-written with the value (a 12-bit quantity) from register RRR1 (the contents of RRR1 and RRR2 remain unchanged). memory address). This


READ RRRR DDDD

Reads a value (up to 12 bits) from device DDDD into register RRRR. There are 3 "devices", the decimal device (device 0 or code "DD"), the hexadecimal device (device 1 or code "HD") and the ASCII device (device 2 or code "AD"). In the case of the ASCII device, only as many as 8 bits will actually be read. In the case of the decimal device, negative number notations will be properly converted by the device.


WRITE RRRR DDDD

Writes a value (up to 12 bits) from register RRRR into device DDDD. The output "devices" mirror the input devices (see the description for the READ instruction above).


JPIF RRRR CCJJ

If the contents of register RRRR meet the condition CC, then the machine "jumps" to the address in register JJ. Here RRRR is any register, but JJ must be one of the 4 "jump registers" J0, J1, J2 or J3 (these are numbered A-D and canbe referred to as RA through RD in the assembly language, if desired). The condition CC must be one of the following:
  • LZ (binary 00) meaning "less than zero";
  • GZ (binary 01) meaning "greater than zero";
  • EZ (binary 10) meaning "equal to zero";
  • NZ (binary 11) meaning "not equal to zero".
If the machine does not "jump", it just continues on to the next instruction it would have executed otherwise. Thus the "jump" is really just an over-writing of the program counter register (PC) with a new value, since the PC is updated by adding one before each instruction is executed.


JUMP AAAA AAAA

Jump directly to the location whose address is AAAA AAAA. This is an unconditional transfer of control and is equivalent to copying the eight bits AAAA AAAA into the PC register (the upper 4 bits will always be zeroed out).

Standard I/O Devices for the PC-231

The PC-231 ships with 3 standard kinds of I/O devices:
  • decimal input and output: device number 0 or code "DD" (decimal device)
  • hexadecimal input and output: device number 1 or code "HD" (hex device)
  • ASCII input and output: device number 2 or code "AD" (ASCII device)
These device numbers are used in the second argument of the READ and WRITE instructions.

Later on, we may use other devices, such as black-and-white or color output. But until then, these will be the only three devices connected to the PC-231. READs or WRITEs to any other devices will simply be ignored by the machine.


Assembler Mnemonics

A simple assembler program will be available soon for the PC-231. An assembler is a program which translates human-oriented, text-based instructions (called mnemonics) into the ones-and-zeros of machine code. Features of the assembly language include the use of named mnemonics for instructions, the use of special and general register names, the use of binary, decimal and hexidecimal constants for locations and numeric values, and the use of labels to mark and refer to various memory locations.

More detailed information on the assembly language will be provided later in lab and on-line. For now, take a look at the example programs in the section which follows.


Some Example Programs

  1. read in a number, multiply it by eight, then print it out again and halt. This program demonstrates the use of bit-shifting to implement arithmetic operations.

    ; ------------------------------------------------------
    ; A sample PC-231 program which multiplies an input by 8
    ;
    ; by Fritz Ruehr # CS 231 # Fall 1998
    
    	READ R0,DD		; read in input value using DD (decimal device)
    	SHIFT R0,-3		; shift by -3 is a LEFT shift, thus multiplies
    	WRITE R0,DD		; write out answer in decimal, too
    	HALT
    
    ; --- End of alphabet program ---
    	

  2. compute the values of the letters of the alphabet and print them out. This program demonstrates the use of labels and loops, as well as input and output of ASCII values.

    ; ---------------------------------------------------------
    ; A sample PC-231 program to compute and print the alphabet
    ;
    ; by Fritz Ruehr # CS 231 # Fall 1998
    
            DATA #print     ; address of WRITE
            COPY DR,J0      ; set up for jump
            DATA 'Z'        ; finish of alphabet
            COPY DR,R0      ; put 'Z' in R0 for testing
            DATA 'A'        ; start of alphabet
    #print  WRITE DR,AD     ; print current letter in ASCII
            COPY R0,R1      ; get fresh 'Z' for test
            SUB  DR,R1      ; compare current and finish
            INC DR,1        ; compute next letter
            JPIF R1,NZ,J0   ; continue if compared different
            HALT
    
    ; --- End of alphabet program ---
    	

  3. assuming a string (sequence of ASCII characters) is stored in RAM, print them out in order. The program is "hard-coded" with the length of the string (other approaches are also possible).

    ; -------------------------------------------------------
    ; A sample PC-231 program to print a string stored in RAM
    ;
    ; by Fritz Ruehr # CS 231 # Fall 1998
    
            DATA  #done     ; address of end of program
            COPY  DR,J0     ; set up J0 for the jump
            DATA  #data     ; address of the data section
            COPY  DR,R0     ; set up R0 for use in LOADs
    #next
            Load  r1,r0     ; get the next character from RAM
            JPIF  r1,ez,J0  ; skip to #done on 0 data
            WRITE R1,D2     ; echo ASCII character to display
            INC  R0,1       ; point to the next RAM character
            JUMP  #next     ; repeat the LOAD/WRITE cycle
    #done
            HALT            ; end of program
    
    ; ---------------------------------------
    ; Data section (the string to be printed)
    
    #data
            CONST 'H'
            CONST 'e'
            CONST 'l'
            CONST 'l'
            CONST 'o'
            CONST ' '
            CONST 't'
            CONST 'h'
            CONST 'e'
            CONST 'r'
            CONST 'e'
            CONST '!'
    
    ; --- End of printer program ---
    	


The PC-231 Simulator and Assembler Programs

A simulator for the PC-231 is now available on Gemini and Hudson shell.willamette.edu. You can call it by typing "sim231" (without the quotes) into a command line. It runs in an interactive mode (I may make a batch version available at a later date). The simulator can be put in a "chatty" mode (i.e., a mode which reports on each instruction as it is executed) by using an optional argument: "sim231 chatty".

There is also an assembler available on the aforementioned machines: it can be called two different ways. The first way is using cut and paste: just type "asm231" (no quotes) into the unix command line, then paste in your program. You can then cut out the hex codes to paste into the sim231 program.

The second way is to use files for input and/or output. In this case, you need to specify the filename(s) using unix re-directions in the command lines, e.g.:

shell% asm231 < input
or, to re-direct output to a file:
shell% asm231 < input > output

In order to get to the unix command line, run the "telnet" program on one of the lab machines, then choose either Gemini or Hudson from the list. When you get connected, type your login and password, then type "un" (no quotes) to get to the unix command line. When you are all done, use "logout" or type Control-D to finish.