Friday, 18 February 2022

A refresh on elementary programming

In the course of reading Damasio's book at reference 1, of which more in due course, I have been prompted to wonder about the distinction between data and process, a distinction which seems to break down in neural networks, biological or otherwise. Then, reading about some of the workings of DNA, I have been prompted to wonder about the extent to which biological or chemical reactions can be exploited to do the sort of stuff more usually associated with computer programs.

Which has taken me back to the foundations, that is to say, what sort of repertoire of instructions does one need to build a decent computer?

To which end, I start with the 24 bit instruction used by ICL 1900 computers when I started out in the early 1970’s. A time when some large computers were water cooled in the way of the engine of a car – and maybe super computers still are – maybe the one snapped at reference 4.

Refresh on ICL 1900 instruction layout

In the snap above of an instruction for an ICL 1900 computer, taken from reference 2 below, the F field says what instruction is to be executed and the X field gives the number of the accumulator to be used. The two fields M and N between them usually define the address of the memory data item required by the instruction. So the operation might be to load the specified accumulator with the contents of that address. If the M field is zero, the 12-bit N field is the address of the operand – with 12 bits or 4,096 giving a very low limit on size, compared with a modern computer. If the M field is non-zero, the address consists of the contents of the modifier added to the N field. A tweak which I imagine is there for the convenience of the compiler, that is to say the thing that converts what you type into something that the computer can understand, can execute.

Part of the point here being that there is only a small number of accumulators (on the processing chip) but lots of addresses (of data items, of places in memory, usually a short distance from the chip). Most of the heavy lifting is done with accumulators, which means that a lot of code and processing time goes into moving stuff in to and out of them.

And while one might think that an instruction is process rather than data, that simple picture is confused by more than half the instruction being data about where data is to be found.

A program is a finite sequence of such instructions, usually stored in a file, but loaded into memory for execution.

Computers like this one can only do one thing at a time, and while several programs can be interlaced in time, appearing to be running at the same time, each one can be considered to running by itself, in complete privacy, executing one instruction after another, by default in the order in which those instructions have been written to file.

A completely centralised operation with the programs existing inside a management and control environment, here called the supervisor. Which is a long remove from the completely decentralised operation of a tank (or body) full of chemicals. In this case, programs, if they can be called such, work of their own accord. They do not need a supervisor to breathe them into organised life.

Refresh on Jackson SDM diagrams

The idea here, again from the early 1970’s, was that (the design of) any program could be built out of the elementary building blocks of sequence, repetition and choice. Where the ‘goto’ instruction was rather frowned upon. While the ‘call’ (subroutine) instruction was desirable but not strictly essential. Where each block was a chunk of processing which was either sufficiently small and simple for the code to be self explanatory or which could be further broken down in a subordinate diagram. 

An important part of this was the breaking of a large problem down into a number of small problems. Each small problem was implemented as a block of code, quite possibly as a subroutine, which could be written, worked up and tested independently of all the other blocks. Then, hopefully, not all that much would go wrong when the blocks were all stitched together.

With some further introduction being provided at reference 3.

Programs and instruction sets

We suppose our computer has a small number of accumulators and a large number of data addresses. As above, the values that these accumulators and data addresses can take are 24 bit integers; positive, zero or negative. Data is usually about some real world entity, perhaps a person or a driving license. But a data item can also be the address of some other bit of data or the address of some particular line of the program using that data.

Program addresses are derived by numbering the instructions from zero. 

The program lives inside our computer and is activated and generally looked after by the supervisor. Programs need supervision to bring them to life; without supervision they are just a chunk of text, dead to the world.

There is a special bit of state data called the cursor, a non negative integer, which keeps track of, tells the supervisor which instruction is currently being executed. By default, the instruction immediately following is the next one to be executed, that is to say that the instructions in the program are executed one after the other. One exception being the case that a branch instruction has been executed, and execution jumps to somewhere else. Another exception is the case that the program has executed a stop instruction.  And the last is the case that execution has reached the end of the program and there is no next instruction, in which case again, it stops. 

The program will also stop if there is some kind of error, for example trying to use an inappropriate address.

When the supervisor starts the program, this cursor is set to zero.

The supervisor recognises a number of different instructions, possibly as many as 100 or them, otherwise the instruction set. A great deal of work has gone into devising instruction sets, into striking the right balance between the convenience of the processor chip and the convenience of the humans writing programs. While some logicians delight in inventing instruction sets which are both minimal and sufficient – although perhaps inconvenient. And others went in for exotic high level languages, for example the Algol 68-R, fashionable in defence circles in the mid 1970’s. High level in the sense that they are many layers above the sort of thing that a processor chip might understand, and much compilation is required to take one down from the high level to the chip-ready, low level. Generally speaking, programs in high level languages are much shorter and much more legible than the equivalent programs in low level languages.

An instruction set

It can be shown that a program consisting of instructions of one of the following forms can do a great deal, particularly in the way of doing sums. And by adding ever more tricky instructions, a program can do even more, including processing text rather than numbers.

PULLA. Put the contents of data address X into accumulator Y. Here and in what follows the values of X and Y have been computed as part of the preparation of the program, before it is executed and in the instruction, in the program that has been loaded into the computer, we have actual numbers, not pointers to numbers. Note that the program itself is inviolate; it is executed but not amended during that execution. What changes during execution is the data in the accumulators – which might be called state data – and the data in memory.

PULLB. Put the data addressed by accumulator X into accumulator Y. This sort of indirection is the key to much that is important in computing: the transition zone between data and process. Supervisors are supposed to fail instructions which reference addresses which are out of bounds.

ADD1. Add the contents of data address X to the contents of accumulator Y, putting the answer into Y.

ADD2. Add the contents of data location addressed by accumulator X to the contents of accumulator Y, putting the answer into Y.

ADD3. Add constant A to the contents of accumulator Y, putting the answer into Y.

SUB1. Subtract the contents of data address X from the contents of accumulator Y, putting the answer into Y. 

SUB2. Subtract the contents of data location addressed by accumulator X from the contents of accumulator Y, putting the answer into Y.

SUB3. Subtract constant A from the contents of accumulator Y, putting the answer into Y. 

These three add and three subtract instructions are the foundation on which our computer does sums. The subject matter of this particular computer. All the rest is moving stuff about and control.

SET1. Put the constant A into accumulator Y.

PUT1. Put the contents of accumulator Y into data address X.

PUT2. Put the contents of accumulator Y into the data location addressed by accumulator X.

PUT3. Put the current program address into accumulator Y. More trickery of the sort started at PULLB above. Needed if we are to be able to return from a call to a subroutine, as, for example, in Basic. Actual use needs to take account of the need to return to a point a few instructions after the one that gets into the accumulator. With subroutines being an important part of the modularity needed if large programs are to be manageable – a modularity which is not, as noted at reference 3, very visible in a human brain.

IF. If the contents of accumulator Y is zero or negative, then transfer control to the program address given by accumulator X. A conditional branch. One or more of these instructions are used to implement the Jackson choice.

GOTO1. Transfer control to program address A. An unconditional branch. Used when the target address is known at the time of writing the program. Paired with IF, used to implement the Jackson repetition.

GOTO2. Transfer control to the program address given by accumulator X. Another unconditional branch, useful when the target address is not known at the time of writing the program. It may, for example, depend on circumstances. Teamed with PUT3 and GOTO1, used to implement calling and subsequently returning from a subroutine.

STOP. Stop.

In this we have more or less followed the layout used by ICL, as illustrated above: do something or other, using contents of this accumulator and of that data address.

As noted above, there is no one such set of instructions and much time has been spent on devising sets suitable for various computing environments and circumstances. While the high level languages actually used by most computer programmers are layered on top of low level languages of this sort.

The business of moving data between memory and peripheral devices such as discs, tapes and humans has been omitted. A substantial complication, but one which does not bear on the matter in hand.

A simple example to add up some numbers held in memory

In what follows, ‘[‘ is used to mark comments, which are not part of the instruction. Most programming languages allow comments.

0: SET1 1, 100 [the code that follows the ‘IF’ below is to be executed 100 times

1: SET1 2, 0 [set the value of the accumulator which will be use to accumulate the sum to zero

2: SET1 3, 1000 [the address of the first of the 100 items – which taken together occupy a block of 100 data locations

3: IF 1, 8 [stop the loop when we have done 100 executions

4: ADD2 2, 3 [add the first or the next data item, addressed by accumulator 3, into accumulator 2

5: ADD3 3, 1 [move the data address pointer to the next position

6: SUB3 1, 1 [subtract 1 from accumulator 1. This new value is tested at 3 above, after the goto which follows

7: GOTO1 3 [go back to the start of the loop

8: we now have the sum of 100 data values in accumulator 2. The program winds up and stops

Note that there are lots of slightly different ways to code up even this simple example. Some of these differences may be to do with style: in a large program it is helpful if the programmer has a style and sticks to it.

Also that it would be possible to avoid the use of the loop by writing out 100 short blocks of code, which would be clumsy and prone to error. But not possible at all in cases where the number of repetitions is not known in advance and may be large.

Conclusions

We have set the scene for comparison with more exotic forms of program. How much of this sort of thing we can we get a bath full of chemicals to do?

References

Reference 1: Feeling and knowing: making minds conscious – Antonio Damasio – 2021.

Reference 2: http://www.chilton-computing.org.uk/acl/literature/manuals/icl1906a/p001.htm. The source for ICL 1900’s. For present purposes, the best of the various offerings turned up by Bing or Google.

Reference 3: http://psmv2.blogspot.com/2015/11/how-does-brain-do-sort-of-things-that.html

Reference 4: https://psmv3.blogspot.com/2018/06/oak-ridge.html

No comments:

Post a Comment