Forth in C for Arduino Uno Board
ceForth_328
Forth in C for Arduino on ATmega328P
The first large scale, working computer in the US was the Harvard
Mark I, designed by Howard Aiken at Harvard University and built by
IBM. It was an electromechanical monster completed in May 1944,
with programs stored on paper tape. Then came ENIAC built by J.
Presper Eckert and John Mauchly at University of Pennsylvania in July
1946. It was based on vacuum tubes, and programmed by patch
cords and switches. In these computers, programs were entered
through media completely different from mechanisms performing
computation, and were called the Harvard Architecture.
In 1945, John von Neumann, then at Princeton University, was invited
to visit ENIAC, and then wrote the classic "First Draft of a Report on the
EDVAC", in which he proposed the stored program computer, where
programs and data resided on the same memory medium. It was
then called the Princeton Architecture or the von Neumann
Architecture, and had been adopted by most computer designers, but
not all of them.
The AVR family of microcontrollers from Atmel happened to follow the
Harvard Architecture, against the common practice in the industry. The
reason was that they use a large flash memory to store programs and
a small RAM memory to store data. The flash memory is organized in
16-bit words and the RAM memory is organized in 8-bit bytes. The two
memories are very different in their timing and read/write behaviors,
and it warrants two different memory buses and separated
instructions to access them.
The Arduino 0022 system used on Arduino boards requires that you
write application programs in 'sketches', which are based on the C
programming language. The C language hides the underlying
microcontroller from you. Instead, it present to you a computing model
which is essentially a Harvard Architecture. Programs are placed in
location hidden from you. Data are placed in locations you have to
declare, and then are secretly assigned by the compiler and the linker.
Functions and data are accessed by assigned names so that you are
prevented to make serious mistakes which may cause the computer
to crash. For casual users, Arduino 0022 matches very well with
Atmega328P microcontroller, sharing the same Harvard Architecture,
and this undoubtedly is one of the reasons why the Arduinos are such
a huge success.
The FORTH programming language definitely belongs to the Princeton
Architecture. It assumes that you have free access to all parts of a
microcontroller, and that programs and data share the same memory
space. Therefore, new commands and new data structures can be
added freely so that you have an interactive and extensible system to
develop and debug your applications.
The Harvard Architecture in the ATmega328P microcontroller is not a
big problem for me. In assembly language, I have the complete
control over the instructions, RAM/flash memory spaces, and all the IO
devices, and I can impose a FORTH Virtual Machine on the
ATmega328 chip. This is 328eForth. An interesting feature of
ATmega328 is that if you have to write into the application section of
the flash memory, that part of your code must reside in the bootloader
section of the flash memory. Therefore, I have to take over the
bootloader section, and the resulting 328eForth system can not
peacefully co-exist with the Arduino 0022 bootloader.
In ATmega328P, there are 2 KB of RAM, and at least 1.5 KB are free. I
cannot store machine instructions in RAM, because ATmega328P only
executes machine instructions stored in the flash memory. However, I
can design a FORTH Virtual Machine with pseudo instructions, which
are pure data as far as ATmega328P is concerned. These pseudo
instructions can be stored either in flash or in RAM. All I need is a
scheme to unify these two different memories so that I can use the
same set of read/write pseudo instructions to read the flash and RAM
memories, and to write the RAM memory.
This is my way of building a Princeton Architecture on a Harvard
microcontroller with a Harvard programming language. The FORTH
Virtual Machine (FVM) has 30 pseudo instructions as byte codes. The
pseudo instructions are written as C routines, and a simple Finite
State Machine (FSM), also written in C, executes these byte codes,
which can be stored either in the flash or RAM memory. The complete
FORTH operating system, including an interpreter, a compiler and
other programming and debugging tools are contained in a big data
structure called a dictionary. This dictionary contains a set of records
linked into a searchable linked list. Each record is the embodiment of
a FORTH command, and consists of a link field, a name field and a
code field. A FORTH command is called externally by a name which is
an ASCII string, and internally by a token, which is the address of its
code field. There are two types of FORTH commands: the primitive
commands having lists of pseudo instructions in their code fields, and
the compound commands having lists of tokens in their code fields.
The FORTH dictionary is a large and rather complex data structure,
because the name fields and the code fields are of variable length. My
very limited experience in C is not sufficient for me to build this data
structure in C, although very experienced C programmers in SVFIG
assured me that it can be done. I fell back to FORTH to build this
dictionary and then imported it into the ceForth_328 sketch as a code
array.
The dictionary code array is 8 KB in size, and is allocated and
initialized by C in the flash memory. However, the lowest 2304 bytes of
this code array is mapped to the RAM memory space in ATmega328P.
In ATmega328P, the RAM memory space is divided into two parts. The
lowest 256 bytes are mapped to the CPU and IO registers, and the
rest of the 2048 bytes are RAM memory. The read/write pseudo
instructions are smart in that they use RAM memory instructions when
the memory address is below 2304, and they use flash memory
instructions otherwise. Therefore, the dictionary spans across the
flash and RAM memory spaces, so that I can add new commands to
the dictionary branch in RAM, while Arduino 0022 thinks I am just
writing harmless data into the RAM memory.
This is how you can impose a Princeton Architecture on a Harvard
Architecture.
The limitations are that you have only 1.5 KB to compile new FORTH
commands, and that you lose these new commands when you lose
power.
As I said before, this ceForth_328 system is a teaser, allowing you to
experience FORTH within the confines of Arduino 0022. If you are
ready for serious application programming, move on to the 328eForth
system.
328eForth for Arduino Uno Board
$25 including assembly source code,
demo code, lessons, and detailed
documentation,