I used to think that 8- and 16-bit microprocessor chips were pretty intelligent, but that was before I started using single-chip microcontrollers. These devices, with their on-chip EPROM, really are smart. They power up already knowing what to do—unlike their more powerful brothers, which just sit there and go "Duh!" until they get hooked up to some program memory.
Without external RAM and PROM, a microprocessor chip is so much wasted silicon. Consequently, a product using an embedded processor usually has the system code and the application program stored in one or more large PROMs. Development is accomplished with a PROM emulator or a similar device.
Almost all applications require some RAM, so it's tempting to put only RAM chips on the board—particularly since they still have a speed advantage over PROMs. Program development is easier because the code is easily changed. Yet in the final product, the system and application code has to come from somewhere. This source doesn't have to be connected via a wide data or address bus, though, or even be particularly fast. It could be a single slow, cheap EPROM hooked up via an 8-bit port. If you don't mind an appreciable startup delay, the code could come from a disk drive, over a serial link from some other equipment, or even be stored in a serial EEPROM, which takes up almost no board space.
Unfortunately, an all-RAM board has no way to load the system code and get the processor out of the "Duh!" mode. At the very least, a bootstrap loader is necessary. That means finding space for two whopping big PROM chips in a 16-bit system.
I recently redesigned a board that had its system code stored in the processor, a 17C43. This was to be replaced with an 80C86, but there was no room on the board for one PROM, much less two. Luckily, there was an I/O board in the system. It had room for a PROM and a 16-bit address generator. In theory, then, code could be copied to RAM via the 8-bit I/O port.
Enter The Smart Loader If you could somehow pre-load the RAM with a hundred bytes of code, no on-board PROM would be needed merely to load the system code. Fortunately, there is a device that contains enough PROM space for a bootstrap loader and has enough smarts to load it into RAM. It even fits in an 18-pin package. Two accessory chips are necessary, but the space penalty isn't great. Although there was no space on my board for even one PROM, it is possible to squeeze in three smaller chips.The smart chip is a PIC16C54 microcontroller from Microchip Technology Inc., Chandler, Ariz. One accessory chip is a quad two-input multiplexer—e.g., a 74HC157. This gives the PIC access to the RAM control signals. The other chip is an 8-bit buffer, such as a 74HC541. This lets the 8-bit port of the PIC drive both halves of the board's 16-bit bus.
Here's how things work. One PIC port bit sends a reset to the main processor. This disconnects it from the data/address bus, allowing the PIC to drive the lower eight bus bits. The buffer chip bridges the two halves of the 16-bit bus. As a result, the same input appears on its high half. The PIC's byte output is then available to both the high-byte and low-byte memory chips. The PIC generates separate write-enable signals for each RAM chip.
In a typical small-system configuration, the data/address bus already has two 8-bit latch chips on it to capture the current RAM address (Fig. 1). In principle, the PIC could load these chips with any 16-bit address, but this would require either an extra port bit or a decoder chip. I compromised by wiring the high-latch chip's output-enable pin to the reset line. This lets the resistors pull address bits 9 through 15 high as long as the PIC is in control. Any other active address bits and the M/—IO selector bit also are pulled high. The address decoder thus activates the highest page of memory.
The PIC loads eight address bits into the lower address latch, limiting the bootstrap code to less than 512 bytes. This code is normally used only to transfer a few thousand bytes to RAM from a PROM or a serial port, so this should be ample space. Even a minimal disk controller could be included.
In theory, it's possible to set any high address with pull-up and pull-down resistors. But because the 80C86's startup address is FFFF0H, it's convenient to put the bootstrap code on the same 256-word page. Once the main processor is running, it can relocate code as required—for example, to use the interrupt vectors on page 0.
Both the write enable and the Address Latch Enable (ALE) pulses from the main processor pass through the multiplexer chip in normal operation. They are replaced by signals from the PIC's 4-bit port when the PIC is in control.
High Address Latch Disabled The ALE signal generated by the CPU is gated off by the multiplexer chip to permit the PIC to load the address latches. (Both latches are loaded with the same value, but the high address latch is disabled.) The write-enable pins of the high and low RAM chips, which are driven in common when the CPU is writing to RAM, are separated to allow the PIC to write bytes to the RAM chips individually. If the main processor has an 8-bit data bus, however, things are much simpler.Figure 2 shows the complete loader circuit, apart from the 8-pin, 10-kΩ SIP that pulls up address bits 9 through 15 when the address latch is disabled. An earlier implementation of this loader used three-state drivers to control the RAM. The drivers and the PIC output pins were connected in parallel to the RAM chips. Using a multiplexer chip eliminated an inverter.
When loading code, the PIC outputs an 8-bit address and writes it to the address latches (see the code listing). Next, it fetches an instruction byte from its EPROM, puts it on the bus, and sends a memory write pulse to the high or low RAM chip as appropriate. It repeats these operations sequentially until it has transferred as many bytes as are needed. Then it turns the main processor on and goes to sleep, consuming negligible power.
It's undesirable for the main processor to be turned on before there is code ready for it to execute. The circuit then has to allow for the PIC's built-in power-on delay, which can be as long as 30 ms. Initially, an RC reset signal some 100 ms long holds the processor off the bus. As soon as the PIC starts up, it generates an overriding reset. After that, it loads the bootstrap code and waits 100 ms to make sure that the RC circuit has timed out before it releases its reset.
Driving The Master Clock The PIC can run from any convenient clock of up to 20 MHz. In my system, I used the PIC to drive the master clock crystal, saving the space taken by a crystal oscillator. If you do this, you must end the loader program with an endless loop since putting the PIC in sleep mode stops its clock.In a word-addressed system, the low address latch normally stores address bits 1 through 8. Bit 0 isn't needed because it only serves to distinguish the high and low halves of a word when reading bytes. The two write-enable strobes in this circuit serve that function. When the microcontroller writes an address byte to the latch, bits 1 through 7 of the address appear in their normal positions in the byte. The same byte appears on both the high and low halves of the bus, so bit 8 of the address must be placed in the bit 0 position of the byte.
One peculiarity of the smaller PIC chips is their mechanism for storing data in their program EPROM. Data bytes are encoded as one of 256 different return instructions. To implement a lookup table, a subroutine call must be made to a calculated address. The number embedded in the return instruction at that address is loaded into the working register. Table addresses have eight bits. The ninth bit is zero, and as a result, the table must be located in the first 256 words of the chip's 512-word program memory.
Another oddity of the 16C5x series is that these models start execution at their highest memory address. This is a nuisance when programming the chip. The PIC's EPROM is loaded sequentially, making it important to pad all programs to exactly 512 instructions. If the unused program area is left unprogrammed, the chip will execute a dummy instruction on startup and roll over to address zero.
I once got into trouble when I put a copyright notice into the last few code words. The first thing the chip executed was a return instruction. If a chip happened to power up with nonzero return-stack contents, the program crashed. I mention this because the listing shows GOTO START as the first instruction, whereas ideally it should be instruction 511. The indirect subroutine call must lie within the first 256 instructions. That's why I put it immediately after GOTO START.
Since the first two table entries indicate the loading address in RAM and the number of words to be loaded, only 252 entries—126 words—are available for the bootstrap program itself. The count represents the number of 16-bit words to be loaded, so the table must be padded to have an even number of bytes. Similarly, the loading address must be even.
If some 250 bytes of bootstrap code aren't enough, the 16C56 can be substituted for the 16C54. The 16C56 has 1024 locations for storing instructions, of which 512 instructions can be data table bytes. The loading program must be changed to select the appropriate table.