home   |   primer index   |   1. Intro: Why 6502?   |   2. addr decode   |   3. mem map req.s   |   4. IRQ/NMI conx   |   5. 74 families & timing   |   6. clk gen   |   7. RST   |   8. mystery pins   |   9. AC performance construction   |   10. exp bus & interfaces   |   11. get more on a board   |   12. WW Q&A   |   13. custom PCBs   |   14. I/O ICs   |   15. displays   |   16. getting 65xx parts   |   17. project steps   |   18. program-writing   |   19. debugging   |   20. pgm tips   |   21. workbench equip   |   22. circuit potpourri


6502 PRIMER: Building your own 6502 computer


Memory Map Requirements

The 6502 treats I/O and memory the same.  For example, if you have the instruction STA $4006, the processor doesn't care if you're storing the accumulator value to RAM or to an I/O IC's register, or anything else.  (You would normally want to have your source code specify the name of a variable, I/O register, or whatever, but the emphasis here is on the actual address.)  There are only a couple of relatively non-negotiable requirements to keep in mind.


RAM in Page 0 and Page 1: Imperative!

Please note that it is almost imperative that you have at least some RAM in page 0 (addresses 0000 to $00FF) and page 1 (addresses $0100-$01FF).  This space does not have to be 100% RAM, but virtually all applications will need at least some in each of these pages.

Page 0 (or zero page, abbreviated ZP, occupying addresses $0000 to $00FF) is used for shorter, single-byte addressing; but more importantly, ZP is, in a sense, 256 8-bit processor registers.  Pre-indexed and post-indexed indirect addressing modes are only available through ZP where the pointer addresses reside.  LDA (ZP,X) and LDA (ZP),Y are examples.  (Assuming you've been previewing the instruction set, you will recognize "LDA" as "LoaD Accumulator" which is the primary working register, and then there's the operand and addressing mode, with the parentheses meaning "indirect" and the X or Y index register inside or outside the parentheses meaning pre- or post-indexing.  "ZP" gets filled in with a zero-page address, usually referred to by a name that's meaningful to humans.  The excellent programming manual referenced earlier explains the instruction set super well.)

As for page 1 (address range of $0100-$01FF), the reason for the need of at least some RAM in page 1 is that the 6502 uses this page for its hardware stack.  The instruction set makes it somewhat practical to implement other stacks as well, but the processor's own native stack that it uses to find its way back from subroutines and interrupts is in page 1.  (In the case of interrupts, the processor status is automatically saved on the stack as well.)  If you don't have RAM in page 1, you will have the severe limitation of not being able to have any subroutines or interrupts.

The low 8 bits of the stack pointer are held in register S, and the high 8 bits are always implied to be $01.  When the 6502 stores a byte at the address pointed to by the stack pointer, it immediately decrements the stack pointer.  In other words, the stack grows down, not up.  If you had 128 bytes of RAM in addresses $180 to $1FF, you would initialize your stack pointer to $FF by having LDX #$FF, TXS at the beginning of your program.  This puts $FF in index register X, and then transfers it to the stack pointer register S.  Later for indexing into the stack, you can TSX and use LDA 105,X for example.

The S register is not directly accessible; but the stack itself is.  You can push values on the stack with PHA, PHX, and PHY, (all taking three clocks, or 3µs @ 1MHz) and pull values off the stack with PLA, PLX, and PLY (all taking four clocks).  Sometimes this stack is also used to pass parameters to and from various routines.  PHP and PLP push and pull the processor status register P.  JSR, the 6502's subroutine-call instruction, pushes two bytes on the stack for a return address.  An interrupt pushes one additional byte, the processor status, and still takes only 7 clocks altogether, giving it the fastest interrupt response of any 8-bit processor.  The return-from-subroutine instruction is RTS and the return-from-interrupt instruction is RTI.  Both take 6 clocks, including the restoration of the processor status register in the case of RTI.


ROM in Page $FF: (Almost) Imperative

Normally your design will need ROM in addresses $FFFA-FFFF.  These addresses contain the addresses of the reset routine, the IRQ ISR (interrupt-service routine) and the NMI ISR.

Reset (RST):  When the 6502's RST input gets pulled low and then brought back high, the 6502 starts its reset process, and gets the address to start executing program instructions from $FFFC-FFFD.  Notice it does not start executing at address $FFFC, but reads it to get the beginning address of the routine where it should start executing.  That routine will normally have to be in ROM.

Non-maskable interrupts (NMI):  When the 6502's NMI input experiences a high-to-low transition, the processor finishes the currently-executing instruction and then gets the beginning address of the NMI interrupt service routine (ISR) from $FFFA-FFFB.  Again, it does not start executing at address $FFFA, but reads this pair of bytes to get the beginning address of the NMI ISR where it should start executing.

(Maskable) Interrupt request (IRQ):  When the 6502's IRQ input is low and the I (interrupt-disable) bit in the status register is clear, the processor finishes the currently-executing instruction and then gets the beginning address of the IRQ ISR from $FFFE-FFFF.  Again, it does not start executing at address $FFFE, but reads this pair of bytes to find the beginning address of the IRQ ISR.

The reason you normally have to have ROM here is that when the computer powers up and begins the reset process, it will not have had a chance yet to store anything into RAM.  The first thing it turns to is the reset vector above ($FFFC-FFFD), and that must be in place already.  This normally means ROM.  There are ways to have RAM there and pre-load the RAM before the processor's RST is released, meaning you would not need any ROM at all in the memory map.  This is beyond the scope of this primer, but we will mention that having the interrupt vectors at $FFFA-FFFB and $FFFE-FFFF in RAM allows you to change them on the fly, possibly expediting the interrupt service if less polling is required to find out which device requested the interrupt service.  This is especially true if the hardware itself can make these locations reflect the addresses of the appropriate ISRs based on which device is requesting interrupt service.  Normally ROM must also contain the reset routine and the ability to load your application program if that's not already in ROM.


Low Byte First

By the way, in two-byte values, the low byte is always first.  This is called "little endian."  It may seem backwards, but it allows the processor to run faster.  (It has nothing to do with even and odd addresses.  A two-byte value can start with the low byte on an even or an odd address, and the high byte will follow, regardless.)  While reading the next byte after an op code, the processor can simultaneously be decoding the op code and determining if that next byte is the next op code, the first of two operand bytes, or the only byte of a one-byte operand.  If indexing is required, the addition can begin taking place on the low operand byte (which is where you start the addition anyway), regardless of whether there's a high operand byte still to come.

Take for example the instruction LDA $684A,X.  The processor reads the op code in the first clock period, and the $4A low byte in the second.  Now in the third, while it is reading the $68 high byte, it can simultaneously be adding the X register's value to the $4A low byte.  It would have to wait and waste another clock period if the high byte were first.  If the first addition generated a carry, it gets added into the high byte in the next clock period; otherwise that next clock period is used for fetching the value from the resulting address, meaning the whole absolute indexed load-accumulator instruction took only four clocks.

It's part of the pipelining scheme whereby the 6502 does more than one thing per clock period.  Take another example, addition of a constant to the accumulator, like ADC #51.  This requires five operations:  1. fetch op code;  2. interpret op code;  3. get operand;  4. add the operand and the accumulator's contents;  and 5. store the result in the accumulator.  The 6502 does the five steps in only two clock periods (2 microseconds at 1MHz).  Op code interpretation (step 2) happens while the operand is being fetched (step 3).  Steps 4 and 5 (doing the addition and putting the result in the accumulator) happen while the processor is getting the next instruction op code.

This, plus things like the implied compare-to-0 instruction integrated into anything that changes the accumulator or X or Y register were part of why the 6502 performed so well.

ZP could have been made to be any other page, but ZP addresses are more intuitive with leading zeroes suppressed.  Although I'm not a microprocessor designer, I suspect it was easier (made the chip simpler internally) to clear the ADH (address-high register) during the reading of the first operand byte than to put some other page number there.  Putting the reset and interrupt vectors right at one end in the last few addresses makes the greatest stretches of regular memory available.  Putting them at the $FFFF end does not eat into ZP RAM and makes address decoding much easier than it would be if addresses 0000-0006 had to be ROM while 0007-$01FF (or at least some of that range) had to be RAM.


address decoding <--Previous   |   Next--> IRQ/NMI connections

last updated Feb 20, 2016