Please note that it is almost imperative that you have at least some RAM in page 0 (addresses 0000 to $00FF) and page 1 (addresses $0100-$01FF). This space does not have to be 100% RAM, but virtually all applications will need at least some in each of these pages.
Page 0 (or zero page, abbreviated ZP, occupying addresses $0000 to $00FF) is used for shorter, single-byte addressing; but more importantly, ZP is, in a sense, 256 8-bit processor registers. Pre-indexed and post-indexed indirect addressing modes are only available through ZP where the pointer addresses reside. LDA (ZP,X) and LDA (ZP),Y are examples. (Assuming you've been previewing the instruction set, you will recognize "LDA" as "LoaD Accumulator" which is the primary working register, and then there's the operand and addressing mode, with the parentheses meaning "indirect" and the X or Y index register inside or outside the parentheses meaning pre- or post-indexing. "ZP" gets filled in with a zero-page address, usually referred to by a name that's meaningful to humans. The excellent programming manual referenced earlier explains the instruction set super well.)
As for page 1 (address range of $0100-$01FF), the reason for the need of at least some RAM in page 1 is that the 6502 uses this page for its hardware stack. The instruction set makes it somewhat practical to implement other stacks as well, but the processor's own native stack that it uses to find its way back from subroutines and interrupts is in page 1. (In the case of interrupts, the processor status is automatically saved on the stack as well.) If you don't have RAM in page 1, you will have the severe limitation of not being able to have any subroutines or interrupts.
The low 8 bits of the stack pointer are held in register S, and the high 8 bits are always implied to be $01. When the 6502 stores a byte at the address pointed to by the stack pointer, it immediately decrements the stack pointer. In other words, the stack grows down, not up. If you had 128 bytes of RAM in addresses $180 to $1FF, you would initialize your stack pointer to $FF by having LDX #$FF, TXS at the beginning of your program. This puts $FF in index register X, and then transfers it to the stack pointer register S. Later for indexing into the stack, you can TSX and use LDA 105,X for example.
The S register is not directly accessible; but the stack itself is. You can push values on the stack with PHA, PHX,
and PHY, (all taking three clocks, or 3µs @ 1MHz) and pull values off the stack with
PLA, PLX, and PLY (all taking four clocks). Sometimes this stack is also used to pass parameters to and
from various routines. PHP and PLP push and pull the processor status
register P. JSR, the 6502's subroutine-call instruction, pushes two bytes on the stack for a return
address. An interrupt pushes one additional byte, the processor status, and still takes only 7 clocks altogether, giving it the fastest
interrupt response of any 8-bit processor. The return-from-subroutine instruction is RTS and the
return-from-interrupt instruction is RTI. Both take 6 clocks, including the restoration of the processor status
register in the case of RTI.
ROM in Page $FF: (Almost) Imperative
Normally your design will need ROM in addresses $FFFA-FFFF. These addresses contain the addresses of the reset routine, the IRQ ISR (interrupt-service routine) and the NMI ISR.
Reset (RST): When the 6502's RST input gets pulled low and then brought back high, the 6502 starts its reset process, and gets the address to start executing program instructions from $FFFC-FFFD. Notice it does not start executing at address $FFFC, but reads it to get the beginning address of the routine where it should start executing. That routine will normally have to be in ROM.
Non-maskable interrupts (NMI): When the 6502's NMI input experiences a high-to-low transition, the processor finishes the currently-executing instruction and then gets the beginning address of the NMI interrupt service routine (ISR) from $FFFA-FFFB. Again, it does not start executing at address $FFFA, but reads this pair of bytes to get the beginning address of the NMI ISR where it should start executing.
(Maskable) Interrupt request (IRQ): When the 6502's IRQ input is low and the I (interrupt-disable) bit in the status register is clear, the processor finishes the currently-executing instruction and then gets the beginning address of the IRQ ISR from $FFFE-FFFF. Again, it does not start executing at address $FFFE, but reads this pair of bytes to find the beginning address of the IRQ ISR.
The reason you normally have to have ROM here is that when the computer powers up and begins the reset process, it will not have
had a chance yet to store anything into RAM. The first thing it turns to is the reset vector above ($FFFC-FFFD), and that must be
in place already. This normally means ROM. There are ways to have RAM there and pre-load the RAM before the processor's RST is
released, meaning you would not need any ROM at all in the memory map. This is beyond the scope of this primer, but we will
mention that having the interrupt vectors at $FFFA-FFFB and $FFFE-FFFF in RAM allows you to change them on the fly, possibly
expediting the interrupt service if less polling is required to find out which device requested the interrupt service. This is
especially true if the hardware itself can make these locations reflect the addresses of the appropriate ISRs based on which
device is requesting interrupt service. Normally ROM must also contain the reset routine and the ability to load your application
program if that's not already in ROM.
Low Byte First
By the way, in two-byte values, the low byte is always first. This is called "little endian." It may seem backwards, but it allows the processor to run faster. (It has nothing to do with even and odd addresses. A two-byte value can start with the low byte on an even or an odd address, and the high byte will follow, regardless.) While reading the next byte after an op code, the processor can simultaneously be decoding the op code and determining if that next byte is the next op code, the first of two operand bytes, or the only byte of a one-byte operand. If indexing is required, the addition can begin taking place on the low operand byte (which is where you start the addition anyway), regardless of whether there's a high operand byte still to come.
Take for example the instruction LDA $684A,X. The processor reads the op code in the first clock period, and the $4A low byte in the second. Now in the third, while it is reading the $68 high byte, it can simultaneously be adding the X register's value to the $4A low byte. It would have to wait and waste another clock period if the high byte were first. If the first addition generated a carry, it gets added into the high byte in the next clock period; otherwise that next clock period is used for fetching the value from the resulting address, meaning the whole absolute indexed load-accumulator instruction took only four clocks.
It's part of the pipelining scheme whereby the 6502 does more than one thing per clock period. Take another example, addition of a constant to the accumulator, like ADC #51. This requires five operations: 1. fetch op code; 2. interpret op code; 3. get operand; 4. add the operand and the accumulator's contents; and 5. store the result in the accumulator. The 6502 does the five steps in only two clock periods (2 microseconds at 1MHz). Op code interpretation (step 2) happens while the operand is being fetched (step 3). Steps 4 and 5 (doing the addition and putting the result in the accumulator) happen while the processor is getting the next instruction op code.
This, plus things like the implied compare-to-0 instruction integrated into anything that changes the accumulator or X or Y register were part of why the 6502 performed so well.
ZP could have been made to be any other page, but ZP addresses are more intuitive with leading zeroes suppressed. Although I'm
not a microprocessor designer, I suspect it was easier (made the chip simpler internally) to clear the ADH (address-high register)
during the reading of the first operand byte than to put some other page number there. Putting the reset and interrupt vectors
right at one end in the last few addresses makes the greatest stretches of regular memory available. Putting them at the $FFFF
end does not eat into ZP RAM and makes address decoding much easier than it would be if addresses 0000-0006 had to be ROM while
0007-$01FF (or at least some of that range) had to be RAM.
address decoding <--Previous | Next--> IRQ/NMI connections
last updated Feb 20, 2016