home   |   stacks treatise index   |   1. Intro: stack basics   |   2. subroutine return addresses & nesting   |   3. interrupts   |   4. virtual stacks   |   5. stack addressing   |   6. passing parameters   |   7. inlined data   |   8. RPN operations   |   9. RPN efficiency   |   10. 65c02 added instructions   |   11. synth instructions w/ RTS/RTI/JSR   |   12. where-am-I routines   |   13. synthesizing 65816 stack instructions   |   14. local variables, environments   |   15. recursion   |   16. enough stack space?   |   17. forming program structures   |   18. stack potpourri   |   19. further reading   |   A: StackOps.ASM   |   B: 816StackOps.ASM   |   Appendix C


6502 STACKS TREATISE


Local variables & environments

In a system that does not use preemptive multitasking and memory protection, basically any program or subroutine has access to most things in the computer that other ones do too.  That is the global environment.  A program or subroutine may also have access to certain data, mass-storage buffers, etc. that other programs and subroutines do not.  The "inner circle" of features collectively form its local environment, and the typical way to do that on the simple systems we're talking about here is by way of stacks.

The very simplest example is pushing processor registers at the beginning of an interrupt-service routine (ISR) so the ISR can use the same registers without interfering with the background task, and restoring the registers at the end.

We can extend that idea and push data onto a stack to preserve it temporarily while another routine uses those variables, something probably obvious to the intermediate programmer.  For example, if you have a subroutine that needs to set the screen's cursor position for its own window without messing it up for other pending routines, you can push the current cursor position to save it, then change it to whatever you need, take care of business using the same display subroutines and the same cursor-position variables that other pending routines use, and at the end of the subroutine, restore the old cursor x and y values that had been saved on the stack, and you will not have interfered with any other pending routines.  It might go something like this:



        LDA  Column
        PHA
        LDA  Row
        PHA

        <do_stuff>   ; Operations using the Row and Column variables will not
        <do_stuff>   ; interfere with other routines' use of the same variables.

        PLA
        STA  Row
        PLA
        STA  Column


Other aspects of local variables and environments, especially stack frames and how they can be carried out on a 6502, may be new, but hopefully easy to envision with the groundwork already laid in previous sections of this treatise.  It is not necessary to get into operating systems and higher-level languages to find relevance.

My first experience with going beyond the basics came from my HP-71 hand-held computer (shown at right) which came out in 1983 (and I bought in '87) and was way ahead of its time.

It came with by far the best BASIC I've ever seen (especially with the user groups' contributions further improving it).  Actually, at the time I took the picture, I was running Forth, not BASIC; but it can hold any number of BASIC programs and subprograms in memory at once, and any of these can call any other subprogram, or even call itself recursively, without stepping on other programs' or subprograms' pending variables, channel numbers, user-defined functions, labels, error-handling setups, etc. which might share names.  Data are passed to and from the subprogram in the BASIC line calling it.  The number of pending subprograms it can keep environments for is limited only by the amount of available RAM.  (I have a total of 177KB of battery-backed RAM in mine, limited by my budget back when RAM was far more expensive than it is today, plus a similar amount of ROM.)  When a subprogram is called, the current environment is saved and a new local environment is created for the subprogram.  The subprogram has access to the elements of its local environment, plus those of the global environment, but not to those of other pending subprograms' saved local environments.  The local environment is erased when the subprogram ends; then the last previous environment becomes active again.  At that point, memory taken by the just-closed subprogram's local environment is freed up, and any files associated with local channels are closed, among other things that happen.

One place I took advantage of this was in writing a very full-featured text editor.  (I do have the video monitor for it, but I wanted an editor that was optimized for using just the small LCD on the computer itself.  Having to view your work as if through a keyhole is not nearly as limiting as you might think when that keyhole can be moved around the file quite nimbly.)  Since this system (the HP-71) did not offer true multitasking or multithreading, having lots of files open at once did not allow moving from one file to another without either closing one of them (to get back to the last previously opened one) or opening another; but I could have the text editor call itself and use the same program (not another copy of it), and each call had its own set of variables, like which line and column I was on, what block was marked, where the tabs and margins were, what the current print device was, etc., which were part of its environment.  I might be working in one file, need to check something in another or copy to or from it (via a cutpaste file that's available to all of them) so I pull that up, and another, and another—and this could go dozens of files deep, although to get back to earlier ones, I had to close files, since the environments were saved on a stack.


A couple of locals methods:

  1. On a stack, store a backup of the old environment's portions which you plan to re-use, as shown in the initial example above.  Then it's ok to use them in place, because you will pull the last previous values off the stack and restore them when you're done.  This method has some overhead for opening and closing the environment, but it may be more efficient addressing things while using it, because you can reduce or eliminate indexing and indirection.

  2. Start an entirely new environment and keep it on the stack, operating on it in the stack space, knowing that it's temporary; ie, separating the new instead of the old.  This method does not move or even temporarily modify older data that will be needed again when you get back to pending routines.  Compared to #1 above, it may have has less overhead to open or close the environment, but addressing things in it may be a little less efficient because of the required stack-relative addressing.


These can of course be combined.  For example, you could:
  1. push backups of the old environment's portions you need to overwrite (like in the example listing at the top, dealing with the column and row screen cursor positions), then
  2. set up any other necessary variables, buffers, etc. on the stack for referencing by way of stack-relative addressing,
  3. run your routine in its environment, then, when you're done and you're ready to close the environment,
  4. delete the variables, buffers, etc. you added to the stack,
  5. pull the things off the stack that you pushed above and restore them to their original places, and
  6. exit the routine that needed its own environment.


For the second method, suppose you get into a routine that needs four bytes of input and output, passed through the stack, and three bytes of independent local variables.  (N might be used as well for local variable space, but it must never be in use when the program counter jumps to another routine.)  The routine might start with:



         PHA          ; Add three more bytes to the stack.  They will get
         PHA          ; used below.  (Remember to pull them off the stack
         PHA          ; at the end.)  Their contents don't matter yet.

length:  SETL  $101   ; Assign names to the three bytes of
width:   SETL  $102   ; local variables created above.  Each
height:  SETL  $103   ; variable is one byte in this case.

weight:  SETL  $104   ; Now assign names to the ones passed on the stack.
density: SETL  $106   ; weight gets 2 bytes, and density and speed each get
speed:   SETL  $107   ; one.  These could have additional names for data
                      ; sent back to the calling routine in the same bytes.


SETL in the C32 assembler is "SET Label," like EQU in most assemblers but you can change the value assigned to a label as many times as you wish.  I believe Kowalski's assembler uses .= or .SET .

I would like to use macros to automate the creation of local variables and the assignment of names to them, but every way I can think of runs into roadblocks, at least with the assemblers I'm familiar with.  [Edit, spring 2019:  I have an idea I want to try, as time allows.]  It would be nice to be able to do for example,



SUB_TOT:   LOCAL  3   ; Make 3-byte local variable SUB_TOT.
PRESSURE2: LOCAL  1   ; Make 1-byte local variable PRESSURE2.
FLOW2:     LOCAL  2   ; Make 2-byte local variable FLOW2.


subroutine_label:
           TSX
           <continue with the program for the process>

           DESTROY_LOCALS  ; Get local variables off the stack
           RTS             ; at the end before exiting.
 ;----------------


to create local variables on the stack by using PHA's and assign the names for the stack offset (in this case giving FLOW2 the value $101, PRESSURE2 the value $103, and SUB_TOT the value $104), and counting the bytes so the DESTROY_LOCALS macro at the end knows how many to pull off the stack.  If you think of a way to do it that would work with most macro assemblers (even if the syntax may need a little modification to work with some), and are willing to share it, please email me, or bring it up on the forum.  If I use it, I'll give you credit.

Inside the local environment, ie, in the subroutine that carries out the process and comes right after the set of locals definitions, locals will be referred to with absolute indexed addressing, like LDA FLOW2,X where X's contents came from the TSX.  (There's also LDA(ZP),Y where the ZP address points to the top of the stack and Y indexes into it; but that works a little differently, so you can't use the same constants defined above.)  You will sometimes need one or more pairs of ZP bytes to use as virtual registers for the things the 6502 doesn't have the addressing modes to do in page 1.  (This was discussed in section 5, on stack addressing.)

Since a label can be assigned new values as many times as you wish (with SETL or .= or similar), and since you put the relevant locals assignments in your source code right before the subroutines that need them, names can be re-used, and the right stack offset value will be used for each subroutine.  So for example we could have another routine that has the following locals in the same source code file, and there will be no conflict between FLOW2 below and FLOW2 above.  Each subroutine will use the right one.



FLOW1:  LOCAL  2
FLOW2:  LOCAL  2
FLOW3:  LOCAL  2


subroutine_label:
        TSX
        <followed by the code that uses these local variables>


If you were to re-use a name that's already a global label (regardless of whether or not it's the address of a variable), the fact that the SETL (or equivalent) assembler directive overwrites the global label's original value would be a problem for any code that comes after the subroutine and expects the global label to still be intact.  The good news is that if the global was assigned with EQU, the assembler should generate an error message, so you won't have a hard-to-find bug.  The bad news is that it will require you to choose another name, one that's available.  (Should be easy enough, huh? ;-) )

If you need a lot of local variable space, using a lot of PHA's will of course not be as efficient as:


        TSX
        TXA
        SEC
        SBC  #$18
        TAX
        TXS


In this case, putting $18 (24 in decimal) bytes on the stack takes 12 clocks instead of 72, and 7 bytes instead of 24, so it's 6 times as fast and 3.5 times as memory-efficient.  The break-even point is at 4 bytes of variables for speed, and 7 bytes for program memory.  (Be careful that you don't depend on uninitialized variables though.)

Note that the looping controls presented in section 8 on RPN operations, with source code in Appendix A (StackOps.ASM), are automatically local and nestable.  And while you're in a given loop, I puts a copy of the immediate loop index on the data stack, and J puts a copy the loop index of the next nested loop out on the data stack.  It doesn't matter how many nested loop or subroutine levels deep you are.  (There might be interesting uses for these when you're not in a loop, too.  Hmmm...)  There may be limitations in how you refer to variables on the hardware stack when you are inside loops that are controlled this way; but this looping-control method is really for use when you're passing data on the ZP data stack anyway, not on the page-1 hardware stack.

You might have already noticed that a subroutine can be recursive, which means it can call itself, over and over, and each nesting level of the subroutine has its own variables.  The next section, section 15 on recursion, discusses this.

So, what if you need to push entire arrays or other environment data that are too large for the 6502's stack, or at least too large to fit in the space left?  Some things that come to mind are:


The 65816 has the attraction of a 16-bit stack pointer, and optional 16-bit X and Y, accommodating deeper stacks.  (It also has native stack-relative addressing modes, with even double indirection and double indexing, making it much more efficient for stack frames and local variables.)  I have an article about the '816 and common misunderstandings about it here.



These relevant links have good material I won't repeat here:





13. synthesizing 65816 stack instructions <--Previous   |   Next--> 15. recursion

last updated Nov 2, 2024