After programming in 6502 language for over a decade, I was getting a bit BORED. One can only code the same routines with the same opcodes so many times before the nausea of repetition becomes overpowering. When I heard the news that CMD was building a cartridge based on a 20 MHz 65816 I was overjoyed. For years I've heard those with 65816 bases systems brag about its capabilities. To us old 6502 programmers, the opportunity to program the fabled 65816 is a new lease on life.
The 65816 is an 8-/16-bit register selectable upgrade to the 6502 series processor. With 24 bit addressing of up to 16 Megabytes of RAM, the powerful 65816 is a logical upgrade that leaves 6502 programmers feeling right at home. It is amazing how fast one can adapt to the new processor. It sounds funny to say it, but the only difficulty I have had learning the 65816 is that there are so many options and choices to complete the same task, that it is hard to decide which method is best.
To get started programming the 65816, I would recommend purchasing the book, "Programming the 65816" from The Western Design Center, manufacturer of the 65816. While it is a bit pricey, the sheer quality and content of the 600 page book is worth the money. Rarely, if ever, has there been a CPU manual as thorough and detailed as the Western Design book. If you know 6502 assembly, then Programming the 65816 is probably the only 65816 book you will ever need.
The 65816 may be operated in Native mode or 6502 Emulation mode. Emulation mode is a 100% 6502 compatible mode where the whole processor looks and feels like a vintage 6502. Native mode offers 8- or 16-bit user registers and full access to 24-bit addressing.
While in emulation mode, not only are all the 6502 opcodes present in their virgin form, but the new 65816 instructions are also available for usage. In fact, the first lesson to learn about programming the 65816 is that emulation mode is much more powerful than a stock 6502. The only true difference between emulation mode and our venerable C64's 6510 processor is that unimplemented opcodes will not produce the results expected on the former. Since all 256 of the potential opcodes are now implemented on the 65816, older C64 software that uses previously unimplemented opcodes will produce erratic results.
To select between emulation and native modes, a new phantom hidden emulation bit (E) was added to the status register. Shown in programming models hanging on top of the Carry bit, the emulation bit is only accessible by one instruction. The new instruction (XCE) exchanges the status of the Carry bit and Emulation bit. To move to emulation mode, set the carry and issue an XCE instruction. To move to native mode, clear the carry and issue the XCE instruction.
Two new instructions are used to clear or set bits within the status
register. The SEP instruction sets bits, and REP clears bits. SEP and
REP use a one byte immediate addressing mode operand to specify which
bits are to be set or cleared. For example, to set the X bit for 8 bit
user registers:
SEP #%00010000 | ; set bit 4 for 8-bit index |
; registers. |
REP #%00010000 | ; clear bit 4 for 16-bit index |
; registers. |
When in 8 bit mode, the index registers perform their function in standard 6502 form. When status bit X is set to 0, both the X and Y index registers become 16 bits wide. With a 16-bit index register you can now reach out to a full 64K with the various indexed addressing modes. An absolute load to an index register in 16-bit mode will retrieve 2 bytes of memory-the one at the effective address and the one at the effective address plus one. Simple things like INX or DEY work on a full 16 bit, which means you no longer have to specify a memory location for various counters, and loops based on index counters can now be coded in a more efficient manner.
The formerly empty status register bit 5 is now referred to as bit M. M is used to specify an 8- or 16-bit wide acculmulator and memory accesses. When in 8 bit mode, (M=1), the high order 8 bits are still accessible by exchanging the low and high bytes with a XBA instruction-it is like having two acculmulators! However; when set for a full 16-bit wide accumulator, all math and accumulator oriented logical intructions operate on all 16 bits! If you add up the clock cycles and bytes required to perform a standard two byte addition, you can start to see the true power of 16-bit registers.
Zero Page has now been renamed to Direct Page-corporate thinking, go figure. A new processor register D was added to allow Direct Page to be moved anywhere within the first 64K of memory. The direct page register is 16 bits wide, so you can now specify the start of direct page at any byte. Several old instructions now include direct page addressing as well. To move direct page, just push the new value onto the stack (16 bits) and then PLD to pull it into the direct page register. You may also transfer the value from the 16-bit accumulator to the direct page register with the TCD instruction. Direct page may also be moved while in emulation mode.
While in native mode, the stack pointer is a full 16 bits wide, which means the stack is no longer limited to just 256 bytes. It can be moved anywhere within the first 64K of memory (although while in emulation mode, the stack is located at page one). There are several new addressing modes that can use the stack pointer as a quasi-index register to access memory. Numerous new push and pull instructions allow you to manipulate the stack. A few of the more useful stack intructions useful to programmers, are the new instructions to push & pull index registers with PHX/PHY and PLX/PLY.
Two other new processors registers are the Program Bank Register (PBR) and the Data Bank Register (DBR). The Program Bank Register can be thought of as extending the program counter out to 24 bits. Although you can JSR and JMP to routines located in other RAM banks, individual routines on the 65816 still must run within a single bank of 64K-there's no automatic rollover from one bank of RAM to the next when executing successive instructions. In this sense, it may help to think of the 65816 processor as a marriage of Commodore's C128 Memory Management Unit (MMU) and an 'enhanced' 6502-a very similar concept.
The Data Bank Register is used to reach out to any address within the 16 megabyte address space of the 65816. When any of the addressing modes that specify a 16-bit address are used, the Data Bank byte is appended to the instruction address. This allows access to all 16 megabytes without having to resort to 24-bit addressing instruction, and helps enable code that can operate from any bank.
Another new feature are two Block Move instructions, one for forward MVP and one for backward MVN. Simply load the 16-bit X register with the starting address, the Y index register with the ending address, the accumulator with the number of bytes to move, and issue the MVP or MVN instructions. MVN is for move negative, and MVP is for move positive, so that your moves don't overwrite themselves. Block Moves use two operand bytes: one for the source bank of 64K and one for the destination bank. Memory is moved at the rate of seven clock cycles per byte.
Several new addressing modes are used to access the full address
space. A 65816 assembler would decode "long" addressing given this input:
LDA $0445F2 | ; load byte from $45F2 of RAM |
; bank 4 |
LDA $03412F,x | ; load byte from $412F of bank 3 |
; plus x. |
LDA ($12) | ; load indirect without an |
; offset. |
JSR ($1234,x) | ; jump to a subroutine via |
; indexed indirect addressing! |
TXY,TYX | Transfer directly between index registers |
BRA | Branch always regardless of status bits |
TSB | Test and set any bit of a byte |
TRB | Test and reset (clear) any bit of a byte |
INC A/DEC A | Increment or decrement the accumulator directly |
STZ | Store a zero to any byte |
m | x | A/M | X/Y | Instructions |
---|---|---|---|---|
0 | 0 | 15-bit | 16-bit | REP #$30 |
0 | 1 | 16-bit | 8-bit | REP #$20 |
SEP #$10 | ||||
1 | 0 | 8-bit | 16-bit | REP #$10 |
SEP #$20 | ||||
1 | 1 | 8-bit | 8-bit | SEP #$30 |