Revision as of 22:36, 4 November 2007

This segment of the tutorial introduces branches, particularly conditional branches and function calls. This is Part 3 of a series. If you haven't yes, you may wish to review at least Part 1 and Part 2.

Why do we need branches?

Branches and jumps provide a way to guide the CPU through a sequence of code. (Jump is more or less a synonym for branch.) Ordinarily, the CPU just keeps executing instructions in the order you wrote them. It's like going down a to-do list. For a simple example, suppose we wanted to add two numbers—call them A and B—together, and write the result to a third place—call it C. The steps might look like this:

Read A into a R0
Add B to R0
Write R0 to C

These steps translate to the following instruction sequence:

   MVI  A,  R0    ; Read A into R0
   ADD  B,  R0    ; Add B to R0
   MVO  R0, C     ; Write R0 out to C

The CPU, when it encounters this sequence of instructions, will always execute them in this order. That's useful by itself, but not useful enough to write interesting programs.

For example, suppose you wanted to fill the entire screen with blank characters. The screen occupies 240 locations of memory starting at location $200. You need to write a blank character (which happens to be the value $0000) to all 240 locations. One way to do that would be to do this:

Set R0 to 0
Write R0 to $200
Write R0 to $201
Write R0 to $202

... 234 lines omitted ...

Write R0 to $2ED
Write R0 to $2EE
Write R0 to $2EF

That works, but it leads to very large and tedious code. The simple way to write this in assembly code would be:

   CLRR R0
   MVO  R0,  $200
   MVO  R0,  $201
   MVO  R0,  $202
   ; ... 234 lines omitted ...
   MVO  R0,  $2ED
   MVO  R0,  $2EE
   MVO  R0,  $2EF

That runs very fast, but it takes up a ton of ROM. Also, it clears the screen only once. What if you want to clear the screen many times? You really don't want to copy this same code everywhere you need to clear the screen. That's where branches come in.

Branches tell the CPU to start executing at a different location, rather than barreling forward. Unconditional branches and jumps tell the CPU to always jump to that location. Conditional branches tell the CPU to jump to a new location if some condition is true. These let you skip pieces of code, or repeat them a number of times. Jumps to subroutine tell the CPU "go do this and come back when you're done." The following sections explore these concepts a little more closely.

Branches and Labels

Like a "goto" statement in other languages, branches need to tell the CPU where to branch to. In assembly code, you have two options:

Specify the address of where you'd like the CPU to branch to
Specify the label of the instruction you'd like the CPU to branch to

Most of the time, you will want to write your code in terms of branches to labels. This is far, far easier than trying to figure out the addresses of instructions you might wish to jump to. See the Assembly Syntax Overview for details on how to write labels. All of the examples in this tutorial will use branches and jumps to labels.

Unconditional Branches and Jumps

Jump and Unconditional Branch instructions are like "goto" instructions found in higher-level languages. They tell the CPU to immediately start executing code from somewhere else as opposed to executing the next instruction in sequence. They're useful to tell the CPU to skip a section of code, or to repeat a section of code. Be careful when repeating a block of code with an unconditional branch: If there isn't a conditional branch somewhere in that block, it could end up running forever.

The following table lists the instructions:

Mnemonic	Description	Cycles	Size
B	Branch to label	9	2 words
J	Jump to label	12	3 words
JD	Jump to label while disabling interrupts	12	3 words
JE	Jump to label while enabling interrupts	12	3 words

As you can see, the primary difference between branches and jumps is that branches are smaller and faster. Branches encode their "target address," the address being jumped to, as a relative displacement to the current address. Jumps, on the other hand, store the actual address of the target. In most cases, especially in a 16-bit ROM, there are few reasons to use a J instruction, although the combination instructions, JD and JE can be useful.

There is also a pseudo-instruction, JR, that allows "jumping to a location held in a register." It is really a pseudonym for "MOVR Rx, R7". Because it is a MOVR instruction, it will modify the Sign Flag and Zero Flag, which may be confusing if you're not expecting it. Otherwise, it is an efficient method for jumping to an address held in an register, such as when returning from a CALL.

Conditional Branches

Conditional branches are similar to "if ... goto" type constructs found in other languages. They tell the CPU to start executing at another location if a particular condition is true. This allows us to build "if-then-else" types of statements, as well as "for" and "while" loops.

The CP1610 has a rich set of conditional branch instructions. They work by looking at the CPU's flag bits to decide when to branch. Instructions like CMP and DECR set these flags, making it easy to write those if/else statements and for/while loops. Even fancier uses are possible with some creativity.

The following table summarizes the conditional branches.

Mnemonic	Name	Branch taken when...	Mnemonic	Name	Branch taken when...
BC	Branch on Carry	`C = 1`	BNC	Branch on No Carry	`C = 0`
BOV	Branch on OVerflow	`OV = 1`	BNOV	Branch on No OVerflow	`OV = 0`
BPL	Branch if PLus	`S = 0`	BMI	Branch on MInus	`S = 1`
BEQ	Branch if EQual	`Z = 1`	BNEQ	Branch on Not Equal	`Z = 0`
BZE	Branch on ZEro	`Z = 1`	BNZE	Branch on Not ZEro	`Z = 0`
BLT	Branch if Less Than	`S <> OV`	BGE	Branch if Greater than or Equal	`S = OV`
BNGE	Branch if Not Greater than or Equal	`S <> OV`	BNLT	Branch if Not Less Than	`S = OV`
BLE	Branch if Less than or Equal	`Z = 1 OR S <> OV`	BGT	Branch if Greater Than	`Z = 0 AND S = OV`
BNGT	Branch if Not Greather Than	`Z = 1 OR S <> OV`	BNLE	Branch if Not Less than or Equal	`Z = 0 AND S = OV`
BUSC	Branch on Unequal Sign and Carry	`S <> C`	BESC	Branch on Equal Sign and Carry	`S = C`

Conditional branches are most often used with numeric comparisons, or as the branch at the end of a loop. The following sections illustrate how numeric comparisons, increment and decrement work in concert with branches.

Conditional branches can also be paired with other instructions that manipulate the flags. For instance, shift instructions, as described in Part 4 update sign, zero, carry and overflow flags depending the operation performed. This can lead to interesting and creative combinations of shifts and branches.

Another use of flags and branches is to pass status information in CPU flags (such as the Carry Flag) and then act on that information later. The SETC and CLRC instructions make it easy to manipulate the Carry Flag to pass this status information around.

Signed Comparisons

The following branches are particularly useful when comparing signed numbers. These are the conditional branches you will use most when writing "if-then-else" statements, since most numbers are signed, or can be treated as signed.

Mnemonic		Branch taken when...	Mnemonic		Branch taken when...
BEQ	BZE	`Z = 1`	BNEQ	BNZE	`Z = 0`
BLT	BNGE	`S <> OV`	BGE	BNLT	`S = OV`
BLE	BNGT	`Z = 1 OR S <> OV`	BGT	BNLE	`Z = 0 AND S = OV`
BOV		`OV = 1`	BNOV		`OV = 0`

Note: One pair of branches shown above—BOV and BNOV—are useful in this context only for detecting overflow and little else. I included them here for completeness. These actually find more use paired up with shift instructions. Those are described Part 4 of this tutorial.

How Signed Arithmetic Works With Conditional Branches

The next couple of paragraphs describe how these branches work with compares. Most of the time, you do not need to think about these details when writing programs, so feel free to skim it for now and come back to it later.

The compare instruction compares two numbers by subtracting them, and then setting the flags based on the result. This provides a lot of information about the relative values of the two numbers, as this table shows (ignoring overflow):

If this is true...	...then this also must be true...	...which implies the flags get set as follows (if you ignore overflow).
`x = y`	`x - y = 0`	S = 0	Z = 1
`x < y`	`x - y < 0`	S = 1	Z = 0
`x > y`	`x - y > 0`	S = 0	Z = 0

That is, we can determine whether two numbers are equal or not by looking at the Zero Flag. We can determine if one's less than the other by looking at the Sign Flag. At least, that would be true if there was no such thing as overflow.

The CP1610 can only work with 16 bits at a time. If you try to subtract two numbers whose values are very far apart, such as, in the worst case 32767 - (-32768), you will trigger an overflow, because the results don't fit in 16 bits. Overflow causes the sign of the result to be the opposite of what you would get if no overflow had occurred. The branches take this into account and look at the overflow bit in addition to the sign bit to decide whether one number is greater than or less than another. The following table illustrates the relationships both with and without overflow.

If this is true...	...then the flags get set as follows...			...which matches these branches.
`x = y`	S = 0	Z = 1	OV = 0	BEQ, BGE, BLE
`x < y`	S = 1	Z = 0	OV = 0	BNEQ, BLT, BLE
`x < y`	S = 0	Z = 0	OV = 1	BNEQ, BLT, BLE
`x > y`	S = 0	Z = 0	OV = 0	BNEQ, BGT, BGE
`x > y`	S = 1	Z = 0	OV = 1	BNEQ, BGT, BGE

As you can see, the flags and the branches cooperate quite nicely.

Gotchas With the CP1610 Compare Instruction

The syntax for the CP1610's compare instruction can confuse things slightly, since it does a "subtract from". Consider the following example:


   MVII    #1, R0  ; R0 = 1
   MVII    #2, R1  ; R1 = 2
   CMPR    R0, R1  ; Subtract R0 from R1 to set flags
   BLT     label   ; Is this taken?

This computes "R1 - R0", not "R0 - R1". It compares R0 to R1 by subtracting R0 from R1. In this example, that leaves S=0 and OV=0. R1 is not less than R0, so the branch is not taken. In other words, to determine if a given branch is taken, you have to read right-to-left. "Is R1 less than R0?" In this case, the answer is no, so the branch is not taken.

Unsigned Comparisons

These branches are useful when comparing unsigned numbers. Most often, these get used with pointers into your game ROM, or with fixed point values such as screen coordinates.

Mnemonic		Branch taken when...	Mnemonic		Branch taken when...
BC		`C = 1`	BNC		`C = 0`
BEQ	BZE	`Z = 1`	BNEQ	BNZE	`Z = 0`

The unsigned comparisons require some additional explanation to be useful. After comparing two unsigned numbers, the BC instruction will branch if the second number is greater than or equal to the first. The BC instruction will branch if the second number is smaller than the first. The CP1610 does not offer unsigned equivalents for "branch if greater than" or "branch if less than or equal." You can get similar effects, though, by combining BC and BNC with BEQ or BNEQ to separate out the "equals", "greater-than" and "less-than" cases.

How Unsigned Arithmetic Works With Conditional Branches

The following couple of paragraphs explain how BC and BNC work in concert with other instructions to provide unsigned comparisons. Feel free to skim this for now and come back to it later.

Unsigned arithmetic is actually pretty similar to signed arithmetic, except that there isn't a sign bit. Rather, all the interesting information ends up in the carry bit. Read "How Signed Arithmetic Works With Conditional Branches" above to get the basics.

As mentioned before, the compare instruction compares two numbers by subtracting them and setting the flags. When subtracting two unsigned numbers, the Carry flag doubles as a "do not borrow" flag. With that in mind (and handwaving just how carry functions that way for now), we have:

If this is true...	...then this also must be true...	...which implies the flags get set as follows
`x = y`	`x - y = 0`	C = 1 (no borrow)	Z = 1
`x > y`	`x - y > 0`	C = 1 (no borrow)	Z = 0
`x < y`	`x - y < 0`	C = 0 (borrow needed)	Z = 0

As you can see, the carry flag gets set whenever 'x' is greater than or equal to 'y'. The carry flag is clear whenever 'x' is less than 'y'. This is why BC works as the equivalent of an unsigned "branch if greater than or equal," and BNC works as an unsigned version of "branch if less than."

Sign/Zero Comparisons

Common looping instructions, such as DECR and INCR only modify the sign and zero flags without updating the carry or overflow flags. These are best used with the following branches:

Mnemonic		Branch taken when...	Mnemonic		Branch taken when...
BPL		`S = 0`	BMI		`S = 1`
BEQ	BZE	`Z = 1`	BNEQ	BNZE	`Z = 0`

These will be explored in more detail in the "looping" section below.

If-Then and If-Then-Else

The if-then construct lets you say "Run this bit of code here only if this is true." The if-then-else construct is similar, with the added twist of letting you say "...or, run this other bit of code if it isn't true."

If-Then

Let's start with the if-then statement. The following diagram shows its overall structure. In this case, the code says "if R1 < R2, run the extra bit of code on the left." (green background)

In this example, the CPU runs the green portion only if R1 is less than R2, as indicated in the blue square. The other bits of code always run. The following diagram shows how this looks as assembly language code:

The code in the blue square implements the actual "if" code. It first compares R2 to R1 using the CMPR instruction, and then branches around the "then" portion if R1 is greater-than or equal to R2 with BGE. That way, the green portion only runs if R1 is less than R2, as we intended. (See Gotchas With the CP1610 Compare Instruction for an explanation as to why this is CMPR R2, R1 and not CMPR R1, R2.)

If-Then-Else

The if-then-else statement is very similar. It just adds code that runs when the "if" condition is false, as shown in this diagram:

As with the previous example, the code in the green box on the left runs when R1 is less than R2. The code in red box at the right runs when R1 is greater than or equal to R2. The CPU either executes one or the other. It doesn't execute both. So how does this look in assembly code?

Code example for an if-then-else statement

Again, the blue box holds the code that implements the "if". As with the previous example, the branch gets taken if R1 is greater than or equal to R2, thereby skipping the "then" portion of code. The branch now goes to the "else" code in the red box. The "then" code has an additional instruction: B done. What this does is skip the code for the "else" portion when the "then" portion is done. In either case, the CPU continues executing with the final block of code at the bottom.

Concrete Example: Staying On Screen

The above examples are somewhat abstract in nature. The following example is somewhat more concrete: Make sure COL stays in the range 0..19. This corresponds to the column coordinate in BACKTAB. If COL gets bigger than 19 (beyond the right edge of the screen), wrap around to the left edge by subtracting 20, else if COL gets smaller than 0, wrap around to the right edge by adding 20. The following example assumes COL is in 16-bit memory for now.

          MVI  COL, R0   ; Put COL into R0

          CMPI #20, R0   ; Is COL >= 20?
          BLT  ok_right  ; Branch around if COL < 20
          SUBI #20, R0   ; Wrap around from right edge to left by subtracting 20
          B    done      ; Skip the rest of the code, since we know we're not off the left

ok_right: CMPI #0,  R0   ; Is COL < 0?  
          BGE  done      ; Branch around if COL >= 0
          ADDI #20, R0   ; Wrap around from left edge to right edge by adding 20

done:     MVO  R0,  COL  ; Write the final result out

Notice that this code has both an if-then and an if-then-else. The first test, "COL >= 20" is part of an if-then-else. The "then" clause subtracts 20 from COL and then branches around the "else" clause. The "else" clause holds the second test, "COL < 0" as part of an if-then statement. That statement's "then" clause adds 20 to COL if it is indeed less than 0.

All paths finally lead to done, where the updated value of COL gets written back to memory. The following diagram shows how it all works:

Diagram illustrating the "stay on screen" example

A quick note on efficiency: The above code works, but it can be made slightly faster and more compact using TSTR instead of CMPI #0. The TSTR instruction only sets the sign and zero based on the value in a register. In this case, we're looking for whether COL is less than 0, which means its sign bit would be '1'. Thus:

ok_right: CMPI #0,  R0   ; Is COL < 0?  
          BGE  done      ; Branch around if COL >= 0

becomes:

ok_right: TSTR R0        ; Is COL < 0?  
          BPL  done      ; Branch around if COL >= 0

This is 2 cycles faster and 1 word shorter. Most of the time you won't notice the difference, but sometimes every cycle and every word counts.

Looping

Loops tell the CPU to repeat a sequence of instructions some number of times. There are two main kinds of loops, traditionally referred to as for loops and while loops. For loops know how many times they'll iterate before they start. While loops figure out when to stop iterating based on something that happens during the loop. Clearing the screen is an example of a for loop: It will iterate 240 times, writing a 0 to each location of BACKTAB. Finding the length of a string is an example of a while loop: It iterates until it finds the NUL character at the end of the string.

(Note: Do not confuse these terms with the for and while keywords from languages like C and C++. You can write either kind of loop with either keyword in those languages.)

Basic Structure

In assembly code, loops tend to have the structure shown below. This structure works for both for amd while loops. The green part in the middle is called the "body" of the loop. The blue part is sometimes called the "loop test" or "loop condition." The difference between a for loop and a while loop is in the loop condition.

If you look carefully, you'll see that this loop always executes at least one iteration. Sometimes, though, you don't want that. This can be accomplished by branching around the body of the loop and directly to the loop test, as shown below:

Basic loop structure, with jump to loop test on first iteration

In for loops, the loop test is generally a decrement and a branch. The DECR instruction will set the zero flag when the register it decrements becomes zero. This makes it easy to write loops that iterate a prescribed number of times. The following example iterates 10 times:

       MVII #10, R0
loop:  ; put loop body here
 
       DECR R0       ; Count down from 10 down to 0
       BNEQ loop     ; Iterate as long as count isn't expired

While loops have more varied loop tests. What they look like depends on what the loop is trying to accomplish.

Terminating a Loop Early

Sometimes, you'll want to break out of a loop early if some alternate condition gets matched. You can do this by putting a test and branch in the body of the loop, with the branch pointing outside the loop. This is similar to the action of C's break keyword. The following diagram illustrates this structure:

"Searching" loops often look like this, breaking out of the loop when they find whatever it is they're searching for.

Concrete For Loop Example: Clearing the screen

To clear the screen, we need to write a blank character to each of the 240 locations in BACKTAB, starting at location $200. The blank character happens to be $0000. The following simple code accomplishes this:

       ; Clear the screen
       MVII    #$0200,  R4   ; Point to BACKTAB
       CLRR    R0            ; Set R0 = $0000
       MVII    #240,    R1   ; Use R1 as our "loop counter"
loop:
       MVO@    R0,      R4   ; Write a blank to the screen.
       DECR    R1            ; Count down the loop counter
       BNEQ    loop          ; Keep looping until counter goes to zero

This code uses Indirect Mode accesses, as described in Part 1 of the tutorial. It uses R4 as the indirect pointer. R4 has the special property that its value increments after each indirect access. So, on the first iteration of the loop, the MVO@ R0, R4 instruction writes to location $200. On the second iteration, it writes to $201, and so on. Very handy.

Concrete While Loop Example: Finding the length of a string

Function Calls

Simple Call/Return

Nested Call/Return

Passing Arguments via Return Address

Indirect Branches and Jump Tables

"It was a 'Jump to Conclusions' mat. You see, it would be this mat that you would put on the floor... and would have different conclusions written on it that you could jump to." -- Tom Smykowski, Office Space

Indirect Branching: "Jump Vectors"

Simple Jump Tables

Adding to the Program Counter

Moving On

At this point, you may wish to move along to the last part, or review earlier parts of this tutorial:

Introducing the Instruction Set Part 1: The CPU, Memory and Registers; Primary Instructions and Addressing Modes
Introducing the Instruction Set Part 2: Single Register and Implied Operand Instructions
Introducing the Instruction Set Part 4: Shift and Rotate Instructions

Or, you can return to the Programming Tutorials index.

Revision as of 22:34, 4 November 2007 (view source) Mr z (talk \| contribs) (→‎Basic Structure) ← Older edit		Revision as of 22:36, 4 November 2007 (view source) Mr z (talk \| contribs) m (→‎Concrete For Loop Example: Clearing the screen) Newer edit →
Line 296:		Line 296:

	This code uses [[Indirect Mode]] accesses, as described in [[Introducing the Instruction Set Part 1#Indirect_Mode_Instructions\|Part 1]] of the tutorial. It uses R4 as the indirect pointer. R4 has the special property that its value increments after each indirect access. So, on the first iteration of the loop, the <CODE>[[MVO@]] R0, R4</CODE> instruction writes to location $200. On the second iteration, it writes to $201, and so on. Very handy.		This code uses [[Indirect Mode]] accesses, as described in [[Introducing the Instruction Set Part 1#Indirect_Mode_Instructions\|Part 1]] of the tutorial. It uses R4 as the indirect pointer. R4 has the special property that its value increments after each indirect access. So, on the first iteration of the loop, the <CODE>[[MVO@]] R0, R4</CODE> instruction writes to location $200. On the second iteration, it writes to $201, and so on. Very handy.
−

	=== Concrete While Loop Example: Finding the length of a string ===		=== Concrete While Loop Example: Finding the length of a string ===

Difference between revisions of "Introducing the Instruction Set Part 3"