Extended Precision Arithmetic
Sometimes programs need to operate on numbers longer than 16 bits.  Common operations include addition, subtraction, comparison and shifting.  (Shifting is important for implementing more complicated operations, such as multiplication and division.)  The Carry Flag and occasionally the Overflow Flag serve an important role in implementing extended precision arithmetic.
Contents
Addition
To add two 32-bit numbers, one must connect two ADDs with an ADCR. The following example illustrates adding the 32 bit number in R1:R0 to R3:R2. The upper halves of the numbers are in R1 and R3. The lower halves are in R0 and R2.
ADDR R0, R2 ; Add lower halves. This may result in a carry ADCR R3 ; Add the carry into R3 ADDR R1, R3 ; Add the upper halves.
Things get slightly more complicated if you try to add numbers longer than 32 bits. This is because ADCR itself may generate a carry. The following example illustrates adding two 64 bit numbers. This time, the first number is stored in memory pointed to by R4, and the sum is held in R3:R2:R1:R0. The most significant portion is in R3, and the least significant portion is in R0.
ADD@ R4, R0 ; Add bits 0..15 together. ADCR R1 ; propagate the carry to bits 16..31 ADCR R2 ; propagate the carry to bits 32..47 ADCR R3 ; propagate the carry to bits 48..63 ADD@ R4, R1 ; Add bits 16..31 together. ADCR R2 ; propagate the carry to bits 32..47 ADCR R3 ; propagate the carry to bits 48..63 ADD@ R4, R2 ; Add bits 32..47 together. ADCR R3 ; propagate the carry to bits 48..63 ADD@ R4, R3 ; Add bits 48..63 together.
As you can see, things get slightly more complicated as the precision goes up, but it remains manageable. One simply needs to ensure all carries propagate to the most significant word.
Subtraction
Subtraction is similar to addition, except that "borrows" take the place of "carries." When subtracting two numbers that are larger than 16 bits, it's important to realize that all the "internal" portions of the subtract are effectively "unsigned," even if the overall number is signed. Thus, the Carry Flag can be thought of as a "Not Borrow" bit for the lower order terms. To subtract two extended precision numbers, one must include one additional step to correctly handle borrowing.
The following example illustrates subtracting the 32 bit number in R1:R0 from R3:R2. The upper halves of the numbers are in R1 and R3. The lower halves are in R0 and R2.
SUBR R0, R2 ; Subtract the lower halves. This may "borrow" from the upper half DECR R3 ; \_ Subtract 1 if we need to borrow, otherwise leave R3 alone. ADCR R3 ; / (ADCR and DECR cancel in the not-borrow case.) SUBR R1, R3 ; Subtract the upper halves.
If you're feeling hackish, you can speed this up in the non-borrowing case like so:
SUBR R0, R2 ; Subtract the lower halves. This may "borrow" from the upper half ADCR PC ; Skip next instruction if we did not borrow DECR R3 ; Decrement R3 if we borrowed SUBR R1, R3 ; Subtract the upper halves.
Do either of these work? Suppose we want to compute $1000_0000 - $0001_0001. (Underscores inserted every 16 bits to aid readability.) This should give us $0FFE_FFFF. So, let's step through both code sequences:
; R3:R2 = $1000:0000 ; R1:R0 = $0001:0001 SUBR R0, R2 ; R2 = R2 - R0 ==> $0000 - $0001 == $FFFF, C=0 DECR R3 ; R3 = R3 - 1 ==> $1000 - 1 == $0FFF ADCR R3 ; R3 = R3 + C ==> $0FFF + 0 == $0FFF, C=0 SUBR R1, R3 ; R3 = R3 - R1 ==> $0FFF - $0001 == $0FFE, C=1 ; Thus: R3:R2 = $0FFE:FFFF
Let's try it with the 'hackish' code example:
; R3:R2 = $1000:0000 ; R1:R0 = $0001:0001 SUBR R0, R2 ; R2 = R2 - R0 ==> $0000 - $0001 == $FFFF, C=0 ADCR PC ; skips next instruction if C=1, but C=0 so it's not skipped. DECR R3 ; R3 = R3 - 1 ==> $1000 - 1 == $0FFF SUBR R1, R3 ; R3 = R3 - R1 ==> $0FFF - $0001 == $0FFE, C=1 ; Thus: R3:R2 = $0FFE:FFFF
Now let's try an example without a 'borrow.' Suppose we compute $1000_2000 - $0002_0001. The result should be $0FFE_$1FFF.
; R3:R2 = $1000:2000 ; R1:R0 = $0002:0001 SUBR R0, R2 ; R2 = R2 - R0 ==> $2000 - $0001 == $1FFF, C=1 DECR R3 ; R3 = R3 - 1 ==> $1000 - 1 == $0FFF ADCR R3 ; R3 = R3 + C ==> $0FFF + 1 == $1000, C=0 SUBR R1, R3 ; R3 = R3 - R1 ==> $1000 - $0002 == $0FFE, C=1 ; Thus: R3:R2 = $0FFE:1FFF, C=1 (no borrow for final result)
Again, with the 'hackish' code example:
; R3:R2 = $1000:2000 ; R1:R0 = $0002:0001 SUBR R0, R2 ; R2 = R2 - R0 ==> $2000 - $0001 == $FFFF, C=1 ADCR PC ; skips next instruction if C=1 DECR R3 ; (skipped, because C=1) SUBR R1, R3 ; R3 = R3 - R1 ==> $1000 - $0002 == $0FFE, C=1 ; Thus: R3:R2 = $0FFE:1FFF, C=1 (no borrow for final result)
Subtracting longer numbers has the same complications as adding longer numbers, with some additional complications to handle the fact that the CPU isn't as helpful handling borrows as it is handling carries.  A single "borrow" in the lowest word might propagate all the way to the top, just as a carry might.  Propagating a borrow is more expensive than propagating a carry, though.  
For example, suppose we want to subtract 1 from the 64-bit value $1000_0000_0000_0000. (Again, underscores inserted every 16 bits for readability.) We know the result will be $0FFF_FFFF_FFFF_FFFF. How do we get there?
Consider the operation on the lowest word: $0000 - $0001. This gives $FFFF, and sets C=0. (That is, we "borrowed".) To propagate this borrow up to higher words, we start out as we did with the 32-bit example above. We decrement the next work, giving $FFFF. We then add the carry with ADCR. Since C=0, that leaves us with $FFFF. We move to the next word, and we DECR and ADCR. This also leaves us with $FFFF and C=0. Finally we get to the last word. One more DECR and ADCR, and we're left with $0FFF, since C=0 from the previous operation. The final computation sets C=1 when it finishes.
That's fine. What about a "non-borrow" situtation? For instance, what if we subtract 0 from a number? Let's try $1000_0000_0000_0000 - 0.
Initially we compute $0000 - 0, which gives us 0, with C=1. (Remember, Carry is "Not Borrow."). So, we move to the next word (bits 16..31), and we DECR (which changes the value to $FFFF) and then ADCR. This leaves us with $0000, and again sets C=1. We move to the next word (bits 32..47), and again DECR (giving us $FFFF) and ADCR (giving us $0000, C=1). Finally we reach the last word, and we DECR (giving us $0FFF) and ADCR (giving us $1000 and C=0).
Thus, one can see that "DECR" followed by "ADCR" successfully propagates a borrow in the same manner that ADCR propagates a carry in extended precision arithmetic.
So, if one wants to subtract two 64-bit numbers, it becomes trivial to adapt the addition example above to perform subtraction. The result isn't pretty though. The following code subtracts the 64-bit number pointed to by R4 from the 64-bit number in R3:R2:R1:R0. (R3 is the most signficant word, and R0 is the least significant.)
;; warning, this code may be broken. --Mr Z 16:34, 8 Jan 2006 (EST) SUB@ R4, R0 ; Subtract bits 0..15. DECR R1 ; \_ propagate the borrow to bits 16..31 ADCR R1 ; / DECR R2 ; \_ propagate the borrow to bits 32..47 ADCR R2 ; / DECR R3 ; \_ propagate the borrow to bits 48..63 ADCR R3 ; / SUB@ R4, R1 ; Subtract bits 16..31. DECR R2 ; \_ propagate the borrow to bits 32..47 ADCR R2 ; / DECR R3 ; \_ propagate the borrow to bits 48..63 ADCR R3 ; / SUB@ R4, R2 ; Subtract bits 32..47. DECR R3 ; \_ propagate the borrow to bits 48..63 ADCR R3 ; / SUB@ R4, R3 ; Subtract bits 48..63
Comparison
For Addition and Subtraction, one usually works from the low end of the number towards the high end. For comparisons, however, the opposite is true. To compare two large numbers, one first compares the most significant words. If they compare as equal, the process moves to the next most significant words. The process continues until two corresponding words miscompare—at which point the program has established one number as larger than the other—or the program reaches the end of both numbers, indicating they are equal.
The process works identically regardless of whether the two numbers are signed or unsigned. The only difference between signed and unsigned comparison is whether one uses BGT/BLT (signed numbers) or BNEQ/BNC (unsigned numbers).
The following example compares the signed 32-bit number in R1:R0 to the signed 32-bit number in R3:R2. R1 and R3 hold the upper 16 bits. R0 and R2 hold the lower 16 bits.
    CMPR R3, R1   ; is R1 greater, equal-to or less than R3?
    BLT  lesser
    BGT  greater
    CMPR R2, R0   ; Upper halves tie, so check lower halves
    BLT  lesser
    BGT  greater
    ; If we get here, R1:R0 = R3:R2
equal:
Here's the same example, using unsigned arithmetic. Recall that BNC is the unsigned variation of "branch if less than".
    CMPR R3, R1   ; is R1 greater, equal-to or less than R3?
    BNC  lesser
    BNEQ greater
    CMPR R2, R0   ; Upper halves tie, so check lower halves
    BNC  lesser
    BNEQ greater
    ; If we get here, R1:R0 = R3:R2
equal:
Notice that the unsigned example uses BNEQ, since the CP1610 lacks a conditional branch that's the unsigned equivalent of "branch if greater than."
Shifting
To do.
