GAMES ONLINE: 65816: The power of 16bit.

Saturday, July 7, 2007

65816: The power of 16bit.

I've been wondering just how much faster the SuperCPU actually is to a stock C64, and aside from the x20 jump you get from the raw clock speed, the new instructions and 16bit nature give you an even bigger boost - Alomst another x2! Heres a little example....

The scrolling in XeO3 takes a long time, every game cycle I do this:


         ldx  #39
ScrollLoop
         lda  BackBuffer,x
         sta  HWScreen2+$400+(40*00),x
         lda  BackBuffer,x
         sta  HWScreen2+$400+(40*01),x
         lda  BackBuffer,x
         sta  HWScreen2+$400+(40*02),x
         lda  BackBuffer,x
         sta  HWScreen2+$400+(40*03),x
         lda  BackBuffer,x
         sta  HWScreen2+$400+(40*04),x
         lda  BackBuffer,x
         sta  HWScreen2+$400+(40*05),x
         lda  BackBuffer,x
         sta  HWScreen2+$400+(40*06),x
         lda  BackBuffer,x
         sta  HWScreen2+$400+(40*07),x
         lda  BackBuffer,x
         sta  HWScreen2+$400+(40*08),x
         lda  BackBuffer,x
         sta  HWScreen2+$400+(40*09),x
         lda  BackBuffer,x
         sta  HWScreen2+$400+(40*10),x
         lda  BackBuffer,x
         sta  HWScreen2+$400+(40*11),x
         lda  BackBuffer,x
         sta  HWScreen2+$400+(40*12),x
         lda  BackBuffer,x
         sta  HWScreen2+$400+(40*13),x
         lda  BackBuffer,x
         sta  HWScreen2+$400+(40*14),x
         lda  BackBuffer,x
         sta  HWScreen2+$400+(40*15),x
         lda  BackBuffer,x
         sta  HWScreen2+$400+(40*16),x
         lda  BackBuffer,x
         sta  HWScreen2+$400+(40*17),x
         lda  BackBuffer,x
         sta  HWScreen2+$400+(40*18),x
         lda  BackBuffer,x
         sta  HWScreen2+$400+(40*19),x
         lda  BackBuffer,x
         sta  HWScreen2+$400+(40*20),x
         dex
         jpl  ScrollLoop1
         rts

This code is self-modified to address the new location of the back buffer, and I have to use a jpl (macro) since a normal branch is just out of reach, so this takes (40*21*9)+(40*7) = 7840 cycles. (this is approx as there are also page boundary crossings hidden in here.)

Now in 65816, I can do exactly the same but being 16 bit, the loop is half, and although we add a couple more cycles for LDA/STA, its still much quicker. So the loop is now (20*21*11)+(40*7) = 4900 cycles.

And now lastly, the 65816 has a block transfer instruction MVN+MVP which are like Z80's LDIR instruction, which means (BEST case) its now (20*21*7) = 2940 cycles. Now, although the block transfer would be broken up a little mode (to do lines mainly), its still only going to be around 3000. So not only is more than twice the speed as the 6502 version, but we have the new 20Mhz clock as well.

..............Bitmap blitting suddenly becomes REALLY interesting!!

GAMES ONLINE

Saturday, July 7, 2007

65816: The power of 16bit.

No comments:

Post a Comment