Normmatt Posted January 21, 2005 Posted January 21, 2005 (edited) this may improve speed somewhat "for(b=0;b<240;b++)" is slower than "for( b=240;b--; )" so if you use the "b--" instead of the b++ it should improve speed somewhat hope this helps Edited January 21, 2005 by Normmatt
Two9A Posted January 21, 2005 Posted January 21, 2005 (edited) Normmatt said: this may improve speed somewhat "for(b=0;b<240;b++)" is slower than "for( b=240;b--; )" so if you use the "b--" instead of the b++ it should improve speed somewhat hope this helpsJust one problem, really; if you profile the application, you'll find that the bitmap-mode scanline renderer doesn't get called all that often; it's the CPU dispatch loop where all the time is lost. I can put that loop in, sure; just not sure it'll help much Two Minutes Later: I tried it out on mode3, and it actually slowed things down. So maybe the compiler's doing a better job than I am? Edited January 21, 2005 by Two9A
Normmatt Posted January 21, 2005 Author Posted January 21, 2005 well it was worth a try ohh and i seen your resume why dont you rewrite the arm7 core in asm then it would run 2x as fast
Two9A Posted January 21, 2005 Posted January 21, 2005 Normmatt said: why dont you rewrite the arm7 core in asm then it would run 2x as fastDude. 1) It's mostly in asm already. 2) I can't write that stuff to save my life, so I'm not gonna touch the dispatch loop; I'd only slow it down.
Federelli Posted January 22, 2005 Posted January 22, 2005 (edited) To these help? Instead of:xx = (x * x) * i;yy = (y * y) * i;zz = (z * z) * i;xy = (x * y) * i;yz = (y * z) * i;zx = (z * x) * i; Use: xx1 = x ;xy1 = y ;yy1 = y ;yz1 = z ;zz1 = z ;zx1 = z ;for(m=1;m++;m<=x){ xx1 = xx1 + x ;xy1 = xy1 + y ;zx1 = zx1 + z }for(m=1;m++;m<=y){ yy1 = yy1 + y ;yz1 = yz1 + y ; }for(m=1;m++;m<=z){ zz1 = zz1 + z ; } for(m=1;m++;m<=i){ xx = xx + xx1 ;xy = xy + xy1 ;zx = zx + zx1 ;yy = yy + yy1 ;yz = yz + yz1 ;zz = zz + zz1 ; Edited January 22, 2005 by Federelli
Two9A Posted January 22, 2005 Posted January 22, 2005 Federelli said: Instead of:(12 muls)Use:(6 new variables, 4 loops and insane numbers of adds)That'd work great for something like 6502, where there's no MUL instruction, Fed. Dunno where you want this code to go though; do tell which section you think could be improved by this
Federelli Posted January 22, 2005 Posted January 22, 2005 I don't know really, all i know is this does optimize multiple rendering funcions in OpenGL .So if you ever use that...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now