You are like a lexicon man
Jump to content
(Disclaimer: Verify this on your target platform before believing this.)
Current testing Results
The Real Take Away
Test any optimization on your target machine before assuming it will work.
The Short Description Of This Tip (for the attention-challenged )
local sqrt( num ) return num ^ 0.5 end
may be faster than:
local mSqrt = math.sqrt local sqrt( num ) return mSqrt( num ) end
The Long Description Of This Tip
I recently answered a math question which led me to calculate the Nth root of a number. The equation for this is:
local function( num, root ) return num ^ 1/root end
While there is nothing exciting about this, it suddenly occurred to me that this might be faster than the math.sqrt function for root == 2.
It is! .... or is it?
On my windows test machine, num^0.5 is almost 40% faster than a localized math.sqrt() call. However, on my iPad Air math.sqrt is 21% faster
Here is my test code:
local function round(val, n) if (n) then return math.floor( (val * 10^n) + 0.5) / (10^n) else return math.floor(val+0.5) end end local function test1( num ) local mSqrt = math.sqrt local startTime = system.getTimer() local v for i = 1, num do v = mSqrt(i) end local endTime = system.getTimer() local dt = (endTime-startTime) print("math.sqrt x " .. num .. " == " .. dt .. " ms" ) return dt end local function test2( num ) local mSqrt = math.sqrt local startTime = system.getTimer() local v for i = 1, num do v = i^0.5 end local endTime = system.getTimer() local dt = (endTime-startTime) print("M^0.5 x " .. num .. " == " .. dt .. " ms" ) return dt end local t1 = 0 local t2 = 0 t1 = t1 + test1(1000000) t1 = t1 + test1(1000000) t1 = t1 + test1(1000000) t2 = t2 + test2(1000000) t2 = t2 + test2(1000000) t2 = t2 + test2(1000000) if( t1 > t2 ) then print(" M^0.5 is faster by " .. round( 1 - t2/t1,2 ) * 100 .. "%" ) else print(" math.sqrt is faster by " .. round( 1 - t1/t2,2 ) * 100 .. "%" ) end
Edited by roaminggamer, 17 March 2015 - 12:23 PM.
On that same note, I've found multiplying by powers of 2, e.g. x * 2^power, quite a lot faster than bit.lshift() or math.ldexp() (with the localizations, of course) in the Windows simulator. On device, I don't know if I've done any tests. Likewise x^2 versus x * x (lookup local, invoke operator with constant, vs. lookup local, lookup "another" local, invoke operator).
On the subject of bitwise ops, where possible, I tend to favor the + and - operators over of bit.bor() and bit.band() / bit.bnot(). The % operator, on the other hand, still seemed pretty expensive vs. bit.band().
Any non-Windows numbers, or contrary results on Windows itself, would be much appreciated!
For optimization geeks, I created an additional way to test the above (requires above code):
local function visualizeResults2( test1, test2, size, num, doPower ) local t1 = 0 local t2 = 0 size = size or 2 num = num or 100 local t1 = 0 local t2 = 0 local count = 1 local width = round(display.contentWidth/size) local height = round(display.contentHeight/size) local yield = coroutine.yield print("Running ", width * height * num * 2, " calculations." ) local lastTime = getTimer() local timerID local wrapped = coroutine.wrap( function( event ) for i = 1, height do for j = 1, width do if( doPower ) then t1 = test1( num, 10^(count-1) ) t2 = test2( num, 10^(count-1) ) else t1 = test1( num, count ) t2 = test2( num, count ) end local tmp = display.newRect( (j-1) * size, (i-1) * size, size, size ) tmp.anchorX = 0 tmp.anchorY = 0 if( t1 > t2 ) then --if( count%7 == 0 ) then tmp:setFillColor(0,1,0) else tmp:setFillColor(1,0,0) end count = count + 1 --if( count%100 == 0 ) then yield() end if( getTimer() - lastTime > 25 ) then lastTime = getTimer() yield() end end end print("DONE") timerID = event.source timer.cancel(timerID) end ) timer.performWithDelay( 1, wrapped, 0 ) end visualizeResults2( test1, test2, 8, 10000, true )
This produces a full page visualization of what numbers are faster one way or the other. Interestingly for certain powers of 10, one method is faster than the other.
Note: I think I have an error in the above code as it crashes at the end. oops!
Yes, it is actually a real handy little thing the do command
If there is a certain order you like to fire functions and blocks of code, you can put them in do blocks
and the way you show it here is really nice because the variables lives just as long as the do block is active
when it´s done executing, the scope is gone and all the local variables in the scope is released too
I use it quite often
Here´s a couple of functions I made to both check what´s actually going on inside the package.loaded table at runtime:
I make a global function in the mainfile:
function printClasses() for k,v in pairs(_G.package.loaded) do print(k) end end function deleteClass(class) if package.loaded[class] then package.loaded[class] = nil print("**** Removed class: " .. class, " ******") end end
If you run your code with theese functions inside you can look in the terminal and se all the modules/ classes loaded.
Now, if you have modules loaded in your scripts like i.e:
This will be listed in the terminal window if you call:
Now, let´s say that you want to clean up because the player has won or lost and he/ she is taken to a typical gameover page you can run the second function and remove the module completely, you just call the function from whereever you like in your code:
The simulator will print out: **** Removed class: MyModule ******
I use these functions alot
I noticed the localizing parts refer to the main lua libraries (math, io, type),
1st question: Would this also prove useful to the corona sdk libraries.. such as if i was creating a lot of display.newText's in a file, would be better or worse if I was to declare local dNewText = display.newText or would there be no noticeable difference?
2nd question: I am also assuming that all these core lua functions are ones I should be localize for optimization if I use them in my code.
On the bottom of the manual
I am trying to find the find the line between optimization and over optimization.
1. Localization would make it faster, but not noticeably unless you did many.
2. Localization gives you the most speedup for functions where, the 'table lookup' is comparable to the execution time of the function.
(T below is a 'made up unit of time' for the examples just to show relative improvement)
Example 1 - Short Execution Time:
Example 2 - Long Execution time
Generally, I use localization for two reasons:
So in your example, I'd simply do this:
local newText = display.newText -- some speedup, way faster to type.
Can't argue too much with Ed's logic in particular that newText is easier to type that display.newText. The one thing is supportability. I was just looking at a project for someone else and they had abstracted things out and it made it tougher to read.
But on software optimization, you need to decide how often you are calling an API call. For instance, if you only need one random number at the beginning of a scene:create() event, The few microseconds you would gain isn't really perceivable. Our modern devices execute 10's of millions of instructions per second.
Now lets say you're generating a curve and need to call a lot of sin(), cos() and other trig functions in a tight loop, then those few microseconds add up in a hurry.