Wednesday, January 21, 2009

refactoring safety net

I recently started porting and collecting some code for doing various sorts of intersection, or collision, tests in haxe.

At the moment, it is pretty limited with ray/triangle and ray/sphere tests.

The first test I ported was based on based on "Fast, Minimum Storage Ray / Triangle Intersection" by Tomas Möller and Ben Trumbore and ported directly from the copy of the code on here.

The results documented in the paper are pretty impressive and having source code certainly made it an easy win, but some aspects of the code concern me:
  • points are represented as arrays, perhaps not very efficient in haxe
  • the code has multiple returns points, can't be inlined in haxe

I want to refactor the code to be as efficient as possible because this is likely to be the innermost loop of particle simulation routines, etc, but I don't want to break the implementation.

Enter the test case.

My initial testcase generated 1000 random triangles and rays and tested the culling and non-culling version of the algorithm.

All well and good, but not good enuff for refactoring since I need to make sure the same inputs produce the same outputs before and after my changes.

So... I put in some trace commands to produce xml like so:
<hit non_hit="1" cul_hit="1" t="0.5373583852">
<v0 x="0.6581358404" y="0.1374191698" z="0.9648343668" />
<v1 x="0.308879062" y="0.6053369969" z="0.3154988046" />
<v2 x="0.8667143827" y="0.1228679823" z="0.3068826511" />
<src x="0.1215154344" y="0.2182366329" z="0.4350455069" />
<dir x="0.7023522276" y="0.3298090568" z="0.1063413928" />
</hit>

Now I can verify that the same input produces the same output.

My test looks like this now:
public function testCanned() {  
var count = 0;
var tests =
loadTriangles();

for ( i in 0 ... 50 ) {
for ( test in tests ) {
verifyTriangle( test );
count++;
}
}
var stopTime = Date.now();
var diff = ( stopTime.getTime() - startTime.getTime() ) / 1000.0;

trace( count + ' tests / ' + diff + 's = ' + ( count / diff ) + ' tests/s' );
}


After a few test runs, the initial numbers look like:
AppTest.hx:99: 459600 tests / 32s = 14362.5 tests/s
AppTest.hx:99: 459600 tests / 36s = 12766.66667 tests/s
AppTest.hx:99: 459600 tests / 35s = 13131.42857 tests/s
AppTest.hx:99: 459600 tests / 35s = 13131.42857 tests/s
AppTest.hx:99: 459600 tests / 36s = 12766.66667 tests/s

So... it averages out to around 13231.738096 tests/s

My first refactoring was to use a "result" variable and some branching to only have a single return point so I can use the inline function modified:
AppTest.hx:99: 459600 tests / 31s = 14825.80645 tests/s
AppTest.hx:99: 459600 tests / 31s = 14825.80645 tests/s
AppTest.hx:99: 459600 tests / 31s = 14825.80645 tests/s
AppTest.hx:99: 459600 tests / 32s = 14362.5 tests/s
AppTest.hx:99: 459600 tests / 31s = 14825.80645 tests/s

For an average of around: 14733.145160 tests/s

This comes to 14733.145160/13231.738096 = 1.11347013167180814489 , for an 11% speedup just from this trivial change.

From some earlier work, I saw that an optimized point class could be significantly faster than a typedef, some additional tests (unpublished) show that using arrays to represent points was about 4x slower than typedefs!

The next refactoring will be to use this "InlinePoint" class

AppTest.hx:101: 459600 tests / 18s = 25533.33333 tests/s
AppTest.hx:101: 459600 tests / 17s = 27035.29412 tests/s
AppTest.hx:101: 459600 tests / 17s = 27035.29412 tests/s
AppTest.hx:101: 459600 tests / 17s = 27035.29412 tests/s
AppTest.hx:101: 459600 tests / 18s = 25533.33333 tests/s

Average is 26434.509804. 26434.509804/13231.738096 = 1.99781084028493908605.

Damn! Of course it comes at the cost of a new type, but still.

So pretty, happy, but still a little concerned about this:
var edge1 = threePt();
var edge2 = threePt();
var tvec = threePt();
var pvec = threePt();
var qvec = threePt();

That's a lot of memory allocation madness... One optimization I tried was to move the declarations to where they are actually used... not much difference...

What about passing in a workspace? This bumped up to 28725 for an improvement of 8%, but doesn't really seem worth the bother.

So that's the good news... now the bad news...
ERR: AppTest.hx:48(tests.AppTest.randomTriangle) - expected '-0.206685315' but was '0.4077964499'

Happily, it turned out to be a red herring from the refactoring of the test itself:
assertEquals( cul.x, non.y );

should have been (and now is)
assertEquals( cul.x, non.x );

Nutty!

No comments:

Post a Comment