sabato 17 ottobre 2009

Multithreading, parameters update rate, etc.

Big news: the first (huge) step of the multithreaded architecture has been completed.

Pros:
- I can switch from a singlethreaded pipeline to a multithreaded one with one line of code.
- I can adjust the scene update speed. Even on the fly!
- The system is robust, error free and works well regardless of the pressure put on the CPU and/or the GPU.
- When the CPU is under heavy pressure, there's a tremendous improvement. Last tests show 250fps for the multithreaded version vs 60fps in singlethread.

Cons:
- the system doesn't scale linearly with N cores.

My main concern for the last month was about the way the scene entities gets updated over time, in particular which parts of the pipeline should be responsible of the updates.

I wanted a straightforward way to update parameters on a per-frame basis. The older one was quite hacky and probably not as fast as expected.

The problem I had to face was about multiple "reference times".
Some parameters should be updated every frame, some every N milliseconds, some at unknown times (think about network data), some are updated upon request from an external class.

All those different timings must convive. You may ask: what does this have to do with rendering speed? As layers of abstraction and containers are added to cope with different timings, the "distance" between the data needed and the class receiving data widens.

Unless there is a "direct" way to map the correct data to the correct parameter, that distance is going to affect the rendering speed. How? Why? If the scene thread is responsible of updating data, why the should the rendering thread be affected? Isn't that separation the main purpose of a multithreaded pipeline?

The answer to all those questions is: rendering speed is affected because some updates are performed by the rendering thread.

You can't update a shader parameter from the scene thread, as you would be changing it while the rendering thread is running. Even if a multithreade device didn't slow down everything, synching something set so many times during a single rendering frame would be very bad, performance-wise.

The problem is by reducing the amount of abstraction layers it's impossible to keep multiple reference timings coherent, while increasing it widens the distance required by the rendering thread to get the data needed.

The results are ok, as the rendering thread can render a static 50k poly model + 500 animated objects (each with 7 animated parameters) at a speed of approximately 250fps.

I'm fine with 3500 parameters updated per frame at 250fps+rendering, considering some of them can be grouped to increase speed. After all my notebook has a T5550 and an 8600M GT, definitely not the fastest hardware around.

I guess I should now try to scale linearly with N cores and in general work on job pools/graphs and schedulers, but I miss graphics stuff...

Nessun commento: