The multithreaded pipeline is almost working.
I still have to clean the code, remove hacks, change the way the frame data is handled and extensively test everything.
Thursday I turned off my pc at 10.15pm, so I didn't have too much time to play with the pipeline!
The first thing I did friday morning was to change my singlethreaded pipeline so that it could run exactly like the multithreaded one does.
I prepared a simple test scene (sponza atrium imported as separate meshes, 500 cubes (each with an unique VB/IB pair) each moving left/right with a simple sin(x) instruction) and started measuring rendering speed.
The first results in dx10 windowed mode were encouraging:
- singlethreaded 133 fps
- multithreaded 177 fps
This improvement comes only from the separation of the scene update thread and the rendering thread, both running at full speed.
Reducing the scene update time and performing interpolation widens the difference in windowed mode:
- singlethreaded 133 fps
- multithreaded 195 fps
The final performances for all configurations in fullscreen mode are:
- dx9 singlethreaded 84
- dx9 multithreaded 104
- dx10 singlethreaded 170
- dx10 multithreaded 275
The dx9 version is clearly limited by a GPU bottleneck. The trick speeding up dx10 seems to be related to the fact vertex format/vertex shader linkage is the same for all meshes. While dx9 recalculates it when a shader/vb change is requested, in dx10 it is user responsibility to link them, so there's no state change when switching VBs.
Despite being GPU-limited, dx9 version still shows an improvement of approximately 20-25% when using the multithreaded pipeline.
What's next?
As I said I'd like to clean and extend the code, then I'll probably have a look at furtherly parallelize my pipeline. In particular I'd like to get a closer look at jobs/job pools/job graphs and schedulers. There's still room for improvement there.
Iscriviti a:
Commenti sul post (Atom)
Nessun commento:
Posta un commento