In the last post, I discussed the creation of a point allocation algorithm to assign points on a grid for instancing multiple different building models of different sizes for the Skylines: City Generator project. In this post, we will discuss the new problems that I discovered when I was able to instance 100’s of different buildings and how I went about solving these problems.
Too many vertices
I wanted to create a fairly large Cyberpunk city with lots of variation and 100’s of different dystopian looking buildings so that when we flew through the city, there wouldn't be too much repetition. The building models that I had chosen had a fairly high polygon count and this meant that when I was instancing a large amount of the buildings in the scene, it was taking too long to render that scene and dropped the frame rate into the single digits. I had to come up with a way to optimise the system in order to be able to run it in real time.
With my limited experience in Blender, I knew that it would take me a very long time to clean up the 80+ models and re-import them into TouchDesigner. I decided to use the polyreduceSOP in TouchDesigner as an alternative to reduce the vertex count of the models on import. I also enabled backface culling in the renderTOP to reduce extra draw calls. These initial steps were helpful in bringing some performance gains, but, they were still not enough to make the project real time.
Cull everything
In my search, to find a solution for this optimization problem, I discovered that if I reduced the amount of geometry that needed to be instanced, I was able to significantly improve the performance of the project. I realised that I now needed to come up with a way to dynamically reduce the number of buildings that get instanced as the camera is moving around the scene.
I decided to dynamically delete points that were previously assigned to instance a building if they lie outside of a culling volume. The culling volume is a sphere, which has its origin at the intersection of the camera view vector and the source grid plane and its radius is the distance of the camera from that intersection point. This method allowed me to dynamically reduce the size of the culling volume and instance a smaller amount of geometry if the camera is closer to the grid plane. I am using the extremely useful objectCHOP to obtain the camera’s transform and then use that to calculate its view vector, the intersection point with the plane and its distance from the intersection point.
I initially tried to perform this culling operation as a SOP network, but since the SOPs in TouchDesigner run on the CPU it was causing an even worse bottleneck and I was seeing poorer performance than before. With the help of the incredible Ian Shelanskey, I was able to come up with a much more performant CHOP based network that was able to significantly increase the project performance and bring it up to real time.
An important thing that I had to keep in mind is that TouchDesigner is a pull-based system and if an operator is cooking or changing its values, all following operators in the network chain will also cook. To avoid any unnecessary cooks, I decided to perform this culling operation towards the end of the point allocation network to ensure that only the minimum number of required operators will cook since the objectCHOP is always cooking if the camera is moving.
Points culled, performance gained
Optimization was a big hurdle that I spent a lot of time trying to cross. I had to come up with a novel way to reduce the amount of geometry that was being instanced in TouchDesigner at any given time. I also learnt how to use the Performance Monitor and the Probe tools in order to identify both CPU and GPU performance bottlenecks and implemented several practical ways of gaining back some cook time. I’m hoping to take all of these learnings with me onto the next project. In this post, I have only introduced my approach to optimization. In the next post, I will take a deeper dive into the optimization network.