Why hardware scissoring and batches don’t like each other:
Some of you might have used the XNA SpriteBatch or the SlimDX Sprite in order to batch quads.
And a few of you might have wondered why changing the ScissorRect between batch.Draw calls doesn’t do what you expect.
The explanation is really simple when you understand what the batch class does under the hood:
Every Batch.Draw call stores a single primitive in the batch. The next call to Batch.End then updates a dynamic vertex buffer, binds it to the device and issues a Device.Draw call.
There is no way to change the scissor rect in between batch.Draw calls.
The only way around this is to end the batch, change the scissor rect, and start the batch again, which is exactly what Squid does. But this leads to more draw calls, actually as many as there are scissor rect changes.
In practice, the batch will also submit a Device.Draw call for every different texture that was used in Batch.Draw, in order of appearance. And this leads to even more draw calls. Some batches give you the option to sort by texture, which is convenient in some situations, but not relevant for GUI drawing. For our purposes, all we need is an alphablended painters algorithm aka everything is drawn in the order it was submitted.
Now, all of this is not a big problem, since you can clip your quads manually before drawing them in a batch.
That way you can avoid the hardware scissor rect alltogether, but that’s where fonts become a real problem.
Even in XNA 4.0 the SpriteFont class doesn’t let you do software clipping. In order to clip fonts pixel perfect,
you either need to use the hardware scissor rect (which we’re trying to get rid of here) or roll your own SpriteFont class.
The solution is a custom SpriteFont class that respects the current software scissor rectangle.
All in all, here’s what we have to do in order to reduce draw calls, and possibly even reduce the whole GUI rendering to a single Device.Draw call:
- replace hardware scissor test with software clipping (scissor quads before sending them to the batch)
- use custom spritefont (scissor glyphs before sending them to the batch)
- avoid texture switches (use texture array or atlas, depending on available hardware)
How is this relevant for Squid?
Whenever the scissor rect changes, Squid makes a call to Renderer.EndBatch, Renderer.Scissor, Renderer.StartBatch.
This degrades the performance of batching. Imagine every single control would scissor. That’s right, this would result in 1 draw call per control, which is not acceptable.
So it’s very important to wisely decide which controls actually need scissor and which do not.
This is why Squid lets you enable/disable the scissor testing via the Control.Scissor property, which is false by default.
But there is more.
Because i assumed any renderer implementation would use the hardware scissor test, Squid sends you un-clipped quads. A little known fact that will change in the next version, where Squid will send you software clipped quads with correctly adjusted UVs. Which means, everything but text will be clipped correctly, even without the hardware scissor test! The clipping itself is boiler plate code which will be wrapped away in Squid. And Squid will still give you a scissor rect which you can use to clip font glyphs.
Hint: You can do all this by yourself already, if you use a custom SpriteFont equivalent!
Wait, when we don’t use hardware scissor, why stop the batch? The answer is we don’t have to.
Squid will still make the same calls to EndBatch/Scissor/StartBatch, but you can ignore every End/Start betwen the first Start and the very last End call. I’ll show an example of this in my next blog.
Why that is important:
- Smart users can use Squid with software scissor on every single control at virtually no cost.
- Very smart users can reduce the whole GUI rendering to a single Device.Draw call.
- Unity3D renderer will be possible to implement in several ways.
How do i get optimal Squid performance:
- Use software clipping, also for fonts.
- Use less textures. Merge textures into arrays or an atlas.
Upcoming Squid changes:
- Squid will send you clipped quads with clipped UVs.
- EndBatch will change to EndBatch(bool final), to signal when the whole GUI rendering is done.
- Squid will include a sample XNA4.0 renderer.
- Squid will include a sample DirectX11 renderer, using PointList and Geometry Shader.
I’ll explain more about #4 in my next blog: An optimal Squid renderer in DirectX11.