Multithreaded game engine–Using Concurrent collections instead

22Jun13

So this seems to be turning into a series of some sort…

What happens if instead of using synchronization with reset events, we use one of the new concurrent collections? Lets find out.

Implementation with Concurrent Queue

So, while I was trying to implement the double buffer (previous post available here) I started thinking about perhaps a simpler implementation with one of the concurrent collections, so I tried with a ConcurrentQueue<T>.

The idea behind it is pretty simple. Instead of having the two collections of RenderCommand we have a ConcurrentQueue<RenderCommand[]> so we think of each of the elements of the queue as a frame, ready to be rendered.

As before, we are starting off with the primitives3DWindows sample. A bit obvious, but, we’ll need to initialize the queue on the constructor of the Renderer class(I just thought I’d mention that).

From the world We’ll add all necessary cubes to a renderCommand, using the AddCube() method. At the end, we’ll need to add that collection to the queue by calling EndFrame(). Like this

public void AddCube(Cube primitive)
{
	var translation = Matrix.CreateFromYawPitchRoll(
		primitive.Rotation.X, primitive.Rotation.Y, primitive.Rotation.Z) * 
		Matrix.CreateTranslation(primitive.Position);
	_updatingRenderCommands.Add(new RenderCommand
	                                {
	                                    Color = primitive.Color, 
                                        Radius = primitive.Radius, 
                                        World = translation
	                                });
}

public void EndFrame()
{
	var renderCommands = new RenderCommand[_updatingRenderCommands.Count];
	_updatingRenderCommands.CopyTo(renderCommands, 0);
	_concurrentRenderCommandsThatRepresentAFrame.Enqueue(renderCommands);
	_updatingRenderCommands.Clear();
}

These methods are public because they are called from World, however, as you can see from the sequence diagram below we check if the renderer CanAcceptCommands(), the reason for this is that we don’t regulate how often we run world.Update() and in reality we only want to have one frame (or RenderCommand collection ready to render) when we call Draw so that there is no latency.

Capture

With the update done, we now need to render. At this point this is pretty trivial. Code below

public void Draw(Matrix view, Matrix projection)
{
	RenderCommand[] renderCommands;
    if (!_concurrentRenderCommandsThatRepresentAFrame.TryDequeue(out renderCommands))
        return;

    foreach (var renderingRenderCommand in renderCommands)
	{
		_cubePrimitive.Draw(
                              renderingRenderCommand.World, 
                              view, 
                              projection, 
                              renderingRenderCommand.Color);
	}
}

I think what is happening here is pretty self explanatory, but I will do it anyway.  If we call Draw() and we TryDequeue and we can’t we call return, there is no point in trying to Draw and empty array of RenderCommands (I believe that would be the result). Alternatively if we successfully dequeue, we will let the CubePrimitive draw each of the commands.

Analysis

It seems to me that this implementation is easier to follow and implement that the previous one, however I took some performance data and this is the result:

Double buffer:

image

ConcurrentQueue:

image

Green means execution time, and red means synchronization time.

I think what these graphs tell is that double buffer is more efficient because concurrentqueue uses spin waits to synchronize, so it shows up in the thread analysis as execution time, where the previous double buffer implementation shows up as synchronization time, meaning the threads are just asleep.

Next I’m going to try with a concurrent collection that doesn’t use spin wait: BlockingCollection<T>.

Advertisements


4 Responses to “Multithreaded game engine–Using Concurrent collections instead”

  1. Hi. Did you do the test with BlockingCollection? I’m finding this series about multithreaded rendering very interesting as I am looking to do a similar thing in my projects moving forward.

  2. Have you measured the garbage generated by these approaches? Garbage collection can have a measurable effect on performance, so keeping allocations to a minimum (ideally to zero) is a target I aim for.

  3. 3 roundcrisis

    Hi there,
    Thanks for the comments :D. This might be the kick in the arse I needed to go and do it.I ll do some measuring when I work on it this weekend.

    Cheers

  4. 4 roundcrisis

    Hi there. I just implemented this with BlockingCollection it was relatively simple tho it didn’t get the results I thought I might … http://i.imgur.com/yNQDpQ4.png didn’t check the garbage yet… will post up soon enough. Promise 😀


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: