Parametric GPU accelerated particles

Last week I wrote about how we can use parametric equations in particle systems.

Doing so allowed us to eliminate mutability from our particles. In that post I already hinted that this property allows us to easily move our particles to be simulated on the GPU.

That is what we will do today!

Here is a sneak-peak of my basic implementation (all code below!):

The goal in this post is to upload particles of for example an explosion to the GPU only a single time, and then simulate and render them with minimal effort until all the particles have disappeared.

We will make one key assumption that while not strictly speaking necessary, will make our code much simpler for this example:

All our particles will be spawned in large groups, all at the same time, and then stay alive for about the same time as other particles in their group.

More complex creation behaviour is also possible, but let us get the basics down first. Also, even though this may seem quite limiting, it is in fact already very powerful.

For example, most of the particles in Roche Fusion are simulated with exactly the same technique as described bellow.

GPU particles in Roche Fusion

The general idea

Before we start, lets step through the main points of what we are going to implement. Our particles will be created, simulated and drawn as follows:

  1. Each time we spawn a group of particles, we upload a vertex buffer to the GPU, where each vertex represents one particle, containing its starting parameters.
  2. Each frame we render that vertex buffer. In the vertex shader we get a vertex representing a particle and move it, according to the elapsed time using our parametric equations.
  3. The correct and current positioned particle is then passed on to a geometry shader. Here we expand the particle – still a single vertex – into a screen aligned quad.
  4. Lastly we use a regular fragment shader to render our particle quads as usual.

Some notes and tricks

There a couple of additional details we will include in our implementation of the above algorithm. Most noteworthy, geometry shaders have the ability to not emit any geometry, effectively discarding the particle before it is even passed to the vertex post-processing pipeline.

We will calculate the correct alpha value of our fading particles in the vertex shader and then use this feature to discard particles with non-positive alpha, which prevents us from rendering particles that would not be visible in the first place.

There is no obviously best way for what work to do in the vertex and geometry shaders. In principle, we could shift all the processing to the geometry shader, but since vertex shaders cannot be skipped, we might as well use them and simplify our code accordingly.

Note, that if we split the work between the two shaders correctly, we could use the same geometry and fragment shaders – and a simple pass through vertex shader – to also render regular sprites simulated on and streamed from the GPU.

This is useful to avoid duplicate code when the same specialised shaders are used to render both GPU particles and other sprites.

Implementation

To implement particles as outlined above, I used my C# OpenGL graphics library which is available on GitHub. It should not be difficult to translate the code below to a different shader language and use a different framework however.

The example I implemented can be found on GitHub as well. It is a small program that allows the user to spawn particles in batches of a thousand.

After the post on parametric particles from last week I was asked to show an example of a non-linearly moving particle. So, the particles in the example are initialised with randomised parameters and then follow a parabolic path – as if affected by gravity – before fading away.

The result is not terribly pretty but serves to demonstrate the basic implementation well.

Particle Vertex

To start our implementation, we need a vertex representing our particle.

In our case, each particle needs to know its initial position, velocity, and its lifetime. We could include a lot more attributes like size, colour, and UV coordinates, but for this post we will focus on the essentials.

Note that I use my library to easily create a vertex attribute array, and determine the byte-size of the vertex. These are needed to pass the data on to OpenGL.

Shaders

Once we have our vertex, the real meat of this post are the shaders.

Vertex shader

In the vertex shader we will take the vertex from above, calculate its current position and transparency using a uniform time value. These will be passed on to the geometry shader for expansion of the particle sprite.

Geometry shader

The input for our vertex shader is the positioned particle from the vertex shader. We take this particle and first transform its position into camera space.

We then expand it into a quad of four vertices. Doing so in camera space means that our particle will always be drawn aligned with the screen.

Lastly we apply the projective transformation of our camera to each vertex and emit them.

We also include simple uv coordinates which we will use in the fragment shader.

Note that we return early – before emitting any geometry – if the particle in question has a non-positive alpha value, meaning that it has faded out.

Fragment shader

The fragment shader is the least interesting of the three. To get something slightly interesting, I wrote a small shader that uses the uv coordinates of the fragment to render the particles as blurry circles.

Results

Putting this all together, the result looks as follows:

At its peak, this video shows about 300 thousand particles at the same time. The slow motion at the end is on purpose, and not related to the number of particles.

For the exact detail on how I use my library to put the code in this post together, feel free to check out the repository of the example project.

If you would like to know more about how to use OpenGL in C#, I suggest you check out my in-depth post on the topic where I build and explain a small object oriented framework from scratch.

Performance

While I have not talked about it much, the main reason to move particles – or indeed anything at all – to the less flexible GPU is performance.

I did not do any tests or comparisons on the particles from this post (let me know if you would like a post on that!) but I hope that being able to simulate and render 300 thousand particles on my laptop with a very basic implementation speaks for itself.

When I first investigated this topic to fight performance problems in Roche Fusion, I did a variety of tests however. Below you can see a frame time graph of a major explosion (~5000 particles) before and after switching to GPU particles.

Performance before and after GPU particles in Roche Fusion

The difference is clearly significant.

For more details on the more complex GPU particles in Roche Fusion, you can find a write-up, including shader code on the Roche Fusion devlog.

Conclusion

I hope this post has been interesting and useful.

Let me know if you would like me to go into more details on any of the aspect discussed above, or anything else related.

If you have used or are using the GPU to simulate particles yourself, leave a comment below and let me know what your experience has been!

Enjoy the pixels!

Leave a Reply