Detailed frame-time performance measurement

Performance has always been an important aspect of game development. From the early days, where the processing power of hardware was extremely limited, to this day, where we may want to simulate hundreds of thousands of particles at once.

There are many ways to achieve good performance – and thus providing a better experience to the player. From following best practices and applying the right design patterns to using the best technologies.

Ultimately, we can – and should – only do so much before measuring our performance.

In general, the most powerful tool we can apply for this purpose is probably a good performance profiler. This way we can measure the performance of our entire program, down to specific lines of code that may be bottlenecks. Especially when it comes to micro-optimisation such a tool can be very valuable in identifying or confirming problematic code.

Profilers are not necessarily good at giving a higher level overview about our game’s performance, especially if we care less about pieces of code and more about semantic groupings like ‘physics’, ‘AI’ or ‘particles’.

Today we will develop a small class that will allow us to easily measure how much time our game spends on activities like that, and a frame-by-frame basis.

The resulting data could then be displayed or used to create graphs in real-time, allowing us to observe the performance of certain in-game actions and effects directly

Performance graph in Roche Fusion

The screenshot above shows how we use the system developed below in Roche Fusion to display the measured frame times in real-time, and split into a large number of activities. The graph shows the performance impact of a particularly large explosion. Check out this post on how we used GPU accelerated particles to virtually eliminate this graphical performance bottleneck.

Requirements

Before we start writing code, let us make a small list of requirements that we want our system to fulfil:

  • measure a list of defined activities
    We want to measure high-level semantic activities, not actual lines of code. It should be easy to add more activities by changing the code as little as possible.
  • easy to use
    If our system is difficult to use, it will only annoy us, and we may not use it consistently – or correctly – which will make it pointless. In particular we want to be able to easily switch between different activities.
  • fast
    We do not want our performance measuring system to take up more time than the activities it measures – this would be reasonably counter productive.

As it often happens, these requirements actually conflict with each other. The fastest solution will almost never be the one that is easiest to use, and the system will have to do a reasonable amount of book-keeping, if we want to be able to easily add more activities, and switch between them easily.

Our solution will thus be a compromise – and in fact out of many possible ones. I will make no claims to my approach being the best, however I do believe that the trade-offs we will make to make the system easy to use are worth the slight overhead.

Defining activities

We will start with the first of our requirements: We need to define a list of activities to measure. There are a number of ways of doing so, however there is one that provides just the right amount of flexibility, and allows a couple of neat tricks, while still being extremely fast to use: enumeration types or enums.

For this example, let us define a bunch of activities as follows:

enum MeasuredActivity
{
    None = 0,
    SimulateGame,
    SimulatePhysics,
    SimulateParticles,
    RenderGame,
    RenderParticles,
    PostProessing
}

To be able to keep track of separate times for each activity we need to keep a collection of multiple time totals. We could use a dictionary mapping activities to times, however that is overhead we can avoid (ab)using a property of enums:

Enums have numeric values that by default increment in the same order as we define them.

That means that by accessing the last defined value and casting it to an integer, we can find out how many activities there are:

int numberOfActivities = Enum.GetValues(typeof(MeasuredActivity))
    .Cast<int>.Last() + 1;

Alternatively we could add another value to our enum that we use to only use to get the total number. This is the solution I will go for since it allows us to define the number of activities statically:

enum MeasuredActivity
{
    None = 0,
    /* all the values from above */

    NumberOfActivities, // must be last
}

const int numberOfActivities = (int)MeasuredActivity.NumberOfActivities;

Which of these two solutions you like better is of course up to you.

Measuring activities

Now that we have out list of activities, and know how many there are – and in fact can index them using integers by casting the enum values to integers – we can simply keep an array of accumulated times.

TimeSpan[] measuredTimes = new TimeSpan[numberOfActivities];

Note that we are using the TimeSpan type to give our measured values a semantic meaning. It further allows us to not have to care about the unit of the stored values and let the type handle operations and conversions for us when needed.

We can now use the Stopwatch class to measure our time, wrapping it in a simple class with start and stop methods as follows:

class FrameTimer
{
    const int numberOfActivities = (int)MeasuredActivity.NumberOfActivities;

    private readonly TimeSpan[] measuredTimes =
        new TimeSpan[numberOfActivities];
    private readonly Stopwatch timer = new Stopwatch();

    private MeasuredActivity activeActivity = MeasuredActivity.None;

    public void StartActivity(MeasuredActivity activity)
    {
        this.timer.Restart();
        this.activeActivity = activity;
    }

    public void StopActivity()
    {
        this.measuredTimes[(int)this.activeActivity] += this.timer.Elapsed;
    }
}

In principle, this is all that we need. We can now call StartActivity when we start an activity we want to measure and call StopActivity afterwards to add up the measured time.

Stacking activities

However, there are still a lot of things that can go wrong. We will never measure time unless we specifically start the timer, and we will lose our time if we forget to call StopActivity.

There are many small things we can do to improve on these issues, but there is one larger change I would like to propose: instead of keeping track of only a single activity, let us instead use nested activities instead.

If we look at our measured activities above, we see that SimulatePhysics and SimulateParticles are really sub-activities of SimulateGame. We will incorporate this into our system by keeping track of multiple layers of activities.

To make sure this cannot be abused we will require that we always stop an activity before stopping its parent. Further, any activity that we start is assumed to be a child of the previous one.

Note that while we could do so, we will not actually accumulate the measured times of child activities into their parents. Instead the parent activities will take up the space between their children – including parts of the code that may not specify to be tracking a specific child activity. This will make sure that at the end of a frame, we will have measured all elapsed time exactly once, and by adding up the different numbers we can easily confirm the total frame time.

It would be easily possible to accumulate times into parent activities instead, but I have found it easier to work with the data in this way.

To then keep track of these nested – or stacked – activities we will use a simple stack and expand our class from above as follows:

class FrameTimer
{
    const int numberOfActivities = (int)MeasuredActivity.NumberOfActivities;

    private readonly TimeSpan[] measuredTimes =
        new TimeSpan[numberOfActivities];
    private readonly Stopwatch timer = new Stopwatch();
    private readonly Stack<MeasuredActivity> stackedActivities =
        new Stack<MeasuredActivity>(); 

    private MeasuredActivity activeActivity = MeasuredActivity.None;

    public void StartActivity(MeasuredActivity activity)
    {
        if (this.activeActivity == activity)
            throw new Exception("Cannot start activity that is already running!");

        this.measuredTimes[(int)this.activeActivity] += this.timer.Elapsed;
        this.timer.Restart();

        this.stackedActivities.Push(activity);
        this.activeActivity = activity;
    }

    public void StopActivity(MeasuredActivity activity)
    {
        if (this.activeActivity != activity)
            throw new Exception("Cannot stop activity that is not active!");

        this.measuredTimes[(int)this.activeActivity] += this.timer.Elapsed;
        this.timer.Restart();

        this.stackedActivities.Pop();
        this.activeActivity = this.stackedActivities.Peek();
    }
}

Notice how we now also accumulate time when starting a new activity, to make sure nothing falls through the cracks. In addition I added two checks to make sure we use the system as intended and do not nest the same activity multiple times, or stop an activity that is not actually running.

Something that is missing so far is being able to reset the measured time each frame. This can be done easily however:

public void Restart()
{
    for (int i = 0; i < this.MeasuredTimes.Length; i++)
    {
        this.measuredTimes[i] = new TimeSpan();
    }
    this.activeActivity = MeasuredActivity.None;
    this.stackedActivities.Clear();
    this.stackedActivities.Push(MeasuredActivity.None);
    this.timer.Restart();
}

The usage of our code is now reasonably simple:

For example, we might call StartActivity(MeasuredActivity.SimulateGame) and the appropriate StopActivity(MeasuredActivity.SimulateGame) around our main update loop. Different components of our game would then start and stop more specific activities as applicable – and as detailed as we like.

At the end of the frame we then have an array full of measured times that we can use as we like before resetting the timer.

Abusing the using statement

There is one trick we can use to make using our FrameTimer even easier.

This trick relies on C#’s using statement.

In essence, the using statement’s purpose is to allow for the temporary use of (unmanaged) resources, using the IDisposable interface.

For example, to keep our application’s memory footprint as low as possible, we may want to dispose of a font we only created to use a single time:

using(Font font = new Font("Arial", 14))
{
    /* do something with the font */
}

This will be translated into the following:

{
    Font font = new Font("Arial", 14);
    try
    {
        /* do something with the font */
    }
    finally
    {
        if(font != null)
            ((IDisposable)font).Dispose();
    }
}

Not worrying to much about the exact syntax inside the finally block, the key point here is that the using statement will make sure that Dispose() is called on the specified object after we exit the block inside. By using try finally this will even happen if we return or if the code throws an exception inside the block.

We will (ab)use this feature to allow for the following usage of our timer:

using(frameTimer.Measure(MeasuredActivity.SomeActivity))
{
    /* the code we want to measure */
}

To make this possible, we will need a custom type inheriting from IDisposable. To prevent the creation of lots of small objects, we will make this type a structure and nest it in our FrameTimer class.

struct FrameTimerActivityHandle : IDisposable
{
    private readonly FrameTimer timer;
    private readonly MeasuredActivity activity;

    private FrameTimerActivityHandle(FrameTimer timer, MeasuredActivity activity)
    {
        this.timer = timer;
        this.activity = activity;
        timer.StartActivity(activity);
    }

    public void Dispose()
    {
        timer.StopActivity(activity);
    }
}

The FrameTimer.Measure() method looks as follows.

public FrameTimerActivityHandle Measure(MeasuerdActivity activity)
{
    return new FrameTimerActivityHandle(this, activity);
}

Note that this method has to return our FrameTimerActivityHandle structure, to make sure the using block can call Dispose() on it. We do not need to store it in a variable however, since we do not need to access the handle ourselves. Dispose() will still be called.

Conclusion

I hope this post has given you insight into the proposed solution to measuring frame times.

If you have an alternative solution, or think that the one proposed here has flaws, feel free to let me know in the comments! I am always interested to hear how others solve the problems I encounter myself.

Enjoy the pixels!

Leave a Reply