API Performance Problems: Lets Capture More Data

The performance tests were failing seemingly randomly. No one on my team knew why our API would suddenly fail on the 99th or 95th percentiles. They wanted easy access to data and visualisations.

We use Gatling for our performance tests and our service runs in Google Cloud Platform (GCP). What if performance statistics from Gatling could be captured and stored in GCP to be visualised using a self-service approach. Turns out they can.

Solution Overview

Gatling stores test statistics in JSON files on disk (more below) those stats can be parsed, stored in GCP Datastore and then using a Datastore backup (in GCP Storage) the results can be loaded into GCP Big Query. From there you can use Google Sheets to load the data into a spreadsheet. Once in Google Sheets, you can do all the usual things you might do with a spreadsheet. Create some pretty graphs, a pie chart maybe.

Datastore --> Storage (Bucket) --> BigQuery --> Google Sheets

That sounds like a lot of moving parts and it is. But the moving parts individually are simple and work together well. The individual tasks are not hard to setup. Let’s start with Gatling.

Gathering Data from Gatling

The first stage is to get the performance stats from Gatling. Starting in the report folder generated by your Gatling test there are two key files that I used for stats:

./js/assertions.json

This file contains the details of all the assertions you make within your test. This is handy as, chances are all the performance stats you care about are in here as your test asserts on them.

JSON Structure

The file contains one JSON object with the following structure:

{
    "simulation": "simulation-name",
    "simulationId": "simulation-id",
    "start": 1576246324329,
    "description": "",
    "scenarios": [
        "scenario1",
        "scenario2"
    ],
    "assertions": [
        {
            "path": "path_to_assertion_target",
            "target": "percentage of successful events",
            "condition": "is greater than",
            "expectedValues": [
                99.9
            ],
            "result": false,
            "message": "path_to_assertion_target: percentage of successful events is greater than 99.9",
            "actualValue": [
                0.0
            ]
        }
    ]
}

./js/global_stats.json

If you don’t assert on the values you want to capture (it is perhaps worth double-checking why you care about them if you don’t assert on them) then your second source of data is global_stats.json. It contains the key stats you might need such as percentiles and response times.

However, one thing to note is the percentile fields follow a Gatling naming convention instead of a clearer 99th, 95th and so on. Use this mapping to help:

Percentile Gatling Percentile Name
50th Percentiles 1
75th Percentiles 2
95th Percentiles 3
99th Percentiles 4

Also note the information on groups are setup either using the defaults or by what is in your Gatling config but the naming is consistent regardless (e.g. group1, group2, group3 and group4).

JSON Structure

The file contains one JSON object with multiple sub-objects within it:

{
    "name": "Global Information",
    "numberOfRequests": {
        "total": 1460,
        "ok": 0,
        "ko": 1840
    },
    "minResponseTime": {
        "total": 41,
        "ok": 0,
        "ko": 41
    },
    "maxResponseTime": {
        "total": 3550,
        "ok": 0,
        "ko": 3550
    },
    "meanResponseTime": {
        "total": 146,
        "ok": 0,
        "ko": 146
    },
    "standardDeviation": {
        "total": 365,
        "ok": 0,
        "ko": 365
    },
    "percentiles1": {
        "total": 63,
        "ok": 0,
        "ko": 63
    },
    "percentiles2": {
        "total": 72,
        "ok": 0,
        "ko": 72
    },
    "percentiles3": {
        "total": 761,
        "ok": 0,
        "ko": 761
    },
    "percentiles4": {
        "total": 9999,
        "ok": 0,
        "ko": 9999
    },
    "group1": {
    "name": "t < 55 ms",
    "count": 0,
    "percentage": 0
},
    "group2": {
    "name": "55 ms < t < 75 ms",
    "count": 0,
    "percentage": 0
},
    "group3": {
    "name": "t > 75 ms",
    "count": 0,
    "percentage": 0
},
    "group4": {
    "name": "failed",
    "count": 18191,
    "percentage": 100
},
    "meanNumberOfRequestsPerSecond": {
        "total": 12,
        "ok": 0,
        "ko": 9
    }
}

Pushing Gatling Data into GCP Datastore

With the data found, we can move it into GCP Datastore. Briefly, GCP Datastore is a NoSQL database offered by GCP. It is perfect for storing data like this where we don’t care about modding data relationships and just storing the data.

Personally I used Python to parse the Gatling JSON files and push the data into Datastore. At the time of writing the python packages I used for GCP were:

google-api-core==1.14.3
google-auth==1.8.2
google-cloud-core==1.1.0
google-cloud-datastore==1.10.0
googleapis-common-protos==1.6.0

These things update all the time so check the latest so check pypi package listings for the latest versions to use.

If you are new to Python take a look at Pip and VirtualEnv which makes managing your python packages easy.

It is a bit beyond the scope of this post to teach GCP & Datastore but assuming you have some knowledge of both these topics then this example should get you started:

# Dict containing my stats...
perf_stat = {}
stat_key = "MyStatKey"

# Code to populate dict from JSON files... omitted

# Create, populate and persist an entity
client = datastore.Client(namespace="mynamespace", project="myproject")
stat_key = client.key('MyDataKind')
entity = datastore.Entity(key=stat_key)
entity.update({
    'testRunId': perf_stat['testRunId'],
    'lastUpdated': dt2ts(datetime.now()),
    'testRunStartTimestampUtc': perf_stat['testRunStartTimestampUtc'],
    'global_min_response_time_ms': perf_stat['global_min_response_time'],
    'global_max_response_time_ms': perf_stat['global_max_response_time'],
    'global_mean_response_time_ms': perf_stat['global_mean_response_time'],
    'global_mean_requests_per_second': perf_stat['global_mean_requests_per_second']
})
client.put(entity)

Backing Up the Data to GCP Storage

With the stats captured in Datastore, you could stop here and tell the rest of your team to access the data via GCP Cloud Console. That would work but (particularly with lots of stats captured) the data can be hard to read.

Instead, we are going to load a backup of the data into a bucket using Storage. Using this backup we can load the data into BigQuery and view it using Google Sheets.

Create a GCP Storage Bucket

First, create a Bucket in Storage to store the backup. I also set the retention policy on our Bucket to 7 days because in our implementation this data was loaded into BigQuery as soon as the backup was completed. Once the data is in BigQuery the backup is no longer needed.

Create a Datastore Backup into your Bucket

With the bucket created there are a number of ways to create a datastore backup to it. But I used the command line during our deployment pipeline.

Create a bash file and first make sure you are authenticated using the pipeline:

gcloud auth activate-service-account --key-file=my_key.json

Now add to your bash file the command to start the backup:

# Backup the perf data to a bucket and wait
urlPrefix=$(gcloud datastore export --kinds='MyDataKind' --namespaces='MyNamespace' --format=json gs://my-bucket-gs-url | jq --raw-output '.metadata.outputUrlPrefix')

The key part of that snippet is the usage of jq (which you may need to install). It grabs the first part of the URL returned by the gcloud command which tells you where your backup is stored. You will need that for the next part.

Loading the Data into GCP BigQuery

With the backup now in your bucket here is the final part. Loading the data into BigQuery. BigQuery is a serverless data warehouse solution provided by GCP. It looks very powerful and complicated but for our purpose it is actually quite simple.

First using the GCP Cloud Console navigate to BigQuery and create a dataset and table within it to store your data in.

Update your bash script (the same one with the backup logic) to load the data into BiqQuery:

# Import datastore backup into big data
bq --location=europe-west2 load \
--source_format=DATASTORE_BACKUP \
--replace \
mynamespace:bigquery_dataset_name.bigquery_table_name \
$urlPrefix/namespace_mynamespace/kind_MyDataKind/namespace_mynamespace_kind_MyDatakind.export_metadata

The key things to note in the snippet above:

  • It uses $urlPrefix defined earlier in your bash script to know where to look for the backup.
  • The bq command used to load the backup into BigQuery has the --replace flag passed to it. This is important for us as it we never want to append data to BigQuery - just use the latest backup.

With that done your data is in BigQuery and ready to use by Google Sheets.

Bonus! Accessing the GCP BigQuery Data from Google Sheets

Google Sheets can access data stored in BigQuery. It does so using a Data Connector found under the Data menu.

This is a simple wizard to follow. Tell it the dataset and table you want and the query you want to use to get your data and that is it. The data is inserted in a Sheet and you are ready to go.

When you want to get the latest version of the data just hit the Refresh button on the bottom left.

Wrapping Up

By following this post you have (or are perhaps planning to) string together a couple of GCP services in a relatively simple way:

Datastore --> Storage (Bucket) --> BigQuery --> Google Sheets

The reward for this plumbing work is a more friendly way to allow your whole team to access performance data. Everyone on your team is probably familiar with spreadsheets. The less technical members of your team don’t need to understand the other services sitting underneath delivering this information to the sheet. Finally, it can all be orchestrated via your deployment pipeline.

It is still early days for my team but so far letting them manipulate performance data themselves has proven quite useful. It is self-service and they can create any number of visualisations they like using a familiar tool.