Parsing qliksense healthcheck api results

 2021-09-15

Abstract

When you do a stress test/troubleshooting of a qliksense node it is useful to collect the responses of the healthcheck api and extract some useful info from them (which and how many applications were loaded in memory, …)

Collecting data

I usually use the command line tool qsense for querying the Qliksense repository

while [ 1 ]
do 
    qsense healthcheck  qlikhost1.redaelli.org ~/certificates/qlik/client.pem >> healthcheck.jl
    sleep 60
done

Each line of the file healthcheck.jl is a json object like

{
  "version": "12.763.10",
  "started": "20210915T165938.000+0200",
  "mem": {
    "committed": 72283.234375,
    "allocated": 118266.04296875,
    "free": 436586.79296875
  },
  "cpu": {
    "total": 0
  },
  "session": {
    "active": 1,
    "total": 63
  },
  "apps": {
    "active_docs": [
      "0599e6baa-3b4a-4648-bbdb-47013a02dc21"
    ],
    "loaded_docs": [
      "15e0547c-c4eb-4492-a1db-1603d8295423",
      "163777f8-9582-46b0-9418-a01f2d71c32d",
      "059e6baa-3b4a-4648-bbdb-47013a02dc21"
    ],
    "in_memory_docs": [
      "15e0547c-c4eb-4492-a1db-1603d8295423",
      "163777f8-9582-46b0-9418-a01f2d71c32d",
      "059e6baa-3b4a-4648-bbdb-47013a02dc21"
    ],
    "calls": 17126,
    "selections": 300
  },
  "users": {
    "active": 1,
    "total": 6
  },
  "cache": {
    "hits": 70,
    "lookups": 70,
    "added": 0,
    "replaced": 0,
    "bytes_added": 0
  },
  "saturated": false,

It can be useful to show application names instead of their Ids. So we can download the dataset with

qsense entity qlikhost1.redaelli.org ~/certificates/qlik/client.pem app --filter "published eq true" > app.json

Extracting some info

With the following script

python healthcheck.py healthcheck.jl healthcheck-out

you can extract a csv with the info

timestamp, mem_free, session_active, session_total, users_active, users_total, app1, app2, app3, ...

Below the source of the script

## file  healthcheck.py
import sys

from pyspark.sql import SparkSession
from pyspark.sql import functions as F

infile = sys.argv[1]
outfile = sys.argv[2]

spark = SparkSession.builder.getOrCreate()

df = spark.read.json(infile)

apps = spark.read.json("apps.json")

df.select(F.col("now"), 
          F.col("mem.free").alias("mem_free"), 
          F.col("session.active").alias("session_active"), 
          F.col("session.total").alias("session_total"),
          F.col("users.active").alias("users_active"), 
          F.col("users.total").alias("users_total"),
          F.col("apps.in_memory_docs"))\
    .withColumn("id", F.explode(F.col("in_memory_docs")))\
    .join(apps, 'id', how="left").withColumn("fullname", F.concat("name", "id")).select(["now", "mem_free", "session_active", "session_total","users_active", "users_total", "fullname"])\
    .groupBy(["now", "mem_free", "session_active", "session_total","users_active", "users_total"])\
      .pivot("fullname").count().coalesce(1).write.mode("overwrite").option("sep",";").option("header","true").csv(outfile)

When/how many times was the engine restarted? What happened just before?

import collections
import pprint
import re
import pandas as pd
import sys

def load_data(day, infile):
    return pd.read_json(infile, lines=True)

def load_apps():
    url = "app.json"
    return pd.read_json(url)
    
def parse_healtcheck(infile):
    def lists_diff(li1, li2):
        return list(set(li1) - set(li2)) + list(set(li2) - set(li1))
    
    df = load_data(infile)
    started = None
    active = loaded = last_active = last_loaded = []
    for index, row in df.iterrows():
        if row["started"] != started:
            print("******************************************")
            print("Engine started at {started}".format(started=row["started"]))
            print("******************************************")
            started=row["started"]
            print("Previous active:")
            print(active)
            print("Previous loaded:")
            print(loaded)
            last_active = last_active + active
            last_loaded = last_loaded + loaded
        new_active = row["apps"]["active_docs"]
        new_loaded = row["apps"]["loaded_docs"]
    
        delta_active = lists_diff(new_active, active)
        if delta_active != []:
            pprint.pprint(row["now"] + ": new active apps: " + str(delta_active))

        delta_loaded = lists_diff(new_loaded, loaded)
        if delta_loaded != []:
            pprint.pprint(row["now"] + ": new loaded apps: " + str(delta_loaded))
        
        active = new_active
        loaded = new_loaded
    pprint.pprint("Latest loaded apps: " + str(collections.Counter(last_loaded)))
    pprint.pprint("Latest active apps: " + str(collections.Counter(last_active)))
    pprint.pprint("Latest restarts: " + str(df.started.unique()))
    
if __name__ == "__main__":
    # execute only if run as a script
    infile = sys.argv[1]
    parse_healthcheck(infile)

MR70

Parsing qliksense healthcheck api results

Abstract

Collecting data

Extracting some info

When/how many times was the engine restarted? What happened just before?

Enter your instance's address

More posts like this

aws-ext

GraphQL datasource for Qliksense