Postgres SQL
Analyzing Postgres Queries Performance with Grafana (Part 1)
Postgres Performance Analysis
Monitoring database performance is never child’s play. The database is the lowest level element in system design and must be properly set up and should work efficiently. A fast API response is generally backed by a fast query or efficient database setup.
Postgres provides an excellent out-of-the-box solution to monitor the queries, indexing, and in general database performance.
Stats
There are 2 main stats tables that provide statistics of different queries and overall database performance:
- pg_stat_activity
- pg_stat_statements
Though other tables are also there, but the above 2 satisfy most of all our different monitoring use cases.
pg_stat_activity
This table reflects the current ongoing process or activity in the database. In one line, this showcases what queries are running in a database in real-time.
If we simply run SELECT on this table:
select * from pg_stat_activity
We shall get all the current processes running in the database both active and idle ones. The following columns are available
As the output would represent raw data, hence a better query would be:
SELECT now() - pg_stat_activity.query_start AS duration,
usesysid,
usename,
client_addr,
pid,
state,
datname,
query,
backend_type
FROM pg_stat_activity
order by duration desc;
The result is far better and more meaningful:
Here if we try to interpret
duration
The total uptime of execution of this query(process)usesysid, username
Database user id and usernamepid
Process unique idstate
The current state of the process (active/idle)datname
Database name in which process is runningquery
Actual truncated query
Though there are multiple other columns as in DDL, you are always free to use them too.
Checkpoints/Best Practices
- Always maintain unique(separate) credentials for every consumer(client-consuming database), and NEVER share the same credentials with two different clients. This way you can track which client is running long queries if any.
- Duration of
active
queries must not be very large. Ideally must be 30 sec to 1 min or so, depending on your requirement. - An
idle
the state simply represents an idle connection. Too many idle connections also eat up your database hardware resources.
Bonus
You can even set up a Grafana dashboard for this query.
In this way, you can regularly watch and see the live status of your Postgres queries and hunt down the SLOW queries
pg_stat_statements
Postgres stores all query performance stats in an aggregate form in this table. By aggregate we mean, taking the mean of different metrics of a batch of queries rather than tracking individual queries at once.
If we simply run SELECT on this table:
select * from pg_stat_statements
We shall get output in the following DDL form
Again the results, are not much in meaningful form, hence we shall try to run a more realistic and meaningful version of this query:
SELECT
pss.userid,
pss.dbid,
pd.datname as db_name,
round((pss.total_exec_time + pss.total_plan_time)::numeric, 2) as total_time,
pss.calls,
round((pss.mean_exec_time+pss.mean_plan_time)::numeric, 2) as mean,
round((100 * (pss.total_exec_time + pss.total_plan_time) / sum((pss.total_exec_time + pss.total_plan_time)::numeric) OVER ())::numeric, 2) as cpu_portion_pctg,
pss.query
FROM pg_stat_statements pss, pg_database pd
WHERE pd.oid=pss.dbid
ORDER BY (pss.total_exec_time + pss.total_plan_time)
DESC LIMIT 100;
Don't be intimidated by the above query, its simply we are doing some arithmetic operations like mean, and sum.
Output is in a much more meaningful format.
The results are self-explanatory,
- The first query takes a max amount of CPU percentage (1.78) shared with all queries. The mean response time is 238 millisec and the total calls/ queries count is 147850.
- The first query is made by the client using
6794153
the user id.
To get the username:
select usename,usesysid from pg_catalog.pg_user ORDER BY usesysid desc
Bonus
You can try creating a Grafana dashboard for easy monitoring:
The highest CPU share query must be optimized and so on.
Conclusion
Both the table pg_stat_activity
, pg_stat_statements
can be used together to analyze the queries' performance. One can easily draw out which queries are heavier and hence can be optimized to minimize CPU usage. One can even track what are the actual client that are running the slow queries.
There are multiple other monitoring stats too like index stats, idle connections, pending transactions, etc, and we shall talk about them in our next blog of this series.
Whola! Both you and I learned something new today. Congrats
👏 👏 👏