Grafana

Repairing bitnami Grafana Database

Case study on how I repaired grafana.db

Ayush P Gupta
3 min readAug 14, 2024

So, I had a bitnami/grafanahelm chart deployed on my k8s cluster.
Somehow our Grafana stopped responding. On capturing pod logs, we saw an error saying there was no space left on device.

Upon investigating size du -ah --max-depth=1 we got to know it was grafana.db that was eating up the whole storage. Due to one alert rule defined, the Grafana database grew in size quickly which caused the disk to become full.

We can see grafana.db taking up 51 GB storage which is huge for frontend application

Grafana internally uses sqlite as database to record different data like users, alerts, permissions etc.

One solution was to clean up and setup things again but I tried to fix the underlying issue else my custom dashboard configuration would get lost.
Hence I decided to purge some tables to free up some space.

I tried to find any inbuilt tool or command given by Grafana to clean up things, but couldn't. I even tried exploring grafana-cli but no help

But somehow, I ended up frying my database, and the following error started coming

Error: database disk image is malformed

Upon searching, I found a way to backup sqlite using sqlite3 CLI which I can easily install using

apt update
apt install -y sqlite3

But wait. My Grafana pod didn't allow me to run this command as root. I tried to set security context in pod runAsUser: 0 but that didn't work. This didn't allow Grafana pod to start.

So upon chatgpt-ing found a way to temp start a pod and mounting volume to install and access sqlite as root

apiVersion: v1
kind: Pod
metadata:
name: grafana-debug
namespace: grafana
spec:
containers:
- name: debug-container
image: bitnami/grafana:latest
command: [ "sleep", "infinity" ]
securityContext:
runAsUser: 0
volumeMounts:
- name: grafana-data
mountPath: /opt/bitnami/grafana/data
volumes:
- name: grafana-data
persistentVolumeClaim:
claimName: grafana
restartPolicy: Never

With this I first installed sqlite3 and then ran the following command

sqlite3 database.db ".dump" | sqlite3 database.new

This command allows you to create a new database by dumping data of the old one. This indeed have probability that some of your data could lost due to malformed transactions, but for me, it does the job. After this, I would rename database.new to grafana.db and for safety rename the old one to grafana-old.db .

But there’s a catch, because my database was malformed, hence there was a ROLLBACK statement, somewhere at last, in the script which caused a whole new database to be of size 0.

To combat this, I came upon this stackover answer

sqlite3 database.db ".dump" | sed -e 's|^ROLLBACK;\( -- due to errors\)*$|COMMIT;|g' | sqlite3 database.new

This command adds a stream editor which replaces all ROLLBACK to COMMIT statement.

And guess what after some time, my database was fully copied. I renamed this as said above and my Grafana was UP and Running again.

Though this approach took a lot time, but I learned a lot about how to recover things after the disaster, which definitely would help me sometime in the future.

Whola! Both you and I learned something new today. Congrats
👏 👏 👏

Further Reading

--

--

Ayush P Gupta
Ayush P Gupta

Written by Ayush P Gupta

NodeJs | VueJs | Kubernetes | Flutter | Linux | DIY person

No responses yet