Cleanup Storage
Each recording exists in the form of a file and an entry in the database. OpenReplay dumps what’s necessary to replay a session (DOM mutations, mouse coordinates, console logs, network activity and much more) into 3 files (2 for the replay itself and 1 for the DevTools data). These files are by default stored on your instance, so they make up most of its storage. Session metadata will be stored in the PostgreSQL database forever, but after 180 days the file containing the recording will be expired/deleted through a minio lifecycle policy.
Temporary storage cleanup
Section titled Temporary storage cleanupOpenReplay stores temporary data in the filesystem prior to processing and uploading to the object storage service (minIO or S3). Once the processing is completed, the data gets deprecated, and a cronjob will delete these files every 2nd day of week.
If you wish to amend the cronjob:
- Edit the configuration:
openreplay -e
- Change the cronjob timing by appending the following line:
utilities:
# Cleanup data everyday morning 3:05 am, server time.
cron: "5 3 * * *"
There are 2 ways for cleaning up storage in your OpenReplay instance: automated (CLI) and manual.
Data cleanup (CLI)
Section titled Data cleanup (CLI)This process is fully automated through our CLI. Simply run the below command to clean up your storage by removing data from both Postgres (where events are stored) and minio (where recordings are saved):
# To clean data older than 14 days
openreplay --cleanup 14
Data cleanup (Manual)
Section titled Data cleanup (Manual)Data can be removed from both the database (where events are stored) and minio (where recordings are saved).
Recordings cleanup
Section titled Recordings cleanupIf you ever need to free up some space, then login to your OpenReplay instance and follow the below steps:
- Run
k9s -n db
- Use the keyboard arrows to navigate the list and get to the
minio-*
container - Press
s
to have shell access the Minio (object storage) container - Run
mc alias set minio http://localhost:9000 $MINIO_ROOT_USER $MINIO_ROOT_PASSWORD
- Run
mc rm --recursive --dangerous --force --older-than 7d minio/mobs
(i.e. delete files that are older than 7 days) - Use
exit
to exit the Minio container - Run
:quit
to exit the Kubernetes CLI
Change default lifecycle policy
Section titled Change default lifecycle policyIf you’re using minio (vanilla installation), you can change the default lifecycle policy this way:
- Run
k9s -n db
- Use the keyboard arrows to navigate the list and get to the
minio-*
container - Press
s
to have shell access the Minio (object storage) container - Run
mc alias set minio http://localhost:9000 $MINIO_ROOT_USER $MINIO_ROOT_PASSWORD
- To automatically clean recordings 14 days after creation
export EXPIRATION_DAYS=14
export DELETE_JOB_DAYS=$((EXPIRATION_DAYS>30 ? 30 : EXPIRATION_DAYS))
cat <<EOF > /tmp/lifecycle.json
{
"Rules": [
{
"Expiration": {
"Days": $EXPIRATION_DAYS
},
"ID": "Delete old mob files",
"Status": "Enabled"
},
{
"Expiration": {
"Days": $DELETE_JOB_DAYS
},
"ID": "Delete flagged mob files after ${DELETE_JOB_DAYS} days",
"Filter": {
"Tag": {
"Key": "to_delete_in_days",
"Value": "${DELETE_JOB_DAYS}"
}
},
"Status": "Enabled"
}
]
}
EOF
mc ilm import minio/mobs < /tmp/lifecycle.json
- Use
exit
to exit the Minio container - Run
:quit
to exit the Kubernetes CLI
Database cleanup (PostgeSQL)
Section titled Database cleanup (PostgeSQL)Depending on your usage, data can be removed from various tables and in different ways.
Connect to PostgreSQL
Section titled Connect to PostgreSQLConnect to your OpenReplay instance, then:
- Run
k9s -n db
- Use the keyboard arrows to navigate the list and get to the
postgresql-*
container - Press
s
to have shell access the the Postgres container - Run
PGPASSWORD=MY_PG_PASSWORD psql -U postgres
(replaceMY_PG_PASSWORD
with the value of thepostgresqlPassword
variable from/var/lib/openreplay/vars.yaml
file) - Execute your delete (or any other) query
- Type
exit
to exit the postgresql-client - Use
exit
to exit the Postgres container - Run
:quit
to exit the Kubernetes CLI
Check tables size
Section titled Check tables sizeTo check the tables size, you can run the following query:
SELECT nspname AS "name_space",
relname AS "relation",
pg_size_pretty(
pg_total_relation_size(C.oid)
) AS "total_size"
FROM pg_class C
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
WHERE nspname NOT IN ('pg_catalog','information_schema')
AND C.relkind <> 'i'
AND nspname !~ '^pg_toast'
ORDER BY pg_total_relation_size(C.oid) DESC
LIMIT 20;
Delete specific events
Section titled Delete specific eventsWe noticed that most of OpenReplay users, after checking the results of the previous section, decided to remove specific events instead of cleaning sessions (especially events.resources
and events_common.requests
).
To discard all event-data you can run any one of the following queries, but keep in mind this will affect cards values, click-maps, events list and other features.
--- To delete all data related to a specific event
-- The next 2 tables are usually the biggest ones, and they affect some cards only
TRUNCATE TABLE events.resources;
TRUNCATE TABLE events_common.requests;
-- The next table will affect click-maps and events list of session's replay
TRUNCATE TABLE events.clicks;
TRUNCATE TABLE events.errors;
TRUNCATE TABLE events.graphql;
TRUNCATE TABLE events.inputs;
TRUNCATE TABLE events.pages;
TRUNCATE TABLE events.performance;
TRUNCATE TABLE events.state_actions;
TRUNCATE TABLE events_common.customs;
TRUNCATE TABLE events_common.issues;
Delete specific sessions by time
Section titled Delete specific sessions by timeIf you want to clean all sessions, skip to the next part as it is faster and releases storage space instantly.
Use the below SQL query if you wish to cleanup data from your database (PostgreSQL). Replace the 2021-01-01
with the date from which to keep recordings. It’s a cascade delete, so all recordings as well as their corresponding events will be removed from the database.
--- Cascade delete all sessions and their related events captured before Jan 1st, 2021
DELETE FROM public.sessions WHERE start_ts < extract(epoch from '2021-01-01'::date) * 1000;
After running the previous query, the database will not release the storage space immediately, as it will schedule a cleaning for later, to manually force it to release storage, you can run the following queries:
--- Recreate indexes and free unused storage
VACUUM FULL public.sessions;
VACUUM FULL events_common.customs;
VACUUM FULL events_common.issues;
VACUUM FULL events_common.requests;
VACUUM FULL events.pages;
VACUUM FULL events.state_actions;
VACUUM FULL events.errors;
VACUUM FULL events.graphql;
VACUUM FULL events.performance;
VACUUM FULL events.resources;
VACUUM FULL events.inputs;
VACUUM FULL events.clicks;
Delete all sessions
Section titled Delete all sessionsUse the below SQL query if you wish to cleanup all sessions data from your database (PostgreSQL). It’s a cascade delete, so all recordings as well as their corresponding events will be removed from the database.
--- Cascade delete all sessions and their related events
TRUNCATE TABLE public.sessions CASCADE;
TRUNCATE TABLE public.errors CASCADE;
TRUNCATE TABLE public.issues CASCADE;
TRUNCATE TABLE public.autocomplete;
Have questions?
Section titled Have questions?If you have any questions about this process, feel free to reach out to us on our Slack or check out our Forum.