Migrating Plausible Analytics to a New Server

A step-by-step guide to migrating Plausible Analytics running inside Docker to a new server

Plausible Analytics is a simple, open-source web analytics tool designed to provide essential website metrics without compromising on user privacy. Unlike traditional analytics platforms, Plausible focuses on simplicity and ease of use, offering clear insights without tracking personal data or using cookies.

At Visini AG, we self-host Plausible Analytics for us and our clients. This guide explains how we migrated Plausible from a multi-service server with virtual machines (VMs) to a dedicated machine. The new machine will be hosting Plausible exclusively via Docker, allowing for easier maintenance and version upgrades. In the old as well as the new setup, Plausible and the associated services (Postgres, ClickHouse) are run with Docker Compose.

Create a Backup of Plausible Data

Backing up Postgres and ClickHouse data ensures we preserve customer site configurations and historical analytics, which can be restored on the new machine. Data stored in Postgres includes settings and metadata, such as the sites and their configurations. ClickHouse stores the actual analytics data, such as page views and unique visitors.

Postgres Backup

First, let’s create a dump of the Postgres database. We need to connect to the Postgres container to create the dump. Let’s check which databases exist in the Postgres container, via the following command (replace the example container ID a52ab8083b6b with the one of your Postgres container):

sudo docker exec -t a52ab8083b6b psql -U postgres  -l

This should provide an output similar to the following:

Name         | Owner    | Encoding |  Collate   |   Ctype    |
-------------+----------+----------+------------+------------+
plausible_db | postgres | UTF8     | en_US.utf8 | en_US.utf8 |
postgres     | postgres | UTF8     | en_US.utf8 | en_US.utf8 |

Looks like we need to create a dump of plausible_db in order to save the configuration data. Let’s create a dump of the entire database, like so:

sudo docker exec -t a52ab8083b6b pg_dump -U postgres plausible_db > /home/ubuntu/plausible_db.sql

Now let’s copy the dump from the virtual machine to the VM host, and then to the local machine:

# First we copy the dump from the VM to the VM host
scp [email protected]:/home/ubuntu/plausible_db.sql /home/ubuntu
 
# Now from the VM host to the local machine
scp [email protected]:/home/ubuntu/plausible_db.sql ~/Downloads

After completing the Postgres backup, the next step is to back up ClickHouse, which stores the analytics data.

ClickHouse Backup

Next, we need to create a dump of the ClickHouse data. ClickHouse stores the actual analytics data, such as page views and unique visitors. The relevant tables are ingest_counters, sessions_v2, and events_v2.

Note: In case you’ve imported data from Universal Analytics (UA), please also migrate the imported_* tables accordingly (not covered in the steps below). See a list of all tables via the following command (replace the example container ID aed6425a6303 with the one of your ClickHouse container):

sudo docker exec -it aed6425a6303 clickhouse client --database plausible_events_db

Running the command show tables should provide an output similar to the following:

-- ┌─name───────────────────────┐
-- │ events                     │
-- │ events_v2                  │
-- │ imported_browsers          │
-- │ imported_devices           │
-- │ imported_entry_pages       │
-- │ imported_exit_pages        │
-- │ imported_locations         │
-- │ imported_operating_systems │
-- │ imported_pages             │
-- │ imported_sources           │
-- │ imported_visitors          │
-- │ ingest_counters            │
-- │ schema_migrations          │
-- │ sessions                   │
-- │ sessions_v2                │
-- └────────────────────────────┘

Let’s focus on the ingest_counters, sessions_v2, and events_v2 tables. We can create dumps of these tables by running the following commands (replace the example container ID aed6425a6303 with the one of your ClickHouse container):

# Clean up any old dumps
sudo docker exec -it aed6425a6303 bash -c "rm -f /var/lib/clickhouse/user_files/*.native"
 
# Create dumps of the relevant tables
sudo docker exec -it aed6425a6303 clickhouse-client --query "SELECT * FROM plausible_events_db.ingest_counters INTO OUTFILE '/var/lib/clickhouse/user_files/ingest_counters.native' FORMAT Native"
sudo docker exec -it aed6425a6303 clickhouse-client --query "SELECT * FROM plausible_events_db.sessions_v2 INTO OUTFILE '/var/lib/clickhouse/user_files/sessions_v2.native' FORMAT Native"
sudo docker exec -it aed6425a6303 clickhouse-client --query "SELECT * FROM plausible_events_db.events_v2 INTO OUTFILE '/var/lib/clickhouse/user_files/events_v2.native' FORMAT Native"
 
# Copy the dumps to the host
sudo docker cp aed6425a6303:/var/lib/clickhouse/user_files/ingest_counters.native ./ingest_counters.native
sudo docker cp aed6425a6303:/var/lib/clickhouse/user_files/sessions_v2.native ./sessions_v2.native
sudo docker cp aed6425a6303:/var/lib/clickhouse/user_files/events_v2.native ./events_v2.native

Now let’s copy these files from the virtual machine to the VM host, and then to the local machine:

# First we copy the dumps from the VM to the VM host
scp [email protected]:/home/ubuntu/ingest_counters.native /home/ubuntu
scp [email protected]:/home/ubuntu/sessions_v2.native /home/ubuntu
scp [email protected]:/home/ubuntu/events_v2.native /home/ubuntu
 
# Now from the VM host to the local machine
scp [email protected]:/home/ubuntu/ingest_counters.native ~/Downloads
scp [email protected]:/home/ubuntu/sessions_v2.native ~/Downloads
scp [email protected]:/home/ubuntu/events_v2.native ~/Downloads

With the backups of the Postgres and ClickHouse data complete, we can now proceed to import them into the new server.

Import Backup Data into New Server

Now that we have the dumps of the Postgres and ClickHouse data, we can import them into the new machine, which should be running Plausible Analytics inside Docker. We first import the Postgres data, followed by the ClickHouse data.

Import Postgres Backup

First, let’s copy the dump of the Postgres database to the new server:

scp ~/Downloads/plausible_db.sql [email protected]:/root

Then we can import the dump into the Postgres container. First, connect to the Postgres container to import the dump. Stop the Plausible container to avoid conflicts. Drop the existing database, then create a new one for a fresh start. Use the following commands (replace the example container ID 5ab0dabcbaa4 with the one of your Postgres container, and 1455b8caae1c with the one of your Plausible container):

docker cp plausible_db.sql 5ab0dabcbaa4:/plausible_db.sql
docker stop 1455b8caae1c # Stop plausible container
sudo docker exec -t 5ab0dabcbaa4 psql -U postgres -c "DROP DATABASE plausible_db"
sudo docker exec -t 5ab0dabcbaa4 psql -U postgres -c "CREATE DATABASE plausible_db"
sudo docker exec -t 5ab0dabcbaa4 psql -U postgres -d plausible_db -f /plausible_db.sql
docker start 1455b8caae1c # Start plausible container

Now the Postgres data has been imported into the new server. We can proceed to import the ClickHouse data.

Import ClickHouse Backup

Next, we need to import the ClickHouse data. We can copy the ClickHouse dumps to the new server:

scp ~/Downloads/ingest_counters.native [email protected]:/root
scp ~/Downloads/sessions_v2.native [email protected]:/root
scp ~/Downloads/events_v2.native [email protected]:/root

Then we can import the dumps into the ClickHouse container. We need to connect to the ClickHouse container to import the dumps. We truncate the existing tables and then import the data from the dumps. Use the following commands (replace the example container ID e92e926fb935 with the one of your ClickHouse container):

# Copy the dumps to the ClickHouse container
docker cp ingest_counters.native e92e926fb935:/root/ingest_counters.native
docker cp sessions_v2.native e92e926fb935:/root/sessions_v2.native
docker cp events_v2.native e92e926fb935:/root/events_v2.native
 
# Connect to the container and truncate the tables
docker exec -it e92e926fb935 clickhouse-client --query "TRUNCATE TABLE plausible_events_db.ingest_counters"
docker exec -it e92e926fb935 clickhouse-client --query "TRUNCATE TABLE plausible_events_db.sessions_v2"
docker exec -it e92e926fb935 clickhouse-client --query "TRUNCATE TABLE plausible_events_db.events_v2"
 
# Import the data from the dumps
docker exec -it e92e926fb935 bash -c "clickhouse-client --query 'INSERT INTO plausible_events_db.ingest_counters FORMAT Native' < /root/ingest_counters.native"
docker exec -it e92e926fb935 bash -c "clickhouse-client --query 'INSERT INTO plausible_events_db.sessions_v2 FORMAT Native' < /root/sessions_v2.native"
docker exec -it e92e926fb935 bash -c "clickhouse-client --query 'INSERT INTO plausible_events_db.events_v2 FORMAT Native' < /root/events_v2.native"

Now the ClickHouse data has been imported into the new server. We have successfully migrated Plausible Analytics to the new server.

Disable ClickHouse-internal Logging

In order to keep the ClickHouse-internal data (such as the trace_log and metric_log tables) from growing indefinitely, we can disable unnecessary logging.

Mount the following configuration file (clickhouse-config.xml) to the ClickHouse container:

<clickhouse>
    <logger>
        <level>warning</level>
        <console>true</console>
    </logger>
 
    <!-- Stop all the unnecessary logging -->
    <query_thread_log remove="remove"/>
    <query_log remove="remove"/>
    <text_log remove="remove"/>
    <trace_log remove="remove"/>
    <metric_log remove="remove"/>
    <asynchronous_metric_log remove="remove"/>
    <session_log remove="remove"/>
    <part_log remove="remove"/>
</clickhouse>

We can additionally disable logging of queries and query threads via clickhouse-user-config.xml, which too should be mounted to the ClickHouse container:

<clickhouse>
    <profiles>
        <default>
            <log_queries>0</log_queries>
            <log_query_threads>0</log_query_threads>
        </default>
    </profiles>
</clickhouse>

Clean up ClickHouse Data (Optional)

ClickHouse-internal data may also be cleaned up manually. This might be required periodically, especially in case logging is not disabled as described above, and the data grows to an unmanageable size (which happened in our case on the old server). We can connect to the container and run clickhouse-client to check the size of all tables, via the following command:

sudo docker exec -it aed6425a6303 clickhouse-client

Then we can run this query to get the size of all tables:

SELECT
    database,
    table,
    formatReadableSize(sum(bytes)) AS size
FROM system.parts
GROUP BY
    database,
    table
ORDER BY
    size DESC;

This should provide an output similar to the following:

┌─database────────────┬─table───────────────────┬─size───────┐
│ system              │ asynchronous_insert_log │ 76.57 KiB  │
│ system              │ query_log               │ 44.91 MiB  │
│ system              │ trace_log               │ 4.21 GiB   │
│ system              │ part_log                │ 27.36 MiB  │
│ system              │ asynchronous_metric_log │ 2.89 GiB   │
│ system              │ metric_log              │ 2.45 GiB   │
│ plausible_events_db │ ingest_counters         │ 195.83 KiB │
│ plausible_events_db │ sessions_v2             │ 1.18 MiB   │
│ plausible_events_db │ events_v2               │ 1.17 MiB   │
└─────────────────────┴─────────────────────────┴────────────┘

Wow! Almost 10GB of logs accumulated over the past 12 months. We can delete data from these tables by running the following queries:

aed6425a6303 :) TRUNCATE TABLE system.trace_log;
aed6425a6303 :) TRUNCATE TABLE system.asynchronous_metric_log;
aed6425a6303 :) TRUNCATE TABLE system.metric_log;

Conclusion

In this guide, we’ve covered the process of migrating Plausible Analytics to a new server. We created dumps of the Postgres and ClickHouse data, and then imported them into the new server. We also disabled ClickHouse-internal logging and cleaned up old data to ensure the system runs smoothly and doesn’t accumulate unnecessary data.

These steps will help you migrate Plausible Analytics seamlessly to a new server, ensuring no loss of configurations or historical data.

© 2024 Camillo Visini
Imprint