PostgreSQL Disk Cleanup: Real-World Troubleshooting and Resolution

This document walks through a full session where we tackled extreme disk usage caused by bloated PostgreSQL tables and indexes. We used `pg_repack`, SQL archiving, and careful investigation of system metrics. Here’s a structured summary from problem to solution.

The same problem NL post

Initial Symptom

The PostgreSQL logs directory and main data directory were consuming over 555 GB:

/var/lib/postgresql/12/main

Step 1: Checking Table Sizes

SELECT 
    schemaname || '.' || relname AS table,
    pg_size_pretty(pg_total_relation_size(relid)) AS total_size,
    pg_size_pretty(pg_relation_size(relid)) AS table_size,
    pg_size_pretty(pg_total_relation_size(relid) - pg_relation_size(relid)) AS index_size
FROM pg_catalog.pg_statio_user_tables
ORDER BY pg_total_relation_size(relid) DESC
LIMIT 20;

Result: `job_jobarchive` had 451 GB total size, with 322 GB in indexes.

Step 2: Identify Individual Indexes

SELECT 
    indexrelname AS index_name,
    idx_scan,
    idx_tup_read,
    idx_tup_fetch,
    pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM pg_stat_user_indexes
WHERE relname = 'job_jobarchive'
ORDER BY pg_relation_size(indexrelid) DESC;

Findings: Two large unused indexes: job_jobarchive_auto_anonymize and job_jobarchive_payed (~2 GB each).

Step 3: Using pg_repack to Rebuild Indexes

We compiled and installed pg_repack manually due to an outdated Ubuntu 20.10 environment. Then ran:

/usr/lib/postgresql/12/bin/pg_repack -d vindazo_fr -t job_jobarchive --only-indexes -U postgres -h localhost

Outcome: Indexes were successfully rebuilt, but disk usage remained due to open file handles.

Step 4: Restart PostgreSQL

sudo systemctl restart postgresql

This freed up some of the disk space after pg_repack ran.

Step 5: Investigating Table Content

We found 87M rows where payed = FALSE:

SELECT COUNT(*) FROM job_jobarchive WHERE payed = FALSE;

Step 6: Preserving Important Data

Rows with meaningful usage were marked as payed = TRUE:

UPDATE job_jobarchive SET payed = TRUE
WHERE job_id IN (SELECT job_id FROM viewjob_viewjob);
UPDATE job_jobarchive SET payed = TRUE
WHERE job_id IN (SELECT job_id FROM job_applicationclickscount);
UPDATE job_jobarchive SET payed = TRUE
WHERE job_id IN (SELECT job_id FROM spontaneous_spontaneousapplication);
UPDATE job_jobarchive SET payed = TRUE
WHERE email IS NOT NULL AND email != '';

Final count: 3.27M rows with payed = TRUE.

Step 7: Final Repack

/usr/lib/postgresql/12/bin/pg_repack -d vindazo_fr -t job_jobarchive -U postgres -h localhost

Result

After this cleanup:

  • Table size dropped drastically
  • Disk space was recovered
  • Query performance improved

Conclusion

This session demonstrates how deep PostgreSQL bloat can go — and how you can safely reclaim space using a mix of query analysis, `pg_repack`, and smart archiving. Regular cleanup like this can prevent disk pressure and system crashes.

For example PostgreSQL Table Bloat & Storage Breakdown: job_jobarchive

This post demonstrates how to analyze and break down the physical storage of a bloated PostgreSQL table using real SQL commands. The example is based on a real-world table named job_jobarchive, which had grown to 469 GB in total size.

1. Bloat Overview

We begin by querying a CTE to calculate bloat per table:

WITH bloat_info AS (
  SELECT
    nspname AS schemaname,
    relname AS tablename,
    pg_size_pretty(pg_table_size(C.oid)) AS table_size,
    pg_size_pretty(pg_total_relation_size(C.oid) - pg_table_size(C.oid)) AS bloat_size,
    pg_size_pretty(pg_total_relation_size(C.oid)) AS total_size
  FROM pg_class C
  JOIN pg_namespace N ON (C.relnamespace = N.oid)
  WHERE C.relkind = 'r' AND N.nspname NOT IN ('pg_catalog', 'information_schema')
)
SELECT * FROM bloat_info WHERE tablename = 'job_jobarchive';

Result:

  • Table size: 462 GB
  • Bloat: 6.8 GB
  • Total size: 469 GB

2. Breakdown: Table Data vs Indexes vs TOAST

To understand where this size comes from:

SELECT
  pg_size_pretty(pg_relation_size('job_jobarchive')) AS table_data,
  pg_size_pretty(pg_indexes_size('job_jobarchive')) AS indexes,
  pg_size_pretty(
    pg_total_relation_size('job_jobarchive')
    - pg_relation_size('job_jobarchive')
    - pg_indexes_size('job_jobarchive')
  ) AS toast,
  pg_size_pretty(pg_total_relation_size('job_jobarchive')) AS total;
  

Results:

  • Table data: 148 GB
  • Indexes: 6.8 GB
  • TOAST: 314 GB (likely from large text or bytea fields)
  • Total: 469 GB

3. Reduce Size Safely

Option A: Repack without downtime

/usr/lib/postgresql/12/bin/pg_repack -d vindazo_fr -t job_jobarchive -U postgres -h localhost

Option B: Vacuum FULL (downtime required)

VACUUM FULL job_jobarchive;

4. Optional: Clean or Archive Large Text Columns

-- For example, if raw_text2 is not needed for old rows:
UPDATE job_jobarchive SET raw_text2 = NULL WHERE payed = FALSE;

5. (Optional) Analyze Top Bloat in All Tables

SELECT
  relname AS table,
  pg_size_pretty(pg_total_relation_size(relid) - pg_relation_size(relid)) AS index_bloat
FROM pg_catalog.pg_statio_user_tables
ORDER BY pg_total_relation_size(relid) - pg_relation_size(relid) DESC
LIMIT 10;

Conclusion

This case study demonstrates how bloated PostgreSQL tables — especially those with large text fields — can dramatically affect storage. With simple SQL and tools like pg_repack, you can safely reclaim space and restore performance without service interruption.

Handling PostgreSQL Bloat When pg_repack Fails

pg_repack Crash Example

Attempting to repack a large table can fail if the database is low on disk space or gets interrupted:

/usr/lib/postgresql/12/bin/pg_repack -d vindazo_fr -t job_jobarchive -U postgres -h localhost
INFO: repacking table "public.job_jobarchive"
ERROR: query failed: SSL SYSCALL error: EOF detected
DETAIL: query was: INSERT INTO repack.table_70304821 SELECT job_id, ... FROM ONLY public.job_jobarchive
ERROR: could not connect to database: connection to server at "localhost" (127.0.0.1), port 5432 failed: FATAL:  the database system is in recovery mode

This indicates PostgreSQL crashed mid-repack. This can happen if there's insufficient disk space for temp shadow copies. After reboot, the database will enter recovery mode.

Alternative: Row-by-Row Deletion via Python

If pg_repack is not viable and massive DELETE commands also fail (due to memory or corruption), a Python script is safer:

Step 1: Export job IDs to delete

\copy (SELECT job_id FROM job_jobarchive WHERE payed = FALSE) TO '/tmp/job_id_unpaid.csv' WITH CSV HEADER;
COPY 85677204

Step 2: Python deletion script (one row at a time)

import psycopg2

def delete_job_ids_one_by_one(file_path, db_config, error_log_path):
    conn = psycopg2.connect(**db_config)
    conn.autocommit = True
    cursor = conn.cursor()

    with open(file_path, 'r') as f, open(error_log_path, 'a') as error_log:
        next(f)  # skip CSV header
        for line in f:
            job_id = line.strip()
            if not job_id:
                continue
            print(f"Verwerk job_id: {job_id}")
            try:
                cursor.execute("DELETE FROM job_jobarchive WHERE job_id = %s", (job_id,))
                print(f"✅ Verwijderd: {job_id}")
            except Exception as e:
                print(f"❌ Fout bij job_id {job_id}: {e}")
                error_log.write(f"{job_id}, {e}\n")
                continue
    cursor.close()
    conn.close()
    print("🟢 Klaar met alle deletions.")

if __name__ == "__main__":
    file_path = '/tmp/job_id_unpaid.csv'
    error_log_path = '/tmp/delete_errors.log'
    db_config = {
        'dbname': 'vindazo_fr',
        'user': 'vindazo_fr',
        'password': 'vindazo_fr',
        'host': 'localhost',
        'port': '5432'
    }
    delete_job_ids_one_by_one(file_path, db_config, error_log_path)

Fixing pg_repack Leftovers

When pg_repack crashes, it leaves triggers and temporary types behind:

/usr/lib/postgresql/12/bin/pg_repack -d vindazo_fr -t job_jobarchive -U postgres -h localhost
INFO: repacking table "public.job_jobarchive"
WARNING: the table "public.job_jobarchive" already has a trigger called "repack_trigger"

Fix:

DROP TRIGGER IF EXISTS repack_trigger ON public.job_jobarchive;
DROP EXTENSION IF EXISTS pg_repack CASCADE;
CREATE EXTENSION pg_repack;

Freeing Up Space

After deleting rows, make sure you free space manually (especially on older systems). In our case, we removed old backups to free 130 GB:

rm -r 2025-04-25-weekly 2025-05-06-daily

Disk Usage After Cleanup

Filesystem      Size  Used Avail Use% Mounted on
/dev/md2        934G  678G  209G  77%

We're now under 80% usage, which is safe to continue working with PostgreSQL and tools like pg_repack.

Why PostgreSQL File Size Doesn't Shrink After DELETE

PostgreSQL deletes data logically, meaning it marks the rows as dead — but it doesn't immediately release the disk space. That space is only reused for future INSERT or UPDATE operations.

✅ Solution: How to Actually Reclaim Disk Space

🔹 1. VACUUM FULL (Requires Downtime and Table Lock)

VACUUM FULL job_jobarchive;
  • ✔️ Creates a fresh copy of the table without dead rows
  • ✔️ Frees up actual space on the file system

🔹 2. pg_repack (No Downtime)

/usr/lib/postgresql/12/bin/pg_repack -d vindazo_fr -t job_jobarchive -U postgres -h localhost
  • ✔️ Reclaims space just like VACUUM FULL, but online
  • ⚠️ Requires enough temporary disk space to hold a shadow copy

🔹 3. Re-check Table and Disk Usage

In PostgreSQL:

SELECT
  pg_size_pretty(pg_relation_size('job_jobarchive')) AS table_data,
  pg_size_pretty(pg_total_relation_size('job_jobarchive')) AS total_size;

On the OS:

du -sh /var/lib/postgresql/12/main/base/

📌 Important

Even if you deleted 80 million rows, you won’t see any reduction in actual disk usage until you run VACUUM FULL or pg_repack.

💡 Want a shell script?

We can provide a script that:

  • Checks if there's enough free disk space first
  • Then safely runs pg_repack with logging

Then if you run vacuum and it is compliet
VACUUM FULL job_jobarchive;
VACUUM
Filesystem Size Used Avail Use% Mounted on
tmpfs 13G 956K 13G 1% /run
/dev/md2 934G 245G 642G 28% /
tmpfs 63G 4.3M 63G 1% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 4.0M 0 4.0M 0% /sys/fs/cgroup
/dev/md1 487M 82M 381M 18% /boot
tmpfs 13G 0 13G 0% /run/user/0

Comments