This document walks through a full session where we tackled extreme disk usage caused by bloated PostgreSQL tables and indexes. We used `pg_repack`, SQL archiving, and careful investigation of system metrics. Here’s a structured summary from problem to solution.
Initial Symptom
The PostgreSQL logs directory and main data directory were consuming over 555 GB:
/var/lib/postgresql/12/main
Step 1: Checking Table Sizes
SELECT
schemaname || '.' || relname AS table,
pg_size_pretty(pg_total_relation_size(relid)) AS total_size,
pg_size_pretty(pg_relation_size(relid)) AS table_size,
pg_size_pretty(pg_total_relation_size(relid) - pg_relation_size(relid)) AS index_size
FROM pg_catalog.pg_statio_user_tables
ORDER BY pg_total_relation_size(relid) DESC
LIMIT 20;
Result: `job_jobarchive` had 451 GB total size, with 322 GB in indexes.
Step 2: Identify Individual Indexes
SELECT
indexrelname AS index_name,
idx_scan,
idx_tup_read,
idx_tup_fetch,
pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM pg_stat_user_indexes
WHERE relname = 'job_jobarchive'
ORDER BY pg_relation_size(indexrelid) DESC;
Findings: Two large unused indexes: job_jobarchive_auto_anonymize
and job_jobarchive_payed
(~2 GB each).
Step 3: Using pg_repack to Rebuild Indexes
We compiled and installed pg_repack
manually due to an outdated Ubuntu 20.10 environment. Then ran:
/usr/lib/postgresql/12/bin/pg_repack -d vindazo_fr -t job_jobarchive --only-indexes -U postgres -h localhost
Outcome: Indexes were successfully rebuilt, but disk usage remained due to open file handles.
Step 4: Restart PostgreSQL
sudo systemctl restart postgresql
This freed up some of the disk space after pg_repack ran.
Step 5: Investigating Table Content
We found 87M rows where payed = FALSE
:
SELECT COUNT(*) FROM job_jobarchive WHERE payed = FALSE;
Step 6: Preserving Important Data
Rows with meaningful usage were marked as payed = TRUE
:
UPDATE job_jobarchive SET payed = TRUE
WHERE job_id IN (SELECT job_id FROM viewjob_viewjob);
UPDATE job_jobarchive SET payed = TRUE
WHERE job_id IN (SELECT job_id FROM job_applicationclickscount);
UPDATE job_jobarchive SET payed = TRUE
WHERE job_id IN (SELECT job_id FROM spontaneous_spontaneousapplication);
UPDATE job_jobarchive SET payed = TRUE
WHERE email IS NOT NULL AND email != '';
Final count: 3.27M rows with payed = TRUE
.
Step 7: Final Repack
/usr/lib/postgresql/12/bin/pg_repack -d vindazo_fr -t job_jobarchive -U postgres -h localhost
Result
After this cleanup:
- Table size dropped drastically
- Disk space was recovered
- Query performance improved
Conclusion
This session demonstrates how deep PostgreSQL bloat can go — and how you can safely reclaim space using a mix of query analysis, `pg_repack`, and smart archiving. Regular cleanup like this can prevent disk pressure and system crashes.
For example PostgreSQL Table Bloat & Storage Breakdown: job_jobarchive
This post demonstrates how to analyze and break down the physical storage of a bloated PostgreSQL table using real SQL commands. The example is based on a real-world table named job_jobarchive
, which had grown to 469 GB in total size.
1. Bloat Overview
We begin by querying a CTE to calculate bloat per table:
WITH bloat_info AS (
SELECT
nspname AS schemaname,
relname AS tablename,
pg_size_pretty(pg_table_size(C.oid)) AS table_size,
pg_size_pretty(pg_total_relation_size(C.oid) - pg_table_size(C.oid)) AS bloat_size,
pg_size_pretty(pg_total_relation_size(C.oid)) AS total_size
FROM pg_class C
JOIN pg_namespace N ON (C.relnamespace = N.oid)
WHERE C.relkind = 'r' AND N.nspname NOT IN ('pg_catalog', 'information_schema')
)
SELECT * FROM bloat_info WHERE tablename = 'job_jobarchive';
Result:
- Table size: 462 GB
- Bloat: 6.8 GB
- Total size: 469 GB
2. Breakdown: Table Data vs Indexes vs TOAST
To understand where this size comes from:
SELECT
pg_size_pretty(pg_relation_size('job_jobarchive')) AS table_data,
pg_size_pretty(pg_indexes_size('job_jobarchive')) AS indexes,
pg_size_pretty(
pg_total_relation_size('job_jobarchive')
- pg_relation_size('job_jobarchive')
- pg_indexes_size('job_jobarchive')
) AS toast,
pg_size_pretty(pg_total_relation_size('job_jobarchive')) AS total;
Results:
- Table data: 148 GB
- Indexes: 6.8 GB
- TOAST: 314 GB (likely from large
text
orbytea
fields) - Total: 469 GB
3. Reduce Size Safely
Option A: Repack without downtime
/usr/lib/postgresql/12/bin/pg_repack -d vindazo_fr -t job_jobarchive -U postgres -h localhost
Option B: Vacuum FULL (downtime required)
VACUUM FULL job_jobarchive;
4. Optional: Clean or Archive Large Text Columns
-- For example, if raw_text2 is not needed for old rows:
UPDATE job_jobarchive SET raw_text2 = NULL WHERE payed = FALSE;
5. (Optional) Analyze Top Bloat in All Tables
SELECT
relname AS table,
pg_size_pretty(pg_total_relation_size(relid) - pg_relation_size(relid)) AS index_bloat
FROM pg_catalog.pg_statio_user_tables
ORDER BY pg_total_relation_size(relid) - pg_relation_size(relid) DESC
LIMIT 10;
Conclusion
This case study demonstrates how bloated PostgreSQL tables — especially those with large text fields — can dramatically affect storage. With simple SQL and tools like pg_repack
, you can safely reclaim space and restore performance without service interruption.
Handling PostgreSQL Bloat When pg_repack Fails
pg_repack Crash Example
Attempting to repack a large table can fail if the database is low on disk space or gets interrupted:
/usr/lib/postgresql/12/bin/pg_repack -d vindazo_fr -t job_jobarchive -U postgres -h localhost
INFO: repacking table "public.job_jobarchive"
ERROR: query failed: SSL SYSCALL error: EOF detected
DETAIL: query was: INSERT INTO repack.table_70304821 SELECT job_id, ... FROM ONLY public.job_jobarchive
ERROR: could not connect to database: connection to server at "localhost" (127.0.0.1), port 5432 failed: FATAL: the database system is in recovery mode
This indicates PostgreSQL crashed mid-repack. This can happen if there's insufficient disk space for temp shadow copies. After reboot, the database will enter recovery mode.
Alternative: Row-by-Row Deletion via Python
If pg_repack is not viable and massive DELETE
commands also fail (due to memory or corruption), a Python script is safer:
Step 1: Export job IDs to delete
\copy (SELECT job_id FROM job_jobarchive WHERE payed = FALSE) TO '/tmp/job_id_unpaid.csv' WITH CSV HEADER;
COPY 85677204
Step 2: Python deletion script (one row at a time)
import psycopg2
def delete_job_ids_one_by_one(file_path, db_config, error_log_path):
conn = psycopg2.connect(**db_config)
conn.autocommit = True
cursor = conn.cursor()
with open(file_path, 'r') as f, open(error_log_path, 'a') as error_log:
next(f) # skip CSV header
for line in f:
job_id = line.strip()
if not job_id:
continue
print(f"Verwerk job_id: {job_id}")
try:
cursor.execute("DELETE FROM job_jobarchive WHERE job_id = %s", (job_id,))
print(f"✅ Verwijderd: {job_id}")
except Exception as e:
print(f"❌ Fout bij job_id {job_id}: {e}")
error_log.write(f"{job_id}, {e}\n")
continue
cursor.close()
conn.close()
print("🟢 Klaar met alle deletions.")
if __name__ == "__main__":
file_path = '/tmp/job_id_unpaid.csv'
error_log_path = '/tmp/delete_errors.log'
db_config = {
'dbname': 'vindazo_fr',
'user': 'vindazo_fr',
'password': 'vindazo_fr',
'host': 'localhost',
'port': '5432'
}
delete_job_ids_one_by_one(file_path, db_config, error_log_path)
Fixing pg_repack Leftovers
When pg_repack crashes, it leaves triggers and temporary types behind:
/usr/lib/postgresql/12/bin/pg_repack -d vindazo_fr -t job_jobarchive -U postgres -h localhost
INFO: repacking table "public.job_jobarchive"
WARNING: the table "public.job_jobarchive" already has a trigger called "repack_trigger"
Fix:
DROP TRIGGER IF EXISTS repack_trigger ON public.job_jobarchive;
DROP EXTENSION IF EXISTS pg_repack CASCADE;
CREATE EXTENSION pg_repack;
Freeing Up Space
After deleting rows, make sure you free space manually (especially on older systems). In our case, we removed old backups to free 130 GB:
rm -r 2025-04-25-weekly 2025-05-06-daily
Disk Usage After Cleanup
Filesystem Size Used Avail Use% Mounted on
/dev/md2 934G 678G 209G 77%
We're now under 80% usage, which is safe to continue working with PostgreSQL and tools like pg_repack.
Why PostgreSQL File Size Doesn't Shrink After DELETE
PostgreSQL deletes data logically, meaning it marks the rows as dead — but it doesn't immediately release the disk space. That space is only reused for future INSERT
or UPDATE
operations.
✅ Solution: How to Actually Reclaim Disk Space
🔹 1. VACUUM FULL (Requires Downtime and Table Lock)
VACUUM FULL job_jobarchive;
- ✔️ Creates a fresh copy of the table without dead rows
- ✔️ Frees up actual space on the file system
🔹 2. pg_repack (No Downtime)
/usr/lib/postgresql/12/bin/pg_repack -d vindazo_fr -t job_jobarchive -U postgres -h localhost
- ✔️ Reclaims space just like
VACUUM FULL
, but online - ⚠️ Requires enough temporary disk space to hold a shadow copy
🔹 3. Re-check Table and Disk Usage
In PostgreSQL:
SELECT
pg_size_pretty(pg_relation_size('job_jobarchive')) AS table_data,
pg_size_pretty(pg_total_relation_size('job_jobarchive')) AS total_size;
On the OS:
du -sh /var/lib/postgresql/12/main/base/
📌 Important
Even if you deleted 80 million rows, you won’t see any reduction in actual disk usage until you run VACUUM FULL
or pg_repack
.
💡 Want a shell script?
We can provide a script that:
- Checks if there's enough free disk space first
- Then safely runs
pg_repack
with logging
Then if you run vacuum and it is compliet
VACUUM FULL job_jobarchive;
VACUUM
Filesystem Size Used Avail Use% Mounted on
tmpfs 13G 956K 13G 1% /run
/dev/md2 934G 245G 642G 28% /
tmpfs 63G 4.3M 63G 1% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 4.0M 0 4.0M 0% /sys/fs/cgroup
/dev/md1 487M 82M 381M 18% /boot
tmpfs 13G 0 13G 0% /run/user/0
Comments
Post a Comment