Error message from server: ERROR: compressed data is corrupted



When running the backup in the shell I immediately see that the problem is in the archive..

Back these archive jobs that actually don't belong in the database cause problems. So that is a very specific problem and I have not found a standard solution on the internet. I have developed a small command myself in python with which I have found and removed corrupt records manually via SQL. See further description of problem and method to repair. 


First I convert here features and documents that I studied and after that see section Solution convert.


Plain backup of vindazo_de

pg_dump: error: Dumping the contents of table "job_jobarchive" failed: PQgetResult() failed.

pg_dump: error: Error message from server: ERROR: compressed data is corrupted

pg_dump: error: The command was: COPY public.job_jobarchive (job_id, source_unique, company_name, company_name_slug, phone, email, url, sol_url, title, description, raw_text, address, city, country, zip_code, added, last_updated, status, for_index, slug, slug_description, view, click, has_checked_duplicate, job_type, contract, statute, logo, company_id, source_id, user_id, category_text, payed, redirect, raw_text2, auto_anonymize) TO stdout;

https://dba.stackexchange.com/questions/127846/dumping-database-with-compressed-data-is-corrupt-errors 




Corrupted table postgresql


Again a problem with THE database and back archive that broke.. What exactly is the problem? See the same here..


Tried everything that could be found on the internet and failed.

REINDEX

ANALYSE

etc…

https://wiki.postgresql.org/wiki/Corruption

https://docs.google.com/document/d/1aaVA9MisQfCe-nb030rKlxluhM_d3iboilLjr1dO6o4/edit 

 

See more docs about this bugfix, If you interested in. Or read a solution, because it is short and to the point.


Solution:

How did I fix this problem?
  1. First I saved all IDs of this table separately in a text file.



\copy (SELECT job_id FROM job_jobarchive) to 'job_id.csv' with csv; 


It took about two minets to copy all IDs to a text file…



That is for sure we now have to go through all the records one after the other and try to select if an error happens then save this error and ID of a record that is not selected.. So I get records that I have to delete..


This attempt is to select all records at ID and then you may see an error.

I've done it with this command:

 

 vim job/management/commands/find_corupted_archive_select.py




from django.core.management.base import BaseCommand

from django.db.models import Count


from job.models import JobArchive, CompanyCount

import sys, traceback

from django.db import connections



class Command(BaseCommand):

    def handle(self, *args, **options):

        """

        Entery point

        """

        lines = open("/home/sites/deployment/vindazo2/branches/de_vindazo/job_id.csv", "r").readlines()

        separator = ', '

        import pdb;pdb.set_trace()

        id_stoped = '15706374'

        check_started = False

        with connections['default'].cursor() as cursor:

            for line in lines:

                job_id = line.strip()

                if (id_stoped == job_id):

                    check_started = True

                if(check_started):

                    print(job_id)

                    cursor.execute("SELECT * FROM  job_jobarchive  WHERE job_id IN ( " + job_id + " )") 



15706374

 

  You can find it In Deployment Vindazo Germany server.. Not committed at this time in Git. I will add it in trunk maybe.

 

So, when you find record you can delet it via SQL client. 


           delete from job_jobarchive_locations where jobarchive_id = '15706374'; 

           delete from job_jobarchive where job_id = '15706374';


REINDEX Table job_jobarchive; 


  

Comments