Cleanup Script for Blocked Domains

๐Ÿงน Cleanup Script for Blocked Domains

As part of our ongoing data hygiene process, we've implemented a custom Django management command that removes contacts and offers associated only with BlockedDomain entries.

๐Ÿ” Purpose

Over time, some domains become inactive, spammy, or irrelevant. Instead of outright deleting them, we now track them in a BlockedDomain model. This allows us to:

  • Preserve domain references and metadata
  • Safely remove all offers and contacts tied to those domains
  • Keep the domain itself (for logging, recovery, or audit reasons)

⚙️ How It Works

  1. Fetch all BlockedDomain entries from the database.
  2. For each blocked domain:
    • Delete all offers linked to each contact under that domain
    • Delete the contacts themselves
    • The domain record remains untouched

๐Ÿ“„ BlockedDomain Model Structure

class BlockedDomain(models.Model):
    domain = models.ForeignKey(Domain, null=True, blank=True, on_delete=models.SET_NULL)
    url = models.URLField(max_length=512, unique=True)
    permanently_deleted = models.BooleanField(default=False)
    comment = models.TextField(blank=True, null=True)
    created_at = models.DateTimeField(auto_now_add=True)

๐Ÿš€ Command Execution

python manage.py clean_contacts_with_offers

This command will output progress logs per domain, including any contacts or offers deleted.

๐Ÿ Python 2.7 Compatibility Note

Since we’re running on Python 2.7, we added the following line at the top of the file to ensure Unicode support:

# -*- coding: utf-8 -*-

✅ Results

After running the command, all irrelevant contacts and offers from blocked domains are removed, leaving the domain in place for reference or later review.

This cleanup helps keep our data lean, relevant, and ready for future processing or scraping tasks.


# -*- coding: utf-8 -*-

from domain.models import Domain, Contact
from realty.models import Offer
from your_app.models import BlockedDomain  # Replace with your actual app name

def clean_blocked_domains():
    blocked_domains = BlockedDomain.objects.all()
    print("Found {} blocked domains.".format(blocked_domains.count()))

    for blocked in blocked_domains:
        domain = blocked.domain
        if not domain:
            print("⚠ No linked Domain object for BlockedDomain URL: {}".format(blocked.url))
            continue

        print("\nProcessing domain: {} (ID: {})".format(domain.url, domain.id))

        contacts = domain.contacts.all()
        for contact in contacts:
            offers = Offer.objects.filter(contact=contact)

            if offers.exists():
                print("Deleting {} offers for contact ID {} - {}".format(offers.count(), contact.id, contact.name))
                offers.delete()

            print("Deleting contact ID {} - {}".format(contact.id, contact.name))
            contact.delete()

        print("✅ Done cleaning contacts and offers for domain {}\n".format(domain.url))

# Call the function directly
clean_blocked_domains()

Comments