Privacy-First Analytics

Privacy-First Analytics

Server-Side Web Analytics from Apache Logs

Looking for Google-Analytics-style insights without client-side tracking, cookies, or vendor lock-in? Below are practical, production-ready options that read your Apache access logs and give you clear dashboards, trends, and error insights—fully self-hosted and GDPR-friendly.

Recommended Tools

1) GoAccess — Fast HTML Dashboard from Logs

CLI → Interactive HTML • Real-time mode • Lightweight • Open source

  • Generates a single interactive report (hits, unique visitors, bandwidth, top pages, referrers, 404s, geos, UAs).
  • Works directly on access.log (supports .gz), can run from cron or real-time.
  • Ideal when you need quick, no-JS analytics on a server.
# Debian/Ubuntu
sudo apt install goaccess

# One-off HTML report (COMBINED log format)
zcat -f /var/log/apache2/access.log* \
| goaccess - --log-format=COMBINED -o /var/www/html/analytics.html

Of htop style

tail -f /var/log/apache2/vindazo_de_access.log | goaccess - --log-format=COMBINED

2) Matomo (Piwik) — Full GA-Style Suite via Log Import

Self-hosted web UI • Segments • Goals • Campaigns • Dashboards

  • Use the import_logs.py tool to ingest Apache logs into Matomo for session-level analytics.
  • GA-like interface: channels, bounce rate, time on page, devices, countries, goals.
  • Scales well for organizations needing stakeholder dashboards and governance.
# Example import (adjust paths, URL, site ID)
python3 /var/www/matomo/misc/log-analytics/import_logs.py \
  --url=https://analytics.example.com \
  --idsite=1 --recorders=4 \
  /var/log/apache2/access.log

3) AWStats — Classic, Stable Log Analyzer

No DB • CGI/static reports • Low resource usage

  • Shows visitors, referrers, search engines, robots, countries, pages, bandwidth.
  • Interface is dated but reliable; excellent for low-maintenance environments.

Quick Comparison

Tool Closest to GA Realtime Hosting Complexity Best For
GoAccess Medium (page/referrer focus) Yes (websocket mode) Self-hosted Low Instant, no-JS server analytics
Matomo (log importer) High (sessions/goals/segments) Batch (near-real-time via cron) Self-hosted Medium Stakeholder dashboards & reporting
AWStats Low–Medium No (batch) Self-hosted Low Simple, stable server reporting

Operational Setup (Practical)

A. Nightly GoAccess Report (HTML only)

# /etc/cron.d/goaccess
15 0 * * * root zcat -f /var/log/apache2/access.log* \
 | goaccess - --log-format=COMBINED \
 -o /var/www/html/analytics/$(date +\%F).html

B. Matomo Hourly Log Imports

# /etc/cron.hourly/matomo-import
python3 /var/www/matomo/misc/log-analytics/import_logs.py \
  --url=https://analytics.example.com --idsite=1 \
  /var/log/apache2/access.log
Tip: If your Apache LogFormat includes response time (%D or %T), status, and referrer, you can build latency distributions, identify slow endpoints, and correlate 4xx/5xx spikes to deployments.

When to Use Client-Side Tools

If you need funnel attribution, event tracking, or product analytics (e.g., feature usage), consider privacy-friendly front-end tools like Plausible or Umami. They complement—rather than replace—server-side log analytics.

Comments