Skip to content

Improve run history management for high-frequency jobs #416

@NikolayS

Description

@NikolayS

Problem

cron.job_run_details grows unbounded for high-frequency jobs. At 1-second scheduling (supported since pg_cron 1.5), this table accumulates 86,400 rows/day (~12 MiB/day) per job. There is no built-in mechanism to manage this growth — the only option is cron.log_run = off, which disables all logging.

This came up while building pg_ash, which uses pg_cron for 1-second sampling of pg_stat_activity. The sampling itself produces ~30 MiB/day, but pg_cron's own logging adds another ~12 MiB/day — a 40% overhead that provides no value for routine successful executions.

Proposed improvements

1. cron.log_run_errors_only (or cron.log_run = 'errors')

Only log failed executions. Successful runs are the common case for well-functioning jobs — logging them is pure overhead. This would reduce job_run_details growth to near-zero for healthy systems while preserving the ability to diagnose failures.

An enum-based cron.log_run would be clean:

  • all (current default, backward-compatible)
  • errors (only failures)
  • off (current behavior)

2. cron.max_run_history — automatic purge with FIFO ring buffer

A GUC that caps the number of rows in job_run_details. When exceeded, old rows are purged. Two implementation approaches:

  • Simple DELETE — easy to implement but creates dead tuples requiring vacuum on every purge cycle
  • Ring buffer with TRUNCATE — use 2-3 partitions and TRUNCATE the oldest when full. Zero bloat, zero vacuum. This is the same approach pg_ash itself uses for sample storage.

The partition-based approach would be ideal for high-frequency jobs.

3. Log to server log instead of table

For jobs that run every few seconds, writing to a table on every execution is expensive (WAL, indexes, vacuum). An option to log to the Postgres server log (log_destination) instead would:

  • Eliminate table bloat entirely
  • Integrate with existing log management infrastructure (pgBadger, CloudWatch, etc.)
  • Let users filter with standard log analysis tools
  • Keep job_run_details table for interactive queries when needed

This could be another enum value: cron.log_run = 'serverlog'.

Context

With pg_cron 1.5+ supporting sub-minute scheduling, high-frequency jobs (1-60 seconds) are a first-class use case. The current logging design assumes jobs run at most once per minute — at that rate, job_run_details growth is manageable. At 1-second frequency, it becomes a significant operational concern.

The combination of "errors only" logging + periodic TRUNCATE-based purge would make pg_cron suitable for high-frequency workloads without any manual cleanup overhead.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions