Skip to content

Race condition between backup and database deletion causes stuck resources #1535

@tartkri-bot

Description

@tartkri-bot

Summary

When a PostgreSQL database is deleted while an initial backup is still running, the operator removes pgbackrest-secrets before the backup completes, causing the backup to fail and the database to get stuck in Deleting state.

Steps to Reproduce

  1. Provision a PostgreSQL database with backups enabled (scheduled + PITR)
  2. Wait for database to reach Ready state
  3. Immediately delete the database (while backup is still In Progress)
  4. Database gets stuck in deletion

Actual Behavior

  1. Database deletion is initiated
  2. Operator removes <db-name>-pgbackrest-secrets Secret
  3. Running backup pod fails with:
    MountVolume.SetUp failed for volume "pgbackrest-config": 
    secret "<db-name>-pgbackrest-secrets" not found
    
  4. Backup remains stuck Running, database stuck Deleting

Resource State When Stuck

$ kubectl get pg  # cluster gone
No resources found

$ kubectl get perconapgbackups
NAME                           STATUS    AGE
postgresql-wza-backup-lcmn     Running   5m13s

$ kubectl get perconapgcluster
NAME             STATUS     AGE
postgresql-wza   Deleting   10m

Root Cause

Deletion flow removes Secrets before checking/waiting on in-progress backups. The backup pod needs the Secret to mount its config but it is already deleted.

Workaround

Manually recreate the pgbackrest-secrets Secret (clone from another DB using same S3 bucket). This allows backup to finish and deletion to proceed.

Proposed Fix

Ensure proper deletion ordering: wait for or cancel in-progress backups before deleting pgbackrest-secrets.

Environment

  • Operator: percona-postgresql-operator (via everest-operator)
  • Storage: S3
  • Discovered during: automated health check workflows with rapid create/delete cycles

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions