Skip to content

Add celery and celery-beat to delete partial files every 48 hours #234

Merged
paigewilliams merged 23 commits intodevelopfrom
delete-partial-files
Mar 12, 2026
Merged

Add celery and celery-beat to delete partial files every 48 hours #234
paigewilliams merged 23 commits intodevelopfrom
delete-partial-files

Conversation

@paigewilliams
Copy link
Collaborator

@paigewilliams paigewilliams commented Mar 10, 2026

asana task

Adds celery, celery-beat and django-celery-results, django-celery-beat to delete partial uploads every 48 hours, with the goal of not bloating servers with partial uploads. Deletion logic exists in the delete_expired_chunks task. Uses redis as the message broker. django-celery-results displays the task results in the admin interface. This should help with transparency and observability. django-celery-beat also displays the intervals at which tasks run. The thinking here is that it allows tribes to have more control over background tasks, and if they want them or not.

For running with docker, adds a celery worker, celery-beat and redis container. For running with Vagrant it runs daemonized. The only step to add in the install instructions is sudo bash /usr/local/apps/TEKDB/TEKDB/scripts/provision_celery.sh to run after database migrations.

Screenshots

Screenshot 2026-03-11 at 11 25 27 AM Screenshot 2026-03-11 at 11 31 21 AM Screenshot 2026-03-11 at 11 23 41 AM

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces Celery (worker + beat) into the TEKDB Django project to periodically delete expired resumable-upload chunk files (intended every 48 hours), using Redis as the broker and django-celery-results / django-celery-beat for admin visibility and scheduling control.

Changes:

  • Add Celery app/task (delete_expired_chunks) and schedule it via Django settings; add a basic test for the cleanup behavior.
  • Add Docker compose services for redis, celery, and celery-beat, plus a web healthcheck used for startup ordering.
  • Add Vagrant/systemd provisioning artifacts to run Redis + Celery worker/beat as services.

Reviewed changes

Copilot reviewed 14 out of 16 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
docker/docker-compose.yaml Adds redis/celery/celery-beat services to dev compose and wires broker env.
docker/docker-compose.prod.local.yaml Adds redis/celery/celery-beat services to prod-local compose.
docker/common.yaml Defines redis/celery/celery-beat service templates and adds a web healthcheck.
deployment/celery-worker.service New systemd unit for celery worker in Vagrant.
deployment/celery-beat.service New systemd unit for celery beat (DatabaseScheduler) in Vagrant.
TEKDB/scripts/provision_celery.sh Provision script to install Redis and enable/start celery systemd units.
TEKDB/requirements.txt Adds celery, redis, django-celery-results, django-celery-beat dependencies.
TEKDB/media/init.py Adds placeholder file under media directory.
TEKDB/TEKDB/tests/test_tasks.py Adds tests for chunk cleanup behavior.
TEKDB/TEKDB/tekdb_filebrowser.py Minor lambda formatting adjustment (parentheses).
TEKDB/TEKDB/tasks.py Adds delete_expired_chunks Celery task and helper formatting function.
TEKDB/TEKDB/settings.py Registers celery results/beat apps and adds Celery configuration + beat schedule.
TEKDB/TEKDB/celery.py Adds Celery app initialization and autodiscovery.
TEKDB/TEKDB/admin.py Adjusts django-celery-results admin registration (unregister GroupResult).
TEKDB/TEKDB/init.py Imports Celery app on Django startup.
TEKDB/TEKDB/management/commands/delete_expired_chunks.py Adds a management command module file (currently empty).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +17 to +18


Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bytes_to_readable can return None for very large values because the loop has no fallback return after the last unit. Ensure the function always returns a string (e.g., add a final return for the largest unit).

Suggested change
# Fallback: for extremely large values, use the largest unit.
return f"{num_bytes:.2f} P{suffix}"

Copilot uses AI. Check for mistakes.
Comment on lines +35 to +46
redis:
extends:
file: common.yaml
service: redis
celery:
extends:
file: common.yaml
service: celery
celery-beat:
extends:
file: common.yaml
service: celery-beat
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description indicates celery/beat should be part of the Docker deployment to run the 48h cleanup, but only docker-compose.yaml (dev) and docker-compose.prod.local.yaml include the redis/celery services. The actual deployed compose file (docker/docker-compose.prod.yaml, used in .github/workflows/deploy-ec2-staging.yml) currently does not include redis/celery/celery-beat, so the cleanup task won’t run in that environment.

Copilot uses AI. Check for mistakes.
path,
filter_func=lambda fo: not fo.has_media_record()
and fo.filename not in self.files_folders_to_ignore(),
filter_func=lambda fo: (
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changes to this file are just ruff formatting fixes

@@ -0,0 +1,18 @@
[Unit]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RUN chmod +x /usr/local/bin/entrypoint.sh

# Create a non-root user and give it ownership of the app directory
RUN addgroup --system appgroup && adduser --system --ingroup appgroup appuser \
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added to resolve warning of SecurityWarning: You're running the worker with superuser privileges: this is absolutely not recommended! in the celery container.

Copy link
Member

@pollardld pollardld left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉 🎉 🎉

volumes:
- static_volume:/vol/static/static:ro
- media_volume:/vol/static/media:ro
redis:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that it is possible, and so easy, to extend into other yaml files.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed! makes it super easy to have different configs for different environments, but keep things DRY.

Copy link
Member

@rhodges rhodges left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not only is it a great new feature, but this makes an easy job of including several very powerful tools that will be welcome when adding features in the future! I am curious how this will impact hardware usage, but in past experience Redis and Celery have proven themselves to be quite performant.

context: ../TEKDB/
dockerfile: ../TEKDB/Dockerfile
restart: unless-stopped
depends_on:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this dependency removed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, I see the inheritance now - cool!

file_path = os.path.join(root, filename)
try:
mtime = os.path.getmtime(file_path)
if mtime >= cutoff_timestamp:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the one piece of logic I most wanted to see in this review, and not only is it here, it's clear, readable, and well-written!

@paigewilliams paigewilliams merged commit efb9681 into develop Mar 12, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants