Skip to content

Refactor article cleanup into dedicated tool#495

Open
mircealungu wants to merge 1 commit intomasterfrom
refactor-article-cleanup-tools
Open

Refactor article cleanup into dedicated tool#495
mircealungu wants to merge 1 commit intomasterfrom
refactor-article-cleanup-tools

Conversation

@mircealungu
Copy link
Member

Summary

  • Removes article deletion code from anonymize_users.py (separation of concerns)
  • Rewrites remove_unreferenced_articles.py with comprehensive cleanup:
    • Deletes from all dependent tables first (tokenization_cache, cefr_assessment, fragments, etc.)
    • Cleans up orphaned source, source_text, and new_text records
    • Adds --days and --delete-from-es CLI options
    • Progress reporting with batch processing

Test plan

  • Run python tools/remove_unreferenced_articles.py --days 90 on staging
  • Verify dependent tables are cleaned up properly
  • Verify anonymize_users.py still works for user anonymization

🤖 Generated with Claude Code

- Remove article deletion code from anonymize_users.py (separation of concerns)
- Rewrite remove_unreferenced_articles.py with proper cleanup:
  - Deletes from all dependent tables first (tokenization_cache, fragments, etc.)
  - Cleans up orphaned source, source_text, and new_text records
  - Adds --days and --delete-from-es CLI options
  - Progress reporting with batch processing

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@github-actions
Copy link

ArchLens - No architecturally relevant changes to the existing views

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant