-
Notifications
You must be signed in to change notification settings - Fork 9
Description
#601 provided substantive increases in annotation performance, at the cost of stale data.
However, our esteemed @labradorite-dev has proposed a solution which could resolve this issue in turn!
I would be remiss to not include the other, much larger refactor I considered which is to implement proper CQRS. The core of the issue here is that data is prioritized for writes, NOT reads. Then, at query time we pay the price and have to stitch all the data together. Materialized views shift that data stitching slightly left (once a day in a separate job), but a potentially more "complete" fix would be to add a new, fully denormalized table which looks like this:
-- read model, updated on every write command annotation_queue ( url_id PRIMARY KEY, priority_rank INT, -- pre-computed: manual > followed > count > id total_count INT, is_manual BOOL, followed_by_user BOOL, ... )Then at each time we write a new annotation to be consumed by annotators, we also populate this table correctly. Then, at query time we have a primary key lookup (extremely cheap): SELECT * FROM annotation_queue ORDER BY priority_rank LIMIT 1.
This seems nifty, and so I propose we implement it!
Metadata
Metadata
Assignees
Labels
Type
Projects
Status