Conversation
|
Over other channels we agreed we would try the migrate-early approach in this PR, rather than the migrate-late approach from #822 |
2591aff to
1bd2cdd
Compare
09072bb to
cb45455
Compare
cb45455 to
1b77fe6
Compare
1b77fe6 to
b53734f
Compare
jorisdral
left a comment
There was a problem hiding this comment.
Nice to see this take shape 😄
198da6b to
16e0b00
Compare
jorisdral
left a comment
There was a problem hiding this comment.
Looks good -- the only thing we should maybe reconsider a little bit more is how we handle empty union results
| -- In some cases we could even add it as one of the tiering runs of | ||
| -- the existing last regular level, but the difference is small and | ||
| -- we would need to relax the invariant that the last level uses | ||
| -- levelling. |
There was a problem hiding this comment.
This one I'm not sure I fully understand. When would it be added as a tiering run?
There was a problem hiding this comment.
Before migration, there usually is a last regular level, which uses tiering. If the union level is small enough, we could theoretically add it as a run into that tiering level (instead of creating a new levelling level for it), but that clashes with our current assertions about merge policy of course.
Maybe I should just remove the comment. I was trying to answer the question a reader might have: "Why do we always have to create a new level just for this union run, even if it's small?" But maybe that causes more confusion than it prevents.
There was a problem hiding this comment.
I think the first paragraph you wrote already cleared up my confusion. Just add some of those detail to the haddocks and then we should be fine. Maybe something along the lines of:
Before migrating the run that resulted from the union into the regular levels, there is usually a last "regular" level, which uses tiering. We could add the union-result run to the resident tiering runs of the last regular level, but that clashes with some invariants. In any case, the difference is small and we would need to relax the invariant that the last level uses levelling.
16e0b00 to
051001b
Compare
jorisdral
left a comment
There was a problem hiding this comment.
LGTM! Were there any changes since last time I reviewed that you think I should look at? If not, let's resolve the final few review comments, and then we can put this PR on the merge queue
No, I only slightly tweaked the comments: https://github.com/IntersectMBO/lsm-tree/compare/16e0b0070478b3b97f6d5a37e1aebd59a44d7f81..051001bb5634e1166f386d1c16fe6c406027ee4e So I will make some last documentation improvements as discussed, rebase and merge. |
051001b to
2671565
Compare
When creating the union of multiple tables, we create a special union level containing a merging tree. Once the tree has been completed, i.e. merged into a single run, we want to get rid of the union level by migrating the run to the regular levels. The main motivation is that we want it to become part of a last level merge, so compaction can occur. Last level merges are especially useful for compaction, since they can drop all delete entries.
This PR demonstrates one of two approaches for migration (#822 being the other one). As soon as the merging tree gets completed through
supplyUnionCredits, we migrate it to a new last regular level. Sometimes it would not fit directly after the existing levels, in which case we can insert a few empty levels.An interesting case to consider it that a union can get completed by an operation on another table that contains a shared reference to the merging tree. Therefore, at the beginning of any operation, a completed but not yet migrated union might be present. To ensure unions can become part of merges as soon as possible, we also check for possible migrations when flushing the write buffer. It might be beneficial to do the same when doing lookups.