Skip to content

Edit smallset to sample in tests#74

Open
borisdevos wants to merge 8 commits intomainfrom
bd/smallset
Open

Edit smallset to sample in tests#74
borisdevos wants to merge 8 commits intomainfrom
bd/smallset

Conversation

@borisdevos
Copy link
Member

@borisdevos borisdevos commented Mar 2, 2026

Part 3 of splitting #71 in pieces. The motivation for changing smallset like this is to at least be able to sample from the sectors when there's finitely many. Base.Iterators.take will always take the first n many sectors implemented in SectorValues{I}. The change to TimeReversed is needed to call collect on its values in product sectors.

@codecov
Copy link

codecov bot commented Mar 3, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

Files with missing lines Coverage Δ
src/timereversed.jl 91.66% <100.00%> (+0.36%) ⬆️

... and 2 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member

@lkdvos lkdvos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In principle I think the addition of the StatsBase dependency is breaking, as this would now make downstream tests fail since they are missing the dependency...

@borisdevos
Copy link
Member Author

Are we okay with this, or is there an alternative? A very silly one is to manually sample with rand, check with unique and replace accordingly, but this will be inherently slower than StatsBase's sample.
I wish to have a new release after #71 anyways, so the version would be bumped either here or there.

@lkdvos
Copy link
Member

lkdvos commented Mar 3, 2026

It's not about a version bump, it would then have to be a major version bump. I am not necessarily opposed to this, but it's definitely a bit disruptive for a somewhat minor convenience update. How about just some manual intervention:

result = I[]
p = sz / length(values(I)) # sample on average sz / length(values(I)) 
for a in values(I)
    if rand() <= p
        push!(result, a)
    end
end
return result
end

The main idea is that while this is not guaranteed to have exactly sz values, it would sample every sector with equal probability?

[EDIT] since we are already depending on Random, I think this is exactly equal to Random.randsubseq!(collect(values(I)), p)

borisdevos and others added 2 commits March 4, 2026 08:32
Co-authored-by: Lukas Devos <ldevos98@gmail.com>
@lkdvos
Copy link
Member

lkdvos commented Mar 4, 2026

Can you explain a bit why the loop is still necessary? If the probabilities are mapped correctly and the default size is 5, should this not be selecting every sector with 100% chance if there are only 5 or less, and (5/6)^6% chance that the set is empty in the 6 sector case seems acceptable?

function Base.IteratorSize(::Type{SectorValues{TimeReversed{I}}}) where {I <: Sector}

# capture product sectors as well
function Base.IteratorSize(::Type{SectorValues{TimeReversed{I}}}) where {I}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, is ProductSector not a subtype of Sector?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is, but for some reason the time-reversed of a product sector didn't behave so nicely.

@borisdevos
Copy link
Member Author

LargeZNIrrep{200} has 200 sectors, so the probability of an empty smallset is realistic, and also got caught in one of the tests. Otherwise I wouldn't have thought of it 🫠

return Random.randsubseq(vals, min(size, L) / L) # returns all sectors if L < size
p = resize!(randperm(L), min(size, L)) # sampling with fixed size
return collect(vals)[p]
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this randperm solution, and would also put it into place for infinite sectors. Something closely similar can be obtained by using shuffle, so how about this implementation:

Suggested change
end
function smallset(::Type{I}, size::Int = 5, maxdim::Real = 10) where {I <: Sector}
vals = values(I)
L = Base.IteratorSize(vals) === Base.IsInfinite() ? 10 * size : min(10 * size, length(vals))
sectors = getindex.((vals,), 2:L) # exclude unit in simple-unit categories
sectors = shuffle!(filter!(s->dim(s) > maxdim, sectors))
return sectors[1:min(size, length(sectors))]
end

I think this should work for all sectors. It will first select the first 10 * size sectors (if there are that many), excluding the unit. Then it will possibly throw out sectors with a large quantum dimension. Then it will randomly permute them via shuffle!, and only keep the first size (unless fewer sectors were left).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants