Bug Summary
Searching "query": "time:>=T" returns documents with time < T, where:
T is a numeric literal containing a valid integer nanosecond unix timestamp where T % 1000 > 0
- Indexed document(s) exist with nanoseconds timestamp
x where T-N < x < T for "small N" (*)
- Field
time in the index is configured with fast_precision: nanoseconds and output_format: unix_timestamp_nanos.
(*) Black-box behavior probing gives the impression that timestamps are being erroneously rounded down to lower precisions internally at some point during query execution, but further probing shows that the "boundary" points (the smallest U>T for which time:>=U stops returning documents at time T) are very unusual and don't seem to obviously correspond to either the integer microsecond or floating point nanosecond rounding error boundaries we'd expect. See "Example Repro" section below for more details.
Repro Steps
- Ingest some data with nanosecond precision timestamps (reference index configuration is provided below)
- Select some sample timestamp
T from any document, as long as the last 3 decimal digits of T are not 999. It's fine if the last 3 digits are all 000.
- Use the REST API to search
"query": "time:>={T+1}", "sort_by":"-time", "max_hits":1" where T the sample timestamp selected above.
- You will find that the document at time
T is incorrectly included in the result!
- Optional: Experiment with different time offsets (incorrect behavior repros with offsets much larger than 1 nano, though exactly what offset ahead of
T stops incorrectly returning the document at time T seems unusual and doesn't align with most normal precision rounding boundaries one would expect).
Example Repro
❯ curl -X POST "http://127.0.0.1:7280/api/v1/fluentbit-logs/search" \
-H "Content-Type: application/json" \
-d '{"max_hits":1,"query":"time:>=1751128201918247001","sort_by":"-time"}'
{
"num_hits": 7354717336,
"hits": [
{
"time": 1751128201918247000
/* ... */
}
],
"elapsed_time_micros": 291577,
"errors": []
}
Bisecting this "time:>=T+N" query to find when Quickwit stops returning the document at time T found this:
❯ curl -X POST "http://127.0.0.1:7280/api/v1/fluentbit-logs/search" \
-H "Content-Type: application/json" \
-d '{"max_hits":1,"query":"time:>=1751128201918247595","sort_by":"-time"}'
...
"time": 1751128201918247000
❯ curl -X POST "http://127.0.0.1:7280/api/v1/fluentbit-logs/search" \
-H "Content-Type: application/json" \
-d '{"max_hits":1,"query":"time:>=1751128201918247596","sort_by":"-time"}'
...
"time": 1751128201918249000
Notice how the query for time:>= (T + 595) still incorrectly returns the document at time T, while querying time:>=(T + 596) finally got it to move on to the (presumably) next document >= T.
Expected behavior
Quickwit should NOT have returned a "hit" document with "time": 1751128201918247000 in response to a query for time:>=1751128201918247595, since 1751128201918247000 < 1751128201918247595 (equivalently 000 < 595).
In general:
- Querying
time:>=T should never return documents with time < T.
- Querying
field:>=X should never return documents with field < X.
Some black-box brainstorming
This odd boundary at least suggests that whatever the issue is internally, it's not as simple as integer nanoseconds being rounded down to integer microseconds, and probably not even nanoseconds being simply rounded to a IEEE-753 64-bit float!
If the timestamps were just being rounded into f64, then we'd expect the boundary point to be N > 40, rather than the observed boundary of N > 595 (since (1751128201918247000.0 >= 1751128201918247040.0) == true and (1751128201918247000.0 >= 1751128201918247041.0) == false).
Configuration
> quickwit --version
Quickwit 0.8.2 (x86_64-unknown-linux-gnu 2024-09-03T11:26:51Z 0f28194)
index.yaml
version: 0.8
index_id: fluentbit-logs
doc_mapping:
mode: lenient
field_mappings:
- name: time
type: datetime
input_formats:
- iso8601
output_format: unix_timestamp_nanos
fast_precision: nanoseconds
fast: true
# ...
timestamp_field: time
# ...
Bug Summary
Searching
"query": "time:>=T"returns documents withtime < T, where:Tis a numeric literal containing a valid integer nanosecond unix timestamp whereT % 1000 > 0xwhereT-N < x < Tfor "small N" (*)timein the index is configured withfast_precision: nanosecondsandoutput_format: unix_timestamp_nanos.(*) Black-box behavior probing gives the impression that timestamps are being erroneously rounded down to lower precisions internally at some point during query execution, but further probing shows that the "boundary" points (the smallest
U>Tfor whichtime:>=Ustops returning documents at timeT) are very unusual and don't seem to obviously correspond to either the integer microsecond or floating point nanosecond rounding error boundaries we'd expect. See "Example Repro" section below for more details.Repro Steps
Tfrom any document, as long as the last 3 decimal digits ofTare not999. It's fine if the last 3 digits are all000."query": "time:>={T+1}", "sort_by":"-time", "max_hits":1"whereTthe sample timestamp selected above.Tis incorrectly included in the result!Tstops incorrectly returning the document at timeTseems unusual and doesn't align with most normal precision rounding boundaries one would expect).Example Repro
Bisecting this
"time:>=T+N"query to find when Quickwit stops returning the document at timeTfound this:Notice how the query for
time:>= (T + 595)still incorrectly returns the document at timeT, while queryingtime:>=(T + 596)finally got it to move on to the (presumably) next document>= T.Expected behavior
Quickwit should NOT have returned a "hit" document with
"time": 1751128201918247000in response to a query fortime:>=1751128201918247595, since1751128201918247000 < 1751128201918247595(equivalently000 < 595).In general:
time:>=Tshould never return documents withtime < T.field:>=Xshould never return documents withfield < X.Some black-box brainstorming
This odd boundary at least suggests that whatever the issue is internally, it's not as simple as integer nanoseconds being rounded down to integer microseconds, and probably not even nanoseconds being simply rounded to a IEEE-753 64-bit float!
If the timestamps were just being rounded into
f64, then we'd expect the boundary point to beN > 40, rather than the observed boundary ofN > 595(since(1751128201918247000.0 >= 1751128201918247040.0) == trueand(1751128201918247000.0 >= 1751128201918247041.0) == false).Configuration
index.yaml