Skip to content

Api v2 improvements#6164

Merged
RobertJoonas merged 5 commits intomasterfrom
api-v2-improvements
Mar 18, 2026
Merged

Api v2 improvements#6164
RobertJoonas merged 5 commits intomasterfrom
api-v2-improvements

Conversation

@RobertJoonas
Copy link
Contributor

@RobertJoonas RobertJoonas commented Mar 16, 2026

Changes

Extract 2 API v2 improvements from #6152, that will be necessary for the main graph -> API v2 migration.

  1. Bugfix: Do not return time buckets that are outside the queried range. See commit description for an explanation of the change.
  2. Allow querying views_per_visit with a time dimension. This is basically just getting rid of a QueryBuilder validation -- the logic for querying the metric is already in use by the main graph.

Tests

  • Automated tests have been added

Changelog

  • Entries has been added to changelog

Documentation

  • This change does not need a documentation update

Dark mode

  • This PR does not change the UI

* For time:hour and time:minute, sessions are smeared using time_slots.
  The fix is to filter out time_slots that fall outside of the utc boundaries

* For any other time dimension, there's no session smearing, but since
  sessions are put into time buckets by the last event timestamps, the
  query might return buckets that are outside of the query time range.
  The fix is to clamp those sessions into the last bucket instead.
@RobertJoonas RobertJoonas requested a review from a team March 16, 2026 16:12
Copy link
Contributor

@ukutaht ukutaht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple suggestions for organization, but looks good otherwise.

end

def select_dimension(q, key, "time:month", :sessions, query) do
{_first, last_datetime} = Time.utc_boundaries(query)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of re-calculating, can we not access query.utc_time_range instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function itself uses query.utc_time_range under the hood. The function also evaluates start as the latest of site.native_stats_start_at and utc_time_range.first.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a huge deal, and sort of micro-optimisation, but maybe it would be worth it to ensure Time.utc_boundaries is called once per query (given of course that none of the used parameters change. Perhaps its result could be stored under query.computed_utc_time_range or query.effective_utc_time_range?

BTW an absolutely out of scope nit but:

  defp beginning_of_time(candidate, native_stats_start_at) do
    if NaiveDateTime.after?(native_stats_start_at, candidate) do
      native_stats_start_at
    else
      candidate
    end
  end

could be rewritten as:

  defp beginning_of_time(candidate, native_stats_start_at) do
    Enum.max([candidate, native_stats_start_at], NaiveDateTime)
  end

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a huge deal, and sort of micro-optimisation, but maybe it would be worth it to ensure Time.utc_boundaries is called once per query (given of course that none of the used parameters change. Perhaps its result could be stored under query.computed_utc_time_range or query.effective_utc_time_range?

Yeah, I guess the main difference between query.utc_time_range and Time.utc_boundaries is that utc_boundaries is strictly used for native queries, while utc_time_range also defines the range for imported data. So a fitting name would be native_utc_time_range.

There's also another difference -- one is NaiveDateTime and the other is DateTime.

)
|> where(
[s, time_slot: ts],
fragment("toStartOfHour(?)", ts) >=
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it would be cleaner to change time_slots so it generates the correct time range, instead of letting it generate bad timeslots and then filtering them out later.

Copy link
Contributor

@zoldar zoldar Mar 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be possible by wrapping timeSlots result with arrayFilter and filtering out timestamps outside the boundaries.

Sorry, another tangent here. Just realized that time_slots macro is "unhygienic" 🙈 in that it blindly assumes "s" relation alias binding in the scope it's used in. Ideally it would accept relation as another parameter, like:

  defmacrop time_slots(query, session, period_in_seconds) do
    quote do
      fragment(
        "timeSlots(toTimeZone(?, ?), toUInt32(timeDiff(?, ?)), toUInt32(?))",
        unquote(session).start,
        ^unquote(query).timezone,
        unquote(session).start,
        unquote(session).timestamp,
        ^unquote(period_in_seconds)
      )
    end
  end

(haven't tested it though - might require replacing unquote(session).field_name with field(unquote(session), :field_name))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call! That way we don't need to worry about timezone conversions either, and can simply use greatest/least functions to trim session start/timestamp by UTC boundaries. 3d995d3

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, @zoldar! I saw your comment only after merging. The suggestion to clean up the macro makes sense.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries, your solution is better. I failed to register we have both boundary timestamps there (start <> timestamp).

@RobertJoonas RobertJoonas added this pull request to the merge queue Mar 18, 2026
Merged via the queue into master with commit 5d93835 Mar 18, 2026
24 checks passed
@RobertJoonas RobertJoonas deleted the api-v2-improvements branch March 18, 2026 11:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants