Core: Simplify RESTTableScan by removing catalog internals#15595
Core: Simplify RESTTableScan by removing catalog internals#15595singhpk234 wants to merge 3 commits intoapache:mainfrom
Conversation
| private final String fetchScanTasksPath; | ||
| private final boolean supportsAsync; | ||
| private final boolean supportsCancel; | ||
| private final boolean supportsFetchTasks; |
There was a problem hiding this comment.
Why is this a boolean? I think it should be required and planning should fail if task fetching is not supported. @nastra, what do you think? Is it worth supporting cases where plan is implemented but not fetch? That would mean that planning has to be all-at-once, which is a bad design!
There was a problem hiding this comment.
Is it worth supporting cases where plan is implemented but not fetch
It could be possible i planned once already and the plan is cached and i can send all-at-once it could be a use case, and i don;t need to do support fetchTask if my planning is fast or its only supported for small tables wdyt ?
There was a problem hiding this comment.
@rdblue I think it is reasonable to only support the plan endpoint without having immediate support for the tasks endpoint (independently from whether such a design results in good perf). It is likely that servers implementing remote scan planning will start with the easy endpoint first (plan) and later add the async planning part and I think we should at least not mandate that all endpoints must be supported immediately
There was a problem hiding this comment.
@singhpk234, wouldn't you cache the results of each request? The motivation for breaking scan planning into multiple requests isn't just the cost of reading manifests. It is also the cost of sending results (you could have thousands of matching files per plan task) and caching them (if you have thousands per request). I doubt that anyone would want to do the work of aggregating plan task results in a cache only to have a more coarse cache that invalidates an entire planning result rather that just parts of it.
My concern here is also motivated by being able to keep things simple. We don't want new REST catalog features that require gracefully falling back for every single endpoint. That's needless complexity. In this case, as we add planning we need to document which parts are required and which ones are optional. I think that supporting both plan and fetch is a minimum for getting this working and we can simplify by asserting that.
@nastra, this doesn't mean the fetch endpoint actually has to do anything. You could always return all results from plan and avoid hitting it. But we don't need to increase the complexity of all clients.
There was a problem hiding this comment.
One can prewarm the caches is one thing i was thinking for small tables ? orthogonally there may be cases where some one is producing a governed set in a sense they gave plan-id and took a while to materialize, until the client polls, one they are done producing they just give the whole list of materialized, then they give it one go (I totally agree, we are still constrained by size of the payload here), which would be ok to it in spark side semantics where we wait to collect all the splits on the driver before handling of to the executor, rather than having a streaming splits like Trino ?
I can say this falls under example of "bad architecture" but i believe these are possible as of today.
I think one other way to put is its the server who is orchestrating the client to come to them by giving them the planTask, they are in their full liberty to not do so and achieve the architecture they want and co-exit and from the spec / ref impl perspective we can take a stand that this is what we recommend, though i feel its a bit strict but i can understand how this simplifies things
There was a problem hiding this comment.
@nastra, this doesn't mean the fetch endpoint actually has to do anything. You could always return all results from plan and avoid hitting it. But we don't need to increase the complexity of all clients.
@rdblue I believe this should be acceptable for implementors as long as we clearly document the requirements, so I'm in favor of doing this
21e8da9 to
cffc400
Compare
| TableOperations ops, TableIdentifier finalIdentifier, RESTClient restClient) { | ||
| // server supports remote planning endpoint and server / client wants to do server side planning | ||
| if (endpoints.contains(Endpoint.V1_SUBMIT_TABLE_SCAN_PLAN) && restScanPlanningEnabled) { | ||
| boolean supportsAsync = endpoints.contains(Endpoint.V1_FETCH_TABLE_SCAN_PLAN); |
There was a problem hiding this comment.
should we maybe name this supportsFetchPlanningResult to be more precise? async could mean different things to different people and the endpoint that we're referring to here is the one that returns the planning result for a given planId
There was a problem hiding this comment.
I would say being able to handle the "submitted" but not "completed" status is a reasonable definition of "async". And I wouldn't say supportsFetchPlanningResult is actually more clear. In the planning APIs, we use "fetch" to refer to the endpoint where you fetch the result of a plan task. I personally find it hard when I see "fetch planning result" vs "fetch plan task result". These are too similar to be helpful.
There was a problem hiding this comment.
Another concern here is that we don't want people to think that a reasonable implementation may use the async endpoint to plan an entire scan and send it back as one result, rather than breaking up the planning operation and using multiple calls to fetch the results. That would be a really bad use of this API and is another good reason to make fetch required and async optional.
core/src/main/java/org/apache/iceberg/rest/RESTScanContext.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/apache/iceberg/rest/RESTScanContext.java
Outdated
Show resolved
Hide resolved
| TableIdentifier tableIdentifier, | ||
| boolean supportsAsync, | ||
| boolean supportsCancel, | ||
| boolean supportsFetchTasks, |
There was a problem hiding this comment.
I appreciate that you're turning around these updates so quickly! I was thinking about this a little differently, so I'm sorry that I should have been a bit more clear with my suggestion.
What I was thinking is a class that behaves like ResourcePaths, but at the table level to encapsulate logic that doesn't need to be spread out. Something like a TableResource, which would be constructed from ResourcePaths, TableIdentifier, and Set<Endpoint> for table-specific endpoint checks like supportsAsync.
tableResource.planPath()would return the planning submit pathtableResource.planPath(planId)would return the async endpoint to poll for the give plan IDtableResource.fetchPath()would return the endpoint for fetching plan task results
I think of this as a "table resource" because it could handle more than just scan planning. For instance, the code for endpoint checks is also littered throughout RESTTableOperations. I would want RESTTableOperations to also delegate as much as possible to the table resource object as well, rather than keeping a copy of the supported endpoints. This wouldn't need to happen right now, but it's how I think about what I'm proposing.
Does that make sense? Do you think it would be a good direction?
There was a problem hiding this comment.
I was just trying to get this in ASAP (keeping 1.11 in mind) :) !
This makes sense to me, over all ! Though some questions inline:
for table-specific endpoint checks like supportsAsync.
Do you envision these to be kind of return null when things are not applicable, we can add in the function contract.
Something like a TableResource, which would be constructed from
Do you think, we would need something like Namespace resource, i am mostly thinking if there is an oppurtunity for having a BaseResource class where i can put the CRUD related call in ti.
+1 on moving this stuff out of RESTTableOps, will think more
| .map(c -> StorageCredential.create(c.prefix(), c.config())) | ||
| .collect(Collectors.toList())); | ||
| private FileIO scanFileIO(List<Credential> credentials) { | ||
| FileIO ioForScan = scanContext.createFileIO(credentials, planId); |
| assertThatThrownBy(restTableScanFor(table)::planFiles) | ||
| .isInstanceOf(UnsupportedOperationException.class) | ||
| .hasMessage("Server does not support endpoint: %s", Endpoint.V1_FETCH_TABLE_SCAN_PLAN); | ||
| .isInstanceOf(IllegalStateException.class) |
There was a problem hiding this comment.
I think this is correct. If a service puts the client in a submitted state but can't complete the operation after that, it's an illegal state.
core/src/test/java/org/apache/iceberg/rest/TestRESTScanPlanning.java
Outdated
Show resolved
Hide resolved
- Remove redundant fields (table, operations, tableIO) from RESTTableScan - Replace catalogProperties + hadoopConf with a FileIO factory function - Pre-compute resource paths instead of passing ResourcePaths + TableIdentifier - Replace Set<Endpoint> with supportsAsync/supportsCancel/supportsFetchTasks booleans - Change headers from Map to Supplier for consistency with RESTTable - Fix newRefinedScan to not propagate plan-scoped FileIO - Fix cleanupPlanResources to always invalidate tracker
- Use PlanStatus.SUBMITTED in error message for consistency - Make ParserContext lazily initialized to avoid recreation on newRefinedScan() - Change fileIOFactory signature to accept Credential and planId directly, moving StorageCredential conversion into the RESTTable lambda - Introduce RESTScanContext to collapse path/boolean/factory constructor params
…back - Rename RESTScanContext to TableResource, accepting Set<Endpoint> instead of pre-resolved booleans for future reuse - Rename methods: planTableScanPath() to planPath(), fetchScanTasksPath() to fetchPath() - Rename parserContext field to lazyParserContext to clarify lazy init - Remove supportsFetchTasks check (fetch endpoint is now required) - Simplify RESTTable constructor from 12 to 10 params
cffc400 to
ff6de18
Compare
Address #15561 (comment)