fix: bundle Arrow deps to fix build on Debian 12#171
Conversation
…x build on Debian 12 Arrow requires Snappy, zstd, Thrift, xsimd, and RapidJSON which are not built by the contrib system and may not be available as system packages (e.g. on Debian 12). Use Arrow's per-dependency BUNDLED source mechanism to have Arrow download and build these during the contrib build. Dependencies already provided by contrib (zlib, bzip2, boost) continue to be found via CMAKE_PREFIX_PATH. Remove libthrift-dev/thrift from CI prerequisites since Thrift is now bundled. Fixes OpenMS/OpenMS#8797 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
📝 WalkthroughWalkthroughRemoved system Thrift package dependencies from CI workflows while configuring Apache Arrow to build all required dependencies from bundled sources instead of relying on system installations across Linux, macOS, and Windows platforms. Changes
Possibly related issues
Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs). Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
libraries.cmake/arrow.cmake (1)
45-49: Centralize bundled dependency flags to avoid branch drift.The same
*_SOURCE=BUNDLEDflags are duplicated in both OS branches. Defining them once and reusing the list will make future updates safer.♻️ Suggested refactor
+ set(ARROW_BUNDLED_DEP_ARGS + "-DSnappy_SOURCE=BUNDLED" + "-Dzstd_SOURCE=BUNDLED" + "-DThrift_SOURCE=BUNDLED" + "-Dxsimd_SOURCE=BUNDLED" + "-DRapidJSON_SOURCE=BUNDLED") ... - -D Snappy_SOURCE=BUNDLED - -D zstd_SOURCE=BUNDLED - -D Thrift_SOURCE=BUNDLED - -D xsimd_SOURCE=BUNDLED - -D RapidJSON_SOURCE=BUNDLED + ${ARROW_BUNDLED_DEP_ARGS} ... - "-DSnappy_SOURCE=BUNDLED" - "-Dzstd_SOURCE=BUNDLED" - "-DThrift_SOURCE=BUNDLED" - "-Dxsimd_SOURCE=BUNDLED" - "-DRapidJSON_SOURCE=BUNDLED" + ${ARROW_BUNDLED_DEP_ARGS}Also applies to: 150-154
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@libraries.cmake/arrow.cmake` around lines 45 - 49, Extract the repeated -D Snappy_SOURCE=BUNDLED, -D zstd_SOURCE=BUNDLED, -D Thrift_SOURCE=BUNDLED, -D xsimd_SOURCE=BUNDLED and -D RapidJSON_SOURCE=BUNDLED flags into a single CMake variable (e.g., ARROW_BUNDLED_DEPS) and use that variable in both OS branches and the other duplicated location (the second block containing the same flags) so the flags are defined once and reused to avoid branch drift.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@libraries.cmake/arrow.cmake`:
- Around line 45-49: Extract the repeated -D Snappy_SOURCE=BUNDLED, -D
zstd_SOURCE=BUNDLED, -D Thrift_SOURCE=BUNDLED, -D xsimd_SOURCE=BUNDLED and -D
RapidJSON_SOURCE=BUNDLED flags into a single CMake variable (e.g.,
ARROW_BUNDLED_DEPS) and use that variable in both OS branches and the other
duplicated location (the second block containing the same flags) so the flags
are defined once and reused to avoid branch drift.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 3b28de70-b1ca-4b3f-807c-6b2b3cc60ce7
📒 Files selected for processing (2)
.github/workflows/main.ymllibraries.cmake/arrow.cmake
|
You are not specifying where to get AWSSDK and others from in Arrow, which might differ from other systems like brew or conda that will include all optional dependencies. Just so you know. Not strictly required. |
|
Yeah it is still not 100% there... thanks |
Summary
_SOURCE=BUNDLEDmechanism instead of requiring them as system packageslibthrift-dev/thriftfrom CI prerequisites since Thrift is now bundledContext
Arrow from contrib does not build on Debian 12 (and likely other minimal Linux installations) because it requires Snappy, zstd, Thrift, xsimd, and RapidJSON which are not built by the contrib system and are not available as system packages.
Previously, the CI worked around this by installing
libthrift-dev(Ubuntu) andthrift(macOS), and relied on GitHub runners having other deps pre-installed. This is fragile and undocumented.The fix uses Arrow's built-in
<pkg>_SOURCE=BUNDLEDflags to tell Arrow to download and build these dependencies itself during the contrib build. Dependencies already provided by contrib (zlib, bzip2, boost) continue to be found viaCMAKE_PREFIX_PATH.Note: This requires internet access during the contrib build for the bundled dependency downloads.
Fixes OpenMS/OpenMS#8797
Test plan
🤖 Generated with Claude Code
Summary by CodeRabbit