Skip to content

feat: replace GPAC with FFmpeg for MP4 demuxing #2170

Open
DhanushVarma-2 wants to merge 5 commits intoCCExtractor:masterfrom
DhanushVarma-2:feat/replace-gpac-with-ffmpeg
Open

feat: replace GPAC with FFmpeg for MP4 demuxing #2170
DhanushVarma-2 wants to merge 5 commits intoCCExtractor:masterfrom
DhanushVarma-2:feat/replace-gpac-with-ffmpeg

Conversation

@DhanushVarma-2
Copy link
Copy Markdown
Contributor

@DhanushVarma-2 DhanushVarma-2 commented Mar 4, 2026

In raising this pull request, I confirm the following (please check boxes):

  • I have read and understood the contributors guide.
  • I have checked that another pull request for this purpose does not exist.
  • I have considered, and confirmed that this submission will be valuable to others.
  • I accept that this submission may not be used, and the pull request closed at the will of the maintainer.
  • I give this submission freely, and claim no ownership to its content.
  • I have mentioned this change in the changelog.

My familiarity with the project is as follows (check one):

  • I have never used CCExtractor.
  • I have used CCExtractor just a couple of times.
  • I absolutely love CCExtractor, but have not contributed previously.
  • I am an active contributor to CCExtractor.

Adds a Rust FFmpeg-based MP4 demuxer (rsmpeg) as a compile-time alternative to GPAC. Enabled with cmake -DWITH_FFMPEG=ON.

Supported: AVC/H.264, HEVC/H.265, CEA-608, CEA-708, tx3g, chapter extraction
Known gap: VobSub detected but not decoded
GPAC path: Unchanged, still the default

Note: mp4_ffmpeg_exports.rs shows as binary in the diff due to null-terminated C string literals for FFI — the file is clean UTF-8.


@DhanushVarma-2 DhanushVarma-2 changed the title Feat/replace gpac with ffmpeg feat: replace GPAC with FFmpeg for MP4 demuxing Mar 5, 2026
@DhanushVarma-2 DhanushVarma-2 force-pushed the feat/replace-gpac-with-ffmpeg branch from 006130e to 4039844 Compare March 6, 2026 06:23
@DhanushVarma-2
Copy link
Copy Markdown
Contributor Author

The 9 Windows test failures (autoprogram, spupng, startcreditstext) are unrelated to MP4 demuxing — they involve subtitle encoding and credits detection, not the MP4 code path. These same tests pass on the Linux CI run. The Windows build itself was also delayed by a Chocolatey 503 outage when installing GPAC.

All 237 Linux tests pass, including the 3 MP4-specific tests. The Linux CI bot also notes this PR fixes 9 previously-broken tests that had never passed before.

@DhanushVarma-2 DhanushVarma-2 force-pushed the feat/replace-gpac-with-ffmpeg branch 13 times, most recently from 87f2178 to 9e95cd7 Compare March 7, 2026 06:59
@cfsmp3
Copy link
Copy Markdown
Contributor

cfsmp3 commented Mar 7, 2026

Before going deep into this.

It would be a lot more readable to separate both implementations. Have process_mp4_ffmpeg and process_mp4_gpac functions, possibly in separate files, so we only need a few #ifdef guards.

Minimize the changes in the existing code (which already is not super well organized)...

Even (much better), do the ffmpeg part in rust. We don't want to add more C to the code - we really want to switch to rust.

@DhanushVarma-2
Copy link
Copy Markdown
Contributor Author

Before going deep into this.

It would be a lot more readable to separate both implementations. Have process_mp4_ffmpeg and process_mp4_gpac functions, possibly in separate files, so we only need a few #ifdef guards.

Minimize the changes in the existing code (which already is not super well organized)...

Even (much better), do the ffmpeg part in rust. We don't want to add more C to the code - we really want to switch to rust.

yeah sure.

@DhanushVarma-2 DhanushVarma-2 force-pushed the feat/replace-gpac-with-ffmpeg branch 2 times, most recently from 513276d to aa75430 Compare March 10, 2026 11:03
@DhanushVarma-2
Copy link
Copy Markdown
Contributor Author

@cfsmp3 Addressed all review feedback.
Rust FFmpeg demuxer supports AVC/H.264, HEVC/H.265, CEA-608/708, tx3g, and chapter extraction via rsmpeg. Build system fixes (ENABLE_HARDSUBX, link ordering, compat defines, swresample, bindgen allowlist), code cleanup (dead declarations, variable naming, match arm allocations), squashed to 3 commits. ccxr_dumpchapters is fully implemented shows as binary in diff due to FFI null terminators. VobSub detected but not decoded (known gap). Tested locally on MP4 files, all CI green.

@DhanushVarma-2
Copy link
Copy Markdown
Contributor Author

Screenshot 2026-03-10 at 6 15 03 PM

@DhanushVarma-2
Copy link
Copy Markdown
Contributor Author

@cfsmp3 can you review it.

@cfsmp3
Copy link
Copy Markdown
Contributor

cfsmp3 commented Apr 4, 2026

Hi @DhanushVarma-2 — it looks like this branch was force-pushed and now only contains the MKV KATE extension fix (3 commits all with the same message). The original FFmpeg MP4 demuxer work is gone.

The KATE fix is already merged via #2250, so this PR currently has no diff against master.

Did you accidentally force-push the wrong branch? You'll need to restore the FFmpeg work or open a new PR. Let us know.

@DhanushVarma-2 DhanushVarma-2 force-pushed the feat/replace-gpac-with-ffmpeg branch from 814de82 to aa75430 Compare April 4, 2026 22:04
@DhanushVarma-2
Copy link
Copy Markdown
Contributor Author

HI @cfsmp3 Sorry about that , I accidentally force-pushed the wrong branch and overwrote the FFmpeg work. Restored it now and merged with upstream master. Should be good to review.

Copy link
Copy Markdown
Contributor

@cfsmp3 cfsmp3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deep Review

Default build (no FFmpeg) — CLEAN

We built the PR without -DWITH_FFMPEG=ON and ran all 27 CI-flagged tests. Every single output file is byte-identical to master. The default GPAC path has zero regressions.

FFmpeg build — DOES NOT BUILD

We tried to build with -DWITH_FFMPEG=ON:

mkdir build_ffmpeg && cd build_ffmpeg
cmake ../src -DWITH_FFMPEG=ON
make -j$(nproc)

Error 1 (pre-existing, not your fault): ffmpeg_intgr.c:100av_find_best_stream incompatible pointer type. Our FFmpeg 7.x changed AVCodec** to const AVCodec**. We suppressed this with -Wno-incompatible-pointer-types to continue.

Error 2 (this PR): Linker fails with:

undefined reference to 'ccx_mp4_process_hevc_sample'

The symbol exists in mp4_rust_bridge.c → compiled into libccx.a. But libccx_rust.a (which needs it) comes LAST in the link order. The linker already consumed libccx.a and won't go back.

The existing codebase handles this with --undefined= flags (see --undefined=decode_vbi, --undefined=do_cb, --undefined=store_hdcc in the CMakeLists). You need the same for your new bridge symbols:

set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -Wl,--undefined=ccx_mp4_process_hevc_sample -Wl,--undefined=ccx_mp4_process_avc_sample")

Error 3 (this PR): lib_ccx/CMakeLists.txt replaces ENABLE_HARDSUBX with ENABLE_FFMPEG_MP4:

-  set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -DENABLE_HARDSUBX")
+  set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -DENABLE_FFMPEG_MP4")

This means -DWITH_FFMPEG=ON no longer defines ENABLE_HARDSUBX, breaking hardsubx when FFmpeg is enabled. It should be added alongside, not replaced.

Code concerns

  1. ccextractor.c changes the default MP4 processing loop — it now iterates over all input files inside the MP4 case. On master, only ctx->inputfile[ctx->current_file] is processed here. This changes behavior in the default (GPAC) path even without FFmpeg. Our tests didn't catch a regression (single-file MP4 tests pass), but this is a behavioral change that needs justification.

  2. Unrelated cosmetic changes: ccx_decoders_isdb.c reformats the CLUT array (one value per line), teletext.rs reformats code from #2240. These don't belong in this PR — please remove them.

  3. Test files: tests/popon_test.scc and tests/popon_test.srt — what are these for? Not mentioned in the PR description.

Summary

Since the FFmpeg build doesn't link, we couldn't test the actual new feature. Please fix:

  1. Add --undefined linker flags for the bridge symbols
  2. Keep ENABLE_HARDSUBX alongside ENABLE_FFMPEG_MP4
  3. Remove unrelated cosmetic changes
  4. Explain or remove the test files

Once it builds, we'll do a full test with MP4 samples comparing GPAC vs FFmpeg output.

@DhanushVarma-2 DhanushVarma-2 force-pushed the feat/replace-gpac-with-ffmpeg branch from 3b702c0 to e3e0a27 Compare April 5, 2026 13:06
@DhanushVarma-2
Copy link
Copy Markdown
Contributor Author

@cfsmp3 can you review it again

@cfsmp3
Copy link
Copy Markdown
Contributor

cfsmp3 commented Apr 7, 2026

Follow-up Review (Apr 6)

Thanks for the quick turnaround on the review feedback — the commit message shows you addressed every point we raised. However, we found new issues.

isdb changes introduce bugs

The ccx_decoders_isdb.c diff introduces three bugs that were NOT in the previous version of this PR:

1. Removed malloc NULL check (line 348-352 deleted)

 text->buf = malloc(128);
-if (!text->buf)
-{
-    free(text);
-    return NULL;
-}

If malloc fails, *text->buf = 0 on the next line dereferences NULL → crash.

2. Bounds check changed from - 1 to + 1 (line 727)

-if (i >= sizeof(arg) - 1)
+if (i >= (sizeof(arg)) + 1)

This allows writing 2 extra bytes past the end of arg before the check triggers. Buffer overflow.

3. Removed bounds guard on write (line 735)

-if (i < sizeof(arg))
-    arg[i] = *buf++;
+arg[i] = *buf++;

Unconditional write without bounds check → potential OOB write.

These must be reverted. The isdb file should not be modified at all in this PR.

teletext.rs still has unrelated changes

The reformatting of code from #2240 (multi-line → single-line) is still present. Please revert — it doesn't belong in this PR.

FFmpeg build

We attempted to build with -DWITH_FFMPEG=ON but hit a pre-existing issue: master itself fails with -DWITH_FFMPEG=ON because it defines ENABLE_HARDSUBX without linking tesseract/OCR libs. This is NOT your fault — it's a pre-existing broken coupling in the build system. We can't verify the FFmpeg code path builds correctly until this is resolved separately.

Your --undefined linker flag additions and the ENABLE_HARDSUBX preservation look correct from code inspection, but we can't confirm at link time.

What's still needed

  1. Revert ALL changes to ccx_decoders_isdb.c — the file should not be in this PR
  2. Revert teletext.rs changes — unrelated reformatting
  3. Once those are clean, we'll work on getting the FFmpeg build path testable

The default (non-FFmpeg) build remains clean — 27 CI tests all byte-identical to master.

Dhanush Varma added 5 commits April 7, 2026 12:20
Replace GPAC with FFmpeg for MP4 demuxing via a Rust implementation
using rsmpeg. Supports AVC/H.264, HEVC/H.265, CEA-608, CEA-708,
and tx3g subtitle tracks. Includes chapter extraction via
ccxr_dumpchapters. VobSub tracks are detected but not yet supported.

Selected at compile time via ENABLE_FFMPEG_MP4 / enable_mp4_ffmpeg
Cargo feature.
- mp4_rust_bridge.c/.h: C-callable functions for AVC/HEVC/CC processing
- ccx_gpac_types.h: compat defines for GF_4CC, GF_ISOM_SUBTYPE_C708, etc.
- mp4.c: dispatch to Rust demuxer when ENABLE_FFMPEG_MP4 is set
- ccextractor.c: iterate all input files in MP4 mode
- Remove dead processmp4_ffmpeg/dumpchapters_ffmpeg declarations
- src/CMakeLists.txt: link swresample, re-add libs after ccx_rust for
  circular dependencies, --no-as-needed on Linux, ENABLE_HARDSUBX
- lib_ccx/CMakeLists.txt: target_compile_definitions for ENABLE_HARDSUBX
- rust/CMakeLists.txt: enable_mp4_ffmpeg Cargo feature
- Add --undefined= linker flags for bridge symbols (ccx_mp4_process_avc_sample,
  ccx_mp4_process_hevc_sample, ccx_mp4_process_cc_packet, ccx_mp4_flush_tx3g)
- Restore ENABLE_HARDSUBX alongside ENABLE_FFMPEG_MP4 in both CMakeLists
- Guard MP4 multi-file loop with #ifdef ENABLE_FFMPEG_MP4 (GPAC path unchanged)
- Revert unrelated cosmetic changes (isdb CLUT formatting, teletext.rs)
- Remove unrelated test files (popon_test.scc/srt)
- Revert .gitignore local dev additions
@DhanushVarma-2 DhanushVarma-2 force-pushed the feat/replace-gpac-with-ffmpeg branch from e3e0a27 to 864f86e Compare April 7, 2026 07:03
@DhanushVarma-2
Copy link
Copy Markdown
Contributor Author

@cfsmp3 , all three issues from the follow-up review are fixed now. Rebased onto latest master which also cleaned up the root cause.

@ccextractor-bot
Copy link
Copy Markdown
Collaborator

CCExtractor CI platform finished running the test files on linux. Below is a summary of the test results, when compared to test for commit 5fdd9b8...:
Report Name Tests Passed
Broken 10/13
CEA-708 1/14
DVB 4/7
DVD 3/3
DVR-MS 2/2
General 25/27
Hardsubx 1/1
Hauppage 3/3
MP4 3/3
NoCC 10/10
Options 79/86
Teletext 20/21
WTV 13/13
XDS 34/34

Your PR breaks these cases:

  • ccextractor --startcreditsnotafter 2 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...
  • ccextractor --startcreditsforatmost 2 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...

NOTE: The following tests have been failing on the master branch as well as the PR:

Congratulations: Merging this PR would fix the following tests:

  • ccextractor --autoprogram --out=ttxt --latin1 --ucla --xds 8e8229b88b..., Last passed: Never
  • ccextractor --autoprogram --out=srt --latin1 --quant 0 85271be4d2..., Last passed: Never
  • ccextractor --out=srt --latin1 --autoprogram 56c9f34548..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 99e5eaafdc..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla 7aad20907e..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla 5d3a29f9f8..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla dab1c1bd65..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 01509e4d27..., Last passed: Never
  • ccextractor --out=srt --latin1 --autoprogram 29e5ffd34b..., Last passed: Never
  • ccextractor --out=spupng c83f765c66..., Last passed: Never
  • ccextractor --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9..., Last passed: Never
  • ccextractor --startcreditsnotbefore 1 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9..., Last passed: Never
  • ccextractor --startcreditsforatleast 1 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9..., Last passed: Never
  • ccextractor --autoprogram --out=srt --latin1 4e56e88ba4..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 c0d2fba8c0..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 27d7a43dd6..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 e2e2b501e0..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --xds --latin1 --ucla 85058ad37e..., Last passed: Never
  • ccextractor --autoprogram --out=srt --latin1 --ucla b22260d065..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla --xds 7f41299cc7..., Last passed: Never

It seems that not all tests were passed completely. This is an indication that the output of some files is not as expected (but might be according to you).

Check the result page for more info.

@ccextractor-bot
Copy link
Copy Markdown
Collaborator

CCExtractor CI platform finished running the test files on windows. Below is a summary of the test results, when compared to test for commit 5fdd9b8...:
Report Name Tests Passed
Broken 10/13
CEA-708 1/14
DVB 3/7
DVD 3/3
DVR-MS 2/2
General 25/27
Hardsubx 1/1
Hauppage 3/3
MP4 3/3
NoCC 10/10
Options 75/86
Teletext 20/21
WTV 13/13
XDS 34/34

Your PR breaks these cases:

  • ccextractor --autoprogram --out=srt --latin1 --quant 0 85271be4d2...
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla dab1c1bd65...
  • ccextractor --out=srt --latin1 --autoprogram 29e5ffd34b...
  • ccextractor --out=spupng c83f765c66...
  • ccextractor --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...
  • ccextractor --startcreditsnotbefore 1 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...
  • ccextractor --startcreditsnotafter 2 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...
  • ccextractor --startcreditsforatleast 1 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...
  • ccextractor --startcreditsforatmost 2 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...

NOTE: The following tests have been failing on the master branch as well as the PR:

Congratulations: Merging this PR would fix the following tests:

  • ccextractor --autoprogram --out=ttxt --latin1 --ucla --xds 8e8229b88b..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 132d7df7e9..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 99e5eaafdc..., Last passed: Never
  • ccextractor --autoprogram --out=srt --latin1 b22260d065..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla 7aad20907e..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 01509e4d27..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --xds --latin1 --ucla 85058ad37e..., Last passed: Never
  • ccextractor --autoprogram --out=srt --latin1 --ucla b22260d065..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla --xds 7f41299cc7..., Last passed: Never

It seems that not all tests were passed completely. This is an indication that the output of some files is not as expected (but might be according to you).

Check the result page for more info.

@DhanushVarma-2 DhanushVarma-2 requested a review from cfsmp3 April 7, 2026 20:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants