Skip to content

fix: strict RFC 2397 regex in _parse_base64_data_uri to reject SSE data#1524

Open
MoonSangJin wants to merge 1 commit intolangfuse:mainfrom
MoonSangJin:fix/5659-sse-bug
Open

fix: strict RFC 2397 regex in _parse_base64_data_uri to reject SSE data#1524
MoonSangJin wants to merge 1 commit intolangfuse:mainfrom
MoonSangJin:fix/5659-sse-bug

Conversation

@MoonSangJin
Copy link

@MoonSangJin MoonSangJin commented Feb 13, 2026

Summary

  • _parse_base64_data_uri previously used a loose startswith("data:") check, which misidentified SSE data (e.g., "data: {'foo': 'bar'}") as base64 data URIs, causing spurious error logs
  • Replace manual parsing with a strict RFC 2397 regex requiring the full data:[<mediatype>][;params];base64,<data> format
  • Non-matching inputs now return (None, None) cleanly without error logging

Test plan

  • Added 9 test cases in tests/test_issue_5659.py covering SSE data, valid data URIs, MIME params, missing MIME type, empty/invalid strings
  • All existing tests/test_media.py unit tests pass
  • ruff check, ruff format, mypy all pass

Closes langfuse/langfuse#5659

@CLAassistant
Copy link

CLAassistant commented Feb 13, 2026

CLA assistant check
All committers have signed the CLA.

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

…identifying SSE data

_parse_base64_data_uri previously used a loose startswith("data:") check,
which caused SSE data (e.g., "data: {...}") to be incorrectly processed
as base64 data URIs, resulting in spurious error logs.

Replace the manual parsing with a strict regex that requires the full
data:[<mediatype>][;params];base64,<data> format. Non-matching inputs
now return (None, None) cleanly without error logging.

Closes langfuse/langfuse#5659
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: python: "Data is not base64 encoded" on server sent events

2 participants