Skip to content

feat(keras): detect StringLookup external vocabulary paths#727

Merged
mldangelo-oai merged 6 commits intomainfrom
fix/keras-stringlookup-cve-2025-12058
Mar 20, 2026
Merged

feat(keras): detect StringLookup external vocabulary paths#727
mldangelo-oai merged 6 commits intomainfrom
fix/keras-stringlookup-cve-2025-12058

Conversation

@mldangelo-oai
Copy link
Contributor

@mldangelo-oai mldangelo-oai commented Mar 20, 2026

Summary

  • add a maintainer PR plan for the major model-format CVE gaps in docs/maintainers/cve-gap-pr-plan-2026-03-20.md
  • detect CVE-2025-12058 in .keras archives when StringLookup uses a scalar external vocabulary path or URL
  • keep inline vocabulary lists clean, and emit a passing version check for fixed Keras versions instead of a warning

False-positive controls

  • only flags scalar vocabulary strings that are clearly external: URL-like, absolute path, home-relative, or traversal-like
  • does not flag inline vocabulary lists/tuples

QA

  • uv run pytest tests/scanners/test_keras_zip_scanner.py -q
  • uv run ruff check modelaudit/ tests/
  • uv run ruff format --check modelaudit/ tests/
  • uv run mypy modelaudit/
  • direct CLI smoke on malicious .keras: exit 1, severities ['warning'], CVEs ['CVE-2025-12058']
  • direct CLI smoke on benign inline-vocabulary .keras: exit 0, no issues
  • uv run pytest -n auto -m "not slow and not integration" --maxfail=1

Summary by CodeRabbit

  • New Features

    • Added detection for CVE-2025-12058 in Keras StringLookup vocabularies, flagging external-path/URL cases and gating findings by Keras version (<3.12.0).
  • Tests

    • Added regression tests for external paths, remote URLs, Windows/home-relative paths, inline vocabularies, and version-gated outcomes.
  • Documentation

    • Updated changelog with the CVE entry and added a maintainer CVE-gap planning document plus a CVE-specific explanation entry.

Add CVE-2025-12058 coverage for .keras archives by flagging StringLookup
layers whose vocabulary is a scalar external path or URL, while keeping
inline vocabulary lists clean. Include a maintainer PR plan for the
remaining major model-format CVE gaps.

Co-authored-by: Codex <noreply@openai.com>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 20, 2026

Warning

Rate limit exceeded

@mldangelo-oai has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 5 minutes and 33 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: fb63ee39-dee3-4d1c-b085-76c5535421d6

📥 Commits

Reviewing files that changed from the base of the PR and between b849edd and 9ed2f31.

📒 Files selected for processing (4)
  • CHANGELOG.md
  • modelaudit/config/explanations.py
  • modelaudit/scanners/keras_zip_scanner.py
  • tests/scanners/test_keras_zip_scanner.py

Walkthrough

Adds CVE-2025-12058 coverage: changelog entry and maintainer plan, a new explanation helper, scanner logic detecting external vocabulary references in Keras StringLookup with version gating, and regression tests covering vulnerable, fixed, benign, and Windows/path variants.

Changes

Cohort / File(s) Summary
Changelog & Maintainer Plan
CHANGELOG.md, docs/maintainers/cve-gap-pr-plan-2026-03-20.md
Added Unreleased changelog entry for CVE-2025-12058 and a dated maintainer PR plan specifying required malicious/benign fixtures, seven targeted PRs with CVE mappings, QA expectations, and acceptance criteria.
Explanations
modelaudit/config/explanations.py
Added get_cve_2025_12058_explanation(issue_type: str) returning CVE-2025-12058 specific messages and a fallback explanation referencing affected Keras versions.
Keras scanner
modelaudit/scanners/keras_zip_scanner.py
Introduced regex helpers (_URL_SCHEME_PATTERN, _WINDOWS_ABSOLUTE_PATH_PATTERN), _is_vulnerable_to_cve_2025_12058 predicate, and detection inside _scan_model_config for StringLookup vocabulary values (URLs, POSIX/Windows absolute paths, ~/, traversal patterns). Emits CVE-2025-12058 findings with severity/gating: FAILED for vulnerable versions, INFO/metadata-only for fixed versions, and version-unknown warnings when metadata missing.
Tests
tests/scanners/test_keras_zip_scanner.py
Added regression tests covering: absolute external path -> single CVE-2025-12058 FAILED/WARNING with layer/CWE details; remote URL -> CVE-tagged finding; inline list -> no CVE/WARNING/CRITICAL; Keras 3.12.0 -> metadata-only INFO finding; Windows home-relative path -> CVE FAILED case.

Sequence Diagram(s)

sequenceDiagram
  participant Runner as Scanner Runner
  participant KZ as KerasZipScanner
  participant Parser as Model Config Parser
  participant Verifier as VersionChecker
  participant Reporter as Findings Reporter

  Runner->>KZ: provide model archive + metadata
  KZ->>Parser: parse model_config.json
  Parser->>KZ: return layers list
  KZ->>KZ: identify layers where class_name == "StringLookup"
  KZ->>KZ: extract layer.config.vocabulary
  KZ->>Verifier: does vocabulary look like URL/absolute/traversal?
  Verifier->>KZ: yes/no
  KZ->>Verifier: check keras_version vulnerable (<3.12.0)?
  Verifier->>KZ: vulnerable / fixed / unknown
  KZ->>Reporter: emit CVE-2025-12058 finding (FAILED / INFO / WARNING) with explanation
  Reporter->>Runner: aggregated findings
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

🐰 I found a string that wandered far and wide,
A lookup that tried to read from the outside.
I hopped, I sniffed, I left a tidy trace,
Tests and gates now keep the model in place. 🥕

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely describes the primary change: adding detection for StringLookup external vocabulary paths in Keras, which is the main objective of the CVE-2025-12058 security detection feature.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/keras-stringlookup-cve-2025-12058

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 651aa97ce9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +1243 to +1246
patch_digits = "".join(ch for ch in parts[2] if ch.isdigit())
if patch_digits:
patch = int(patch_digits)
return (major, minor, patch) < (3, 12, 0)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Parse Keras versions with PEP 440 semantics

_is_vulnerable_to_cve_2025_12058 concatenates all digits from the patch segment, so prereleases like 3.12.0rc1/3.12.0.dev1 are parsed as patch 1 and treated as non-vulnerable. _check_stringlookup_vocabulary_path then emits a passing version check, creating a security false negative for vulnerable prerelease builds.

Useful? React with 👍 / 👎.

Co-authored-by: Codex <noreply@openai.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
CHANGELOG.md (1)

74-82: ⚠️ Potential issue | 🟡 Minor

Run the markdown formatter before merge.

The docs check is already failing on this file, so this entry should be included in the repo’s Prettier pass before merge. As per coding guidelines, "**/*.{md,yaml,yml,json}: When formatting markdown/json/yaml, use npm ci --ignore-scripts && npx prettier --write \"**/*.{md,yaml,yml,json}\" if instructed or if formatting drifts`."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@CHANGELOG.md` around lines 74 - 82, The CHANGELOG.md entry under "##
[Unreleased]" is not formatted per repo style; run the markdown formatter on
CHANGELOG.md before merging by following the repo's markdown/json/yaml
formatting step (run npm ci --ignore-scripts then npx prettier --write
"**/*.{md,yaml,yml,json}") so the added bullets (tests/security/keras lines)
conform to Prettier rules and the docs check passes; re-run the docs check and
commit the formatted CHANGELOG.md.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/maintainers/cve-gap-pr-plan-2026-03-20.md`:
- Around line 9-13: The QA recipe currently uses a host-specific example
(`/etc/passwd`) when demonstrating a malicious `.keras` with `StringLookup` and
config.vocabulary; replace that with a deterministic temporary-file created via
a tmp_path-style fixture (e.g., create a temp file under tmp_path and point
config.vocabulary to that file) so tests remain portable and CI-safe, update the
benign example note to explicitly mention inline vocabulary lists/tuples remain
allowed, and ensure the doc text around `StringLookup`, `config.json`, and
`fix/keras-stringlookup-cve-2025-12058` reflects the tmp_path usage.

In `@modelaudit/scanners/keras_zip_scanner.py`:
- Around line 986-994: The home-relative check currently uses
candidate.startswith("~/") which misses Windows-style "~\\..." paths; use the
already-computed normalized (normalized = candidate.replace("\\", "/")) for that
check instead—i.e., change the condition to normalized.startswith("~/") in the
boolean expression that also references _URL_SCHEME_PATTERN,
_WINDOWS_ABSOLUTE_PATH_PATTERN and the "../" checks so "~\\vocab.txt" is
detected correctly.
- Around line 951-962: The StringLookup CVE check currently emits passed=True
when keras_version is a string; change it to passed=False and update the message
to indicate "metadata-only assessment is inconclusive" (mirroring the
TorchModuleWrapper behavior) so archive-supplied metadata cannot produce a false
pass; keep details keys (layer_name, layer_class, keras_version) and location
but flip the verdict and message in the result.add_check call where
keras_version is a str. Also harden the external vocabulary path detection by
normalizing the candidate path before the tilde check (e.g., replace backslashes
with forward slashes or otherwise canonicalize) so candidate.startswith("~/")
correctly catches Windows-style "~\\..." variants in the external path check
that currently uses candidate.startswith("~/").

---

Outside diff comments:
In `@CHANGELOG.md`:
- Around line 74-82: The CHANGELOG.md entry under "## [Unreleased]" is not
formatted per repo style; run the markdown formatter on CHANGELOG.md before
merging by following the repo's markdown/json/yaml formatting step (run npm ci
--ignore-scripts then npx prettier --write "**/*.{md,yaml,yml,json}") so the
added bullets (tests/security/keras lines) conform to Prettier rules and the
docs check passes; re-run the docs check and commit the formatted CHANGELOG.md.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 5da1a73a-3a86-48e3-a8c0-0036c092c777

📥 Commits

Reviewing files that changed from the base of the PR and between d19d6fd and 651aa97.

📒 Files selected for processing (5)
  • CHANGELOG.md
  • docs/maintainers/cve-gap-pr-plan-2026-03-20.md
  • modelaudit/config/explanations.py
  • modelaudit/scanners/keras_zip_scanner.py
  • tests/scanners/test_keras_zip_scanner.py

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
docs/maintainers/cve-gap-pr-plan-2026-03-20.md (1)

13-13: ⚠️ Potential issue | 🟡 Minor

Replace host-specific /etc/passwd example with a deterministic temp-path fixture.

Please update this QA example to use a portable fixture path (e.g., tmp_path / "vocab.txt") instead of /etc/passwd, so guidance stays CI-safe and deterministic.

Suggested wording
-   - QA: malicious `.keras` with `StringLookup` + `/etc/passwd`; benign `.keras` with inline vocabulary list
+   - QA: malicious `.keras` with `StringLookup` + a `tmp_path / "vocab.txt"` external fixture; benign `.keras` with inline vocabulary list/tuple

Based on learnings: "Use deterministic fixtures only in tests; never reference host paths like /etc/passwd; create all targets under tmp_path."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/maintainers/cve-gap-pr-plan-2026-03-20.md` at line 13, Update the QA
example that references a host-specific path by replacing the `/etc/passwd`
usage with a deterministic test fixture path (e.g., use tmp_path / "vocab.txt")
in both the "malicious `.keras` with `StringLookup`" and the "benign `.keras`
with inline vocabulary list" examples; ensure the prose and any sample commands
mention creating the vocab file under tmp_path (or similar CI-safe temp fixture)
so the example is portable and deterministic across environments.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@docs/maintainers/cve-gap-pr-plan-2026-03-20.md`:
- Line 13: Update the QA example that references a host-specific path by
replacing the `/etc/passwd` usage with a deterministic test fixture path (e.g.,
use tmp_path / "vocab.txt") in both the "malicious `.keras` with `StringLookup`"
and the "benign `.keras` with inline vocabulary list" examples; ensure the prose
and any sample commands mention creating the vocab file under tmp_path (or
similar CI-safe temp fixture) so the example is portable and deterministic
across environments.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 88221257-e71a-4f94-9fbc-5cb71e10e476

📥 Commits

Reviewing files that changed from the base of the PR and between 651aa97 and 770d0b0.

📒 Files selected for processing (1)
  • docs/maintainers/cve-gap-pr-plan-2026-03-20.md

Use normalized paths for Windows-style home-relative vocabulary detection,
replace the host-specific plan example with a tmp_path fixture, and downgrade
fixed-version archive metadata to an inconclusive INFO finding instead of a
false pass.

Co-authored-by: Codex <noreply@openai.com>
@mldangelo-oai
Copy link
Contributor Author

Addressed the actionable review points in b849edd:

  • replaced the host-specific plan example with a fixture path
  • normalized Windows-style vocabulary paths before the tilde check
  • changed fixed-version archive metadata from a false pass to an inconclusive INFO finding, so untrusted metadata does not prove safety

Validation:

  • ................................................................ [100%]
    64 passed in 0.14s
  • Success: no issues found in 192 source files
  • ============================= test session starts ==============================
    platform darwin -- Python 3.11.14, pytest-9.0.2, pluggy-1.6.0
    rootdir: /Users/mdangelo/code/modelaudit
    configfile: pyproject.toml
    testpaths: tests
    plugins: anyio-4.12.1, xdist-3.8.0, asyncio-1.3.0, cov-7.0.0
    asyncio: mode=Mode.STRICT, debug=False, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function
    created: 16/16 workers
    16 workers [2555 items]

........................................................................ [ 2%]
........................................................................ [ 5%]
........................................................................ [ 8%]
........................................................................ [ 11%]
........................................................................ [ 14%]
........................................................................ [ 16%]
........................................................................ [ 19%]
........................................................................ [ 22%]
........................................................................ [ 25%]
........................................................................ [ 28%]
........................................................................ [ 30%]
.....................s.s.........................................s...... [ 33%]
........................................................................ [ 36%]
........................................................................ [ 39%]
..................................s.....sss.s..........s..........s....s [ 42%]
........................................................................ [ 45%]
.............................................................s.......... [ 47%]
........................................................................ [ 50%]
........................................................................ [ 53%]
........................................................................ [ 56%]
........................................................................ [ 59%]
........................................................................ [ 61%]
........................................................................ [ 64%]
...........................s............................................ [ 67%]
........................................................................ [ 70%]
........................................................................ [ 73%]
........................................................................ [ 76%]
.ss.........ss.s.........................ssss........................... [ 78%]
........................................................................ [ 81%]
........................................................................ [ 84%]
...........................................s.s.......................... [ 87%]
........................................................................ [ 90%]
........................................................................ [ 92%]
........................................................................ [ 95%]
..........................................................s..........s.. [ 98%]
s.................................. [100%]
=========================== short test summary info ============================
SKIPPED [1] tests/scanners/test_pytorch_binary_scanner.py:79: ML context filtering now ignores executable signatures in weight-like data to reduce false positives
SKIPPED [1] tests/scanners/test_pytorch_binary_scanner.py:134: ML context filtering now ignores executable signatures in weight-like data to reduce false positives
SKIPPED [1] tests/scanners/test_sevenzip_scanner.py:615: py7zr not available for integration tests
SKIPPED [1] tests/scanners/test_sevenzip_scanner.py:107: py7zr not available
SKIPPED [1] tests/scanners/test_sevenzip_scanner.py:126: py7zr not available
SKIPPED [1] tests/scanners/test_sevenzip_scanner.py:145: py7zr not available
SKIPPED [1] tests/scanners/test_sevenzip_scanner.py:182: py7zr not available
SKIPPED [1] tests/scanners/test_sevenzip_scanner.py:248: py7zr not available
SKIPPED [1] tests/scanners/test_sevenzip_scanner.py:324: py7zr not available
SKIPPED [1] tests/scanners/test_sevenzip_scanner.py:444: py7zr not available
SKIPPED [1] tests/scanners/test_sevenzip_scanner.py:503: py7zr not available
SKIPPED [1] tests/scanners/test_weight_distribution_scanner.py:444: TensorFlow not installed
SKIPPED [1] tests/test_performance_benchmarks.py:195: psutil not available for memory testing
SKIPPED [8] tests/conftest.py:144: tensorflow is not installed
SKIPPED [1] tests/test_tensorflow_lambda_detection.py:218: TensorFlow not installed
SKIPPED [1] tests/utils/helpers/test_ml_context_false_positives.py:149: Skipping due to test environment differences - core functionality verified with real models
SKIPPED [1] tests/utils/helpers/test_ml_context_false_positives.py:199: Skipping due to test environment differences - core functionality verified with real models
SKIPPED [1] tests/test_performance_benchmarks.py:152: Not enough asset files for scaling test
SKIPPED [1] tests/test_performance_benchmarks.py:255: Skipping concurrency overhead check in local environment (overhead=3.69x)
SKIPPED [1] tests/test_performance_benchmarks.py:312: Skipping timeout performance test due to enhanced security scanning. The improved security detection now performs more thorough analysis, which introduces legitimate performance variance that makes timeout overhead measurements unreliable. The core security functionality has been verified to work correctly.
================= 2528 passed, 27 skipped in 69.83s (0:01:09) ==================

  • direct smoke: fixed-version external vocabulary now emits 0 warnings/criticals and 1 INFO

@mldangelo-oai
Copy link
Contributor Author

Corrected summary for the latest push in b849edd:

  • replaced the host-specific plan example with a tmp_path fixture path
  • normalized Windows-style ~\\... vocabulary paths before the tilde check
  • changed fixed-version archive metadata from a false pass to an inconclusive INFO finding, so untrusted metadata does not prove safety

Validation:

  • uv run pytest tests/scanners/test_keras_zip_scanner.py -q
  • uv run mypy modelaudit/
  • uv run pytest -n auto -m "not slow and not integration" --maxfail=1
  • direct smoke: fixed-version external vocabulary now emits 0 warnings/criticals and 1 info

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@modelaudit/scanners/keras_zip_scanner.py`:
- Around line 1236-1252: The _is_vulnerable_to_cve_2025_12058 function currently
strips non-digits from the patch and therefore treats prerelease tags like
"3.12.0rc1" as a released 3.12.0 (non-vulnerable); update its parsing to detect
PEP 440 prerelease suffixes (a, b, rc, dev) in parts[2] similar to
_is_vulnerable_keras_3_11_x: extract numeric patch and also capture any alpha
suffix, and if the numeric (major, minor, patch) equals (3, 12, 0) but a
prerelease suffix exists (or if a prerelease makes the version logically
earlier), return True (vulnerable); otherwise retain the existing numeric
comparison logic and keep the try/except ValueError behavior.

In `@tests/scanners/test_keras_zip_scanner.py`:
- Around line 289-320: Add a prerelease-version test for CVE-2025-12058 that
mirrors the existing fixed-version test: create a KerasZipScanner instance and
use create_configured_keras_zip with prerelease keras_version values (e.g.,
"3.12.0a0", "3.12.0rc1", "3.12.0.dev0") containing a StringLookup layer with an
external vocabulary path, call scanner.scan(...) and assert the CVE-2025-12058
check is present and has severity IssueSeverity.WARNING (and that
vulnerable-version tests still produce a failed check); place this as a new test
function named like test_stringlookup_prerelease_versions_treated_as_vulnerable
so it follows the existing pattern and references the same check/filter logic
used in test_stringlookup_external_vocabulary_path_is_passing_on_fixed_keras.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 49f8a326-e1a9-4905-bdce-20c1916b71f3

📥 Commits

Reviewing files that changed from the base of the PR and between 770d0b0 and b849edd.

📒 Files selected for processing (3)
  • docs/maintainers/cve-gap-pr-plan-2026-03-20.md
  • modelaudit/scanners/keras_zip_scanner.py
  • tests/scanners/test_keras_zip_scanner.py

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b849eddcca

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +991 to +998
return (
bool(_URL_SCHEME_PATTERN.match(candidate))
or candidate.startswith("/")
or normalized.startswith("~/")
or bool(_WINDOWS_ABSOLUTE_PATH_PATTERN.match(candidate))
or normalized.startswith("../")
or "/../" in normalized
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Detect relative StringLookup vocabulary paths

_is_external_stringlookup_vocabulary only matches URL/absolute/home/traversal forms and ignores plain relative filenames like "vocab.txt". StringLookup still treats those as filesystem paths, so a crafted vulnerable model can bypass CVE-2025-12058 detection by using a non-traversal relative path.

Useful? React with 👍 / 👎.

mldangelo-oai and others added 2 commits March 20, 2026 11:57
Co-authored-by: Codex <noreply@openai.com>
… CVE

Handle prerelease suffixes like rc, a, b, and dev in the CVE-2025-12058
version gate so 3.12.0 prereleases are not misclassified as the final fixed
release, and add regression coverage for prerelease metadata.

Co-authored-by: Codex <noreply@openai.com>
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3fff611eba

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +1240 to +1241
if not version_match:
return False

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Treat unparseable Keras versions as unknown risk

_is_vulnerable_to_cve_2025_12058() returns False when version parsing fails, and _check_stringlookup_vocabulary_path() interprets that as “outside vulnerable range”. Non-canonical metadata like keras-3.11.3/v3.11.3 would therefore be downgraded to an INFO metadata check instead of a CVE warning, creating a false negative path for vulnerable archives.

Useful? React with 👍 / 👎.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9ed2f31aba

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +1007 to +1008
or bool(_WINDOWS_ABSOLUTE_PATH_PATTERN.match(candidate))
or normalized.startswith("../")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Flag rooted Windows StringLookup paths as external

_is_external_stringlookup_vocabulary only recognizes Windows drive-letter/UNC prefixes. Rooted paths like \vocab.txt are not matched, but on Windows they are absolute paths, so a malicious .keras can still trigger local file reads while bypassing CVE-2025-12058 detection.

Useful? React with 👍 / 👎.

@mldangelo-oai mldangelo-oai merged commit 20e9852 into main Mar 20, 2026
26 checks passed
@mldangelo-oai mldangelo-oai deleted the fix/keras-stringlookup-cve-2025-12058 branch March 20, 2026 19:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant