Xployt-lvl2 – Helper Scripts for Vulnerability Triage

These utilities prepare smaller, model-friendly chunks of a MERN-stack codebase for LLM-based vulnerability scanning.

Directory layout

/output/repo_id
  file_tree.json              # raw file/folder structure of the target repo
  vuln_files_selection.json   # OpenAI-picked folders + standalone files
  vuln_file_metadata.json     # per-file metadata & summaries (backend + frontend)
  file_subsets.json           # GPT-4-clustered file subsets
  subset_pipeline_suggestions.json # suggested pipelines per subset
  pipeline_outputs/           # LLM outputs for each pipeline stage

get_file_struct.py            # quick console tree printer (no JSON)
get_file_struct_json.py       # writes /data/file_tree.json
select_vuln_files.py          # asks GPT-4 for which parts look security-relevant
generate_metadata.py          # builds rich metadata + summaries for each path
group_subsets.py              # clusters files into logical subsets
pipeline_suggester.py         # suggests pipelines per subset
pipeline_executor.py          # executes pipelines on each subset
main.py                       # FastAPI server for pipeline automation

Required environment variables (.env)

OPENAI_API_KEY=<your key>
CODEBASE_PATH=<absolute path to the repo you want to analyse>
REPO_ID=<unique identifier for the repo>
# optional – limit number of files processed by generate_metadata.py

Create a .env file in the project root or export them in your shell.

Installing dependencies

This repo uses Poetry:

poetry install

Usage – step by step

Scan filesystem and build JSON tree

poetry run python get_file_struct_json.py
# -> writes data/file_tree.json

Let GPT-4 decide which folders/files deserve security attention

poetry run python select_vuln_files.py
# -> writes data/vuln_files_selection.json

Generate per-file metadata & natural-language summaries

# full run
poetry run python generate_metadata.py

# or limit to N files for a cheap dry-run
METADATA_MAX_FILES=10 poetry run python generate_metadata.py

# or analyse a different repo root
poetry run python generate_metadata.py --base /some/other/path

Group files into functional subsets

poetry run python group_subsets.py
# -> writes data/file_subsets.json

Suggest analysis pipelines for each subset

poetry run python pipeline_suggester.py
# -> writes data/subset_pipeline_suggestions.json

Execute pipelines on each subset

poetry run python pipeline_executor.py
# -> writes results under output/REPO_ID_data/pipeline_outputs/

Script details

Script	What it does	Key outputs
`get_file_struct.py`	Pretty-prints a depth-limited directory tree to stdout. Handy for a quick visual inspection.	–
`get_file_struct_json.py`	Recursively walks the repo (honours `EXCLUDE_DIRS`) and dumps a JSON object representing folders & files. Uses `CODEBASE_PATH` if set.	`data/file_tree.json`
`select_vuln_files.py`	Sends the JSON tree to GPT-4 with a prompt asking for potentially vulnerable areas. Stores the returned JSON lists.	`data/vuln_files_selection.json`
`generate_metadata.py`	Reads the selection, computes language/LOC/imports per file, calls GPT-4 for a 2-3 sentence summary (cached via SHA-1), and writes a consolidated metadata file.	`data/vuln_file_metadata.json`
`group_subsets.py`	Uses GPT-4 to cluster files into logical subsets based on functional connections (data flow, MVC, shared state).	`data/file_subsets.json`
`pipeline_suggester.py`	For each subset, asks GPT-4 which vulnerability analysis pipelines should run and stores suggestions.	`data/subset_pipeline_suggestions.json`
`pipeline_executor.py`	Executes the suggested pipelines per subset and persists LLM outputs for each pipeline stage.	`output/REPO_ID_data/pipeline_outputs/`

Updating / re-running

• If your codebase changes, re-run the scripts in order. generate_metadata.py only re-summarises files whose SHA-1 changed, saving tokens. • Delete files in /data to force a full rebuild.

Example workflow (one-liner)

# Assume .env has OPENAI_API_KEY and CODEBASE_PATH already
poetry run python get_file_struct_json.py && \
poetry run python select_vuln_files.py && \
poetry run python generate_metadata.py && \
poetry run python group_subsets.py && \
poetry run python pipeline_suggester.py && \
poetry run python pipeline_executor.py

You now have everything needed to batch code & summaries into LLM-sized chunks for vulnerability analysis.

REST API – Pipeline Runner

The FastAPI server (see main.py) exposes a single endpoint to automate the entire workflow from your CI/CD pipeline.

`POST /run-pipeline`

Runs the standard six-step pipeline described above.

Request body (JSON):

{
  "id": "a_unique_id",          // Arbitrary identifier – persisted as REPO_ID in .env
  "path": "/abs/path/to/repo"  // Absolute path to the codebase – persisted as CODEBASE_PATH in .env
}

Behavior:

Updates/creates .env with REPO_ID and CODEBASE_PATH.
Executes scripts in this order, aborting on first failure:
- get_file_struct_json.py
- select_vuln_files.py
- generate_metadata.py
- group_subsets.py
- pipeline_suggester.py
- pipeline_executor.py
Returns JSON:
- If pipeline_executor.py produced an aggregated summary → { "success": true, "results": [...] }
- Otherwise → { "success": true, "output": "<stdout of last script>" }

Successful results example:

{
  "success": true,
  "results": [
    {
      "subset_id": "subset-001",
      "pipeline_id": "pipeline_injection",
      "outputs": [
        "subset-001_pipeline_injection_vuln_report.json",
        "subset-001_pipeline_injection_owasp_only.json",
        "subset-001_pipeline_injection_remediation_suggestions.json"
      ]
    }
  ]
}

Successful plain-output example:

{
  "success": true,
  "output": "Suggestions written to output/idurar-erp-crm_data/pipeline_outputs/..."
}

If a script fails, the API returns HTTP 500 with a body like:

{
  "detail": {
    "message": "Script 'generate_metadata.py' failed (exit code 1)",
    "output": "Traceback …"
  }
}

Starting the API server

Ensure dependencies are installed:

poetry install

Then launch FastAPI with live-reload:

poetry run uvicorn xployt_lvl2.main:app --reload

By default the docs are available at http://127.0.0.1:8003/docs.

Example request (cURL)

curl -X POST http://127.0.0.1:8003/llm/scan \
     -H "Content-Type: application/json" \
     -d '{
           "path": "E:/PROJECTS/ACADAMIC/Xployt-ai/REPOS/idurar-erp-crm-5"
        }'

poetry run uvicorn xployt_lvl2.main:app --reload

curl -X POST http://127.0.0.1:8003/llm/scan \
     -H "Content-Type: application/json" \
     -d '{
           "path": "E:/PROJECTS/ACADAMIC/Xployt-ai/REPOS/vuln_node_express"
         }'

curl -X POST "http://127.0.0.1:8003/execute-module" \
  -H "Content-Type: application/json" \
  -d '{"path": "E:/PROJECTS/ACADAMIC/Xployt-ai/REPOS/vuln_node_express", "module_number": 5}'

curl -X POST http://127.0.0.1:8003/llm/scan  \
     -H "Content-Type: application/json" \
     -d '{
           "path": "E:/PROJECTS/ACADAMIC/Xployt-ai/REPOS/nodejs-goof"
         }'

curl -X POST http://127.0.0.1:8003/llm/scan \
     -H "Content-Type: application/json" \
     -d '{
           "path": "E:/PROJECTS/ACADAMIC/Xployt-ai/REPOS/Zero-Health"
         }'

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
test		test
xployt_lvl2		xployt_lvl2
.env.sample		.env.sample
.gitignore		.gitignore
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Xployt-lvl2 – Helper Scripts for Vulnerability Triage

Directory layout

Required environment variables (.env)

Installing dependencies

Usage – step by step

Script details

Updating / re-running

Example workflow (one-liner)

REST API – Pipeline Runner

`POST /run-pipeline`

Starting the API server

Example request (cURL)

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Xployt-lvl2 – Helper Scripts for Vulnerability Triage

Directory layout

Required environment variables (.env)

Installing dependencies

Usage – step by step

Script details

Updating / re-running

Example workflow (one-liner)

REST API – Pipeline Runner

POST /run-pipeline

Starting the API server

Example request (cURL)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

`POST /run-pipeline`

Packages