PolicyGPT: Intelligent PDF Q&A System using RAG, OpenAI & ChromaDB

pdf-rag-assistant

Build a Production-Ready RAG system for intelligent Q&A over PDFs (policies, contracts, resumes). Powered by OpenAI Structured Outputs, Pydantic for schema enforcement, ChromaDB for local vector storage, and a Streamlit UI. Ideal for enterprise AI in legal, banking, and HR. Scalable, deterministic, and easy to deploy locally.

Architecture Overview

PDFs → Text Extraction → Chunking → Embeddings → ChromaDB User Query → Retrieval → Prompt → OpenAI → JSON Output → UI

Project Strature:

pdf-rag-assistant/ │── app.py │── main.py │── rag/ │ ├── ingest.py │ ├── retrieve.py │ ├── prompt.py │ ├── pdf_loader.py │── models/ │ ├── schema.py │── utils/ │ ├── openai_client.py │── data/ │ ├── sample.pdf │── chroma_db/ │── requirements.txt │── README.md │── .env

🔥 Key Features

📄 Upload and process PDF documents
🔍 Semantic search using embeddings (vector database)
🧠 RAG pipeline (Retrieval + Generation)
📊 Structured JSON output (Pydantic schema)
⚡ Fast Streamlit UI for interaction
🏗️ Modular and production-ready architecture

⚙️ Tech Stack

Python 3
OpenAI API (LLM + Embeddings)
ChromaDB (Vector Database)
Pydantic (Structured Output Validation)
Streamlit (Frontend UI)
PyPDF (PDF Parsing)

🚀 Installation & Setup

git clone https://github.com/NxtGenCodeBase/pdf-rag-assistant.git
cd policygpt

python -m venv venv
venv\Scripts\activate

pip install -r requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PolicyGPT: Intelligent PDF Q&A System using RAG, OpenAI & ChromaDB

pdf-rag-assistant

Architecture Overview

Project Strature:

🔥 Key Features

⚙️ Tech Stack

🚀 Installation & Setup

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
models		models
rag		rag
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
main.py		main.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

PolicyGPT: Intelligent PDF Q&A System using RAG, OpenAI & ChromaDB

pdf-rag-assistant

Architecture Overview

Project Strature:

🔥 Key Features

⚙️ Tech Stack

🚀 Installation & Setup

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages