An advanced AI-powered application that transforms static PDF documents into interactive knowledge bases. Rebuilt as a decoupled client-server microservice, it safely handles multiple concurrent users and uses a Dual-Architecture approach: RAG (Retrieval Augmented Generation) for precise Q&A and Context Stuffing for full document summarization.
- Decoupled Architecture: A stateless FastAPI backend handles all heavy AI processing, keeping the Streamlit frontend fast and responsive.
- Multi-User Isolation: Generates unique UUIDs for every user session, mapping them to isolated ChromaDB instances to prevent data leaks between concurrent users.
- Cloud-Optimized Memory: Uses Google's Embedding APIs to offload heavy tensor computations, allowing the backend to run flawlessly on memory-constrained cloud tiers (like Render's 512MB limit).
- Automated Garbage Collection: Employs FastAPI Background Tasks to silently sweep and delete orphaned database folders when users abandon their sessions.
- Dual Intelligence Engine: Uses RAG (
k=5chunks) for accurate question answering and Context Stuffing for broad, bullet-point document summaries.
| Layer | Technology |
|---|---|
| Language | Python 3.10+ |
| Backend API | FastAPI, Uvicorn |
| Frontend UI | Streamlit |
| Orchestration | LangChain |
| Vector Database | ChromaDB (Local) |
| LLM | Google Gemma-3-27b-it (via Google Gemini API) |
| Embeddings | Google Gemini Embedding API (models/gemini-embedding-001) |
git clone https://github.com/Prerakshah98/Document-Analysis-Tool.git
cd Document-Analysis-Toolpip install -r requirements.txtCreate a .env file in the root folder and add your Google API key:
GOOGLE_API_KEY=your_actual_api_key_here
Start the FastAPI server:
uvicorn api:app --reload --port 8000Note: Ensure
API_URLinapp.pyis set tohttp://localhost:8000for local testing.
streamlit run app.pyBecause HTTP is stateless, the backend doesn't remember users between clicks. This is solved by generating a UUID4 on the frontend and passing it as a query parameter in every REST API call. The backend uses this UUID as a key in a runtime dictionary to route queries to the correct local database folder.
LangChain's RecursiveCharacterTextSplitter is used with a chunk_size of 1000 and a chunk_overlap of 200. This creates a "sliding window" effect, ensuring sentences cut off at the end of one chunk are repeated at the start of the next, preventing context loss at chunk boundaries.
Instead of using local HuggingFace embedding models (which require ~800MB of RAM for PyTorch), raw text chunks are routed to Google's Embedding API. This drastically reduces the server's memory footprint to under 150MB, making it perfectly suited for free-tier cloud deployments.
Document-Analysis-Tool/
├── api.py # FastAPI backend — handles uploads, querying, and cleanup
├── app.py # Streamlit frontend — user interface
├── rag_logic.py # core RAG code
├── requirements.txt # Python dependencies
└── .env # Environment variables (not committed)
This project is designed to be deployed as two separate services:
- Backend (FastAPI): Deploy on Render, Railway, or any platform supporting Python.
- Frontend (Streamlit): Deploy on Streamlit Community Cloud.
After deploying the backend, update API_URL in app.py to point to your live backend URL before deploying the frontend.
This project is open-source and available under the MIT License.