A locally-hosted Retrieval Augmented Generation (RAG) agent designed to help you study for final exams. It ingests course materials (Slides, Textbooks, Papers, Notes), stores them in a graph-vector database (ArcadeDB), and allows you to chat with or search through your materials using advanced LLMs (Gemini, Gemma).
- Multi-Modal Ingestion: Automatically processes PDFs and Text files, categorizing them as Slides, Textbooks, or Papers.
- Hybrid Storage: Uses ArcadeDB to store document structure and vector embeddings (graph-based retrieval ready).
- Flexible LLM Support:
- Google Gemini (via API Key): Access
gemini-2.0-flash,gemini-2.5-prowithout complex Cloud setup. - Ollama (Local): Run
gemma:2band other open models locally for privacy and offline usage.
- Google Gemini (via API Key): Access
- Smart Search: Client-side cosine similarity search to find relevant course concepts instantly.
- Interactive UI: Built with Streamlit for a seamless Chat and Search experience.
- Python 3.9+
- Docker & Docker Compose (for ArcadeDB)
- Ollama (optional, for local models)
- Google API Key (recommended for best chat performance)
Clone the repository and create a .env file in the root directory:
touch .envAdd your Google API Key to .env:
GOOGLE_API_KEY=your_actual_api_key_here
Note: Get a key from Google AI Studio. If you prefer local-only, you can skip this but must use Ollama.
We use ArcadeDB running in Docker. Start it with:
docker compose up -dPort 2480 (HTTP) and 2424 (Binary) will be exposed.
If you want to use local embeddings (Gemma) or chat models:
- Install Ollama.
- Pull the required models:
Note: The system uses
ollama pull gemma:2b
gemma:2bfor local embeddings and chat. - Start the server:
ollama serve
Create a virtual environment and install the Python packages:
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txtYou need to run both the Backend (API) and the Frontend (UI).
Terminal 1: Backend (FastAPI)
source .venv/bin/activate
uvicorn app.langchain.main:app --host 0.0.0.0 --port 8000 --reloadOn startup, the backend checks database/PDFs and automatically ingests any new files.
Terminal 2: Frontend (Streamlit)
source .venv/bin/activate
streamlit run app/streamlit/app.pyAccess the UI at http://localhost:8501.
- Automatic: Place your PDF/TXT files in
database/PDFs/slides,textbooks, orpapers. Restart the backend to ingest them. - Manual: Use the "Upload Document" sidebar in the Streamlit UI to add files dynamically.
- Select Provider: Choose "Vertex" (Google API) or "Ollama" in the sidebar.
- Select Model:
- Vertex:
gemini-2.0-flash(Fast),gemini-2.5-pro(Reasoning),gemma-3-12b-it. - Ollama:
gemma:2b(Local).
- Vertex:
- Ask Questions: Type queries like "Explain Grover's Algorithm from the slides". The agent uses RAG to fetch context and cite sources (e.g.,
(Page 3)).
- Switch to the Search tab.
- Enter a concept (e.g., "Tensor Products").
- View exact text matches and their source pages.
- Note: Ensure your selected Provider matches the one used for ingestion for best results (Ollama
gemma:2bis the current default for ingestion).
- Note: Ensure your selected Provider matches the one used for ingestion for best results (Ollama
- "Model not found" error:
Ensure you have pulled the model in Ollama:
ollama pull gemma:2b. - Search returns no results:
If you switched embedding models (e.g., from Nomic to Gemma), you must re-ingest your data. Stop the app, clear
database/ArcadeDB/arcadedb-*/databases, and restart the backend. - 500 Error on Chat:
Check the backend logs terminal. Ensure your
GOOGLE_API_KEYis valid if using Vertex/Gemini models.
- Backend: FastAPI
- LLM Orchestration: LangChain
- Database: ArcadeDB (Vertex-Type Schema)
- Frontend: Streamlit