RAG-Fundamentals 🚀

Comprehensive Collection of Retrieval-Augmented Generation Techniques

A complete hands-on guide to RAG implementations, from basic concepts to advanced techniques. This repository contains practical notebooks demonstrating various RAG approaches, retrieval strategies, and optimization methods.

📊 RAG Architecture Flowchart

Here's the complete RAG system architecture:

graph TD
    A[📄 DATA] --> B[🔪 CHUNK 1]
    A --> C[🔪 CHUNK 2] 
    A --> D[🔪 CHUNK 3]
    
    B --> E[🧠 EMBEDDING 1]
    C --> F[🧠 EMBEDDING 2]
    D --> G[🧠 EMBEDDING 3]
    
    E --> H[💾 DATABASE<br/>Vector Store]
    F --> H
    G --> H
    
    I[👤 USER] --> J[🔎 QUERY]
    J --> K[🧠 EMBEDDING<br/>Query]
    
    I -.->|Generation| L[🔍 SEMANTIC SEARCH]
    K --> L
    H --> L
    
    L --> M[🎯 RETRIEVAL]
    M --> N[📊 RANKED RESULTS]
    N --> O[🔄 RERANKED RESULTS]
    
    O --> P[🤖 LLM<br/>Query+Prompt+Context]
    P --> Q[💬 RESPONSE]
    
    style A fill:#000000,color:#ffffff
    style B fill:#000000,color:#ffffff
    style C fill:#000000,color:#ffffff
    style D fill:#000000,color:#ffffff
    style E fill:#000000,color:#ffffff
    style F fill:#000000,color:#ffffff
    style G fill:#000000,color:#ffffff
    style H fill:#000000,color:#ffffff
    style I fill:#000000,color:#ffffff
    style J fill:#000000,color:#ffffff
    style K fill:#000000,color:#ffffff
    style L fill:#000000,color:#ffffff
    style M fill:#000000,color:#ffffff
    style N fill:#000000,color:#ffffff
    style O fill:#000000,color:#ffffff
    style P fill:#000000,color:#ffffff
    style Q fill:#000000,color:#ffffff

Core RAG Components:

📄 Data Processing: Documents are split into manageable chunks
🧠 Embedding: Text chunks converted to vector representations
💾 Vector Database: Stores embeddings for efficient retrieval
🔍 Semantic Search: Finds relevant chunks using similarity
🎯 Retrieval: Returns top-k most relevant documents
🔄 Reranking: Optimizes order for better context
🤖 LLM Generation: Produces final answer with retrieved context

🎯 What is RAG?

Retrieval-Augmented Generation combines the power of:

🔍 Information Retrieval: Finding relevant external knowledge
🤖 Language Generation: Creating contextual responses
📚 External Knowledge: Up-to-date, domain-specific information

📁 Repository Structure

🔰 Basic RAG

Foundation techniques for RAG implementation

Notebook	Purpose	Key Techniques
`basic_rag.ipynb`	Core RAG pipeline	Vector embeddings, semantic search, prompt engineering
`document_rag.ipynb`	PDF document processing	Local knowledge base, metadata handling
`url_rag.ipynb`	Web content RAG	Real-time retrieval, dynamic knowledge updates

🔍 Retriever Techniques

Advanced retrieval strategies beyond basic semantic search

Technique	Notebook	Innovation
Contextual Compression	`Contextual_Compression_Retriever.ipynb`	Compresses retrieved docs to extract only relevant portions
HyDE	`Hypothetical_Document_Embedding_(HyDE).ipynb`	Generates hypothetical answers to improve retrieval accuracy
Multi-Hop Retrieval	`MultiHop_Query_Step_by_Step_Retrieval.ipynb`	Step-by-step retrieval for complex queries
Parent Document	`Parent_Document_Retriever.ipynb`	Retrieves small chunks but returns larger parent documents
Self-Query	`Self_Query_Retrieval.ipynb`	Natural language query parsing with metadata filtering
Sentence Window	`Sentence_Window_Retrieval.ipynb`	Expanded context windows around relevant sentences

🏆 ReRanking Techniques

Optimize retrieved document ordering for better LLM performance

Method	Notebook	Advantage
BM25 Rerank	`bm25_rerank_rag.ipynb`	Statistical keyword-based relevance scoring
Cohere Rerank	`cohere_rerank.ipynb`	State-of-the-art neural reranking API
Cross-Encoder	`cross_encoder_rerank.ipynb`	Joint query-document encoding for precise relevance
Flash ReRank	`Flash_ReRanking.ipynb`	Ultra-fast reranking for real-time applications

🔗 Hybrid Search RAG

Combine multiple search methodologies for superior performance

Technique	Notebook	Combines
Cosine Similarity	`cosine.ipynb`	Vector similarity mathematics and optimization
Hybrid Search	`hybridsearch-rag.ipynb`	Semantic + keyword search with score fusion algorithms

⚡ RAG Fusion

Fuse multiple retrieval methods for comprehensive results

Method	Notebook	Fusion Strategy
Reciprocal Rank Fusion	`ReciProcal_Rank_Fusion.ipynb`	Combines rankings from multiple retrievers using RRF algorithm

🎯 Lost In Middle Problem

Solve positional bias where important info gets overlooked in middle context

Solution	Notebook	Approach
Merger & Reranking	`MergerRetriever_And_Reranking.ipynb`	Strategic document ordering and context reorganization

🖼️ MultiModal RAG

Process and understand multiple data types including text, images, charts, and diagrams

Technique	Notebook	Innovation
MultiModal Processing	`MultiModal_RAG(IMG+TexT).ipynb`	CLIP-based unified embeddings for text and images, cross-modal retrieval

🤖 Agentic RAG

Intelligent agent-based RAG systems with autonomous decision making

Method	Notebook	Capability
Agentic RAG	`Agentic_RAG.ipynb`	Autonomous agent-based retrieval and generation
Memory RAG	`RAG_Agent_WIth_Memory.ipynb`	Conversation history tracking and contextual responses
LangGraph RAG	`Rag_with_langgraph.ipynb`	Graph-based workflow orchestration for complex RAG
Routed RAG	`Routed_RAG_With_LLM_Router.ipynb`	LLM-powered routing for dynamic retrieval strategies

🔬 Advanced RAG

Cutting-edge RAG techniques for specialized applications

Technique	Notebook	Advanced Feature
Agentic RAG	`Agentic_RAG.ipynb`	Agent-based autonomous reasoning with query rewriting
Corrective RAG (CRAG)	`Corrective_RAG_(CRAGgi).ipynb`	Self-correcting retrieval with web search fallback
MultiModal Agent RAG	`MultiModal_Agent_RAG.ipynb`	ArXiv papers and image analysis with academic citations
RAG Swarm Agent	`RAG_Swarm_Agent.ipynb`	Multi-domain expert agents with smart routing

👨‍💼 Supervisor Agent RAG

Intelligent routing across multiple knowledge domains with domain-specific vector stores

Technique	Notebook	Capability
Supervisor Agent	`SuperVisor_RAG_Agent.ipynb`	Multi-domain routing with automatic classification and caching

💾 Structured Query RAG

Text-to-SQL RAG systems for natural language querying of structured databases

Technique	Notebook	Capability
Structured Retrieval	`Structured_Retrieval_RAG.ipynb`	Schema-aware natural language to SQL with result interpretation

⏱️ Dynamic RAG

Self-updating knowledge bases with version control and real-time updates

Technique	Notebook	Capability
Dynamic Knowledge Update	`Dynamic_Knowledge_Update_RAG.ipynb`	Version-tracked document updates with timestamp awareness

🕸️ Graph RAG

Knowledge graph-enhanced retrieval leveraging entity relationships and community structures

Technique	Notebook	Capability
Basic Graph RAG	`Graph_RAG.ipynb`	Entity extraction and document-entity graph construction
Hybrid Search Graph RAG	`Hybrid_Search_RAG.ipynb`	Community detection and advanced graph traversal techniques

📊 RAG Evaluation

Comprehensive evaluation frameworks for RAG system performance

Framework	Notebook	Evaluation Focus
RAG Evaluation	`RAG_Evaluation.ipynb`	Custom evaluation metrics and benchmarking
RAGAs Evaluation	`RAGAs_Evaluation.ipynb`	Automated evaluation using RAGAs framework

🔧 Key Technologies Used

🦜 LangChain: RAG framework and components
🤗 HuggingFace: Embeddings and transformers
📊 ChromaDB: Vector database storage
🔍 FAISS: Efficient similarity search
🌐 OpenAI: Embeddings and language models
⚡ Cohere: Professional reranking services
🖼️ CLIP: Multimodal embeddings for text and images
🔄 LangGraph: Graph-based workflow orchestration
⚖️ RAGAs: Automated RAG evaluation framework
🧠 Groq: High-performance LLM inference
📝 Sentence Transformers: Lightweight embedding models
🗃️ SQLite: Structured database for Text-to-SQL RAG
📊 Tavily: Web search API for real-time information
🔍 spaCy: Named entity recognition and NLP processing
🕸️ NetworkX: Graph creation and analysis for knowledge graphs

🚀 Getting Started

Clone the repository
Install dependencies: pip install -r requirements.txt
Start with Basic RAG to understand fundamentals
Explore Retriever Techniques for advanced retrieval strategies
Implement ReRanking for better accuracy
Try Hybrid Approaches for production systems
Explore MultiModal RAG for documents with images and text
Advance to Agentic RAG for autonomous intelligent systems
Use RAG Evaluation to benchmark and optimize performance

🎯 Use Cases Covered

📖 Document Q&A: Academic papers, legal documents
🌐 Web Search: Real-time information retrieval
🏢 Enterprise: Knowledge management systems
🔬 Research: Multi-document analysis and synthesis
💬 Chatbots: Context-aware conversational AI
🖼️ Multimodal Analysis: Documents with text, images, and charts
🤖 Autonomous Systems: Agent-based intelligent retrieval
📊 Performance Evaluation: RAG system benchmarking and optimization
🔄 Self-Correcting Systems: Adaptive and corrective RAG implementations
💾 Structured Data: Natural language querying of databases with Text-to-SQL
🔄 Dynamic Knowledge: Self-updating, version-controlled knowledge bases
🧠 Multi-Domain Routing: Intelligent classification and retrieval across different knowledge domains
🕸️ Knowledge Graphs: Entity relationship-enhanced retrieval with graph structures

🎯 Master RAG: From basic retrieval to advanced fusion techniques, this repository provides everything needed to build Development-ready RAG systems.

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
Advanced RAG		Advanced RAG
Agentic RAG		Agentic RAG
Basic Rag		Basic Rag
Basic Structured Supervisor Agent RAG		Basic Structured Supervisor Agent RAG
Custom RAG		Custom RAG
Dynamic RAG		Dynamic RAG
Graph RAG		Graph RAG
HybridSearch Rag		HybridSearch Rag
Lost In Middle Rag Problem		Lost In Middle Rag Problem
MultiModal RAG		MultiModal RAG
RAG Evaluation		RAG Evaluation
Rag Fusion		Rag Fusion
ReRanking Techniques		ReRanking Techniques
Retriever Techniques		Retriever Techniques
Structured Query RAG		Structured Query RAG
Supervior Agent RAG		Supervior Agent RAG
VectorLess RAG		VectorLess RAG
.gitignore		.gitignore
README.md		README.md
attention-is-all-you-need.pdf		attention-is-all-you-need.pdf
requirements.txt		requirements.txt
state_of_the_union.txt		state_of_the_union.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAG-Fundamentals 🚀

📊 RAG Architecture Flowchart

Core RAG Components:

🎯 What is RAG?

📁 Repository Structure

🔰 Basic RAG

🔍 Retriever Techniques

🏆 ReRanking Techniques

🔗 Hybrid Search RAG

⚡ RAG Fusion

🎯 Lost In Middle Problem

🖼️ MultiModal RAG

🤖 Agentic RAG

🔬 Advanced RAG

👨‍💼 Supervisor Agent RAG

💾 Structured Query RAG

⏱️ Dynamic RAG

🕸️ Graph RAG

📊 RAG Evaluation

🔧 Key Technologies Used

🚀 Getting Started

🎯 Use Cases Covered

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RAG-Fundamentals 🚀

📊 RAG Architecture Flowchart

Core RAG Components:

🎯 What is RAG?

📁 Repository Structure

🔰 Basic RAG

🔍 Retriever Techniques

🏆 ReRanking Techniques

🔗 Hybrid Search RAG

⚡ RAG Fusion

🎯 Lost In Middle Problem

🖼️ MultiModal RAG

🤖 Agentic RAG

🔬 Advanced RAG

👨‍💼 Supervisor Agent RAG

💾 Structured Query RAG

⏱️ Dynamic RAG

🕸️ Graph RAG

📊 RAG Evaluation

🔧 Key Technologies Used

🚀 Getting Started

🎯 Use Cases Covered

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages