Agent Cortex v2

Agent Cortex is a local, multi-tool AI assistant built with LangChain, Fast retrieval, and a reasoning-capable language model (Mistral 7B). It can answer questions using a mix of:

🔍 Document retrieval (RAG)
💬 Short-term memory (chat history)
🧠 Long-term memory (fact retention across sessions)
🌐 Web search (DuckDuckGo)
🐍 Python interpreter tool
🧮 Math calculations

All running locally with no paid APIs.

Working Cortex CLI

In this terminal session, I tell Agent Cortex name which it identifies as an input fact which it saves to long-term storage vector database as well as short term memory. When asking again in the same session it is able to remember my name because it is given in the context because of short term memory. After shutting the system down and restarting, it is able to recall my name by determining to use the long-term memory tool which uses RAG:

Retrieval (RAG) — Where are the fireworks on July 3rd?
→ Retrieved from local documents indexed about Bristol, RI events.
Web Search — What is the weather in Bristol, RI tomorrow?
→ Uses DuckDuckGo to search live internet results and summarizes forecast.
Calculator Tool — What is 5 * 7 + 15?
→ Routes through a custom calculator tool to evaluate the expression.
Short-Term-Memory — My name is Jacob. then later: What is my name?
→ Agent remembers and recalls personal facts.
Long-Term-Memory — The agent determins if input is long term storage desirable and stores in a local vector database. upon shutting the agent down and startup up later it is able to recall information about the user that it had saved. → Agent remembers and recalls personal facts.
Python REPL — sum([1, 2, 3])
→ Executes Python code securely using a local interpreter.

Each query is interpreted by the ReAct-based agent and routed to the appropriate tool — all executed locally with no API calls or internet billing of an LLM. The websearch is real though but not using outside LLMs for reasoning.

Features

Retrieval-Augmented Generation from .txt documents
Short-term memory using chat history context
Long-term memory — agent remembers facts across sessions
Tool-based reasoning using LangChain’s ReAct agent
DuckDuckGo web search integration
🐍 Python REPL tool for executing code
Calculator for numeric inputs
Mistral 7B via Ollama (runs locally as CLI)
Fallback + context injection for vague queries
CLI-based agent chat interface loop

Tech Stack

LangChain
Ollama (local LLM hosting)
Mistral 7B
ChromaDB (vector store)
HuggingFace Embeddings (all-MiniLM-L6-v2)
Python 3.10
Poetry for dependency management

Ollama + Mistral Setup

This project uses Ollama to run the mistral model locally.

1. Install Ollama (macOS)

brew install ollama

Or download from ollama.com/download and install the desktop app.

2. Download the Mistral Model

ollama run mistral

This will download and launch the Mistral model. Leave it running.

Make sure curl http://localhost:11434 returns {"status":"ok"}

Getting Started

Clone the Repo

git clone https://github.com/YOUR_USERNAME/agent_cortex_v1.git
cd agent_cortex_v1

Install Poetry & Dependencies

poetry install

Index Your Documents

Place .txt files under data/documents, then run:

PYTHONPATH=. poetry run python scripts/index_documents.py

Start the Agent

poetry run python main.py

You'll be prompted with:

You:

Try asking:

"What time is the Fourth of July parade?"
"My name is Jacob" followed by "What is my name?"
"sum([2, 4, 6])"
"Who is todays date and the weather look like in Boston?"
"What is 25 * 4 + 3?"

Limitations

While Agent Cortex v1 is functional, it's an early prototype with several known limitations:

Agent Behavior

Long-term memory is fact-based only: It stores facts like names and locations, not full conversations.
Short-term memory is session-only: Once you close the CLI, short-term context is reset.
No agent reflection or self-correction: It does not retry intelligently or summarize thoughts beyond what the base model provides.
Inconsistent ReAct formatting: The LLM may sometimes fail to produce valid Thought / Action / Action Input format, causing parsing errors or retries.
Fallbacks are basic and do not yet include streaming or error correction

Retrieval System

Only supports .txt files: No PDF, HTML, or Markdown parsing.
No document metadata or filtering: The retriever does not rank sources by type, date, or confidence.
No chunking or advanced preprocessing: Raw text is split into single documents without semantic boundaries.
No multi-vector fusion: Only single-query similarity search; no query rewriting or reranking logic.
Static index: You must manually re-index documents after any updates.

Performance & Deployment

No streaming output: The full response is printed only after the agent completes.
Latency: Mistral via Ollama is slower than hosted APIs, especially on lower-spec machines.
Ollama dependency: Requires installing and running the Ollama server separately, which some users may find nontrivial.

Model Limitations

No fine-tuning: The Mistral model is used out-of-the-box with no task-specific customization.
No prompt injection prevention: User input is not sanitized or structured securely for prompt-based attacks.
No multi-turn tool use: Tools are single-action only — no recursive or multi-step reasoning chains.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
assets		assets
data		data
handlers		handlers
models		models
scripts		scripts
tests		tests
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
agent.py		agent.py
longterm_memory.py		longterm_memory.py
main.py		main.py
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
pyrightconfig.json		pyrightconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agent Cortex v2

Working Cortex CLI

Features

Tech Stack

Ollama + Mistral Setup

1. Install Ollama (macOS)

2. Download the Mistral Model

Getting Started

Clone the Repo

Install Poetry & Dependencies

Index Your Documents

Start the Agent

Limitations

Agent Behavior

Retrieval System

Performance & Deployment

Model Limitations

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Agent Cortex v2

Working Cortex CLI

Features

Tech Stack

Ollama + Mistral Setup

1. Install Ollama (macOS)

2. Download the Mistral Model

Getting Started

Clone the Repo

Install Poetry & Dependencies

Index Your Documents

Start the Agent

Limitations

Agent Behavior

Retrieval System

Performance & Deployment

Model Limitations

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages