This guide explains how to create and test a single agent using the Agentic Chatbot Accelerator. A single agent is the simplest architecture — one agent with direct access to tools, knowledge bases, and MCP servers that handles all user requests on its own.
A single agent configuration consists of:
- Model & Inference Parameters: The LLM model, temperature, max tokens, stop sequences, and optional reasoning budget for extended thinking
- Instructions: A system prompt defining the agent's role and behavior
- Tools: Function-based or object-based tools registered in the tool registry
- MCP Servers: External MCP servers providing additional tool capabilities (e.g. AWS Knowledge, custom APIs)
- Conversation Manager: How conversation history is managed (Sliding Window is recommended)
- Memory (optional): AgentCore Memory for persisting conversation context across sessions
- Structured Output (optional): A dynamic Pydantic model that validates and structures the agent's final response into typed fields
Unlike swarm, graph, or agents-as-tools architectures, a single agent handles everything directly — there is no delegation, handoff, or orchestration. This makes it ideal for focused tasks where one agent with the right tools is sufficient.
The single agent is built on Strands Agents and runs as a containerized runtime on Amazon Bedrock AgentCore.
Before creating a single agent, you need:
- The accelerator deployed (CDK or Terraform stack)
- At least one model available in Amazon Bedrock in your region (e.g. Claude Sonnet 4.6, Claude Haiku 4.5, Nova 2 Lite)
- Tools or MCP servers registered (optional — agents can work without tools)
Go to Agent Factory → Create Agent and select Single Agent as the architecture type.
In the Agent Configuration step:
- Agent Name: Enter a unique name for your agent (e.g.
aws_expert) - Model: Select the LLM model (e.g. Claude Sonnet 4.6, Claude Haiku 4.5)
- Temperature: Controls randomness (0–1). Lower values (0.1–0.3) for factual tasks, higher (0.7–0.9) for creative tasks
- Max Tokens: Maximum output tokens (e.g. 3000)
- Extended Thinking (optional): Enable for models that support reasoning. When enabled, select a Reasoning Effort level (Low, Medium, High)
- Agent Instructions: Write a system prompt that defines the agent's role, behavior, and how it should use its tools
Select the Conversation Manager to control how conversation history is managed:
- Sliding Window (recommended): Maintains a rolling window of recent messages
- Null: No conversation history management
Check Enable AgentCore Memory to persist conversation context across sessions. When enabled, an AgentCore Memory resource is created and attached to your agent runtime, allowing it to maintain context even when sessions are terminated and resumed.
Check Enable Structured Output to have the agent return structured JSON responses validated against typed fields:
- Click + Add Field to define each output field
- For each field, specify:
- Field Name: A valid Python identifier (e.g.
aws_services,summary) - Python Type: The field type (
str,int,float,bool,list[str],list[int],list[float],dict) - Description: A human-readable description of what this field contains
- Optional: Check if the field can be
None
- Field Name: A valid Python identifier (e.g.
The agent's final response will be automatically parsed into a Pydantic model with these fields.
In the Tools step, extend your agent's capabilities:
- MCP Servers: Select from registered MCP servers (e.g.
aws-knowledgefor AWS documentation search) - Custom Tools: Select from registered function-based or object-based tools
- Knowledge Bases: Attach Amazon Bedrock Knowledge Bases for RAG if enabled in the deployment configuration
You can add multiple tools — the agent will decide which one(s) to use based on the user's request and the tool descriptions.
The review step shows a JSON preview of the complete agent configuration. Click Create Runtime to submit. The agent goes through the creation pipeline (Step Function → AgentCore Runtime) and will reach "Ready" status when deployment is complete.
Once the agent reaches "Ready" status:
- Go to the Chat interface
- Select your agent's endpoint
- Send a message — the agent will process the request, optionally invoke tools, and return a response
A single agent that uses the AWS Knowledge MCP server to answer questions about AWS services with up-to-date documentation.
-
Register the MCP server — register the AWS Knowledge MCP server with:
- Name:
aws-knowledge - URL:
https://knowledge-mcp.global.api.aws - Auth Type: NONE
- Description: AWS documentation search, regional availability, and content recommendations
- Name:
-
Create the agent:
- Agent Factory → Create Agent → Single Agent
- Name:
aws_expert - Model: Claude Sonnet 4.6 (or Nova 2 Lite for faster responses)
- Temperature: 0.2
- Max Tokens: 3000
- Extended Thinking: Enabled, Reasoning Effort: Medium
-
Instructions:
You are an AWS Solutions Architect assistant with access to the official AWS documentation through the AWS Knowledge MCP server.
When a user asks about AWS services, features, configurations, or best practices:
1. ALWAYS use the aws___search_documentation tool first to find relevant, up-to-date information. Choose the appropriate topic(s):
- "general" for architecture, best practices, tutorials
- "reference_documentation" for API/SDK/CLI details
- "troubleshooting" for errors and debugging
- "current_awareness" for new features and announcements
- "cdk_docs" or "cdk_constructs" for CDK questions
- "cloudformation" for CloudFormation templates
2. If the search results reference a specific documentation page, use aws___read_documentation to fetch the full content for detailed answers.
3. Use aws___recommend to suggest related documentation the user might find helpful.
4. For region-specific questions, use aws___get_regional_availability to check service availability.
Your responses should:
- Cite the documentation URL for every claim
- Include code examples when available from the docs
- Be concise but thorough
- Clearly distinguish between what the docs say vs. your general knowledge
- If the docs don't cover something, say so explicitly
Never make up AWS service limits, pricing, or feature availability — always verify via the tools.
- Tools: Select
aws-knowledgeMCP server - Conversation Manager: Sliding Window
Open the chat interface, select the aws_expert endpoint, and try:
User: What are the best practices for S3 bucket security?
→ Agent searches AWS documentation for S3 security best practices
→ Reads relevant documentation pages for detailed guidance
→ Returns a comprehensive answer with citations and code examples
Other test prompts:
- "How do I set up a Lambda function with a DynamoDB trigger using CDK in TypeScript?"
- "Is Amazon Bedrock available in eu-west-1?"
- "What's new with Amazon ECS in 2025?"
- "I'm getting an AccessDenied error on S3 GetObject, how do I fix it?"
When enabled, the agent uses the model's reasoning capabilities to think through complex problems before responding. Configure the Reasoning Effort level:
| Level | Use Case |
|---|---|
| Low | Simple, factual questions |
| Medium | Multi-step reasoning, analysis |
| High | Complex problem-solving, architecture design |
The agent's reasoning process is captured and can be displayed alongside the final response. Extended thinking requires a model that supports it (e.g. Claude Sonnet 4.6).
Structured output turns free-form agent responses into validated, typed JSON objects. This is useful when downstream systems need to consume the agent's output programmatically.
How it works:
- You define field specifications in the agent configuration (name, type, description)
- At runtime, a Pydantic model is dynamically created from these specs (via
structured_output.py) - The Strands agent parses its final LLM response into the model
- The validated structured output is included in the response alongside the text
Supported types: str, int, float, bool, list[str], list[int], list[float], dict
Example: A field spec like { name: "aws_services", pythonType: "list[str]", description: "AWS services cited in the response" } would extract a list of service names from the agent's answer.
When enabled, AgentCore Memory provides persistent conversation context:
- Conversation state is stored both locally (in-memory) and remotely (AgentCore Memory)
- When a session is resumed, the session manager rehydrates state from AgentCore Memory
- This allows the agent to remember previous conversations even after the container restarts
To inspect an existing single agent's configuration:
- Go to Agent Factory
- Find the agent in the table — the Architecture column shows "SINGLE"
- Click on a version to open the View Version modal
- The modal displays: model configuration, agent instructions, tools, MCP servers, conversation manager, and structured output settings
To update an agent's configuration:
- Select the agent in the Agent Factory table
- Click New version
- The wizard opens with the existing configuration pre-populated
- Modify model, instructions, tools, or other settings as needed
- Click Create Runtime to deploy the new version
- The UI sends a
createAgentCoreRuntimemutation witharchitectureType: SINGLEand the agent config asconfigValue - The Agent Factory Resolver validates the config against
AgentConfiguration(Pydantic) and starts a Step Function - The Step Function invokes the Create Runtime Version Lambda, which selects the single agent Docker container (
docker/) - At runtime, the container's
data_source.pyloads the agent configuration from DynamoDB registry.pyloads available tools and MCP servers from DynamoDB at module initializationfactory.pycreates a StrandsAgentwith the configured model, instructions, tools, conversation manager, and optional session managercallbacks.pyhooks into tool execution events — logging tool invocations, capturing tool arguments/results for trajectory evaluation, and tracking Knowledge Base referencesapp.pyhandles incoming requests: initializes the agent once per session, streams tokens via SSE, captures reasoning content, and returns the final response with optional structured output and trajectory data
| File | Role |
|---|---|
docker/app.py |
Entry point — handles requests, streams responses, manages sessions |
docker/src/data_source.py |
Loads agent configuration from DynamoDB |
docker/src/factory.py |
Creates the Strands Agent with model, tools, and callbacks |
docker/src/types.py |
Pydantic models for AgentConfiguration and StructuredOutputFieldSpec |
docker/src/registry.py |
Loads tool and MCP server registries from DynamoDB |
docker/src/callbacks.py |
Tool execution logging and trajectory capture for evaluations |
docker/src/structured_output.py |
Dynamically builds Pydantic models from field specifications |
- Be specific about the agent's role and domain expertise
- Reference tools by name and explain when each should be used
- Include output formatting guidelines (e.g. "Use markdown headings", "Always cite sources")
- Set boundaries — tell the agent what it should not do or when to say "I don't know"
- Start with MCP servers for external capabilities — they're the recommended approach
- Use Knowledge Bases for RAG over your own documents
- Use custom tools for quick prototyping or tightly-coupled logic
- Don't overload the agent with too many tools — the model performs better with a focused set
| Pattern | Best for |
|---|---|
| Single Agent | Focused tasks with direct tool access — the simplest and fastest to set up |
| Agents as Tools | Dynamic delegation where an orchestrator decides which specialists to invoke |
| Swarm | Collaborative workflows where agents hand off conversations to each other |
| Graph | Predefined workflows with fixed execution paths and conditional routing |
| Issue | Cause | Fix |
|---|---|---|
| Tools not appearing | Tool/MCP not registered | Check the tool registry in DynamoDB or register via the UI |
| MCP server connection fails | Wrong auth type or endpoint | Verify the MCP server configuration (SigV4 vs. NONE, correct Runtime/Gateway ID) |
| Structured output parsing fails | Field types don't match agent output | Adjust field descriptions to guide the model, or simplify field types |
| Extended thinking not working | Model doesn't support reasoning | Use a model that supports extended thinking (e.g. Claude Sonnet 4.6) |
| Session state lost | Memory not enabled | Enable AgentCore Memory in the agent configuration |
| Slow responses | High reasoning effort or many tools | Reduce reasoning effort, limit tool count, or use a faster model |
| "Configuration not found" error | DynamoDB item missing | Verify the agent was created successfully and check CloudWatch logs |

