Context Settings
The Context tab controls how your agent manages memory, processes conversation history, and reasons through problems. These settings directly impact response quality, cost, and agent capabilities.
Overview
Context settings determine:- How much conversation history the agent remembers
- How intelligently that memory is managed
- How many reasoning steps the agent can take
- The total amount of information the agent can process
- Whether prompts are automatically optimized
Smart Context
Automatically manages conversation memory by intelligently selecting the most relevant previous messages.What is Smart Context?
Instead of including the entire conversation history (which wastes tokens and can confuse the agent), Smart Context:- Analyzes the current query and full conversation
- Selects the most relevant previous messages
- Includes only pertinent context for this specific response
- Reduces token usage while improving quality
How It Works
Without Smart Context:Benefits
Improved Quality
Agent focuses on relevant information, not distracted by unrelated history
Reduced Costs
Fewer tokens = lower costs per request
Longer Conversations
Stay within token limits even in extended conversations
Better Performance
Less context to process = faster responses
When to Enable
- Enable (Recommended)
- Disable
Use Smart Context for:
- Multi-turn conversations
- Customer service chatbots
- Long interactions
- Cost-sensitive applications
- Most production use cases
- ✅ Automatic memory optimization
- ✅ Lower token usage
- ✅ Better focus on relevant context
- ✅ No configuration needed
- ✅ Works automatically
Prompt Optimization
Automatically enhances your agent’s prompts based on its configuration to achieve better results.What is Prompt Optimization?
Prompt Optimization analyzes your agent’s:- Instructions
- Tools and connectors
- Output schema
- Use case
- Structures the prompt for better AI performance
- Emphasizes important instructions
- Optimizes for the specific model being used
- Improves reasoning and tool usage
How It Works
Without Prompt Optimization:Benefits
Better Instruction Following
Better Instruction Following
Agent is more likely to follow complex or nuanced instructions correctly.Example: Instructions about “only escalate refunds over $200” are emphasized in a way the model understands better.
Improved Tool Usage
Improved Tool Usage
Agent makes better decisions about when and how to use tools.Example: “Search knowledge base before answering” becomes a stronger directive that the agent follows more consistently.
Better Reasoning
Better Reasoning
Agent thinks through problems more systematically.Example: Multi-step problems are structured for step-by-step reasoning.
Model-Specific Optimization
Model-Specific Optimization
Prompts are tailored to work best with the specific model you selected.Example: GPT-4 and Claude have different prompt formats they respond to best - optimization handles this automatically.
When to Enable
- Enable (Recommended)
- Disable
Use Prompt Optimization for:
- Complex agent instructions
- Agents with multiple tools
- Production deployments
- When quality is critical
- Most use cases
- ✅ Better agent performance
- ✅ More consistent results
- ✅ Improved tool usage
- ✅ No manual prompt engineering
- ✅ Model-specific tuning
Maximum Tokens
The maximum number of tokens the agent can use for context. This includes system instructions, conversation history, tool descriptions, and the agent’s reasoning.What are Tokens?
Tokens are the basic units that AI models process:- Roughly 4 characters = 1 token
- Roughly 0.75 words = 1 token
- “Hello world!” = ~3 tokens
- This paragraph = ~50 tokens
Token Calculator:
50,000 tokens ≈ 37,500 words ≈ 75 pages of text
50,000 tokens ≈ 37,500 words ≈ 75 pages of text
What Counts Toward the Limit
All of these count toward your token limit:System Instructions
System Instructions
Your agent instructions from the Information tab.Typical size:
- Simple: 200-500 tokens
- Detailed: 500-1,500 tokens
- Very detailed: 1,500-3,000 tokens
Tool Descriptions
Tool Descriptions
Descriptions of available tools and how to use them.Typical size per tool:
- Simple tool: 100-300 tokens
- Complex tool: 300-800 tokens
- 5 tools ≈ 1,000-2,000 tokens
Conversation History
Conversation History
Previous messages (limited by Message History setting).Typical size:
- Short message: 50-150 tokens
- Long message: 150-500 tokens
- 50 messages ≈ 5,000-10,000 tokens
User Prompt
User Prompt
The current input to the agent.Typical size:
- Simple question: 10-50 tokens
- Detailed request: 50-200 tokens
- Long document: 200-5,000+ tokens
Agent's Reasoning
Agent's Reasoning
Internal reasoning steps and tool usage.Typical size:
- Simple response: 100-500 tokens
- Tool usage: 200-800 tokens per tool call
- Complex reasoning: 1,000-5,000+ tokens
Setting the Limit
Default: 50,000 tokens- Low (8K-16K)
- Medium (16K-50K)
- High (50K-128K)
- Very High (128K-200K)
Use for:
- Simple, single-turn interactions
- Minimal conversation history
- Cost-sensitive applications
- Fast responses needed
- Basic classification
- Simple Q&A
- One-shot processing
- Minimal tools
- ⚠️ Limited history
- ⚠️ Few tools available
- ⚠️ Can’t handle long inputs
Choosing the Right Limit
Decision Guide
Ask yourself:
-
How long are typical inputs?
- Short (< 500 words) → 16K-50K
- Medium (500-2,000 words) → 50K-128K
- Long (2,000+ words) → 128K+
-
How many tools does the agent use?
- None or 1-2 → 16K-50K
- 3-5 → 50K
- 6-10 → 50K-128K
- 10+ → 128K+
-
How long are conversations?
- Single turn → 16K
- 5-20 turns → 50K
- 20-50 turns → 50K-128K
- 50+ turns → 128K+
-
What’s your budget?
- Cost-sensitive → Use minimum needed
- Standard → 50K
- Premium → 128K+
Message History Limit
Maximum number of previous messages to include in conversation context. Works with Smart Context to determine which messages the agent can access.What is Message History?
The conversation history is the list of previous messages between the user and agent:How It Works
With Message History Limit = 50:Setting the Limit
Default: 50 messages (approximately 25 turns)- Low (10-20)
- Medium (20-50)
- High (50-100)
- Very High (100-200)
Use for:
- Short interactions
- Simple Q&A
- Cost optimization
- Single-topic conversations
- 5-10 conversation turns
- Basic customer service
- Simple automation
- ⚠️ Can’t reference older context
- ⚠️ Not good for complex conversations
Choosing the Right Limit
Decision Guide
Consider:
-
Typical conversation length?
- 1-3 questions → 10-20 messages
- 5-15 questions → 20-50 messages
- 15-30 questions → 50-100 messages
- 30+ questions → 100-200 messages
-
Need to reference old context?
- Rarely → Lower limit
- Sometimes → Medium limit
- Frequently → Higher limit
-
Cost sensitivity?
- Very sensitive → Lower limit
- Standard → Medium limit
- Not concerned → Higher limit
-
Smart Context enabled?
- Yes → Can use higher limits (it optimizes)
- No → Use lower limits to control costs
At 50 messages: You get approximately 25 conversation turns (each turn = 1 user message + 1 agent message). This is plenty for most customer service and automation scenarios.
Maximum Reasoning Steps
Limits how many times the agent can use tools or reason through a problem before providing a final response.What are Reasoning Steps?
A reasoning step is any action the agent takes:- Tool usage - Calling an API, searching knowledge base, querying database
- Internal reasoning - Thinking through a problem step-by-step
- Decision-making - Evaluating options and choosing a path
Why Limit Reasoning Steps?
Prevent Infinite Loops
Prevent Infinite Loops
Without limits, agents could get stuck in loops:The limit prevents this.
Control Costs
Control Costs
Each reasoning step uses tokens:More steps = higher costs. The limit caps this.
Ensure Timely Responses
Ensure Timely Responses
Each step takes time:More steps = slower responses. The limit prevents excessive delays.
Encourage Efficiency
Encourage Efficiency
Limits encourage the agent to be efficient:❌ With unlimited steps:
- Try every tool
- Excessive reasoning
- Redundant checks
- Choose best tool first
- Efficient reasoning
- Direct path to answer
Setting the Limit
Default: 10 steps- Low (3-5)
- Medium (5-10)
- High (10-20)
- Very High (20-30)
Use for:
- Simple tasks
- Single tool usage
- Fast responses critical
- Cost-sensitive
- 1-2 tool calls
- Simple reasoning
- Basic automation
- ⚠️ Can’t handle complex tasks
- ⚠️ May fail on multi-step problems
Choosing the Right Limit
Decision Guide
Consider:
-
Task complexity?
- Simple (1 tool) → 3-5 steps
- Moderate (2-3 tools) → 5-10 steps
- Complex (4-6 tools) → 10-20 steps
- Very complex (7+ tools) → 20-30 steps
-
How many tools available?
- 1-2 tools → 5 steps
- 3-5 tools → 10 steps
- 6-10 tools → 15 steps
- 10+ tools → 20 steps
-
Response time requirements?
- Must be fast → Lower limit
- Standard → 10 steps
- Can be slower → Higher limit
-
Cost sensitivity?
- Very sensitive → Lower limit
- Standard → 10 steps
- Not concerned → Higher limit
What Happens When Limit is Reached
When the agent hits the reasoning step limit:- Agent stops reasoning
- Returns best answer so far
- May include a note that it couldn’t complete
Best Practices
Keep Smart Context enabled
Keep Smart Context enabled
Smart Context is a free optimization that:
- Reduces costs
- Improves focus
- Enables longer conversations
- Works automatically
Use Prompt Optimization by default
Use Prompt Optimization by default
Unless you’re an expert prompt engineer:
- Keep it enabled
- Let the system optimize
- Focus on clear instructions
- Don’t worry about prompt formatting
Start with default token limit (50K)
Start with default token limit (50K)
50,000 tokens is sufficient for most use cases:
- Standard conversations
- Multiple tools
- Reasonable history
- Good balance of cost/capability
Monitor token usage
Monitor token usage
Track how much context your agents actually use:
- Are they consistently near the limit?
- Are they using only a fraction?
- Adjust limits based on actual usage
Limit message history for cost control
Limit message history for cost control
More history = more tokens = higher costs:
- Most use cases: 50 messages is plenty
- Simple bots: 20 messages may be enough
- Complex conversations: 100 messages if needed
Set reasoning steps based on complexity
Set reasoning steps based on complexity
Match the limit to the task:
- Simple (1-2 tools) → 5 steps
- Standard (3-5 tools) → 10 steps
- Complex (6+ tools) → 15-20 steps
Test edge cases
Test edge cases
Verify limits work for:
- Longest expected conversations
- Most complex tasks
- Edge cases with many tools
Optimize over time
Optimize over time
As you learn your agents’ patterns:
- Lower unused capacity
- Increase where agents struggle
- Fine-tune for specific use cases
Troubleshooting
Agent hitting token limits
Agent hitting token limits
Symptoms:
- Error: “Token limit exceeded”
- Responses cut off
- Agent can’t complete tasks
- Increase Maximum Tokens limit
- Reduce Message History Limit
- Simplify agent instructions
- Remove unnecessary tools
- Enable Smart Context (if not already)
- Use a model with higher context (Claude, GPT-4 Turbo)
Agent forgetting previous context
Agent forgetting previous context
Symptoms:
- Repeating questions
- Not remembering earlier conversation
- Losing track of context
- Increase Message History Limit
- Check Smart Context is enabled
- Verify conversation is actually multi-turn
- Ensure messages are being saved correctly
Agent hitting reasoning step limit
Agent hitting reasoning step limit
Symptoms:
- Incomplete answers
- “I couldn’t complete analysis”
- Tasks not finished
- Increase Maximum Reasoning Steps
- Simplify the task
- Reduce number of tools (remove unused ones)
- Break complex tasks into multiple agents
- Check agent isn’t stuck in loops
Responses are slow
Responses are slow
Causes:
- High token limits
- Many reasoning steps
- Large message history
- Complex tools
- Reduce token limits (if not using full capacity)
- Lower reasoning step limit
- Reduce message history
- Use faster model (GPT-3.5 vs GPT-4)
- Optimize tool descriptions
Costs higher than expected
Costs higher than expected
Check:
- Token limits set too high?
- Message history too long?
- Reasoning steps too high?
- Smart Context disabled?
- Using expensive model?
- Right-size token limits to actual usage
- Lower message history to minimum needed
- Reduce reasoning steps if not all used
- Enable Smart Context
- Consider Workforce model
- Monitor per-agent costs