Context Settings

The Context tab controls how your agent manages memory, processes conversation history, and reasons through problems. These settings directly impact response quality, cost, and agent capabilities. Context Settings Tab

Overview

Context settings determine:

How much conversation history the agent remembers
How intelligently that memory is managed
How many reasoning steps the agent can take
The total amount of information the agent can process
Whether prompts are automatically optimized

Smart Context

Automatically manages conversation memory by intelligently selecting the most relevant previous messages.

What is Smart Context?

Instead of including the entire conversation history (which wastes tokens and can confuse the agent), Smart Context:

Analyzes the current query and full conversation
Selects the most relevant previous messages
Includes only pertinent context for this specific response
Reduces token usage while improving quality

How It Works

Without Smart Context:

User: "What's your return policy?"
Agent: [Response about 30-day returns]

User: "What about shipping?"
Agent: [Response about shipping]

User: "Can I get a refund?"
Agent receives: ALL previous messages
- What's your return policy?
- [Full response about returns]
- What about shipping?
- [Full response about shipping]
- Can I get a refund?

Total: ~500 tokens of context

With Smart Context:

User: "What's your return policy?"
Agent: [Response about 30-day returns]

User: "What about shipping?"
Agent: [Response about shipping]

User: "Can I get a refund?"
Agent receives: ONLY relevant messages
- What's your return policy?
- [Full response about returns]
- Can I get a refund?

Total: ~200 tokens of context (shipping context excluded as irrelevant)

Benefits

Improved Quality

Agent focuses on relevant information, not distracted by unrelated history

Reduced Costs

Fewer tokens = lower costs per request

Longer Conversations

Stay within token limits even in extended conversations

Better Performance

Less context to process = faster responses

When to Enable

Enable (Recommended)
Disable

Use Smart Context for:

Multi-turn conversations
Customer service chatbots
Long interactions
Cost-sensitive applications
Most production use cases

Benefits:

✅ Automatic memory optimization
✅ Lower token usage
✅ Better focus on relevant context
✅ No configuration needed
✅ Works automatically

Default: Enabled - Keep Smart Context enabled for most use cases. It’s a free optimization that improves both quality and cost.

Prompt Optimization

Automatically enhances your agent’s prompts based on its configuration to achieve better results.

What is Prompt Optimization?

Prompt Optimization analyzes your agent’s:

Instructions
Tools and connectors
Output schema
Use case

Then automatically:

Structures the prompt for better AI performance
Emphasizes important instructions
Optimizes for the specific model being used
Improves reasoning and tool usage

How It Works

Without Prompt Optimization:

Agent receives:
- Your exact instructions as written
- Tool descriptions as provided
- User prompt as passed in

The AI processes these exactly as given.

With Prompt Optimization:

System analyzes your configuration and:
- Restructures instructions for clarity
- Highlights key constraints
- Optimizes tool usage guidance
- Formats for the specific model
- Adds relevant context cues

The AI receives an enhanced prompt.

Benefits

Better Instruction Following

Agent is more likely to follow complex or nuanced instructions correctly.Example: Instructions about “only escalate refunds over $200” are emphasized in a way the model understands better.

Improved Tool Usage

Agent makes better decisions about when and how to use tools.Example: “Search knowledge base before answering” becomes a stronger directive that the agent follows more consistently.

Better Reasoning

Agent thinks through problems more systematically.Example: Multi-step problems are structured for step-by-step reasoning.

Model-Specific Optimization

Prompts are tailored to work best with the specific model you selected.Example: GPT-4 and Claude have different prompt formats they respond to best - optimization handles this automatically.

When to Enable

Enable (Recommended)
Disable

Use Prompt Optimization for:

Complex agent instructions
Agents with multiple tools
Production deployments
When quality is critical
Most use cases

Benefits:

✅ Better agent performance
✅ More consistent results
✅ Improved tool usage
✅ No manual prompt engineering
✅ Model-specific tuning

Default: Enabled - Keep this on unless you’re an expert prompt engineer who prefers manual optimization.

Maximum Tokens

The maximum number of tokens the agent can use for context. This includes system instructions, conversation history, tool descriptions, and the agent’s reasoning.

What are Tokens?

Tokens are the basic units that AI models process:

Roughly 4 characters = 1 token
Roughly 0.75 words = 1 token
“Hello world!” = ~3 tokens
This paragraph = ~50 tokens

Token Calculator:
50,000 tokens ≈ 37,500 words ≈ 75 pages of text

What Counts Toward the Limit

All of these count toward your token limit:

System Instructions

Your agent instructions from the Information tab.Typical size:

Simple: 200-500 tokens
Detailed: 500-1,500 tokens
Very detailed: 1,500-3,000 tokens

Tool Descriptions

Descriptions of available tools and how to use them.Typical size per tool:

Simple tool: 100-300 tokens
Complex tool: 300-800 tokens
5 tools ≈ 1,000-2,000 tokens

Conversation History

Previous messages (limited by Message History setting).Typical size:

Short message: 50-150 tokens
Long message: 150-500 tokens
50 messages ≈ 5,000-10,000 tokens

User Prompt

The current input to the agent.Typical size:

Simple question: 10-50 tokens
Detailed request: 50-200 tokens
Long document: 200-5,000+ tokens

Agent's Reasoning

Internal reasoning steps and tool usage.Typical size:

Simple response: 100-500 tokens
Tool usage: 200-800 tokens per tool call
Complex reasoning: 1,000-5,000+ tokens

Setting the Limit

Default: 50,000 tokens

Low (8K-16K)
Medium (16K-50K)
High (50K-128K)
Very High (128K-200K)

Use for:

Simple, single-turn interactions
Minimal conversation history
Cost-sensitive applications
Fast responses needed

Sufficient for:

Basic classification
Simple Q&A
One-shot processing
Minimal tools

Limitations:

⚠️ Limited history
⚠️ Few tools available
⚠️ Can’t handle long inputs

Choosing the Right Limit

Decision Guide

Ask yourself:

How long are typical inputs?
- Short (< 500 words) → 16K-50K
- Medium (500-2,000 words) → 50K-128K
- Long (2,000+ words) → 128K+
How many tools does the agent use?
- None or 1-2 → 16K-50K
- 3-5 → 50K
- 6-10 → 50K-128K
- 10+ → 128K+
How long are conversations?
- Single turn → 16K
- 5-20 turns → 50K
- 20-50 turns → 50K-128K
- 50+ turns → 128K+
What’s your budget?
- Cost-sensitive → Use minimum needed
- Standard → 50K
- Premium → 128K+

Context above 200K tokens can produce more unpredictable results. Even if your model supports it, quality may degrade with extreme context lengths.

Start with 50,000 (the default). Increase only if you hit limits or need more capability. Monitor your usage and adjust.

Message History Limit

Maximum number of previous messages to include in conversation context. Works with Smart Context to determine which messages the agent can access.

What is Message History?

The conversation history is the list of previous messages between the user and agent:

User: "What's your return policy?"
Agent: "We have a 30-day return policy..."
User: "What about damaged items?"
Agent: "Damaged items can be returned..."
User: "Can I get a refund?"  ← Current message

Message History Limit determines how far back the agent can see.

How It Works

With Message History Limit = 50:

Agent can access:
- Current message
- Up to 50 previous messages
- (Approximately 25 conversation turns)

Older messages are excluded from context.

With Smart Context enabled:

Agent can access:
- Current message
- Up to 50 previous messages
- Smart Context selects most relevant ones

Only the most pertinent history is included.

Setting the Limit

Default: 50 messages (approximately 25 turns)

Low (10-20)
Medium (20-50)
High (50-100)
Very High (100-200)

Use for:

Short interactions
Simple Q&A
Cost optimization
Single-topic conversations

Sufficient for:

5-10 conversation turns
Basic customer service
Simple automation

Limitations:

⚠️ Can’t reference older context
⚠️ Not good for complex conversations

Choosing the Right Limit

Decision Guide

Consider:

Typical conversation length?
- 1-3 questions → 10-20 messages
- 5-15 questions → 20-50 messages
- 15-30 questions → 50-100 messages
- 30+ questions → 100-200 messages
Need to reference old context?
- Rarely → Lower limit
- Sometimes → Medium limit
- Frequently → Higher limit
Cost sensitivity?
- Very sensitive → Lower limit
- Standard → Medium limit
- Not concerned → Higher limit
Smart Context enabled?
- Yes → Can use higher limits (it optimizes)
- No → Use lower limits to control costs

At 50 messages: You get approximately 25 conversation turns (each turn = 1 user message + 1 agent message). This is plenty for most customer service and automation scenarios.

Start with 50 (the default). Lower if you want to reduce costs. Raise if agents struggle with longer conversations.

Maximum Reasoning Steps

Limits how many times the agent can use tools or reason through a problem before providing a final response.

What are Reasoning Steps?

A reasoning step is any action the agent takes:

Tool usage - Calling an API, searching knowledge base, querying database
Internal reasoning - Thinking through a problem step-by-step
Decision-making - Evaluating options and choosing a path

Example conversation:

User: "I want to return order #12345"

Step 1: Agent uses Order Lookup tool → Gets order details
Step 2: Agent uses Return Policy tool → Checks if return allowed
Step 3: Agent reasons → Order is within 30 days, item eligible
Step 4: Agent uses Refund Processor tool → Initiates refund
Step 5: Agent responds → "I've processed your return and refund"

Total: 5 reasoning steps

Why Limit Reasoning Steps?

Prevent Infinite Loops

Without limits, agents could get stuck in loops:

Agent: Use tool A → Error
Agent: Try tool B → Error
Agent: Try tool A again → Error
Agent: Try different approach → Error
(Repeats indefinitely...)

The limit prevents this.

Control Costs

Each reasoning step uses tokens:

Step 1: Tool call = 200 tokens
Step 2: Tool call = 200 tokens
Step 3: Reasoning = 300 tokens
Step 4: Tool call = 200 tokens
Step 5: Response = 400 tokens

Total: 1,300 tokens

More steps = higher costs. The limit caps this.

Ensure Timely Responses

Each step takes time:

Step 1: 0.5 seconds
Step 2: 0.5 seconds
Step 3: 0.3 seconds
Step 4: 0.5 seconds
Step 5: 0.8 seconds

Total: 2.6 seconds

More steps = slower responses. The limit prevents excessive delays.

Encourage Efficiency

Limits encourage the agent to be efficient:❌ With unlimited steps:

Try every tool
Excessive reasoning
Redundant checks

✅ With reasonable limits:

Choose best tool first
Efficient reasoning
Direct path to answer

Setting the Limit

Default: 10 steps

Low (3-5)
Medium (5-10)
High (10-20)
Very High (20-30)

Use for:

Simple tasks
Single tool usage
Fast responses critical
Cost-sensitive

Sufficient for:

1-2 tool calls
Simple reasoning
Basic automation

Limitations:

⚠️ Can’t handle complex tasks
⚠️ May fail on multi-step problems

Choosing the Right Limit

Decision Guide

Consider:

Task complexity?
- Simple (1 tool) → 3-5 steps
- Moderate (2-3 tools) → 5-10 steps
- Complex (4-6 tools) → 10-20 steps
- Very complex (7+ tools) → 20-30 steps
How many tools available?
- 1-2 tools → 5 steps
- 3-5 tools → 10 steps
- 6-10 tools → 15 steps
- 10+ tools → 20 steps
Response time requirements?
- Must be fast → Lower limit
- Standard → 10 steps
- Can be slower → Higher limit
Cost sensitivity?
- Very sensitive → Lower limit
- Standard → 10 steps
- Not concerned → Higher limit

What Happens When Limit is Reached

When the agent hits the reasoning step limit:

Agent stops reasoning
Returns best answer so far
May include a note that it couldn’t complete

Example:

User: "Analyze this complex data set and provide insights."

Agent (after 10 steps of analysis):
"Based on my analysis so far, I've found [partial insights]. 
However, this is a complex dataset that would benefit from 
additional analysis. Here's what I've discovered..."

If agents frequently hit the limit without completing tasks, increase the limit. If they rarely use all steps, you can lower it to save costs.

Start with 10 (the default). Monitor your agents’ performance. Increase for complex tasks, decrease for simple ones.

Best Practices

Keep Smart Context enabled

Smart Context is a free optimization that:

Reduces costs
Improves focus
Enables longer conversations
Works automatically

Disable only if you have specific reasons.

Use Prompt Optimization by default

Unless you’re an expert prompt engineer:

Keep it enabled
Let the system optimize
Focus on clear instructions
Don’t worry about prompt formatting

You can always disable for manual control.

Start with default token limit (50K)

50,000 tokens is sufficient for most use cases:

Standard conversations
Multiple tools
Reasonable history
Good balance of cost/capability

Increase only when you hit limits.

Monitor token usage

Track how much context your agents actually use:

Are they consistently near the limit?
Are they using only a fraction?
Adjust limits based on actual usage

Right-size for efficiency.

Limit message history for cost control

More history = more tokens = higher costs:

Most use cases: 50 messages is plenty
Simple bots: 20 messages may be enough
Complex conversations: 100 messages if needed

Balance history with budget.

Set reasoning steps based on complexity

Match the limit to the task:

Simple (1-2 tools) → 5 steps
Standard (3-5 tools) → 10 steps
Complex (6+ tools) → 15-20 steps

Too low = incomplete tasks. Too high = wasted tokens.

Test edge cases

Verify limits work for:

Longest expected conversations
Most complex tasks
Edge cases with many tools

If agents hit limits, increase thoughtfully.

Optimize over time

As you learn your agents’ patterns:

Lower unused capacity
Increase where agents struggle
Fine-tune for specific use cases

Start generous, optimize down.

Troubleshooting

Agent hitting token limits

Symptoms:

Error: “Token limit exceeded”
Responses cut off
Agent can’t complete tasks

Solutions:

Increase Maximum Tokens limit
Reduce Message History Limit
Simplify agent instructions
Remove unnecessary tools
Enable Smart Context (if not already)
Use a model with higher context (Claude, GPT-4 Turbo)

Agent forgetting previous context

Symptoms:

Repeating questions
Not remembering earlier conversation
Losing track of context

Solutions:

Increase Message History Limit
Check Smart Context is enabled
Verify conversation is actually multi-turn
Ensure messages are being saved correctly

Agent hitting reasoning step limit

Symptoms:

Incomplete answers
“I couldn’t complete analysis”
Tasks not finished

Solutions:

Increase Maximum Reasoning Steps
Simplify the task
Reduce number of tools (remove unused ones)
Break complex tasks into multiple agents
Check agent isn’t stuck in loops

Responses are slow

Causes:

High token limits
Many reasoning steps
Large message history
Complex tools

Solutions:

Reduce token limits (if not using full capacity)
Lower reasoning step limit
Reduce message history
Use faster model (GPT-3.5 vs GPT-4)
Optimize tool descriptions

Costs higher than expected

Check:

Token limits set too high?
Message history too long?
Reasoning steps too high?
Smart Context disabled?
Using expensive model?

Optimize:

Right-size token limits to actual usage
Lower message history to minimum needed
Reduce reasoning steps if not all used
Enable Smart Context
Verify you’re using the default model (Claude Haiku 4.5) unless a more capable model is needed
Monitor per-agent costs

Next Steps

Tools & Connectors

Connect data sources and APIs

Information Settings

Configure agent identity and behavior

Provider Settings

Choose AI models and configure outputs

Best Practices

Optimize agent performance

​Context Settings

​Overview

​Smart Context

​What is Smart Context?

​How It Works

​Benefits

Improved Quality

Reduced Costs

Longer Conversations

Better Performance

​When to Enable

​Prompt Optimization

​What is Prompt Optimization?

​How It Works

​Benefits

​When to Enable

​Maximum Tokens

​What are Tokens?

​What Counts Toward the Limit

​Setting the Limit

​Choosing the Right Limit

Decision Guide

​Message History Limit

​What is Message History?

​How It Works

​Setting the Limit

​Choosing the Right Limit

Decision Guide

​Maximum Reasoning Steps

​What are Reasoning Steps?

​Why Limit Reasoning Steps?

​Setting the Limit

​Choosing the Right Limit

Decision Guide

​What Happens When Limit is Reached

​Best Practices

​Troubleshooting

​Next Steps

Tools & Connectors

Information Settings

Provider Settings

Best Practices

Context Settings

Overview

Smart Context

What is Smart Context?

How It Works

Benefits

When to Enable

Prompt Optimization

What is Prompt Optimization?

How It Works

Benefits

When to Enable

Maximum Tokens

What are Tokens?

What Counts Toward the Limit

Setting the Limit

Choosing the Right Limit

Message History Limit

What is Message History?

How It Works

Setting the Limit

Choosing the Right Limit

Maximum Reasoning Steps

What are Reasoning Steps?

Why Limit Reasoning Steps?

Setting the Limit

Choosing the Right Limit

What Happens When Limit is Reached

Best Practices

Troubleshooting

Next Steps