Skip to main content

Context

Context is the working memory of a conversation. Everything the model knows when generating a response — your message, the conversation history, tool results — lives here.

What Is Context?

Think of context as a desk. Everything the AI model needs to think about sits on that desk: your question, the conversation so far, any instructions, tool definitions, and results from tool calls. The model reads everything on the desk before generating its response.

The bigger the desk, the more the model can work with. But every desk has an edge — that’s the context window, measured in tokens.

What’s on the deskWhere it comes from
System promptInstructions set by the tool creator (called “AI Instructions” in Gumloop)
User messageWhat you type in the chat
Conversation historyAll previous messages in this thread
Tool definitionsNames, descriptions, and parameters of available tools
Tool resultsData returned from tool calls
AttachmentsImages, documents, or other files you’ve shared

System and User Prompts

Two types of text shape every AI response:

System prompt (AI Instructions) — these are set by the person who built the agent. They define the agent’s identity, workflow, and boundaries. The user never sees them, but the model reads them first, before anything else. In Gumloop, these are called “AI Instructions.”

User prompt — this is what you actually type. It sits alongside the system prompt in the context. The model uses both to decide what to do next.

Multimodal context

Modern models can handle more than text. Images, PDFs, and even audio can be part of the context. When you upload a screenshot to an AI, it becomes tokens on the desk alongside your message.

Including Context from Tools

Here’s where context gets powerful. Without tools, the model can only work with what it was trained on and what you tell it. With tools, the model can pull in real data and add it to the desk.

For example:

  • MCP tools — an agent calls your CRM and the contact details come back into the context. Now the model can reference actual data when composing its response.
  • Web search — the model searches the web, and the results land in the context. Now it has information beyond its training cutoff.
  • File reading — the model reads a document you’ve shared, and its contents become part of the working memory.

This is why tools reduce hallucination: the model doesn’t need to guess when it has real data on the desk.

Context Limits

Every model has a maximum context window — the total number of tokens it can handle at once. Current models range from around 8,000 tokens (small) to over 1,000,000 tokens (very large).

When you hit the limit, things start to break:

  • Older messages get dropped — the system starts removing earlier parts of the conversation to make room for new content
  • Quality degrades — even before hitting hard limits, models tend to “forget” details buried deep in long contexts
  • Tool definitions eat space — connecting many tools means their descriptions take up context that could be used for conversation

Smart context management matters. This is why Gumloop’s MCP scripting approach is valuable — instead of passing all data through the model’s context, workflows process data independently and only pass the results back. More on this in the MCP Deep Dive.

Keep it lean

Shorter instructions, fewer unnecessary tools, and fresh conversations for new topics — these simple habits keep your context clean and your model sharp.

Quiz: Context

Why does giving an agent access to tools reduce hallucination?

Correct! Tools don’t make the model smarter — they give it real data to work with. When actual CRM records or database results are in the context, the model references those instead of generating plausible guesses.

Not quite. Tools don’t change the model itself — they enrich the context with real data. When the model has actual facts on its “desk,” it references those instead of guessing, which reduces hallucination.

Reset