Context
Context is the working memory of a conversation. Everything the model knows when generating a response — your message, the conversation history, tool results — lives here.
What Is Context?
Think of context as a desk. Everything the AI model needs to think about sits on that desk: your question, the conversation so far, any instructions, tool definitions, and results from tool calls. The model reads everything on the desk before generating its response.
The bigger the desk, the more the model can work with. But every desk has an edge — that’s the context window, measured in tokens.
| What’s on the desk | Where it comes from |
|---|---|
| System prompt | Instructions set by the tool creator (called “AI Instructions” in Gumloop) |
| User message | What you type in the chat |
| Conversation history | All previous messages in this thread |
| Tool definitions | Names, descriptions, and parameters of available tools |
| Tool results | Data returned from tool calls |
| Attachments | Images, documents, or other files you’ve shared |
System and User Prompts
Two types of text shape every AI response:
System prompt (AI Instructions) — these are set by the person who built the agent. They define the agent’s identity, workflow, and boundaries. The user never sees them, but the model reads them first, before anything else. In Gumloop, these are called “AI Instructions.”
User prompt — this is what you actually type. It sits alongside the system prompt in the context. The model uses both to decide what to do next.
Modern models can handle more than text. Images, PDFs, and even audio can be part of the context. When you upload a screenshot to an AI, it becomes tokens on the desk alongside your message.
Including Context from Tools
Here’s where context gets powerful. Without tools, the model can only work with what it was trained on and what you tell it. With tools, the model can pull in real data and add it to the desk.
For example:
- MCP tools — an agent calls your CRM and the contact details come back into the context. Now the model can reference actual data when composing its response.
- Web search — the model searches the web, and the results land in the context. Now it has information beyond its training cutoff.
- File reading — the model reads a document you’ve shared, and its contents become part of the working memory.
This is why tools reduce hallucination: the model doesn’t need to guess when it has real data on the desk.
Context Limits
Every model has a maximum context window — the total number of tokens it can handle at once. Current models range from around 8,000 tokens (small) to over 1,000,000 tokens (very large).
When you hit the limit, things start to break:
- Older messages get dropped — the system starts removing earlier parts of the conversation to make room for new content
- Quality degrades — even before hitting hard limits, models tend to “forget” details buried deep in long contexts
- Tool definitions eat space — connecting many tools means their descriptions take up context that could be used for conversation
Smart context management matters. This is why Gumloop’s MCP scripting approach is valuable — instead of passing all data through the model’s context, workflows process data independently and only pass the results back. More on this in the MCP Deep Dive.
Shorter instructions, fewer unnecessary tools, and fresh conversations for new topics — these simple habits keep your context clean and your model sharp.
