What is a model? Why are they large? Who is Claude?
At the core of most AI tools you’re using, whether it’s ChatGPT, Claude, or Gumloop, underneath those is a model. This is what’s processing the text, image, or audio you’re sending in and getting you a response.
How these models work is actually pretty simple to understand, but let’s be clear, extremely hard to build and vastly more complicated than what I’m about to explain.
Large language models are fundamentally next-word predictors. You write something like “Who is the first president of the United States?” and the model takes that, maps it against all the vast data it’s analyzed, and tries to predict the right next word, word by word. Every time it picks a word, it looks at your prompt plus what it’s written so far, and then predicts the next word. It does that until it’s answered your question.
This is how chatbots are built. You prompt them and they go word by word responding to you.
How do they go from one answer to a conversation? It’s actually just more of the same. You feed the whole conversation, every message you’ve sent so far, back to the model, and it predicts the next word. When you start a new conversation, the model starts fresh. It has no memory of what came before.
Now in Gumloop, you can pick from most of the models out there. How are they different? Well, models sit on a spectrum: intelligence on one axis and speed on the other, which are inversely related. The more capable the model, the slower you should expect a response, but with fewer mistakes. And the further up the curve, the more you should expect to pay.
Anthropic, the creator of Claude models, has three options. Opus sits on one end, thinks deeply, responds slowly, like your grandpa. Then there’s Sonnet in the middle, your capable coworker. And finally Haiku, your eager teenager ready for a quick answer. OpenAI and Google’s Gemini models have similar offerings, each along this axis.
So what should you pick in Gumloop? Well, that depends on your task, but what I always recommend is start with an advanced model and then move down to simpler models. As long as you’re still happy with the results, keep going until you find that perfect balance: quality you’re satisfied with, but with the least capable model.
So now we understand what models are. They’re next-word predictors. We can go back and forth with them. And chatbots are simply large language models that keep the conversation going. But how do we give AI access to our day-to-day tools so these large language models can actually do things for us, so we don’t just receive text? That’s in the next lesson.
What is an AI Model?
How large language models work under the hood — next-word prediction, conversation context, the intelligence vs. speed tradeoff, and how to choose the right model.
Underneath every AI tool (ChatGPT, Claude, Gumloop) is a model. This is the engine that processes the text, image, or audio you send and generates a response.
How Large Language Models (LLMs) Actually Work
AI models predict the next word in a sequence based on the previous words. They’re called “large” because they’re trained on massive datasets (billions of web pages, books, and articles) and contain billions or even trillions of parameters that help them understand language patterns.
The prediction process works like this:
- A user provides input: “Who was the first president of the United States?”
- The model maps the input against its training data and predicts the most likely next word
- After selecting a word, the model considers both the original prompt and its generated text to predict the next word
- This repeats until a complete answer is formed
From Single Response to Conversation
When you send a follow-up message, the chatbot feeds the entire conversation (every message exchanged so far) back to the model. The model then predicts the next word based on all that context.
When you start a new conversation, the model starts completely fresh. It has no memory of previous chats. Each conversation is independent: the model only knows what’s in the current thread.
The Intelligence vs. Speed Tradeoff
Models sit on a spectrum where intelligence and speed are inversely related:
- More capable models: slower responses, fewer mistakes, higher cost
- Faster models: quicker responses, more potential for errors, lower cost
Each major provider (Anthropic, OpenAI, Google) offers models across this spectrum:
| Use case | Anthropic | OpenAI | |
|---|---|---|---|
| Complex reasoning | Claude Opus | GPT-5.2 | Gemini 3.0 |
| Most business tasks | Claude Sonnet | GPT-5 | Gemini 2.5 Pro |
| Simple, high volume | Claude Haiku | GPT-4.1 Mini | Gemini 2.5 Flash |
How to Choose the Right Model
Start with an advanced model. Begin with a more capable model (like Claude Sonnet or GPT-5) to establish a quality baseline. Test your workflow and evaluate the results.
If the results are good, move down. Try a faster, cheaper model and test again. Keep iterating until you find the fastest, most affordable model that still delivers the quality you need.
It’s much easier to identify when a simpler model is “good enough” than to debug why your automation is producing mediocre results.
Key Takeaways
- Models are next-word predictors: they generate responses by predicting one word at a time based on patterns in their training data
- Chatbots are LLMs with context: they maintain conversations by feeding the entire chat history back to the model
- No persistent memory: each new conversation starts fresh
- Intelligence vs. speed tradeoff: more capable models are slower and costlier, faster models may make more mistakes
- Start advanced, then optimize: begin with a capable model and work your way down