This is Part 2 of our Explainer series to demystify some of the jargon used in the AI space. This edition shines a light on slightly more technical concepts used when talking about the creation and operation of LLMs. This builds on the concepts and definitions of Part 1 which you can find here.

AI Technical Glossary

Training

The process of teaching an AI model by providing it large amounts of data so it learns patterns and relationships. Training a Large Language Model (LLM) is done by feeding it a significant chunk of the internet, billions of documents, books, articles, code repositories and more. Through this process, the model gradually adjusts itself until it gets good at predicting and generating language.

Training is extraordinarily expensive. Running the computation resource l(large computer servers in data centres) required to train a Frontier Model (like ChatGPT) costs hundreds of millions of dollars and takes months. It also requires specialist infrastructure and teams that most organisations simply don't have. This is why training is almost exclusively done by the major providers, and why most businesses build on top of models that already exist rather than starting from scratch.

The key thing to understand is that training is a one-off process. Once a model is trained, that knowledge is baked in. It doesn't keep learning from conversations after the fact unless it goes through another round of training.

Fine-tuning

Taking an existing Foundation Model and training it further on a smaller, specific dataset so it becomes better at a particular task or adopts a particular style or tone. Think of it as taking a well-educated generalist and giving them a focused apprenticeship in your industry.

A legal firm might fine-tune a model on thousands of contracts so it gets better at drafting and reviewing legal language. A retailer might fine-tune on their product catalogue and customer service history so the model handles queries and uses language and tone in a way that feels on-brand. A manufacturer might fine-tune on technical manuals so the model gives accurate, specific answers to engineering questions.

Fine-tuning is significantly cheaper than training from scratch, but it still requires clean, well-structured data and some technical expertise to do properly. It's also not always necessary. For many use cases, giving the model the right instructions and context gets you most of the way there without fine-tuning at all.

Prompts

The instructions or questions you give an AI model. The wording, structure, and detail of a prompt has a significant impact on the quality of the response; more than most people expect when they first start working with these models.

A vague prompt gets an answer that could be correct or 80% of correct. Your use case will determine whether this is appropriate or not. A well-constructed prompt that gives the model clear instructions, relevant context, and a defined format to respond in will produce something far more useful. The difference between "write me a summary of this report" and a prompt that specifies the audience, the desired length, the tone, and the key points to focus on is substantial.

In a business context, prompts are rarely just one-off questions. They are carefully crafted instructions built into products and workflows, sitting invisibly behind customer-facing tools, internal assistants, and automated processes. Getting them right is a discipline in its own right.

Context

The information available to an AI model during a conversation or task, including previous messages, uploaded documents, system instructions, and any other data passed to it. Models have no memory between separate conversations, so everything they need to know to give a useful response has to be in the context.

This is an important distinction from how humans work. If you start a fresh conversation with an AI assistant, it has no recollection of anything you discussed previously unless that information is explicitly included again. Within a single conversation though, it can refer back to everything that has been said and take it all into account when responding.

In practice this means that well-designed AI applications are careful about what they put into the context, making sure the model has the right information to do its job without being overwhelmed by irrelevant noise.

Context Window

The maximum amount of text a model can process at once, usually measured in tokens. Anything outside the context window simply doesn't exist as far as the model is concerned.

Early models had quite short windows, enough for a brief conversation but not much more. Modern Frontier Models can handle the equivalent of several novels in a single context, which opens up genuinely useful applications like summarising lengthy reports, analysing large codebases, or working through extensive customer histories in one go.

Context window size matters practically when you are building AI applications. If you need a model to reason across a large document, or maintain coherence across a long workflow, you need a model with a window big enough to hold it all.

Inference

The act of running an AI model to generate a response. Every time you send a message to Gemini, trigger an automated AI workflow, or get a response from an AI-powered product, that's inference happening in real time.

Inference is a very separate phase of a model's lifecycle to training. Training happens once and is enormously expensive. Inference happens continuously, every single time the model is used, and while it's much cheaper per interaction, at scale it adds up. For businesses building AI products, inference costs are a real operational consideration.

Inference speed also matters. A model that takes ten seconds to respond might be fine for a research task but completely unusable in a customer-facing product where people expect near-instant replies.

Token

The basic unit AI models use to process input (it could be text, images, audio, etc.). Using text as an example, rather than reading word by word, models break text down into tokens, which are roughly equivalent to a word or part of a word. "Running" might be one token, while "unbelievable" might be split into two or three.

Tokens matter for two practical reasons. First, pricing. Most providers charge per token, both for the text you send in and the text the model generates back. Understanding token usage helps with cost planning, particularly at scale. Second, context limits. When providers say a model has a 200,000 token context window, they mean the total amount of text in and out of the model in one go, measured in tokens.

As a rough guide, 1,000 tokens is around 750 words. It's not a number you need to obsess over day to day, but it's useful to have a feel for when you're thinking about costs or the scale of what a model can handle.

Hallucination

When an AI model generates something that sounds entirely plausible but is factually wrong, made up, or simply doesn't exist. It might cite a research paper that was never written, quote a statistic it invented, or confidently give an incorrect answer to a factual question, all in the same assured tone it uses when it's completely right.

Hallucinations happen because language models are fundamentally pattern-completion machines. They generate the most plausible-sounding next word or sentence based on their training, and sometimes that process produces something convincing but wrong. They don't have a built-in alarm that fires when they're uncertain. This happens more often when these models are working in an area where they have not seen much training data.

This is one of the most important things to understand when deploying AI in a business context. It doesn't mean these tools aren't useful, they clearly are. But it does mean that outputs need appropriate review, particularly in high-stakes situations involving legal, financial, medical, or factual content. Good AI system design builds in checks rather than assuming the model is always correct.

Wrap Up

This edition covered slightly more technical but important terms that are used often when describing the operation or usage of LLMs. You might not use words like inference or context window in everyday conversation, but if you are evaluating an AI product, sitting in a vendor meeting, or working with a team building something with AI, you will hear them. Having a working understanding of what they mean makes those conversations easier and helps you ask better questions.

Next time we will shift from how models work to what you can actually build with them, covering the key terms that come up when designing and deploying AI applications. That's where things start to get practical.

AI Explainer - Part 2: Deeper Dive