What is a Large Language Model?
The basic idea behind models like GPT and Claude.
A large language model (LLM) is a neural network trained on a very large corpus of text to predict the next token in a sequence. That single, simple objective — predict what comes next — turns out to be enough to produce systems that can summarize, translate, write code, and hold a conversation.
The word "large" is doing a lot of work. These models have billions or even trillions of parameters, and they are trained on a meaningful fraction of the public internet. Scale is what separates a toy from something that feels genuinely capable.
It helps to remember what an LLM is not: it is not a database, and it does not look things up. Everything it "knows" is compressed into its weights during training, which is why models can be confidently wrong about facts they never saw clearly.