What is Inside of ChatGPT?

This is a very brief summary of an impressively deep research What is ChatGPT Doing.. and Why Does It Work by Stephen Wolfram.

How does ChatGPT make artificial texts look human-written? The author tried to present the explanation of how this tool is working and what is going on inside the AI. The essence of his explanation applies to other LLMs (large language models) as well.

ChatGPT is Adding One Word at a Time

The most basic mechanism behind ChatGPT is the search of a “reasonable continuation”, or a “what one might expect someone to write after seeing what people have written on billions of webpages, etc.” To put it simply, the AI is scanning billions of examples of texts written by humans and takes into consideration the words that do usually follow the previous one. It produces a ranked list of words that might follow depending on their “probabilities”. However, to generate a more creative text, the AI chooses not the one with the highest rank.

AI Prefers Pairs of English Words to Calculate the Probability

As the author states, there are about 40,000 commonly used words in English. By analysing a large amount of texts, we are able to calculate the probability of appearance of any of them. However, to create a meaningful sentence, the AI does exactly the same, but for pairs of words.

AI Uses the Approach That Implies Neural Nets

Neural nets are simple idealizations of how brains seem to work. For example, there is a neural net that is trained to identify the object in the picture. It is picking out certain features of the object and corresponding it to its database. The neural net of ChatGPT also just corresponds to a mathematical function with billions of terms. The size of neural net and the ability to learn help with generating, but these parameters have their limits.

The success of ChatGPT is giving us evidence of existing of major new laws of language out there to discover.

The Text is Represented With Numbers

ChatGPT assigns a number to every word in the dictionary. But there’s an important idea — that’s for example central to ChatGPT — that goes beyond that: the idea of “embeddings”. They are a way to try to represent the essence of something by an array of numbers—with the property that nearby things are represented by nearby numbers. The idea is to look at large amounts of text and then see how similar the environments are in which different words appear and their connections between each other. For example, some words are interchangeable, like “alligator” and “crocodile”, and some are rare to be used together.

So, What is a ChatGPT?

It’s a giant neural net—currently a version of the so-called GPT-3 network with 175 billion weights, which are the result of very large-scale training, based on a huge corpus of text — on the web, in books, etc. — written by humans. It is particularly set up for dealing with language. Its overall goal is to continue text in a reasonable way, based on what it’s seen from the training it’s had.

It operates in three basic stages:

It takes the text and finds an embedding (i.e. an array of numbers).
Then it operates on this embedding to produce a new embedding (i.e. a new array of numbers).
It then takes the last part of this array and generates from it an array of about 50,000 values that turn into probabilities for different possible next pieces of a text.

ChatGPT is successfully able to capture the essence of human language and the thinking behind it. In its training, ChatGPT has somehow implicitly discovered whatever regularities in language make this possible. The success of ChatGPT is giving us evidence of existing of major new laws of language out there to discover.