Instead of trying to break down a problem into  a set of logical instructions for a computer to follow,  we can give it a ton of data, and let it figure out how to solve the problem itself.

This SciShow video is supported by Linode! You can get a $100 60-day credit on a new Linode account at linode.com/scishow.

ChatGPT can write sonnets about dinosaurs on skateboards and handle basic addition, but when asked to do something like that, it can sometimes get it wrong. It’s strange that computers can make mistakes on grade school math, but there’s a reason for it.

Modern computers have special components called arithmetic logic units (ALUs) which do all the number-crunching. The basic building block of an ALU is a kind of electronic circuit called a logic gate. Logic gates receive a set of input values and apply rigid, logical operations to produce an output.

By combining logic gates, computers can do math. For example, you can create a circuit called an adder that can add two binary numbers together. But these rigid, logical operations make it hard to use computers for more complex tasks.

In the last decade, a different approach has become popular for tackling these tasks. Instead of trying to break down a problem into instructions for a computer to follow, we can give it a ton of data and let it figure out how to solve the problem itself. A neural network is an algorithm that connects up thousands or even millions of mathematical components called neurons, much like a logic gate. However, unlike logic gates, a neural network’s inputs and outputs can be any number that the computer’s hardware can represent, not just one or zero. This makes it easier to create reliable solutions to problems that require more nuance. Neural networks are also trained, so they can learn what output should be given for a given input.

ChatGPT is a large language model (LLM) that was trained on huge bodies of writing on the internet like Wikipedia, to take a piece of text as input and produce a piece of text in response as an output. It is able to craft long sentences that make way more sense than other auto-complete options, and it has been trained with human-assisted feedback to specifically curate outputs that the trainers consider “high quality”. This has blown open the door to how we interact with computers, as we can now talk to a computer in natural language to make requests, including asking it to do math.

However, ChatGPT sometimes gives wrong answers to even simpler questions, like taking the sum of two large numbers. This is unlike a cheap, plastic, solar powered calculator, which would never get that error provided the numbers could fit on the screen. Despite the fact that many people have noticed that ChatGPT often fails on reasonably straightforward math when larger numbers are involved, there is no fully transparent study available due to researchers not having direct access to the model. This could be because the model is trained to basically regurgitate a collage of words that closely resembles the patterns it’s encountered in its training data, which includes examples of adding numbers, as well as encoding the broader structures of how people talk about numbers and the functions we perform with them.

A recent preprint by Chinese researchers found that the latest model of ChatGPT could accurately add and subtract numbers under one trillion about 99% of the time. However, the accuracy drops when it comes to multiplication, with the model only managing to get the right answer about two thirds of the time. This implies that the model doesn’t form perfect, logic-gate style math with unfaltering accuracy, like an ALU would do.

Interestingly, people have been able to coax ChatGPT into becoming more accurate with addition by explaining its logic more carefully, including adding up those large numbers. This means that with some prodding, ChatGPT has the potential to be reliable, but ultimately it can’t be guaranteed that its answers will be accurate, like we can for the old-school ALUs.

The best way to use ChatGPT might be to take its suggestions as a creative starting point, or even combining it with more reliable code that can produce precise outputs. If you’re looking to crunch numbers rather than draft emails, however, you might be better off with the calculator. By the time you’re watching this video, ChatGPT may have learned to solve a math problem correctly. However, you may still be able to find a problem it is not able to solve yet.

Thanks for watching this SciShow video, which is supported by Linode! Linode is a cloud computing company from Akamai that provides storage space, databases, analytics and more. User reviews rate Linode above average in ease of use and setup, as well as quality of support. In addition, Linode has won awards for their customer support, and you can talk to a real person at any time of day and any time of year.

If you’re interested in trying out Linode, you can click the link in the description or go to linode.com/scishow for a $100 60-day credit on a new Linode account. Thanks to Linode for supporting this SciShow video!