5 minute History of AI
AI is all the rage nowadays, and it appears to be taking over every aspect of our lives. Even though it seems like AI boomed only recently, it has actually been around for 74 years, so let’s rewind back.
Back in 1950, this super-smart guy named Alan Turing, who was like the Einstein of computer science, published a paper called “Computing Machinery and Intelligence”. In this paper he came up with a wild idea called the Turing Test. It’s basically a game of “Guess Who?” but with a twist. Here’s how it goes: imagine you’re chatting with someone through text, but plot twist — one of them is actually a computer pretending to be human. Your job is to figure out which one is the real deal and which one is the computer in disguise. If you can’t tell the difference between the human and the computer, then the computer has passed the Turing Test and proven it can talk just like a person.
Since the times of Alan Turing, AI has evolved beyond our imagination, with the boost of powerful GPUs, advancements in software engineering and our perpetual narcissistic desire to replicate our human brain.
So if AI has been around for a while, why does it feel like it boomed in the headlines a year ago? Part of the boom is due to our profit loving corporations and VC-funded talent shows, where hype begets hype. But the other part of the boom is due to a fascinating discovery by a few brilliant scientists.
In 2017, a team of computer science researchers led by Ashish Vaswani introduced a game-changing AI neural network architecture called the “Transformer” in their breakthrough research paper “Attention is All You Need.” The Transformer created a new way to train AI algorithms through “paying attention” to some words and letters more than the others, in return giving us cohesive human like text.
To put this discovery into perspective let’s compare AI before and after this attention mechanism. Imagine you’re chatting with an AI chatbot before the Transformer was introduced. You ask, “What’s the weather like in New York City today? I want to plan my outfit for the trip.” The AI might respond with something like, “The weather in New York City is sunny. An outfit is clothing worn by a person.” The response is disjointed, and the AI fails to understand the context of your question.
Now, let’s see how an AI powered by the Transformer would handle the same conversation:
You: “What’s the weather like in New York City today? I want to plan my outfit for the trip.”
Transformer AI: “The weather in New York City today is mostly sunny with a high of 75°F (24°C). Since you’re planning your outfit for the trip, I suggest packing light, comfortable clothing suitable for warm weather, such as shorts, t-shirts, and sundresses. Don’t forget to bring a light jacket or sweater for cooler evenings and comfortable walking shoes for exploring the city!”
The transformer completely changed the game in the Natural Language Processing arena, allowing humans to speak to the computer easily and making English the hottest new programming language. (*Andrej Karpathy)
So how does this transformer actually “pay attention”? Technically, the transformer model processes sequences of tokens — such as words or subwords — by assigning importance to each token’s relationships with others through self-attention mechanisms.
To simply describe how it works, imagine you’re at a busy conference, where everyone is talking at once. It’s a bit overwhelming at first, like trying to make sense of a jumbled mess. But as you walk around, you start to focus on individual conversations. You tune out the background noise and start paying attention to specific words and phrases. This is kind of like how the attention mechanism works in Vaswani’s discovery. It helps the model pay attention to important words and understand what’s being said, just like you do when you’re trying to follow a conversation in a noisy room. This breakthrough attention mechanism basically put the “T” (Transformer) into “Chat-GPT” and allowed us to speak to a computer in plain English, instead of in python code. This caused the big boom in AI with unlimited possibilities, many of which we are still discovering every day.
Your 3 key takeaways are:
1) AI has been around since 1950s.
2) AI blew up because it learned to talk to us in plain English, via a Transformer.
3) The key to unlocking AI’s full potential lies in the art of prompting — providing clear, specific instructions that guide the AI to deliver the desired output.