Artificial Intelligence, News, Technology

OpenAI’s New Language Model Is a Polymath—Without a Mind

Abhinav Raj

Abhinav Raj, Writer
@uxconnections

The GPT-3 AI can generate code like a programmer, write couplets like Keats and report news like CNN—without any thought

OpenAI released its third-generation language model in a closed beta this week, and it’s something out of a sci-fi flick. 

The GPT-3 or ‘Generative Pre-trained Transformer 3’ is the quintessential time-traveller’s gimmick: it can become a translator when it’s required to, code in CSS, JSX or Python, and weave a story with masterful rhetoric and vivified characters. 

In the domain of linguistics, the GPT-3 is the jack of all trades, and justifiably so; because it excels in Natural Language Processing tasks.

Natural Language Processing or NLP is a sub-discipline of artificial intelligence that allows computers to comprehend and interpret human language. Its focus is to quantify human language in order to make it intelligible to machines, allowing for meaningful understanding and analysis of text and speech. 

Now, three things make GPT-3 a moon-shot development in AI.

The first: It’s the largest and the most powerful language model ever created in the history of the world.

Its sheer size supersedes its nearest competitor by an order of magnitude. With about 175 billion parameters, the model’s sheer size is ginormous, enabling it to perform specific tasks in a heartbeat. With less than 10 training samples, the GPT-3 can code in JSX or CSS by just a verbal description of the function.

The second: The GPT-3 is task-agnostic.

What does that mean? 

The GPT-3 is able to perform a diverse range of tasks in the hands of different personnel. From news article generation to language translation and even penning down reimagined Romantic classics from Coleridge, OpenAI’s language is giving everyone from the 18th to the 21st Century a run for their money. The language model is essentially a blank canvas for developers and machine learning practitioners to stimulate their wildest imaginations with.

The final: It doesn’t require fine-tuning.

While other language models available (such as Bidirectional Encoder Representations from Transformers, or BERT) demand for elaborate fine-tuning for specific tasks such as spam detection, question answering and translation by accumulating hundreds and thousands of training datasets (examples), the GPT-3 can perform on those tasks with minimal or no fine-tuning… at all. With a handful of training examples (if not right off the bat), the GPT-3 model will be ready to leap into the task. 

However, as overwhelming as it may seem, we’re not quite in the age of Skynet yet.

Artificial Intelligence—Sans Intelligence

It’s easy to get overenthusiastic about something the internet is claiming to be a ‘ground-breaking’ entity that feels like has arrived ‘from the future’; especially when everyone is boarding the hype train from Twitter’s platform. 

We’re all guilty of going overboard with our rhetoric, but dubbing the GPT-3 a representation of ‘artificial general intelligence’ is an insidious practice that could obscure the truth of its shortcomings. Saying so wouldn’t carry half a modicum of truth and would excuse all its flaws. 

To substantiate, a good analogy for the language model is that of a sophisticated text predictor: the model is trained to simply predict the next word from its vast repository of data after deriving the most statistically plausible value, with respect to what it’s given to work with. 

The model itself is not cognisant of what its response is, nor is it capable of extrapolation. In subtle terms, it lacks any sentience; like any other language model. The model utilises its skills and uses pattern recognition abilities to accomplish a given task. 

To quote the creator of Keras deep learning library and Google software engineer, François Chollet:

“Any problem can be treated as a pattern recognition problem if your training data covers a sufficiently dense sampling of the problem space. What’s interesting is what happens when your training data is a sparse sampling of the space — to extrapolate, you will need intelligence.” 

Perhaps AI may rely on our intelligence for yet another day.

Subscribe to the Blog
Join for the latest tech, design and industry news straight to your inbox.

Our UX team designs customer experiences and digital products that your users will love.

Follow Us