GPT-3 | The Daily Omnivore

GPT-3

Generative Pre-trained Transformer 3 (GPT-3) is an autoregressive language model that uses deep learning to produce human-like text. It is the third-generation language prediction model in the GPT-n series (and the successor to GPT-2) created by OpenAI, a San Francisco-based artificial intelligence research laboratory.

Before the release of GPT-3, the largest language model was Microsoft’s Turing NLG, introduced several months earlier with a capacity of 17 billion parameters—less than a tenth of GPT-3’s.

GPT-3’s full version has a capacity of 175 billion machine learning parameters. GPT-3, which was introduced in 2020, is part of a trend in natural language processing (NLP) systems of pre-trained language representations.

The quality of the text generated by GPT-3 is so high that it is difficult to distinguish from that written by a human, which has both benefits and risks. Thirty-one OpenAI researchers and engineers presented the original paper introducing GPT-3. In their paper, they warned of GPT-3’s potential dangers and called for research to mitigate risk.

Microsoft announced on September 22, 2020 that it had licensed ‘exclusive’ use of GPT-3; others can still use the public API to receive output, but only Microsoft has access to GPT-3’s underlying code.

In the original paper, the authors described how natural language processing (NLP) was improved in GPT-n through a process of ‘generative pre-training of a language model on a diverse corpus of unlabeled text, followed by discriminative fine-tuning on each specific task.’ This eliminated the need for human supervision and for time-intensive hand-labeling.

The training data contains occasional toxic language and GPT-3 occasionally generates toxic language as a result of mimicking its training data. A study from the University of Washington found that GPT-3 produced toxic language at a toxicity level comparable to the similar natural language processing models of GPT-2 and CTRL. GPT-3 produced less toxic language compared to its predecessor model, GPT-1, although it produced both more generations and a higher toxicity of toxic language compared to CTRL Wiki, a language model trained entirely on Wikipedia data.

GPT-3’s builder, OpenAI, was initially founded as a non-profit in 2015. In 2019, OpenAI did not publicly release GPT-3’s precursor model, breaking from OpenAI’s previous open-source practices, citing concerns that the model would perpetuate fake news. OpenAI eventually released a version of GPT-2 that was 8% of the original model’s size. In the same year, OpenAI restructured to be a for-profit company. In 2020, Microsoft announced the company had exclusive licensing of GPT-3 for Microsoft’s products and services following a multi-billion dollar investment in OpenAI.

Large language models, such as GPT-3, have come under criticism from Google’s AI ethics researchers for the environmental impact of training and storing the models, detailed in a paper co-authored by Timnit Gebru and Emily M. Bender in 2021.

Related

Posted on July 31, 2021 at 8:28 pm in Language, Technology | RSS feed | Reply | Trackback URL

Leave a comment Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Blog at WordPress.com.