Generative AI: The First Draft, Not Final

This article gives a high-level overview of how LLMs work and their attendant limitations with accessible explanations and anecdotes throughout the piece. We also present advice on how people can introduce them into their workflows.



By: Numa Dhamani & Maggie Engler

Generative AI: The First Draft, Not Final

 

It's safe to say that AI is having a moment. Ever since OpenAI's conversational agent ChatGPT went unexpectedly viral late last year, the tech industry has been buzzing about large language models (LLMs), the technology behind ChatGPT. Google, Meta, and Microsoft, in addition to well-funded startups like Anthropic and Cohere, have all released LLM products of their own. Companies across sectors have rushed to integrate LLMs into their services: OpenAI alone boasts customers ranging from fintechs like Stripe powering customer service chatbots, to edtechs like Duolingo and Khan Academy generating educational material, to video game companies such as Inworld leveraging LLMs to provide dialogue for NPCs (non-playable characters) on the fly. On the strength of these partnerships and widespread adoption, OpenAI is reported to be on pace to achieve more than a billion dollars in annual revenue. It's easy to be impressed by the dynamism of these models: the technical report on GPT-4, the latest of OpenAI's LLMs, shows that the model achieves impressive scores on a wide range of academic and professional benchmarks, including the bar exam; the SAT, LSAT, and GRE; and AP exams in subjects including art history, psychology, statistics, biology, and economics. 

These splashy results might suggest the end of the knowledge worker, but there is a key difference between GPT-4 and a human expert: GPT-4 has no understanding. The responses that GPT-4 and all LLMs generate do not derive from logical reasoning processes but from statistical operations. Large language models are trained on vast quantities of data from the internet. Web crawlers –– bots that visit millions of web pages and download their contents –– produce datasets of text from all manner of sites: social media, wikis and forums, news and entertainment websites. These text datasets contain billions or trillions of words, which are for the most part arranged in natural language: words forming sentences, sentences forming paragraphs. 

In order to learn how to produce coherent text, the models train themselves on this data on millions of text completion examples. For instance, the dataset for a given model might contain sentences like, "It was a dark and stormy night," and "The capital of Spain is Madrid." Over and over again, the model tries to predict the next word after seeing "It was a dark and" or "The capital of Spain is," then checks to see whether it was correct or not, updating itself each time it's wrong. Over time, the model becomes better and better at this text completion task, such that for many contexts — especially ones where the next word is nearly always the same, like "The capital of Spain is"  — the response considered most likely by the model is what a human would consider the "correct" response. In the contexts where the next word might be several different things, like "It was a dark and," the model will learn to select what humans would deem to be at least a reasonable choice, maybe "stormy," but maybe "sinister" or "musty" instead. This phase of the LLM lifecycle, where the model trains itself on large text datasets, is referred to as pretraining. For some contexts, simply predicting what word should come next won't necessarily yield the desired results; the model might not be able to understand that it should respond to instructions like "Write a poem about a dog" with a poem rather than continuing on with the instruction. To produce certain behaviors like instruction-following and to improve the model's ability to do particular tasks, like writing code or having casual conversations with people, the LLMs are then trained on targeted datasets designed to include examples of those tasks.

However, the very task of LLMs being trained to generate text by predicting likely next words leads to a phenomenon known as hallucinations, a well-documented technical pitfall where LLMs confidently make up incorrect information and explanations when prompted. The ability of LLMs to predict and complete text is based on patterns learned during the training process, but when faced with uncertain or multiple possible completions, LLMs select the option that seems the most plausible, even if it lacks any basis in reality.

For example, when Google launched its chatbot, Bard, it made a factual error in its first-ever public demo. Bard infamously stated that the James Webb Space Telescope (JWST) “took the very first pictures of a planet outside of our own solar system.” But in reality, the first image of an exoplanet was taken in 2004 by the Very Large Telescope (VLT) while JWST wasn’t launched until 2021.

Hallucinations aren’t the only shortcoming of LLMs –– training on massive amounts of internet data also directly results in bias and copyright issues. First, let’s discuss bias, which refers to disparate outputs from a model across attributes of personal identity, such as race, gender, class, or religion. Given that LLMs learn characteristics and patterns from internet data, they also unfortunately inherent human-like prejudices, historical injustice, and cultural associations. While humans are biased, LLMs are even worse as they tend to amplify the biases present in the training data. For LLMs, men are successful doctors, engineers, and CEOs, women are supportive, beautiful receptionists and nurses, and LGBTQ people don't exist. 

Training LLMs on unfathomable amounts of internet data also raises questions about copyright issues. Copyrights are exclusive rights to a piece of creative work, where the copyright holder is the sole entity with the authority to reproduce, distribute, exhibit, or perform the work for a defined duration.

Right now, the primary legal concern regarding LLMs isn't centered on the copyrightability of their outputs, but rather on the potential infringement of existing copyrights from the artists and writers whose creations contribute to their training datasets. The Authors Guild has called upon OpenAI, Google, Meta, and Microsoft, amongst others, to consent, credit, and fairly compensate writers for the use of copyrighted materials in training LLMs. Some authors and publishers have also taken this matter into their own hands.

LLM developers are presently facing several lawsuits from individuals and groups over copyright concerns –– Sarah Silverman, a comedian and actor, joined a class of authors and publishers filing a lawsuit against OpenAI claiming that they never granted permission for their copyrighted books to be used for training LLMs.

While concerns pertaining to hallucinations, bias, and copyright are among the most well-documented issues associated with LLMs, they are by no means the sole concerns. To name a few, LLMs encode sensitive information, produce undesirable or toxic outputs, and can be exploited by adversaries. Undoubtedly, LLMs excel at generating coherent and contextually relevant text and should certainly be leveraged to improve efficiency, among other benefits, in a multitude of tasks and scenarios.

Researchers are also working to address some of these issues, but how to best control model outputs remains an open research question, so existing LLMs are far from infallible. Their outputs should always be examined for accuracy, factuality, and potential biases. If you get an output that is just too good to be true, it should tingle your spider senses to exercise caution and scrutinize further. The responsibility lies with the users to validate and revise any text generated from LLMs, or as we like to say, generative AI: it’s your first draft, not the final.

 
 
Maggie Engler is an engineer and researcher currently working on safety for large language models. She focuses on applying data science and machine learning to abuses in the online ecosystem, and is a domain expert in cybersecurity and trust and safety. Maggie is a committed educator and communicator, teaching as an adjunct instructor at the University of Texas at Austin School of Information.
 

Numa Dhamani is an engineer and researcher working at the intersection of technology and society. She is a natural language processing expert with domain expertise in influence operations, security, and privacy. Numa has developed machine learning systems for Fortune 500 companies and social media platforms, as well as for start-ups and nonprofits. She has advised companies and organizations, served as the Principal Investigator on the United States Department of Defense’s research programs, and contributed to multiple international peer-reviewed journals.