Why most AI writing can’t get its facts straight
Skip to main content
  • Home
  • Economy
  • Stocks
  • Analysis
  • World+Biz
  • Sports
  • Splash
  • Features
  • Videos
  • Long Read
  • Games
  • Epaper
  • More
    • COVID-19
    • Bangladesh
    • Infograph
    • Interviews
    • Offbeat
    • Thoughts
    • Podcast
    • Quiz
    • Tech
    • Subscribe
    • Archive
    • Trial By Trivia
    • Magazine
    • Supplement
  • বাংলা
The Business Standard
SUNDAY, MAY 22, 2022
SUNDAY, MAY 22, 2022
  • Home
  • Economy
  • Stocks
  • Analysis
  • World+Biz
  • Sports
  • Splash
  • Features
  • Videos
  • Long Read
  • Games
  • Epaper
  • More
    • COVID-19
    • Bangladesh
    • Infograph
    • Interviews
    • Offbeat
    • Thoughts
    • Podcast
    • Quiz
    • Tech
    • Subscribe
    • Archive
    • Trial By Trivia
    • Magazine
    • Supplement
  • বাংলা
Why most AI writing can’t get its facts straight

Thoughts

Leonid Bershidsky, Bloomberg
05 May, 2021, 11:55 am
Last modified: 05 May, 2021, 12:06 pm

Related News

  • Bangladeshi-origin scientist appointed chair of Cambridgeshire branch of Institute of Directors
  • Too much email? Let your bot answer it
  • China uses AI software to improve its surveillance capabilities
  • AI is explaining itself to humans. And it's paying off
  • CopywriterPro.ai: An app that that generates advertising copies using AI

Why most AI writing can’t get its facts straight

The brute force approach to artificial intelligence writing still works better on fiction than on factual, data-based reporting

Leonid Bershidsky, Bloomberg
05 May, 2021, 11:55 am
Last modified: 05 May, 2021, 12:06 pm
Leonid Bershidsky. Illustration: TBS
Leonid Bershidsky. Illustration: TBS

It's been almost a year since OpenAI, the San-Francisco lab co-founded by Elon Musk, released Generative Pre-trained Transformer 3, the language model that can produce astoundingly coherent text with minimal human prompting — enough time to draw some conclusions on whether its brute-force approach to artificial intelligence can in time allow most writing to be delegated to machines. In my current job at Bloomberg News Automation, I'm in the business of such delegation, and I have my doubts that the trail blazed by GPT-3 leads in the right direction.

In these past months, lots of people have tested GPT-3, often with surprising results like these fake Neil Gaiman and Terry Pratchett stories or these "Dr Seuss" poems about Elon Musk — or these perfectly readable newspaper columns, clearly published by editors both in awe of the new technology and relieved that AI wouldn't be taking away their jobs any time soon.

It's taken me a while to figure out what all these GPT-3 products resemble, and now I know: A monologue from the classic play by Nikolai Gogol, "The Inspector General." The central character, a complete nonentity named Ivan Khlestakov, arrives in a provincial town and is taken by its elite for a high-ranking government inspector about to conduct a secret investigation into their shady affairs. Khlestakov, fired up by the red carpet treatment, the free-flowing champagne and the attentions of the town's eligible ladies, lets loose a self-aggrandizing tirade (here in Arthur Sykes' translation):

On one occasion I took charge of a Department. It was a funny story: the Director went off somewhere— nobody knew where. So, naturally, people began to ask how was his place to be taken? Who was to fill it? Any number of generals coveted the post and tried it, but they soon gave the thing up—too difficult for 'em! It looked easy enough, but, on closer inspection, it proved a devil of a business! There was nothing to be done, but come to me. In a twinkling the streets were choke-full of couriers, couriers after couriers. Just picture to yourselves thirty-five thousand couriers! How's that for a situation, I ask you?

Nobody could say Khlestakov's boast was incoherent or grammatically incorrect; it was just buffoonishly untrue in every detail. I was reminded of this when testing the Bloomberg version of GPT-3, trained on a body of text produced by our thousands of writers. Fed the first sentence of this column by my Opinion colleague Anjani Trivedi, dealing with the semiconductor shortage, the model continued the narrative by citing a report by "Mark A Rosenbaum, director of research and the president of Computer Strategies Consulting and author of The Chip Shortage"; to the best of my knowledge, a firm called Computer Strategies does exist, but both the person and the book are fictitious. The key to Rosenbaum's report, according to GPT-3, "is something called the 'R&D capitalization method,' which he says allows you to 'make an apples to apples look like apples to oranges comparison.'" The model went on:

While the details for the numbers are a little fuzzy, Rosenbaum has made several calculations, based on those numbers, that are all fairly conservative and suggest that the chip shortage is real and very real. What I want to take a moment to say here is that if you own shares of anyone you care about, you should read this analysis and be sure to discuss it with your clients. Because while Intel, Apple and Qualcomm are probably going to take a beating, they may end up having a good week.

The text generator's political commentary is no less colorful and imaginative. Given the first line of this column by Zev Chafets about Palestinian Authority President Mahmoud Abbas, it proceeded to report the appointment of Hillary Clinton as the Palestinian Secretary of State "under a presidential nomination system that had been ruled unconstitutional more than a decade ago." Abbas, GPT-3 added, "threatened to call Clinton's father, former president Jimmy Carter, 'one of those thugs who put the Jews in prison.'"

The AI model imbibed billions and billions of lines of text to mature as an artificial Khlestakov. Its capacity for invention — or let's be tech optimists and call it imagination — appears to exceed that of many humans; the Abbas-Clinton-Carter connection is certainly beyond my modest imaginative powers. That's why GPT-3 can be good at literary parody, a genre that requires a well-developed sense of the absurd. Nothing can develop that quality better than an inordinate amount of chaotic reading, which is the method used to train models such as GPT-3. 

Get me rewrite! Photo: Bloomberg/ Getty images
Get me rewrite! Photo: Bloomberg/ Getty images

What the most spectacular GPT-3 products prove is that pure literary creativity, especially the derivative kind, is fungible. Surprisingly, the flight of fancy is the easiest part of writing to hand over to a machine; just train it on more obscure style and content examples than the work of Gaiman or Dr Seuss, and few people will wince at its poetry published in literary journals or its paperback fantasy or science fiction — as long as these contributions are carefully edited for traces of bias that "stochastic parrots" like GPT-3 can inherit from the data used to train them.

I could even imagine some heir to GPT-3 being used by news organizations or, say, Substack writers to produce opinion columns. A lot of these — though none written by my Bloomberg Opinion colleagues — are relatively predictable: You more or less know in advance what a specific writer will say on any issue. So if a speech model is trained on a specific columnist's body of work, you might get a well-honed engine that can opine on anything in a certain writer's voice given just the first line. Again, the output would need an edit to avoid reputation-killing errors. But if a columnist gets something wrong, hey, in the end it's just an opinion and everybody's got one. The ritual column, which readers scan to be stroked or triggered and the columnist writes to put in their obligatory two cents, is a clear use case.

Paradoxically, it's the most technical, formulaic stories — those dealing with market signals, deal announcements, statistical releases — that a GPT-34-like engine can't be trusted to handle, because no matter how often we repeat that it's a text engine, not a knowledge one, text is always only a means to an end. It always delivers a message, imparts knowledge, even if it's only trying to create coherent sentences based on a statistical model. In news automation, voice and style — which a well-trained model is demonstrably able to imitate — are not needed, but it's important to rule out invention, minimize interpretation and stick to the data from which the story is built. People, and sometimes robots, make trading decisions based on these stories, and an error in a potentially market-moving story can be costly. We can't use a "stochastic parrot," an AI Khlestakov — or, to be more generous, a fount of derivative creativity — to produce this kind of text. As GPT-3's developers from OpenAI have pointed out, 

In the long term, as machine learning systems become more capable it will likely become increasingly difficult to ensure that they are behaving safely: the mistakes they make might be more difficult to spot, and the consequences will be more severe. 

To minimize the potential for errors, the OpenAI team showed that excellent results can be achieved when the model is trained with human feedback: Human labelers rate the outputs to tell the models which ones are acceptable and which are not. The example used in the OpenAI paper was summarizing Reddit posts, but theoretically, it could be applied to factual, data-based stories, too. Yet the amount of human labor necessary to train the model so it never strays from the facts and draws safe and relevant conclusions from them is much greater than the amount of work it takes to write a simple program that would produce the text based on a set of rules. Brute-forcing the task also requires considerable computing resources and consumes a fair amount of energy. Replacing the human labor of coders writing simple story scripts with the human labor of labelers plus the necessary processing power may not be worth it.

If AI is the future of writing, I certainly hope it's not the kind of AI that needs to burn the equivalent of a coal mine as it ingests hundreds of gigabytes of data and then uses dozens of exhausted workers on minimum wage to label outputs to complete its training. Gogol's play ends as a real government investigator arrives and Khlestakov's moment of glory ends abruptly amid stunned silence; it's unlikely he needs much training never to accept free drinks in a similar situation again. Humans are, in general, flexible and capable of learning from their mistakes; they can be held responsible for their errors, and those who like writing can sometimes produce truly original work — something today's AI is unable, and not even really trying, to do. And humans who earn their living by writing aren't begging to be replaced.

Let's accept that even the discussion of a text-generating AI as a competitor to humans is an astounding development. The progress made in this area in recent years is undeniable. But whether humans can be outcompeted when it comes to writing remains an open question. At least with existing techniques, an AI victory in this race is unlikely.


Leonid Bershidsky is a member of the Bloomberg News Automation team based in Berlin. He was previously Bloomberg Opinion's Europe columnist. He recently authored a Russian translation of George Orwell's "1984."


Disclaimer: This article first appeared on Bloomberg, and is published by special syndication arrangement

AI / Artificial Intelligence / OpenAI

Comments

While most comments will be posted if they are on-topic and not abusive, moderation decisions are subjective. Published comments are readers’ own views and The Business Standard does not endorse any of the readers’ comments.

Top Stories

  • Project delays: The Sinohydro style 
    Project delays: The Sinohydro style 
  • Photo: TBS
    37,000 BO account holders sell all shares in 11 days
  • Photo: Reuters
    Monkeypox: Govt puts ports on alert 

MOST VIEWED

  • Shahadat Hussein
    How the culture of food waste impedes SDG attainment
  • Zhang Jun
    What justifies China’s zero-Covid policy?
  • Ekram Kabir. Illustration: TBS
    Developing an effective social media strategy
  • Romesh Ratnesar and Timothy Lavin. Sketch: TBS
    Is carbon removal finally getting serious?
  • Kazi Ashraf Uddin.
    Managerialism in Bangladeshi public universities: A perspective from the global South
  • Jim O’Neill/ Former treasury minister, United Kingdom
    Another global recession?

Related News

  • Bangladeshi-origin scientist appointed chair of Cambridgeshire branch of Institute of Directors
  • Too much email? Let your bot answer it
  • China uses AI software to improve its surveillance capabilities
  • AI is explaining itself to humans. And it's paying off
  • CopywriterPro.ai: An app that that generates advertising copies using AI

Features

The Buffalo shooter targeted Black people, linking mass migration with environmental degradation and other eco-fascist ideas. Photo: Reuters

Eco-fascism: The greenwashing of the far right

16h | Panorama
Green-backed Heron on a tilting stalk. Photo: Enam Ul Haque

Green-backed Heron: Nothing but a prayer to catch a fish  

19h | Panorama
Illustration: TBS

‘High logistics cost weakens Bangladesh’s competitiveness’

21h | Panorama
Every morning is a new beginning for all

Seashore

21h | In Focus

More Videos from TBS

Wheat prices double in India

Wheat prices double in India

10h | Videos
Is Washington-Moscow agreement possible?

Is Washington-Moscow agreement possible?

11h | Videos
Pigeon exhibition for the first time in Gazipur

Pigeon exhibition for the first time in Gazipur

15h | Videos
Photo: TBS

US Congress to hold first public UFO panel

17h | Videos

Most Read

1
Tk100 for bike, Tk2,400 for bus to cross Padma Bridge
Bangladesh

Tk100 for bike, Tk2,400 for bus to cross Padma Bridge

2
A packet of US five-dollar bills is inspected at the Bureau of Engraving and Printing in Washington March 26, 2015. REUTERS/Gary Cameron
Banking

Dollar hits Tk100 mark in open market

3
The story of Bangladesh becoming a major bicycle exporter
Industry

The story of Bangladesh becoming a major bicycle exporter

4
PK Halder: How a scamster rose from humble beginnings to a Tk11,000cr empire
Crime

PK Halder: How a scamster rose from humble beginnings to a Tk11,000cr empire

5
BSEC launches probe against Abul Khayer Hero and allies
Stocks

BSEC launches probe against Abul Khayer Hero and allies

6
The reception is a volumetric box-shaped room that has two glass walls on both the front and back ends and the other two walls are adorned with interior plants, wood and aluminium screens. Photo: Noor-A-Alam
Habitat

The United House: Living and working inside nature

The Business Standard
Top
  • Home
  • Entertainment
  • Sports
  • About Us
  • Bangladesh
  • International
  • Privacy Policy
  • Comment Policy
  • Contact Us
  • Economy
  • Sitemap
  • RSS

Contact Us

The Business Standard

Main Office -4/A, Eskaton Garden, Dhaka- 1000

Phone: +8801847 416158 - 59

Send Opinion articles to - oped.tbs@gmail.com

For advertisement- sales@tbsnews.net

Copyright © 2022 THE BUSINESS STANDARD All rights reserved. Technical Partner: RSI Lab