OpenAI's 'dangerous' AI text generator is out: People find GPT-2's words 'convincing'
November 4, 2019
OpenAI, the non-profit founded by Elon Musk in 2015 – he's no longer part of it – has released the biggest and final version of the GPT-2 text-generating language model, which it has admitted could be dangerous in the wrong hands.
However, it says the newly released full model's output is only slightly more convincing to humans than the previous version.
The organization released the first portion of the model in February as part of a staged process, beginning with just 124 million parameters. It held back the full model with 1.5 billion parameters because scientists believed it was too dangerous and could be used by malicious actors, such as terrorists and state-sponsored hackers.
SEE: How to implement AI and machine learning (ZDNet special report) | Download the report as a PDF (TechRepublic)
Among the malicious purposes for which OpenAI admitted GPT-2 might be used are generating misleading news articles, impersonating others online, automating the production of abusive or fake content for social media, and automating the creation of spam and phishing content.
The model was the leading text-generating AI until Google researchers developed BERT or Bidirectional Encoder Representations from Transformers, which it's using to predict what people are searching for.
According to OpenAI, humans find output from the 1.5-billion parameter GPT-2 model "convincing", but only marginally more so than the 774-million model it released in August. Cornell University surveyed people to see how credible humans found the output.
"These results make us more inclined to release the 1.5-billion model, as the incremental increase in human-perceived credibility relative to 774 million seems low," OpenAI noted.
However, OpenAI says GPT-2 can be "fine-tuned" for misuse. The Middlebury Institute of International Studies' Center on Terrorism, Extremism, and Counterterrorism (CTEC) reckons GPT-2 can be tweaked to generate synthetic propaganda to support white supremacy, Marxism, jihadist Islamism, and anarchism.
In a paper accompanying the 1.5-billion GPT-2 release, OpenAI says it believes advanced persistent threat (APT) actors "are most likely to have the resources and motivation to misuse GPT-2", but it does not have the resources to counter these threats, hence OpenAI's partnership with CTEC and Cornell University.
SEE: Slight of hand: OpenAI's trick to make its Rubik's robot hand work
However, OpenAI believes the danger from lower-tier threats, such as financially motivated cybercriminals, are not as immediate as it previously thought. It's also seen no evidence GPT-2 has been misused so far.
OpenAI developed a detection model that has detection rates of about 95% for detecting text from the 1.4 billion GPT-2 model. While high, it argues detection systems should be augmented by meta-data based...