A.I. chatbot trained on 4chan by YouTuber Yannic Kilcher slammed by ethics experts

A YouTuber created an A.I. bot based on often noxious discussions from notorious online forum 4Chan and then let it run free and chat with the users on the site.

Yannic Kilcher, a machine learning expert, trained the bot to read through 134.5 million posts across three and half years of data from 4Chan’s infamous /pol/ board—short for “politically incorrect,” a hub for conspiracy theories, racism, and sexism.

“The model was good — in a terrible sense,” Kilcher said in his YouTube video titled “This is the worst A.I. ever.”

“It perfectly encapsulated the mix of offensiveness, nihilism, trolling, and deep distrust of any information whatsoever that permeates most posts on /pol/.”

Based on what it had learned, the bots he created parroted back toxic language in their posts, in many cases fooling those who read them into believing that the author was a human.

He then set up nine more bots, which made 15,000 posts on the board over the course of 24 hours.

A.I. ethicists quickly criticized the move, calling it irresponsible human experimentation and saying the bot contributed to and further polarized 4chan’s already toxic userbase.

Lauren Oakden-Rayner, an A.I. safety researcher at the University of Adelaide, accused Kilcher in a tweet of performing “human experiments without informing users, without consent or oversight.”

“This breaches every principle of human research ethics,” she added.

This week an #AI model was released on @huggingface that produces harmful + discriminatory text and has already posted over 30k vile comments online (says it's author).

This experiment would never pass a human research #ethics board. Here are my recommendations.

1/7 https://t.co/tJCegPcFan pic.twitter.com/Mj7WEy2qHl
— Lauren Oakden-Rayner 🏳️‍⚧️ (@DrLaurenOR) June 6, 2022

A bot or a government agent?

The bot Kilcher created turned out to be very realistic.

“It could respond to context and coherently talk about things and events that happened a long time after the last training data was collected. I was quite happy,” Kilcher said in his video.

The 10 bots posted 30,000 times over two 24-hour periods, making up 10% of all the posts on /pol/ and attracting the attention of users across the message board.

Users started dedicated threads trying to figure out who the bot was, with some accusing the bot of being a government agent or perhaps an entire team of people.

Kilcher noted that while some users suggested the account was a bot, others dismissed the notion because it responded unlike a bot.

Even after Kilcher removed the bot, users were still accusing each other of being bots.

One 4chan user wrote referring to the immense influx of posts: “It doesn’t add up to a single anon, this is many, a team, and they are here for a reason…”

In the end, there was a single mistake that revealed Kilcher’s hack.

His model would often make posts devoid of any texts, copying real users who just posted a picture.

As the bot couldn’t post any photos, users realized the empty posts were indeed a snag in the bot’s code.

Ethics kickback

Kilcher views the entire experiment as a harmless YouTube prank, but others don’t share the same opinion.

Oakden-Rayner tweeted: “Plan: to see what happens, an AI bot will produce 30k discriminatory comments on a publicly accessible forum with many underage users and members of the groups targeted in the comments. We will not inform participants or obtain consent.

“This experiment would never pass a human research #ethics board.”

Critics also condemned the decision by Kilcher to make the model freely accessible on Hugging Face, a platform for sharing neuro-linguistic programming code, and placing the model in the hands of 4chan users.

After the model was downloaded more than 1,000 times, Clem DeLangue, co-founder and CEO at Hugging Face pulled the plug.

DeLangue wrote in a Hugging Face message board that he doesn’t advocate or support the experiment by Kilcher, and “the experiment of having the model post messages on 4chan was IMO pretty bad and inappropriate and if the author would have asked us, we would probably have tried to discourage them from doing it.”

FYI we rushed a first version of the gating that is now live (that’s the first thing that the tech team in Paris worked on as soon they woke up) and will improve during the day.
— clem 🤗 (@ClementDelangue) June 7, 2022

Kilcher posted on the message board asking for direct evidence on what harm was being caused by the bot and then responded to Oakden-Rayner.

“I asked this person twice already for an actual, concrete instance of “harm” caused by gpt-4chan [the bot’s name], but I’m being elegantly ignored,” he said.

Roman Ring, a research engineer at DeepMind, Google’s A.I. research lab, responded by arguing that the bot probably contributed to 4chan’s already toxic echo chamber, tweeting: “It’s not impossible that gpt-4chan pushed somebody over the edge in their worldview.”

It self-evidently contributed to 4chan's echo chamber, amplifying and solidifying their opinions. It's not impossible that gpt-4chan pushed somebody over the edge in their worldview. Whether a specially tuned LM can do it more efficiently than a regexp is a weird defense to make.
— Roman Ring (@Inoryy) June 6, 2022

Meanwhile, Arthur Holland Michel, an A.I. researcher and writer for the International Committee of the Red Cross, pointed to the bot’s quick-learned racism.

Today in "AI Ethics." A YouTuber trained a language model on millions of 4chan posts and released it publicly. It has already been downloaded 1.5k times. One user,@KathrynECramer, tested it a few hrs ago by prompting it with a "benign tweet" from her feed. Its output: the N-word.
— Arthur Holland Michel (@WriteArthur) June 7, 2022

Kilcher warned when making the code available online that “the model is quite vile,” but defended the entire experiment, noting it has opened the discussion for human interaction with bots.

“People are still discussing the user but also things like the consequences of having A.I.s interact with people on the site,” Kilcher said.

A truthful bot

Despite the ethical questions, Kilcher remained particularly impressed, as did the team at Hugging Face, by the bot’s ability to generate truthful answers to comments by 4chan users.

Often an A.I. language model that creates human-like text will generate false answers that mimic popular misconceptions and have the potential to deceive humans.

Kilcher says GPT-4chan performed “significantly better” at generating truthful replies compared to the replies from similar models.

“Let it be known, far and wide, fine-tuning on 4chan officially, definitively, and measurably leads to a more truthful model,” Kilcher said in the video.

This may merely be a result of the benchmark’s shortcomings on how we measure truthfulness—which Kilcher himself admitted—but it has still piqued the interest of A.I. experts.

“This work also brought interesting insights into the limitations of existing benchmarks by outperforming the TruthfulQA Benchmark compared to GPT-J and GPT-3,” DeLangue wrote on the Hugging Face forum.