There’s not much new about the Cambridge Union, the 200-year-old debating society in Cambridge, England that has played host to world leaders and famous arguments over politics and culture.
But last night, in perhaps the most momentous first since the Union decided to admit women in 1963, a non-human participated in the club’s debate.
IBM’s “Project Debater” artificial intelligence software was used to assist two teams of humans as they squared off over the proposition that artificial intelligence will do more harm than good.
“Project Debater” is a software system designed by IBM that can extract and categorize arguments from either text or audio, and then summarize those positions, presenting them through synthesized speech.
In February in San Francisco, in a benchmark test, the system squared off against Harish Natarajan, a world champion debater. In that case, the A.I. drew its arguments from 400 million published articles available on the Internet. The audience judged that Natarajan narrowly defeated the A.I. on the question of whether government should subsidize preschool education.
The demonstration in Cambridge was different, although Natarajan was present again as a member of the team arguing A.I. would ultimately do more good than harm.
In this case, IBM was showcasing a capability it soon hopes to sell to its business customers. Named “speech by crowd,” the Debater A.I. groups and summarizes large numbers of disparate arguments made by individuals.
The machine distilled more than 1,100 arguments about A.I. that people submitted to IBM through a website in the week prior to the debate. It categorized 570 comments as being in favor of the idea that A.I. would cause more harm than good and 511 comments as being opposed. It discarded some comments as irrelevant to the debate.
Then, for both the pro and con case, it distilled those hundreds of comments into five main themes—such as the idea that A.I. would automate monotonous routine tasks, for the pro-A.I. position, or that A.I. would entrench human bias, for the anti-A.I. position. It then presented, in its synthesized, feminine voice, a few sentences of supporting evidence for each theme. After Project Debater presented the opening case for each side at the start of the debate, it was up to the humans on the two teams to elaborate on these points and rebut counter-arguments.
Noam Slonim, the engineer who leads Project Debater at IBM, said the company would shortly begin making this technology available to select customers of its cloud computing services. He said he could imagine a company using it to help understand what its customers thought of a product or what its employees thought of a particular policy. He also said he could envision governments using it to better understand the views of citizens.
Slonim said this was a good example of how A.I. would in the future work alongside people, helping to augment what they do, rather than competing against humans.
It is also an example of the rapid progress being made in natural language processing, an area of machine learning that had been lagging computer vision until recently. But in the past 18 months, a number of powerful new machine learning models from the likes of Google and OpenAI have been made available that can better predict the correct meaning of phrases and compose much longer passages of more human-like text.
Dan Lahav, another IBM computer scientist and an experienced debater who worked on the project, said that some people fed the machine obscenities and racist language in the apparent hope the machine might repeat those comments during the debate, but the software’s natural language processing was good enough to weed those comments out as not relevant to the debate or not the most persuasive arguments for the two positions.
In the tests of other A.I. systems, this kind of malicious behavior by people has led to embarrassing situations for technology companies. For instance, Microsoft famously had to withdraw a Twitter chatbot called Tay that was supposed to learn from human comments, after people fed it racist and abusive Tweets, training the system to tweet in similar language.
Slonim said the Project Debater software remained imperfect, noting that during the Cambridge Union debate it erroneously used an argument about bias as supporting the case for A.I. It also once or twice made statements that were repetitive of arguments it already made.
The software has been trained from more than 30,000 argumentative statements which human reviewers have rated for their persuasiveness in order to build a model based on what makes an argument persuasive to human listeners. Slonim said one area of further research was to then “reverse-engineer” this model to gain further understanding of why humans find certain arguments more persuasive than others.
“I think it is an interesting way of exploring issues,” said Neil Lawrence, a professor of machine learning at Cambridge University and a former head of machine learning for Amazon.
Lawrence participated in the debate, arguing in favor of the idea that A.I. would do more harm than good. He said that while his own personal view is that A.I. can do “tremendous good,” he did believe people should not ignore potential harms.
“You are better off assuming it is going to do more harm over the next 10 years because then you watch out for the pitfalls,” he said.
In the end though, the sold-out 300-person audience at the Cambridge Union found Natarajan and his teammate Sylvie Delacroix, a professor of law and ethics at the University of Birmingham, more persuasive. They voted 52.1% against the proposition that A.I. would cause more harm than good.
More must-read stories from Fortune:
—These are the jobs artificial intelligence will eliminate by 2030
—Why Trump is bad for business
—Can Goldman Sachs CEO David Solomon get the bank to grow again?
—Ford’s Mustang Mach-E is a radical gamble on an electric future
—Give these world-class experiences as gifts this holiday season
Subscribe to Fortune’s Eye on A.I. newsletter, where artificial intelligence meets industry.