This Stanford computer science professor went to written exams 2 years ago because of AI. He says his students insisted on it

Stanford University computer science professor Jure Leskovec is no stranger to rapid technological change. A machine-learning researcher for nearly three decades and well into his second decade of teaching, he’s also the co-founder of Kumo, a startup with $37 million in funding raised to date.

But two years ago, as the latest wave of artificial intelligence began reshaping education, Leskovec told Fortune he was rocked by the explosion of his field into the mainstream. He said Stanford has such a prestigious computer science program he feels as if he “sees the future as it’s being born, or even before the future is born,” but the public release of GPT-3 was jarring.

“We had a big, I don’t know, existential crisis among students a few years back when it kind of wasn’t clear what our role is in this world,” Leskovec said.

He said it seemed like breakthroughs in AI would be exponential to the point where “it will just do research for us, so what do we do?” He said he spent a lot of time talking with students at the PhD level about how to organize themselves, even about what their role in the world would be going forward. It was “existential” and “surprising,” he said. Then, he received another surprise: a student-led request for a change in testing.

“It came out of the group,” he said, especially the teaching assistants, the previous generation of computer science undergraduates. Their idea was simple: “We do a paper exam.”

AI as catalyst for change

Leskovec, a prominent researcher at Stanford whose expertise lies in graph-structured data and AI applications in biology, recounted the pivot with a mixture of surprise and thoughtfulness. Historically, his classes had relied on open-book, take-home exams, where students could leverage textbooks and the internet. They couldn’t use other people’s code and solutions, but the rest was fair game. As large language models like OpenAI’s GPT-3 and GPT-4 exploded onto the scene, students and teaching assistants alike began questioning whether assessments ought to be handled differently.

Now it’s a lot more work for him and his TAs, he said, saying these exams take “much longer” to grade. But they all agreed it was the best way to actually test student knowledge. The age of AI for Leskovec, an AI veteran, has surprised him by putting a higher workload back on himself and other humans. Besides there being “fewer trees in the world” from all the paper he’s printing out, he said AI has just created “additional work.” His 400-person classes feel like an audience at a “rock concert,” but he insisted he’s not turning to AI for help synthesizing and analyzing all the exams.

“No, no, no, we hand grade,” he insisted.

A student-driven solution

Leskovec’s solution sits squarely in the middle of a raging debate about how AI is changing higher education, as reports of rampant cheating have led many colleges to ban the use of AI outright. Other professors are turning back to the paper exam, reviving the famous blue books of many ’90s kids’ memories of high school. One New York University professor even suggested getting “medieval,” embracing ancient forms of testing such as oral and written examination. In the case of Leskovec, the AI professor’s solution for the AI age is likewise to turn away from AI for testing.

When asked if he was worried about students cheating with AI, Leskovec posed another question: “Are you worried about students cheating with calculators? It’s like if you allow a calculator in your math exam, and you will have a different exam if you say calculators are disallowed.” Likening AI to a calculator, he said AI is an amazingly powerful tool that “kind of just emerged and surprised us all,” but it’s also “very imperfect … we need to learn how to use this tool, and we need to be able to both test the humans being able to use the tool and humans being able to think by themselves.”

What is an AI skill and what is a human skill?

Leskovec is wrestling with a question that touches everyone in the workforce: What is a human skill, what is an AI skill, and where do they merge? MIT professor David Autor and Google SVP James Manyika argued inThe Atlantic tools like a calculator or AI generally fall into two buckets: automation and collaboration. Think dishwasher, on the one hand, or word processor, on the other. The collaboration tool “requires human engagement” and the issue with AI is that it “does not go neatly into either [bucket].”

The jobs market is sending a message on AI implementation that equates to something like a response from the Magic 8 Ball: “Reply hazy. Try again later.” The federal jobs report has revealed anemic growth since the spring, most recently disappointing expectations with a print of just 22,000 jobs in August. Most economists attribute the lack of hiring to uncertainty about President Donald Trump’s tariff regime, which multiple courts have ruled illegal and appears to be heading to the Supreme Court. But AI implementation is not going smoothly at the corporate level, with an MIT study (not connected to Autor) finding 95% of generative AI pilots are failing, followed shortly after by a Stanford study finding the beginning of a collapse in hiring at the entry level, especially in jobs exposed to automation by AI.

For another perspective, the freelance marketplace Upwork just launched its inaugural monthly hiring report, revealing what non-full-time jobs are being rewarded by the market. The answer is “AI skills” are super in-demand and, even if companies aren’t hiring full-time employees, they are piling into highly paid and highly skilled freelance labor.

Despite a softer overall labor market, Upwork finds companies are “strategically leveraging flexible talent to address temporary gaps in the workforce,” with large businesses driving a 31% growth in what Upwork calls high-value work (contracts greater than $1,000) on the platform. Smaller and medium-sized businesses are piling into “AI skills,” with demand for AI and machine learning leaping by 40%. But Upwork also sees growing demand for the kind of skills that fall in between: a human who is good at collaborating with AI.

Upwork says AI is “amplifying human talent” by creating demand for expertise in higher-value work, most visible across the creative and design, writing, and translation categories. One of the top skills hired for in August was fact-checking, given “the need for human verification of AI outputs.”

Kelly Monahan, managing director of the Upwork Research Institute, said “humans are coming right back in the loop” of working with AI.

“We’re actually seeing the human skills coming into premium,” she said, adding she thinks people are realizing AI hallucinates too much of the time to completely replace human involvement. “I think what people are seeing, now that they’re using AI-generated content, is that they need fact-checking.”

Extending this line of thinking, Monahan said the evolving landscape of “AI skills” shows what she calls “domain expertise” is growing increasingly valuable. Legal is a category that grew in August, she said, highlighting legal expertise is required to fact-check AI-generated legal writing. If you don’t have advanced skills in a particular domain, “it’s easy to be fooled” by AI-generated content, and businesses are hiring to protect against that.

Leskovec agreed when asked about the skills gap that appears to be facing entry-level workers trying to get hired, on the one hand, and companies struggling to effectively implement AI.

“I think we almost need to re-skill the workforce. Human expertise matters much more than it ever did [before].” He added the entry-level issue is “the crux of the problem,” because how are young workers supposed to get the domain expertise required to effectively collaborate with AI?

“I think it goes back to teaching, reskilling, rethinking our curricula,” Leskovec said, adding colleges have a role to play, but organizations do, as well. He asked a rhetorical question: How are they supposed to have senior skilled workers if they’re not taking in young workers and taking the time to train them?

When asked by Fortune to survey the landscape and assess where we are right now in using AI, as students, professors and workers, Leskovec said we are “very early in this.” He said he thinks we’re in the “coming-up-with-solutions phase.” Solutions like a hand-graded exam and a professor finding news ways to fact-check his students’ knowledge.

Fortune Brainstorm AI returns to San Francisco Dec. 8–9 to convene the smartest people we know—technologists, entrepreneurs, Fortune Global 500 executives, investors, policymakers, and the brilliant minds in between—to explore and interrogate the most pressing questions about AI at another pivotal moment. Register here.