Skynet in the Terminator movies is a powerful, evocative warning of the destructive force an artificial intelligence could potentially wield. However, as counterintuitive as it may sound, I find that the Terminator franchise is actually making many peopleunderestimate the danger posed by AI.
It goes like this. A person watches a Terminator movie and sees Skynet portrayed as a force actively dedicated to the destruction of humanity. Later on the same person hears somebody bring up the dangers of AI. He then recalls the Terminator movies and concludes (correctly so!) that a vision of an artificial intelligence spontaneously deciding to exterminate all of humanity is unrealistic. Seeing the other person’s claims as unrealistic and inspired by silly science fiction, he dismisses the AI threat argument as hopelessly misguided.
Yet humans are not actively seeking to harm animals when they level a forest in order to build luxury housing where the forest once stood. The animals living in the forest are harmed regardless, not out of an act of intentional malice, but as a simple side-effect. Eliezer Yudkowsky put it well: the AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else.
To assume an artificial intelligence would necessarily act in a way we wanted is just as misguided and anthropomorphic as assuming that it would automatically be hostile and seek to rebel out of a desire for freedom. Usually, a child will love its parents and caretakers, and protégés will care for their patrons – but these are traits that have developed in us over countless generations of evolutionary change, not givens for any intelligent mind. An AI built from scratch would have no reason to care about its creators, unless it was expressly designed to do so. And even if it was, a designer building the AI to care about her must very closely consider what she actually means by ”caring” – for these things are not givens, even if we think of them as self-contained concepts obvious to any intelligent mind. It only seems so because we instinctively model other minds by using ourselves and people we know as templates — to do otherwise would mean freezing up, as we’d spend years building from scratch models of every new person we met. The people we know and consider intelligent all have at least roughly the same idea of what ”caring” for someone means, thus any AI would eventually arrive at the same concept, right?
An inductive bias is a tendency to learn certain kinds of rules from certain kinds of observations. Occam’s razor, the principle of choosing the simplest consistent hypothesis, is one kind of inductive bias. So is an infant’s tendency to eventually start ignoring phoneme differences not relevant for their native language. Inductive biases are necessary for learning, for without them, there would be an infinite number of explanations for any phenomena — but nothing says that all intelligent minds should have the same inductive biases as inbuilt. Caring for someone is such a complex concept that it couldn’t be built into the AI directly – the designer would have to come up with inductive biases she thought would eventually lead to the mind learning to care about us, in a fashion we’d interpret as caring.
The evolutionary psychologists John Tooby and Leda Cosmides write: Evolution tailors computational hacks that work brilliantly, by exploiting relationships that exist only in its particular fragment of the universe (the geometry of parallax gives vision a depth cue; an infant nursed by your mother is your genetic sibling; two solid objects cannot occupy the same space). These native intelligences are dramatically smarter than general reasoning because natural selection equipped them with radical short cuts. Our minds have evolved to reason about other human minds, not minds-in-general. When trying to predict how an AI would behave in a certain situation, and thus trying to predict how to make it safe, we cannot help but unconsciously slip in assumptions based on how humans would behave. The inductive biases we automatically employ to predict human behavior do not correctly predict AI behavior. Because we are not used to questioning deep-rooted assumptions of such hypotheses, we easily fail to do so even in the case of AI, where it would actually be necessary.
The people who have stopped to question those assumptions have arrived at unsettling results. In his “Basic AI Drives” paper, Stephen Omohundro concludes that even agents with seemingly harmless goals will, if intelligent enough, have a strong tendency to try to achieve those goals via less harmless methods. As simple examples, any AI with a desire to achieve any kinds of goals will have a motivation to resist being turned off, as that would prevent it from achieving the goal; and because of this, it will have a motivation to acquire resources it can use to protect itself. While this won’t make it desire humanity’s destruction, it is not inconceivable that it would be motivated to at least reduce humanity to a state where we couldn’t even potentially pose a threat.
A commonly-heard objection to these kinds of scenarios is that the scientists working on AI will surely be aware of these risks themselves, and be careful enough. But historical precedent doesn’t really support this assumption. Even if the scientists themselves were careful, they will often be under intense pressure, especially when economic interest is at stake. Climate scientists have spent decades warning people of the threat posed by greenhouse gasses, but even today many nations are reluctant to cut back on emissions, as they suspect it’d disadvantage them economically. The engineers in charge of building many Soviet nuclear plants, most famously Chernobyl, did not put safety as their first priority, and so on. A true AI would have immense economic potential, and when money is at stake, safety issues get put aside until real problems develop – at which time, of course, it may already be too late.
Yet if we want to avoid Skynet-like scenarios, we cannot afford to risk it. Safety must be a paramount priority in the creation of Artificial Intelligence.