Can we build the Good Samaritan AI? Three guidelines for teaching morality to machines

For years, experts have warned about the unwanted effects of artificial intelligence (AI) on society. Ray Kurzweil predicts that by 2029 intelligent machines will be able to outsmart human beings. Stephen Hawking argues that “once humans develop full AI, it will take off on its own and redesign itself at an ever-increasing rate”. Elon Musk warns that AI may constitute a “fundamental risk to the existence of human civilisation”.

These dystopian prophecies have often been met with calls for more ethical AI systems; that somehow engineers should imbue autonomous systems with a sense of ethics. According to some AI experts, we can teach our future robot overlords to tell right from wrong, akin to a “good Samaritan AI” that will always act justly and will help humans in distress.

Although this future is still decades away, there is much uncertainty about how, if at all, we will reach this level of general machine intelligence. But what is more crucial at the moment is that even today’s narrow AI applications require us to attend urgently to the ways in which they are making moral decisions in practical day-to-day situations. For example, this is relevant when algorithms make decisions about who gets access to loans or when self-driving cars have to calculate the value of a human life in hazardous situations.

A question of morality

Teaching morality to machines is hard, because humans can’t convey morality objectively in metrics that a computer can easily process. In fact, it is even questionable whether we, as humans, have a sound understanding of morality at all that we can agree on.

When facing moral dilemmas, humans tend to rely on gut feeling instead of elaborate cost-benefit calculations. Machines, on the other hand, need explicit and objective instructions that can be clearly measured and optimised.

For example, an AI player can excel in games with clear rules and boundaries by learning how to optimise the score via repeated playthroughs. After its experiments with deep reinforcement learning on Atari video games, Alphabet’s DeepMind was able to beat the leading masters of Go. Meanwhile, OpenAI amassed “lifetimes” of experiences to beat the best human players at the Valve Dota 2 tournament, one of world’s most popular e-sports competitions.

But in real-life situations, optimisation problems are vastly more complex. For example, how do you teach a machine to maximise fairness algorithmically or to overcome racial and gender biases in its training data? A machine cannot be taught what is fair unless the engineers designing the AI system have a precise conception of what fairness is.

This has led some authors to worry that a naïve application of algorithms to everyday problems could amplify structural discrimination and reproduce biases in the data they are based on. In the worst case, algorithms could deny services to minorities, impede people’s employment opportunities, or get the wrong political candidate elected. So what can we do to design more ethically aligned machines?

Three golden rules

Firstly, AI researchers and ethicists need to formulate ethical values as quantifiable parameters. In other words, they need to provide a machine with explicit answers and decision rules to any potential ethical dilemmas it might encounter. This would require humans to agree amongst themselves on the most ethical course of action in any given situation – a challenging but not impossible task.

For example, Germany’s Ethics Commission on Automated and Connected Driving has recommended that we explicitly program ethical values into self-driving cars in order to prioritise the protection of human life above all else. In the event of an unavoidable accident, the car should be “prohibited from weighing victims against each other”. In other words, a car shouldn’t be able to choose whether to kill one person rather than another – based on factors such as age, gender or mental or physical constitution – when a crash is inescapable.

Secondly, engineers need to collect enough data on explicit ethical measures to train AI algorithms appropriately. Even after we have defined specific metrics for our ethical values, an AI system might still struggle to adopt them if there is not enough unbiased data to train the models.

Getting appropriate data is challenging, because ethical norms cannot always be clearly standardised. Different situations require different ethical approaches, and in some situations there may not be a single ethical course of action at all – just think about lethal autonomous weapons that are currently being developed for military applications.

One way of solving this would be to crowdsource potential solutions to moral dilemmas from millions of humans. For instance, MIT’s Moral Machine project shows how crowdsourced data can be used to train machines to make better moral decisions in the context of self-driving cars.

And thirdly, policymakers need to implement guidelines to make AI decisions about ethics more transparent, especially with regard to ethical metrics and outcomes. If AI systems make mistakes or have undesired consequences, we cannot accept “the algorithm did it” as an adequate excuse. But we also know that demanding full algorithmic transparency is technically untenable (and, quite frankly, not very useful).

Neural networks are simply too complex to be scrutinised by human inspectors. Instead, there should be more transparency about how engineers quantified ethical values before programming them, as well as the outcomes that the AI has produced as a result of these choices. For self-driving cars, for instance, this could imply that detailed logs of automated decisions are kept at all times to ensure their ethical accountability.

Next steps

These guidelines could be a starting point for developing ethically aligned AI systems. By failing to imbue ethics into AI systems, we may be placing ourselves in the dangerous situation of allowing algorithms to decide what’s best for us.

So, in an unavoidable accident situation, self-driving cars will need to make some decision, for better or worse. But if the car’s designers fail to specify a set of ethical values that could act as decision guides, the AI system may come up with a solution that causes even greater harm. This means that we cannot simply refuse to quantify our values. By walking away from this critical ethical discussion, we are making an implicit moral choice. And as machine intelligence becomes increasingly pervasive in society, the price of inaction could be enormous – it could negatively affect the lives of billions of people.

We cannot assume that machines are inherently capable of behaving morally. Humans must teach them what morality is, and how it can be measured and optimised. For AI engineers, this may seem like a daunting task. After all, defining moral values is a challenge mankind has struggled with throughout its history.

Nevertheless, the state of AI research requires us to finally define morality and to quantify it in explicit terms. Engineers cannot build a “good Samaritan AI” as long as they lack a formula for the Good Samaritan human.

 

The Centre for Public Impact is investigating the way in which artificial intelligence can improve outcomes for citizens.

Are you working in government and interested in how AI applies to your practice? Or are you are an AI practitioner who thinks your tools can have an application in government? If so, please get in touch.

 

FURTHER READING