Skip to content
Article Article January 18th, 2018

AI's sight test: open up to power up

Article highlights


As data is the foundation of artificial intelligence, we need to see what is feeding the machines, says @LizCarolan

Share article

Ensuring that the data behind AI systems is visible is crucial for identifying problems and building trust, says @LizCarolan

Share article

The question of large tech giants opening or sharing their data stores needs to be on the table, says @LizCarolan

Share article

Partnering for Learning

We put our vision for government into practice through learning partner projects that align with our values and help reimagine government so that it works for everyone.

Partner with us

Data is the foundation of artificial intelligence (AI). It is what machines ingest in order to learn, and what they process in order to make decisions or deliver predictions.

The coming of the age of Big Data has driven a fresh wave of AI investment, a wave that has brought with it challenges from racial bias in algorithms to monopolistic behaviour by tech giants. Could some of the concerns arising from this resurgence be impacted by Big Data's idealistic cousin; open data?

Open data is the idea that, wherever possible, data should be made available so anyone can access, use and share it. It asserts that the potential for data to improve our world can only be realised if there is a level playing field when it comes to accessing the economy's "new oil".

This level playing field can allow more people to create new and improved insights, products and services. We have seen evidence of this in the use of open satellite and weather system data in everything from new crop insurance in Africa, to the apps we use every day.

Transparency is also a way to get systems to work better, and change behaviour. This includes making corruption harder and building trust in government, by making contracting processes visible; and projects empowering parents to make data-based choices and demand improvements when it comes to schools. Transparency is also increasingly seen as an alternative to hard regulation, with things like publishing information on the pay gap in organisations creating incentives for behaviour change.

So how does open data interact with the latest wave of AI and in particular with some of the concerns filtering up as AI is deployed in the real world?

Bias in AI systems

Training data fundamentally shape AI systems. They create the assumptions a system will use when it goes on to make decisions and predictions in the real world. Yet data is not neutral. As the Centre for Public Impact has pointed out, "the veneer of objectivity that algorithms provide {can} mask the very real subjectivity that lies underneath all data".

This is especially risky in complex systems like criminal justice, an area where AI is already in place. A 2016 report by ProRuplica showed that algorithms in the US used to predict recidivism were biased against black people, impacting sentencing of offenders.

In this case, it appears that biases in the data used to train the algorithm ended up embedded in the AI system, with life-altering consequences. As the AI Now Institute points out machine learning can "reinforce existing inequalities, regardless of the intentions of the technical developers".

The ability to scrutinise AI systems and products relies on the ability to scrutinise the data on which it has been trained. Just as when it comes to government procurement and spending, ensuring that the data behind these systems is open to interrogation is crucial for both identifying problems, and building trust.

Much of this will depend on the willingness of companies to share the data underlying their AI systems.

AI monopolies

The quality of AI systems depends on the quality and quantity of available data. So it follows that the large tech giants have a substantial head start over new or emerging innovators.  The Googles and Amazons of the world hold huge stores of data on everything from our movements to our shopping habits, and as their AI systems acquire more data-producing users, this advantage will only strengthen.

At first glance, the widespread adoption of open data initiatives by governments and other organisations may be widening the gap. Opening up data benefits the larger tech firms to a greater extent than smaller ones, according to Jeni Tennison, CEO of the Open Data Institute.

Larger firms have greater capacity, skills and scale to make use of publically available data. They can also potentially use it to generate more valuable insights than smaller rivals as they can combine it with their own large stores of data.

However, the alternative to this is keeping data closed, or charging a fee to access or use it. And failing to open data disproportionately damages smaller companies, according to Tennison.

While freely available data allows smaller firms to experiment and innovate, paywall and licensing restrictions make developing and testing new ideas prohibitively expensive or risky.

Larger firms, on the other hand, have the cash to buy access, or the lawyers to negotiate it. They also have the scale and capacity to generate their own (private) version of closed or poorly licensed data from what it already knows, and to improve this through their systems and products.

Tennison, here, gives the example of Google's own version of the UK address file. This is less reliable than the strictly licensed official dataset, but is constantly improving as it receives consumer feedback - though disproportionately in wealthier areas.

So if we want to avoid the over-concentration of market share and power in a few large firms, the question of large tech giants opening or sharing their data stores needs to be on the table.

Getting this to happen will not be easy - as The Economist pointed out, "incentives to share valuable data and algorithms are weak". Indeed in about 10 years of the open data publication, only a tiny fraction has come from the private sector.

But as governments consider their role in relation to AI, they will need to think about the levers they have at their disposal to make sure these incentives are there. While transparency has sometimes been used as a softer form of regulation, on this point we may need to consider good old fashioned hard regulation of firms, to as The Economist puts it, get firms to "prise open their grip".

The World Wide Web Foundation is conducting exploratory research on the use of data and algorithms by national and local governments. Click here to take part in our short survey.

***

The Centre for Public Impact is investigating the way in which

artificial intelligence can improve outcomes for citizens.

Are you working in government and interested in how AI applies to your practice? Or are you are an AI practitioner who thinks your tools can have an application in government? If so, please get in touch.

FURTHER READING

Written by:

Liz Carolan Senior Advisor, Open Data Charter
View biography
Share this article: