As we go about our lives we generate vast quantities of data. We produce so much data, in fact, that it is impossible to store, let alone analyse. The proliferation of this data will continue as our use of digital technology expands.
Analysing all this data to derive powerful insights into people’s lives is the sort of task that artificial intelligence (AI) is already very good at. Businesses like Amazon use this strength of AI to tailor their offerings to customers, recommending books that people like them are reading. The power of AI to crunch data is also being put to use improving public services and healthcare and in some places can even determine whether you get a visit from the police.
There are massive potential benefits to be had for consumers and citizens as increasingly powerful AI is used to personalise and target business and government services. But there are also risks, and these risks will grow as the amount of information held about us grows and the ability of AI to analyse it improves.
One risk is that information people would rather have kept confidential will be revealed. This was the case when a father accidentally opened coupons for baby supplies mailed to his daughter by a store based apparently on an algorithmic predication that she was pregnant. Advances in AI also makes anonymity increasingly fragile and it may become possible to re-assign identity to particular sets of information because of AI’s ability to cross-reference vast quantities of data in multiple data sets.
Most of us are clueless about what data is collected about us, by whom, and for what purpose. This is recognised by both parties when the necessary acquisition of consent is literally and figuratively reduced to a farcical box-ticking exercise. What a strange contract it is when the customer signing doesn’t read it, the company knows its customer doesn’t read it, and the customer knows that the company knows that the customer doesn’t read it. We need to consider whether access to our personal data remains a reasonable condition of use for everyday services, from email to Facebook.
AI systems depend on the data on which they are trained and which they are given to assess, and may reflect back biases in that data in the action they recommend. Biases may exist in data because of poor data collection. They may also occur when a process being modeled itself exhibits unfairness.
If data on job applications was gathered from an industry that systematically hired men over women and this data was then used to help select likely strong candidates in the future, this could then reinforce sexism in hiring decisions. Recently there have been a number of high profile cases of data bias and machine prejudice in the USA. Google ads promising jobs paying more than $200,000 were shown to significantly fewer women than men. And recidivism software widely used by American courts to assess the likelihood of an individual re-offending was found to be two times less likely to falsely flag white people and two times more likely to falsely flag black people.
Compounding these problems is the fact that AI decision-making systems are often deployed as a background process, unknown and unseen by those they impact. It is in any case often hard to know how AI arrives at the decisions it makes. This ‘black box’ problem is most prevalent in those complicated machine learning algorithms which evolve over time.
It is because of the great opportunities and important risks associated with AI’s extraordinary capacity to crunch data that one of the most urgent recommendations in Future Advocacy’s report “An Intelligent Future?” is for a ‘New Deal On Data’ in the UK. We are not the first to propose this and we hope that many others will join the call. Such a deal would agree best practice on privacy, consent, transparency, and accountability.
A ‘New Deal On Data’ between citizens, business, and governments will be good for us ordinary people. But it is in the interests of business and government too, because it will build trust. If we do not do this we risk undermining public confidence in AI technology, sparking opposition to its uptake.
The only way to develop such a new deal is through a loud and lively public debate. People need greater clarity about who collects what, and for what purpose. We need to argue about the rights of various parties and understand how to access information about how our own personal data is stored and used. Public debate should also focus on the uncertainties around how data might be used in the future.
Government is not well positioned to lead this and nor is business. It feels like a job for a respected and impartial national treasure to convene this crucial debate. But sadly national treasures feel a bit thin on the ground at the moment. Does anyone have Stephen Fry’s number?
This blog is based upon one of our concrete recommendations to the UK government on how it can maximise the benefits and minimise the risks of artificial intelligence from our recent report, An Intelligent Future?