AI Without Data is Artificial Ignorance
Many years ago, I attended a seminar in Prague on the state of credit scoring in numerous locations in what had been Soviet Europe. I was taken aback by the stretch to find any data that would facilitate the growth of consumer markets through credit. This followed a session on credit scoring at the International Conference of Data Protection and Privacy Commissioners at Santiago de Compostela that focused on two case studies. The first was the UK, where adequate data was available, but decisions were the product of profiling. The second was France, where having only negative data led to flawed, inaccurate scores.
When thinking through conflicting policy objectives, we like to say on one hand this, on the other hand that. When dealing with information policy conundrums we need three hands for the balancing process. On the first hand, we want better outcomes. But better outcomes, be it in health, credit, education or economic growth, require the constant evolution of advanced analytic tools fed by data. That was the dilemma facing the World Bank in modernizing Eastern European economies. On the second hand, we are concerned that probability based decision making has impacts on both autonomy and the humanity of the decisions we make. So, data protection authorities were concerned that credit scoring is profiling. On the third hand, the accuracy of the decisions made by analytic tools are only as good as the quality and quantity of the data available to the tools. So, authorities were concerned that credit scoring was compelling but that the data was not adequate. With these conflicting policy drivers, balancing is best handled by a three-handed mythical being, not by mere human policy makers who only have two hands.
Artificial intelligence (AI), the next step in advanced analytics, is already here. I am not talking about interesting consumer applications, like Alexa, but rather the machine learning tools already in place to curb fraud and make networks safe. These are not controversial applications, but the controversial ones will follow. For example, there recently was the failure of a major application of AI in cancer treatment and research that, in part, was impacted by the lack of sufficient data. This failure led to this blog’s title quote, “AI without data is artificial ignorance,” coined by my colleague Stan Crosley.
Over the past few weeks, I have been absorbing the impact of the proposed EU ePrivacy Regulation on the development of advanced analytics. The proposed ePrivacy Regulation is grounded in the fundamental right to the respect for private life, in particular with regard to communications. This means it revolves around the individual’s ability to be the controller of the data pertaining to his or her communications from his or her terminals, be they a PC, phone, smart car, or smart medical device.
The EU General Data Protection Regulation (GDPR) links to the fundamental right to data protection, which looks for guidance to the full range of fundamental rights impacted by the processing of data. For the past four years, the IAF has been looking at means of governing advanced analytics by looking to the full range of rights and interests of all stakeholders, putting
the individual first. AI has caused IAF to focus more on how ethics impacts both analysis and the application of that analysis. All of this is based on the broadness of data protection as a fundamental right.
The GDPR governs both thinking and acting with data. IAF increasingly understands how to synthesize the three-handed mythical being in conducting assessments in a manner that aligns with the GDPR. However, the proposed ePrivacy Regulation is so broad in scope that it may very well govern all the data touched by communication coming off the many terminals we all touch in our daily routine. The proposed ePrivacy Regulation covers not just the browser on our PC but the apps on our phone and every IoT device as well. Governing research based on data coming from all those sensors and how they impact each other will be very hard to do based on explicit informed consent, as required by the proposed ePrivacy Regulation.
The IAF has formed Discussion Groups based on the challenges raised by AI, with a new focus on life sciences industries. AI in health sciences will rely on the ability to make use of data coming from different types of sensors governed by different privacy protocols. I have great confidence that IAF will be able to create assessment processes that will handle the threehanded balancing process. But that will only be true if the data is governed in a manner other than explicit consent. I would truly hate for AI, in the end, to be artificial ignorance.