No one gets angry at a mathematician or a physicist whom he or she doesn’t understand, or at someone who speaks a foreign language, but rather at someone who tampers with your own language — Jacques Derrida
Almost every Artificial Intelligence (AI) expert agrees that an AI algorithm is only as good as the dataset it works on. Secondly, the larger, more diverse and more global a dataset is, the better the AI algorithm performs. Thirdly, there is a lot of bias that can be introduced in algorithms by using skewed datasets. This bias can wreak havoc in some cases. However, almost everyone disagrees on what the definition of privacy should be, what level of access should be granted to whom and what constitutes good data. Normally, centralization is a sign of crisis. In the past, resources have been pooled to respond to industry regulations. For example, the Basel III accord, the COSO CoBit framework etc. are all good cases of standardization and pooling together of resources to deal with regulations. While AI is not yet subject to stringent regulations and rightfully so-strict laws can stifle innovation, this article makes a case for pooling of data resources and harmonization of approaches to make data less private, more accessible and with rules built in. The idea is to create a global, unbiased dataset that can fuel the development of AI.
The World Economic Forum meets annually to discuss challenges facing the world. This year, a broad swathe of people met to discuss everything from climate leadership to inclusive finance, the future of the world economy to the future of exponential technologies. I have always been intrigued by what the future state of AI would look like.
How Do We Deal With Disparate and Global Efforts At Developing AI Systems?
As I watched the panel below, I could not help but observe some common themes everyone agreed on and a lot of issues that the panel did not agree on. While AI as a concept has old origins, most people understand that its development has happened in waves i.e. there were two dry periods called ‘AI Winters’ where research and funding to this technology dried up. Today, most of the western world looks at AI to drive efficiency and productivity. Of course, the bigger demographic trends are helping accelerate research in AI.
If we are planning to model AI systems to mimic the human brain, its because the brain is the best model of intelligence that we can model our AI systems on. We can always look at different ways of developing AI but for now, the human neural network is a good model to start with. This is something most AI developers agree on. However, the dilemma with the development of AI as a technology is that different companies and individuals are looking to optimize different objective functions.
For example, auto manufacturers are looking to create a safe and comfortable driving experience. Therefore, their experiments with AI deal with collision detection, lane changes, navigation etc. The creators of DeepMind for example were training algorithms to beat humans and their own older versions at Chess and the Chinese game of Go. Since the applications of AI are all pervasive, many different efforts are on in different parts of the world. The most visible manifestations are digital voice assistants, biometric identification systems, level 3 to level 5 autonomous driving etc.
There are many ways in which co-ordination can be achieved. First, multinational fora such as the United Nations, the World Economic Forum can be leveraged to check in with peers annually. Second, industry alliances such as the Partnership on AI is another example. As per its website, its stated mission is:
The Partnership on AI (PAI) is a multistakeholder organization that brings together academics, researchers, civil society organizations, companies building and utilizing AI technology, and and other groups working to better understand AI’s impacts. The Partnership was established to study and formulate best practices on AI technologies, to advance the public’s understanding of AI, and to serve as an open platform for discussion and engagement about AI and its influences on people and society
Thirdly, public and private partnerships which include governments in their fold can also help. In countries such as India and Singapore, the government is trying to create a uniform, centralized database that can be leveraged by anyone using Open API’s. These are good examples of governments pro-actively fostering the development of AI to achieve its own goals such as financial inclusion, education and literacy among the countries’ masses.
Of course, there is always a challenge with rogue actors or a network of actors outside the formal world. Bitcoin is a great development of a coordinated effort across countries to build a working proof of concept of a currency on a Blockchain. Similarly, there will be thousands of developers working on their own to develop AI applications for better or for worse. Therefore, the next step could be to have a global developer conference for AI at least for the developers with good ambitions.
The ImageNet Model
One of the biggest examples of pooling data to train facial recognition AI is the annual contest ImageNet. Hosted since 2010, ImageNet is a competition where the research team with the best AI (an algorithm with the least error rate) in recognizing images wins. ImageNet crowdsources its annotation process. The 2017 challenge evaluates algorithms for object localization/detection from images/videos at scale.
For any industry group with global ambitions of which there are many, there is a great incentive to collaborate on creating a centralized database of data that can be leveraged to create a contest like ImageNet. For example, lets say the auto industry collects data through millions of sensors and in-car devices and pools together all the data. It could host a global contest for teams to optimize the objective functions they are most interested in. The advantage of pooling data and learning together is to learn about ethical and moral dilemmas as a group and then learn how to proactively deal with legislation that could come up. Also, weather conditions, roads and a million other variables vary by location. There is a great incentive in pooling data subject to creating the right privacy guidelines. The industries have to proactively inform their clients about the data they are collecting, assure them of anonymity and follow through on that promise to create an optimal outcome for anyone.
This database can be funded by subscription fees paid out by members of this centralized pool. One of the advantages of creating a Database as a Service (DaaS) is to have data governance built in with data definition and clear standards for retrieving data and maintaining privacy.
One thing is clear-the best AI players in the industry will need global data.
Don’t Miss This Opportunity
There are no clear answers on many questions around AI — when will AI achieve Singularity, how many jobs will be displaced? do we have any ethical standards? Do we have uniform privacy laws? etc. Europe implemented GDPR standards while the rest of the western world and Asia have a different approach. The trick is how to drive innovation while protecting the interests of people. There is no better example of the need for global coordination than AI. In the past, global coordination was inspired by crisis. This time, we have an opportunity to avert another one.