* By Amanda Matos Cavalcante
Without an adequate volume of data, it is simply impossible to train algorithms with actual company data and carry out predictive analyses… much less work with Artificial Intelligence.
A few days back, while browsing LinkedIn, I came across an article¹ by Cezar Taurion, one of the gurus of Digital Transformation. The subject intrigued me as much as he does: “It’s a misconception that sophisticated AI algorithms alone can provide valuable business solutions without an adequate volume of data.”
I have noticed a huge interest on the part of the market when it comes to Artificial Intelligence. More than a few companies are seeking to include AI in their portfolio in some way or another. However, not everyone understands that there is a right path to take, and that the basis of AI algorithm training and implementation is the data.
It was not by chance that Gartner designed a pyramid that serves as a guide to Digital Transformation. According to this pyramid, Data is the base of everything; it is only possible to climb the “Analytics” and “Machine Learning” steps and reach “AI” at the top when the base (the data) exists in sufficient volume and consistency. Without an adequate volume of data, it is simply impossible to train algorithms with actual company data or to carry out predictive analytics… much less work with Artificial Intelligence.
Why did combustion happen only now, if this topic has been studied for over 50 years?
For a long time, I wondered why Artificial Intelligence had only now become a reality. Fortunately, I found the answer to this question and I would like to share it with you.
To better understand, it is important to remember that the concept of AI came about in 1956, cited by professor and scientist John McCarthy at the Dartmouth conference. In 1950, English mathematician Alan Turing published the article “Computing Machinery and Intelligence,” which was also a landmark study for the area. In the 1980s, the theme again surfaced with the interest in studies and discussions about neural networks and connectionism, but it cooled off again after a while. It was only in the last few years that AI underwent the “boom” that we have been witnessing. And the question that remains is: why only now?
My research on the subject led me to an article posted on the Salesforce² website, which explains that although theoretical models about AI have been around for many years, evolving from simple computing to a real and consistent AI requires three key items that weren’t yet a reality a short time ago: (1) Highly powerful and affordable computing, to enable fast and efficient processing; (2) Good data models to intelligently classify, process and analyze the data; (3) Access to a large quantity of raw data to feed the models so that they continue to improve.
In other words, we can consider that Artificial Intelligence finally became a reality because of the following advances:
– A volume of data that is only available through big data, which grows every day particularly because of the explosion of data coming from (for example) social media, IoT technologies such as sensors and wearable devices, as well as public and private companies that now provide data and information previously not so accessible;
– Well-founded data models that today are prepared to process and analyze a terabyte (or more!) per second and deliver consistent insights for business strategies;
– Growth of infrastructure, such as more robust processors, storage technologies and cloud computing, coupled with virtualization capabilities, all of which allow for dynamic sizing that can support and handle large volumes of processing and storage.
It is also worth noting an inherent interconnection with the Science of Complexity, which unites the Exact Sciences with the Natural Sciences, and addresses (among other aspects) mathematical models and information compression.
Without oxygen there is no fire
I have turned my attention to the issue of AI particularly because the company where I work has been developing the largest data lake in Latin America. Working with data and creating the base of the pyramid that will lead one of Brazil’s largest telecommunication operators to reach the top through Artificial Intelligence has been a non-trivial task, but tremendously challenging and rewarding. We are also developing a tool that attributes faster ingestion of data to the operator’s data lake with better time-to-market and quality for data scientists to explore Analytics.
I understand that companies need to take a closer look at the base of the pyramid so that, with qualified and higher-volume data, they can create predictive analyses and develop machine-learning, deep learning, and PLN, and then offer Artificial Intelligence solutions that are relevant to their business market. The data are precisely what supports and enables the conclusion of patterns and predictions of behavior, for example, which is essential for the AI process.
Just as there is no fire without oxygen, there is no successful Artificial Intelligence without the data. In the words of Cezar Taurion himself: “no algorithmic sophistication will overcome a lack of data.” We return to the Harvard phrase that I quoted in one of the articles I wrote last year: “Very soon, data will have the same importance for us as oil had for 20th century companies.” Looks like this “very soon” has arrived.
* Amanda Matos Cavalcante is Marketing Manager at Triad Systems and a specialist in Digital Strategies Management at Harvard Business School.
In the press
May, 9th, 2018 – ComputerWorld