A good friend of mine called my attention to a terrific article about the impact of the COVID pandemic on the domain of data science and predictive models that was recently published by MITSloan Management Review (Data Science, Quarantined; Jeffrey D. Camm and Thomas H. Davenport; July 15, 2020).
In it, the authors note that existing machine learning (ML) models failed spectacularly during the crisis and, as a result, companies quickly adjusted to focus on descriptive analytics in order to try to understand what was happening in real-world, based on what was now showing up in their data, and away from predictive ML models that were trained on past data from a world that no longer matched current business reality. One manager quoted in the article put it bluntly as, “Our demand-forecasting automated machine learning models didn’t handle eight weeks of zeros very well.” As someone that has spent years touting the benefits of automated machine learning, my immediate reaction to that quip was, “Ouch.”
But what struck me even more was the following quote, from Lydia Hassell of Hanesbrands, which emphasized the need to monitor ML model exceptions more frequently to uncover patterns in the new normal, to test the robustness of existing models, and to train new models:
“While we would normally run these reports on a monthly basis, we began running these weekly, or even more frequently, to better monitor what was happening to the machine learning models.”
From months to weeks sounds promising, but….wait, what? Since when is waiting weeks for new insights acceptable to a business, particularly when the conditions in the real, COVID, world are changing daily?
This is sadly the reality of ML today for most organizations. Despite the advancements of automated machine learning, ML is still painstakingly slow to deliver new insights. Yes, once you have a model, it can predict very rapidly. The model will work well when conditions are relatively stable. But what about training a new model to react to changing conditions, particularly a forecasting model, with hundreds or thousands of time series features, lag variables and the like? You can bet conditions will change again before you have your new model.
It sounds a bit like the dark ages to me, akin to when business teams used to ask their IT department for a new report or dashboard, and they would come back with something static and mediocre after weeks or months. Self-service business intelligence (BI) changed all that. Now business teams can use tools like Qlik, Tableau or PowerBI to create their own reports and insights at the click of a few buttons. There are literally millions of BI users within our businesses that wouldn’t dream of waiting weeks or months for insights. Once sparked, this BI revolution occurred very rapidly.
I believe strongly that ML is in the process of undergoing a similar revolution.
The first era of ML, call it ML 1.0, was all about handcrafted models built in the backroom by teams of experts. It took weeks, months or even years, as these data scientists and engineers gathered their training data, engineered their features, selected algorithms with intimidating names like recurrent neural nets, support vector machines and extreme gradient boosted trees (that of course only true “experts” like them could understand…trigger evil laugh), tuned hyperparameters, tested, validated and then iterated through it all until, eureka, they had a model that then could be deployed by another set of experts. No wonder 95% or more of the models never made it all the way through this process.
In the era of ML 2.0, automated machine learning arrived which promised to automate the above process. It truly was a great advancement. You could now load your data into an AutoML system, which through brute force, heuristics, and a boatload of compute power, could crank through dozens of feature engineering and pre-processing steps, try tens or even hundreds of fancy algorithms, perform a grid search to find the optimal hyperparameters, then retrain and validate your model across multiple data partitions, all while you sat back and enjoyed a cup of coffee or two. Well, maybe. If your data was small enough (say, a few thousand rows like what the automated ML vendors are always demoing) it would take tens of minutes to get a model. Not too shabby. If your training data were larger, like what is almost always the case in the real world, your waiting time would be a few hours. Again, it is a huge stepwise improvement over the weeks and months of ML 1.0, but nowhere near the speed to insights that we are used to in the BI world.
This brings us to InstantML, ML 3.0, which I believe can spark a revolution in the way that ML is used within organizations, similar to the shift that we have seen in BI. InstantML has been introduced by Tangent Works as a completely new machine learning paradigm. It doesn’t just automate the process of a data scientist, it flips the process on its head. In short, instead of the time and resource intensive multi-step process of automated machine learning, InstantML engineers features and applies a highly efficient algorithm in a single step. This yields a model and insights in seconds. It is by far the fastest path from data to predictive value that I have seen.
InstantML allows business users to create new forecasting models on the fly with virtually no wait time. When benchmarked against other approaches, the engine (which Tangent Works calls TIM) is 100 times or more faster than common ML platforms in producing a model. And it does so without compromising accuracy.
Now imagine your BI users with InstantML at their fingertips, being able to experiment, create new models, iterate and gain insights in real-time as real-world conditions and data changes. This is a game changer.
I will leave you with one more quote from the article. As very well stated in closing, the authors write that the new normal for data science will be “all about agility and speed.”:
The ability to generate customized and adaptive models quickly will be a key determinant of success: It’s a different world from the relatively stable data and analytics world of the past.
Welcome to the new normal. Welcome to the InstantML revolution.
To see the power of InstantML within Qlik watch the on-demand webinar:
The Critical Need for Predicting the Future in Uncertain Times
Scott Bergquist is the VP of Business Development for Tangent Works US. He has a 20+ year track record comprising new business development, sales channel strategy and management, organic / inorganic growth strategy development and execution, and leading consultative value creation in the areas of AI / ML, data platforms, analytics driven organizations, IoT, and cloud / PaaS / SaaS consumption models.
Prior to Tangent Works, Scott led large business development and sales teams at Cisco and IBM, and was one of the first BD hires at DataRobot, a pioneer of automated machine learning.