The data science pipeline is a complicated one with a lot of manual steps and intuition. The basic pipeline includes the following steps:
1) data cleaning
2) feature selection/engineering
3) model selection
4) parameter optimization
5) model validation
AutoML systems attempt to automate everything from data clearning to parameter optimization.
Of course, the question is basically a red herring. AutoML systems aren’t coming up with novel algorithms. These aren’t AGIs. They’re simply pulling from a series of things humans have already invented. They can certainly speed up parameter optimization, which is a slow and counter intuitive process that involve a little luck and a little deep thinking on the part of a very smart human. But will these systems replace data scientists any time soon?
Not likely. What they’ll do is make it easier for non-data scientist programmers to integrate well worn paths in narrow AI development without needing to know a lot of the basic data science skillsets. Advanced data science teams will only benefit from these systems too, as it will cut down on boring and manual steps that make the day to day life of a data scientist tedious and allow them to get down to working on new ideas and theories.
In this article, I break down why AutoML isn’t replacing data scientists any time soon but why it’s a rising force in years to come.