You are a data scientist or machine learning engineer. You’ve spent months getting together a model to satisfy some business requirement. You scoped the problem, gathered the data, prepared it, trained the model, iterated on a few versions until you hit the business’s performance requirements, and even managed to get the model into production. Now you’re done, right?

Unfortunately not. The truth is, the work has just begun. Now that your model is in production, you need to be able to ensure that it’s still meeting the previously mentioned performance requirements. If you simply deploy your model and walk away to your next project, your model performance can degrade (or even fail entirely) and you won’t have any way of knowing.

Machine learning models are increasingly becoming key to businesses of all shapes and sizes, performing myriad functions. As businesses come to depend more and more on machine learning, the need to ensure that models are still performant grows as well. If a machine learning model is providing value to a business, it’s essential that the model remains performant.

You may be asking yourself “But why would my model performance degrade? What would cause my model to fail? It worked in my development environment, why wouldn’t it work in production?”

The answer is simple, and well stated by Cassie Kozyrkov, Chief Decision Scientist at Google:

“The world represented by your training data is the only world you can expect to succeed in.”

If the data being passed to your machine learning model for inference differs from the data that the model was trained on, your model won’t perform as expected. There are two categories of issues that cause your production data to be different from your training data: data quality issues and data change issues.

Data quality issues

Data quality issues occur when instrumentation around data collection, processing, or storage breaks down. The most obvious signs of data quality issues are lots of null or missing values, but there are other, more insidious problems with data quality that can arise as well.

One example of a data quality issue is type mismatch. If American ZIP codes (five digits) are encoded as strings when a model is trained but some upstream process causes them to be encoded as integers when the model is in production, the model will likely be able to make neither heads nor tails of it. If you are lucky, the model will break loudly and you can rectify the problem swiftly. Otherwise, the model will fail silently, throwing a warning and making a sub-par prediction.

Data quality issues can occur in both batch and streaming data pipelines, and only by monitoring the data being passed to the model can a machine learning engineer ensure that their model is relying on high quality data.

Data change issues

Training-serving skew, concept drift, data drift, covariate shift, oh my…

The breadth of data change issues is wide, and an exhaustive examination deserves its own blog post. By and large, data change issues can be summarized by the statement: “the real world processes generating the data have changed between when the training data was captured and now that the model is in production.”

This change might be a covariate shift (a change in the independent input variables being fed to the model), a prior probability shift (a shift in the dependent target variable being predicted by the model), or a concept shift (a change in the relationship between the independent and dependent variables). In all likelihood, some combination of these three issues will occur with your model, though over what time period and to what extent is highly variable.

It’s important to note that all data change issues can cause model performance degradation. Fortunately, by monitoring the input data and the predictions made by the model, it’s possible to be alerted when any of these data change issues arise.

But how?

Hopefully, you come away from this post with a thorough understanding of the problems that can arise if you don’t monitor your ML models. But, now that we’ve answered the question “Why should I monitor my machine learning models?”, we also need to answer “How do I monitor my machine learning models?”

Monitoring machine learning models in production can be daunting, but WhyLabs makes it easy. With our fully self-serve signup flow and zero configuration setup, you can start monitoring your models right away. Better still, the Starter tier, which gives you all of the features of the platform for a single model for free, allows you to trial the system without even entering your credit card info.

This blog has been republished by AIIA. To view the original article, please click here: