Last Updated on September 30, 2016
5 Best Practices For Operationalizing Machine Learning.
Not all predictive models are at Google-scale.
Sometimes you develop a small predictive model that you want to put in your software.
I recently received this reader question:
Actually, there is a part that is missing in my knowledge about machine learning. All tutorials give you the steps up until you build your machine learning model. How could you use this model?
In this post, we look at some best practices to ease the transition of your model into production and ensure that you get the most out of it.
I Have a Model. Now What?
So you have been through a systematic process and created a reliable and accurate model that can make predictions for your problem.
You want to use this model somehow.
- Maybe you want to create a standalone program that can make ad hoc predictions.
- Maybe you want to incorporate the model into your existing software.
Let’s assume that your software is modest. You are not looking for Google-sized scale deployment. Maybe it’s just for you, maybe just a client or maybe for a few workstations.
So far so good?
Now we need to look at some best practices to put your accurate and reliable model into operations.
5 Model Deployment Best Practices
Why not just slap the model into your software and release?
You could. But by adding a few additional steps you can build confidence that the model that you’re deploying is maintainable and remains accurate over the long term.
Have you put a model into production?
Please leave a comment and share your experiences.
Below a five best practice steps that you can take when deploying your predictive model into production.
1. Specify Performance Requirements
You need to clearly spell out what constitutes good and bad performance.
This maybe as accuracy or false positives or whatever metrics are important to the business.
Spell out, and use the current model you have developed as the baseline numbers.
These numbers may be increased over time as you improve the system.
Performance requires are important. Without them, you will not be able to setup the tests you will need to determine if the system is behaving as expected.
Do not proceed until you have agreed upon minimum, mean or a performance range expectation.
2. Separate Prediction Algorithm From Model Coefficients
You may have used a library to create your predictive model. For example, R, scikit-learn or Weka.
You can choose to deploy your model using that library or re-implement the predictive aspect of the model in your software. You may even want to setup your model as a web service.
Regardless, it is good practice to separate the algorithm that makes predictions from the model internals. That is the specific coefficients or structure within the model learned from your training data.
2a. Select or Implement The Prediction Algorithm
Often the complexity a machine learning algorithms is in the model training, not in making predictions.
For example, making predictions with a regression algorithm is quite straightforward and easy to implement in your language of choice. This would be an example of an obvious algorithm to re-implement rather than the library used in the training of the model.
If you decide to use the library to make predictions, get familiar with the API and with the dependencies.
The software used to make predictions is just like all the other software in your application.
Treat it like software.
Implement it well, write unit tests, make it robust.
2b. Serialize Your Model Coefficients
Let’s call the numbers or structure learned by the model: coefficients.
These data are not configuration for your application.
Treat it like software configuration.
Store it in an external file with the software project. Version it. Treat configuration like code because it can just as easily break your project.
You very likely will need to update this configuration in the future as you improve your model.
3. Develop Automated Tests For Your Model
You need automated tests to prove that your model works as you expect.
In software land, we all these regression tests. They ensure the software has not regressed in its behavior in the future as we make changes to different parts of the system.
Write regression tests for your model.
- Collect or contribute a small sample of data on which to make predictions.
- Use the production algorithm code and configuration to make predictions.
- Confirm the results are expected in the test.
These tests are your early warning alarm. If they fail, your model is broken and you can’t release the software or the features that use the model.
Make the tests strictly enforce the minimum performance requirements of the model.
I strongly recommend contriving test cases that you understand well, in addition to any raw datasets from the domain you want to include.
I also strongly recommend gathering outlier and interesting cases from operations over time that produce unexpected results (or break the system). These should be understood and added to the regression test suite.
Run the regression tests after each code change and before each release. Run them nightly.
4. Develop Back-Testing and Now-Testing Infrastructure
The model will change, as will the software and the data on which predictions are being made.
You want to automate the evaluation of the production model with a specified configuration on a large corpus of data.
This will allow you to efficiently back-test changes to the model on historical data and determine if you have truly made an improvement or not.
This is not the small dataset that you may use for hyperparameter tuning, this is the full suite of data available, perhaps partitioned by month, year or some other important demarcation.
- Run the current operational model to baseline performance.
- Run new models, competing for a place to enter operations.
Once set-up, run it nightly or weekly and have it spit out automatic reports.
Next, add a Now-Test.
This is a test of the production model on the latest data.
Perhaps it’s the data from today, this week or this month. The idea is to get an early warning that the production model may be faltering.
This can be caused by content drift, where the relationships in the data exploited by your model are subtly changing with time.
This Now-Test can also spit out reports and raise an alarm (by email) if performance drops below minimum performance requirements.
5. Challenge Then Trial Model Updates
You will need to update the model.
Maybe you devise a whole new algorithm which requires new code and new config. Revisit all of the above points.
A smaller and more manageable change would be to the model coefficients. For example, perhaps you set up a grid or random search of model hyperparameters that runs every night and spits out new candidate models.
You should do this.
Test the model and be highly critical. Give a new model every chance to slip up.
Evaluate the performance of the new model using the Back-Test and Now-Test infrastructure in Point 4 above. Review the results carefully.
Evaluate the change using the regression test, as a final automated check.
Test the features of the software that make use of the model.
Perhaps roll the change out to some locations or in a beta release for feedback, again for risk mitigation.
Accept your new model once you are satisfied that it meets the minimum performance requirements and betters prior results.
Like a ratchet, consider incrementally updating performance requirements as model performance improves.
Adding a small model to operational software is very achievable.
In this post, you discovered 5 steps to make sure you cover your bases and are following good engineering practices.
In summary, these steps were:
- Specify Performance Requirements.
- Separate Prediction Algorithm From Model Coefficients.
- Develop Regression Tests For Your Model.
- Develop Back-Testing and Now-Testing Infrastructure.
- Challenge Then Trial Model Updates.
If you’re interested in more information on operationalizing machine learning models check out the post:
This is more on the Google-scale machine learning model deployment. Watch the video mentioned and review the great links to both the AirBnB and Etsy production pipelines.
Do you have any questions about this post or putting your model into production?
Ask your question in the comments and I will do my best to answer.