Last Updated on June 7, 2016
Once you have found and tuned a viable model of your problem it is time to make use of that model. You may need to revisit your why and remind yourself what form you need a solution for the problem you are solving.
The problem is not addressed until you do something with the results. In this post you will learn tactics for presenting your results in answer to a question and considerations when turning your prototype model into a production system.
Depending on the type of problem you are trying to solve, the presentation of results will be very different. There are two main facets to making use of the results of your machine learning endeavor:
- Report the results
- Operationalize the system
Once you have discovered a good model and a good enough result (or not, as the case may be), you will want to summarize what was learned and present it to stakeholders. This may be yourself, a client or the company for which you work.
Use a powerpoint template and address the sections listed below. You might like to write up a one-pager and use part section as a section header. Try to follow this process even on small experimental projects you do for yourself such as tutorials and competitions. It is easy to spend an inordinate number of hours on a project and you want to make sure to capture all the great things you learn along the way.
Below are the sections you can complete when reporting the results for a project.
- Context (Why): Define the environment in which the problem exists and set up the motivation for the research question.
- Problem (Question): Concisely describe the problem as a question that you went out and answered.
- Solution (Answer): Concisely describe the solution as an answer to the question you posed in the previous section. Be specific.
- Findings: Bulleted lists of discoveries you made along the way that interest the audience. They may be discoveries in the data, methods that did or did not work or the model performance benefits you achieved along your journey.
- Limitations: Consider where the model does not work or questions that the model does not answer. Do not shy away from these questions, defining where the model excels is more trusted if you can define where it does not excel.
- Conclusions (Why+Question+Answer): Revisit the why, research question and the answer you discovered in a tight little package that is easy to remember and repeat for yourself and others.
The type of audience you are presenting to will define the amount of detail you go into. Having the discipline to complete your projects with a report on results, even on small side projects will accelerate your learning in field. On such small side projects, I highly recommend sharing results of side projects on blogs or with communities and elicit feedback that you can capture and take with you into the start of your next project.
You have found a model that is good enough at addressing the problem you face that you would like to put it into production. This may be an operational installation on your workstation in the case of a fun side project, all the way up to integrating the model into an existing enterprise application. The scope is enormous. In this section you will learn three key aspects to operationalizing a model that you could consider carefully before putting a system into production.
Three areas that you should think carefully about are the algorithm implementation, the automated testing of your model and the tracking of the models performance through time. These three issues will very likely influence the type of model you choose.
It is likely that you were using a research library to discover the best performing method. The algorithm implementations in research libraries can be excellent, but they can also be written for the general case of the problem rather than the specific case with which you are working.
Think very hard about the dependencies and technical debt you may be creating by putting such an implementation directly into production. Consider locating a production-level library that supports the method you wish to use. You may have to repeat the process of algorithm tuning if you switch to a production level library at this point.
You may also consider implementing the algorithm yourself. This option may introduce risk depending on the complexity of the algorithm you have chosen and the implementation tricks it uses. Even with open source code, there may be a number of complex operations that may be very difficult to internalize and reproduce confidently.
Write automated tests that verify that the model can be constructed and achieve a minimum level of performance repeatedly. Also write tests for any data preparation steps. You may wish to control the randomness used by the algorithm (random number seeds) for each unit test run so that tests are 100% reproducible.
Add infrastructure to monitor the performance of the model over time and raise alarms if accuracy drops below a minimum level. Tracking may occur in real-time or with samples of live data on a re-created model in a separate environment. A raised alarm may be an indication that that structure learned by the model in the data have changed (concept drift) and that the model may need to be updated or tuned.
There are some model types that can perform online learning and update themselves. Think carefully in allowing models to update themselves in a production environment. In some circumstances, it can be wiser to manage the model update process and switch out models (their internal configuration) as they are verified to be more performant.
In this post you learned that a project is not considered complete until you deliver the results. Results may be presented to yourself or to your clients and there is a minimum structure to follow when presenting results.
You also learned three concerns when using a learned model in a production environment, specifically the nature of the algorithm implementation, model tests and on going tracking.
Can you share a sample PPT regarding presenting the results.
Perhaps in the future.
Hi Jason, Well done, very insightful.
Can you suggest places where I can see a case study of taking an ML pipeline to production
I have some notes here that might help as a start:
I think that making the transition from a prototype to a production ready machine learning system is quite daunting. In your experience, tools as sklearn or Keras are used in production systems or they belong only to the prototyping phase?
They can be, it really depends on the production environment.
The prediction side of ML methods is often trivially simple, so much so that you can save the “model” (e.g. coefficients) and load it with your own code for use in ops. I have done this myself (e.g. model in Java, save coefficients to ascii, load and make predictions in FORTRAN on big computers).
I write more about this here:
I hope that helps.
Excellent. Thank you very much!
pls con you share a PPt presentation you ever do.
Sorry, I cannot. All presentations that I have prepared are confidential.
Hello our bro. Jason, Thank you for your open minded and help every developers like me; but still I am scratch beginner for machine learning but doing my research thesis using ML so what do i do? specially on word embedding neural network on local language.
Perhaps start right here:
Hi Jason, do you have an idea on how to use the save model without going to fit the model?Really need an explanation on this. Thank you
Sorry, I don’t understand.
You must fit the model before it can be used to make predictions, regardless of saving it or not.
what I mean is the fit take some time in running the algorithm. Lets say I already fit n save the model. Then I open up the another page to load the save model n do the prediction and skip the fit part.
My question is, even if the model has been saved and load, I still need the fit part?Lets say I want to proceed to the production, does I still need to train all the data which it takes time to run?
Sorry for asking Im really confused.
No, the model is only fit once.
Hi Jason. Do you have an idea to make a realtime prediction?
For more help see this:
Hi Jason .
How different are the ‘production level models’ to that we use for estimating the model or predicting. ?? For eg , if I use sklearn library for estimating/modelling , how do I know its a good fit with a production-level model ? Can you please throw some light on which are some of the libraries I can look at , for production-level models . Thanks !
Sorry, I don’t understand your question, perhaps you can elaborate.
We use the same library and same process to fit a model for production. Perhaps this will help:
The good results of model machine learning can become a data train?. It means when I get the results I will retrain the model with old data and new data(the results). I have tried to find information about this on the internet but I can’t get the answer.
Thank you so much!
Sorry I don’t understand your question, perhaps you can elaborate or rephrase?