XenonStack

A Stack Innovator

Post Top Ad

Thursday, 12 December 2019

DevOps for Machine Learning, Tensor Flow and PyTorch

Why Continuous Integration and Deployment?

Like in Modern Web Applications, it needs Agile systems because of the ever-changing requirements of the clients and consumers. In Machine Learning the challenge is to make the system that works well with the real world, and the real-world scenarios change continuously. The system needs continuous learning and training from the real world. The solution is DevOps for Machine learning and deep learning. Which continuously trains the model on the new data after some time and then validates and tests the model accuracy to make sure it will work well with the current real-world scenarios.

DevOps for Machine Learning, Tensor Flow and PyTorch

TensorFlow and PyTorch are open source tools that help for DevOps for machine learning. Google develops TensorFlow and based on Theano Whereas PyTorch is developed by Facebook and based on Torch. Both frameworks define computational graphs. In TensorFlow, it needs to determine the entire computational graph and then run the ML algorithms. PyTorch uses dynamic graphs and creates it on the go.
TensorFlow has Tensorboard for visualization and enables directly on the browser. PyTorch doesn’t have a tool like that, but Matplotlib can be used with it. TensorFlow has more community support and online solutions than PyTorch.Whichever framework is used to build the Machine Learning Model, the CI/CD is a much-needed thing. Developers or Data Scientists spend most of their time in managing and Deploying their model to production manually, and this makes a lot of human errors. This needs to be an automated process with a well-defined pipeline and Model Versioning.
The skills needed by a data scientist is changing now, its less visualization and statistics-based and moving closer to engineering. Continuous Integration and Deployment of Machine Learning Models is the real challenge in the Data Science world, Productionizing the models requires proper Integration and Deployment pipeline. As the real world changes continuously so the system should have the capability to learn with time. Continuous Integration and Deployment Pipeline make this happen.
Currently, if a modern application needed to be developed a continuous pipeline then tools like Git, Bitbucket, GitCI, Jenkins were used for versioning and management of the code. As in the case of Modern application, only the codebase is required to be managed and versioned but in Machine Learning and AI Applications things are more iterative and complex. The data is another thing to accomplish here. A system is required which can version and manage the data, models and intermediate data.

Continuous Development life cycle

Git is a source code management and also a version control management tool. Versioning the code is much more critical for product releases, and there is history for every file for exploring the changes and reviewing the code.

Git for versioning and managing code

In Machine learning and AI systems, the model code needs to manage for releases and changes tracking. Git is the most used source code management tool used. In Continuous Integration and Continuous Deployment, Git manages the versions by tagging branches, and git-flow can be used for feature branches.

Dvc for versioning models and data

Unlike source code, the size of the model and data is much larger, and Git is not suitable for this kind of cases where the data is, and models files are large. Dvc is a Data science version control system that provides the end to end support for managing the Training data, intermediate data and model data.

Version control

DVC provides the commands like Git to add commit and push models and data to S3, Azure, GCP, Minio, SSH. It also includes data provenance for tracking the evolution of Machine Learning Models. Dvc helps in reproducibility if need to get back to a particular experiment.

Experiment management

Metric tracking is easy to use using DVC. It provides a Metric tracking feature that lists all the branches along with metrics values and picks the best version of the experiment.

Deployment and Collaboration

Dvc push-pull command is available to push the changes to production or staging. It also has a built-in way to create DAG using ML steps. DVC run command is used to create the deployment pipeline. It streamlines the work into a single, reproducible environment and also makes it easy to share the environment.

Packaging Models

There are a vast number of ways with models that can be packaged but the most convenient and automated using Docker on Kubernetes. Docker is not only applied to packaging but also as a development environment. It also handles version dependency management. It provides more reliability than running Flask on a Virtual Machine.
In this approach, Nginx, Gunicorn and Docker Compose will be used to create a scalable, repeatable template for making it easy to run with continuous integration and deployment.

Directory Structure

├── README.md
├── nginx/
├ ├── Dockerfile
├ └── nginx.conf
├── api/
├ ├── Dockerfile
├ ├── app.py
├ ├── __init__.py
├ └── models/
├── docker-compose.yml
└── run_docker.sh

How to Perform Continuous Model Testing for PyTorch and TensorFlow?

Feature Test

  • value of features lies between the threshold values
  • feature importance changed concerning previous
  • Feature a relationship with the outcome variable in terms of correlation coefficients.
  • Feature unsuitability by testing RAM usage, inference latency, etc.
  • generated feature violates the data compliance-related issues
  • code coverage of the code generating functions
  • static code analysis outcome of code generating features


Continue Reading: 
XenonStack/Blogs


No comments:

Post a Comment