Facebook partners with AWS on PyTorch 1.5 updates, such as TorchServe to serve models



[ad_1]

Facebook’s PyTorch has grown into one of the most popular deep learning frameworks in the world, and today it’s getting new libraries and great updates, including stable C ++ frontend API support and library updates like TorchServe, a Model Service Library developed in collaboration with Amazon Web Services.

The TorchServe library comes with support for Python and TorchScript models; provides the ability to run multiple versions of a model at the same time or even revert to previous versions in one model file. More than 80% of cloud machine learning projects with PyTorch occur on AWS, Amazon engineers said in a blog post today.

PyTorch 1.5 also includes TorchElastic, a library developed to allow AI professionals to scale or shrink cloud training resources as needed or if things go wrong.

An integration of AWS with Kubernetes for TorchElastic enables container organization and fault tolerance. An integration of Kubernetes for TorchElastic on AWS means that Kubernetes users no longer have to manually manage the services associated with model training to use TorchElastic.

TorchElastic is designed for use in large distributed machine learning projects. PyTorch product manager Joe Spisak told VentureBeat that TorchElastic is used for large-scale NLP and computer vision projects on Facebook and is now being incorporated into public cloud environments.

VB TRansform 2020: the AI ​​event for business leaders. San Francisco from July 15 to 16

“What TorchElastic does is basically allow you to vary your training across multiple nodes without the training work really failing; it will continue gracefully, and once those nodes come back online, you can basically restart training and start calculating variants on those nodes as they arise, ”said Spisak. “We saw that [elastic fault tolerance] as an opportunity to partner with Amazon again, and we also have some Microsoft pull requests that we have merged. So basically we expect virtually all three major cloud providers to natively support that so that users can tolerate elastic failure in Kubernetes in their clouds. ”

Work between AWS and Facebook on the libraries started in mid-2019, Spisak said.

Also new today: A stable version of the C ++ interface API for PyTorch can now translate models from a Python API to a C ++ API.

“The big problem here is that with the upgrade to C ++, with this version, we are now in full parity with Python. Basically, you can use all the packages you can use in Python, all the modules, optim, etc. All of these are now available in C ++; they’re full parity documentations. And this is something that researchers have been waiting for and frankly, production users have been waiting for, and it basically gives everyone the ability to basically move between Python and C ++, ”said Spisak.

An experimental version of custom C ++ classes was also introduced today. PyTorch’s C ++ implementations have been particularly important to manufacturers of reinforcement learning models, Spisak said.

PyTorch 1.5 has updates for the torchvision, torchtext, and staple torchaudio libraries, as well as TorchElastic and TorchServe, a model service library made in collaboration with AWS.

Version 1.5 also includes updates for the torch_xla package to use PyTorch with Google Cloud TPU or TPU Pods. Work on an xla compiler dates back to conversations between employees of the two companies that started in late 2017.

The release of PyTorch 1.5 today follows the release of 1.4 in January, which included Java support and mobile customization options. Facebook first introduced support and quantization for Google Cloud TPU and PyTorch Mobile at an annual PyTorch developer conference held in San Francisco in October 2019.

PyTorch 1.5 only supports Python 3 versions and no longer supports Python 2 versions.

[ad_2]