Azure ML Managed Online Endpoints

Secure, Scalable, and Reliable Model Deployment

4 min readApr 19, 2023

By Manu Bhardwaj | Day 18 of #30DaysOfAzureAI

Machine learning model deployment can be a daunting task, especially when it comes to ensuring security, scalability, and reliability in production. As an AI enthusiast, I’m always on the lookout for solutions that simplify this process. Today, I’m excited to share my experience with Azure Machine Learning and its new managed online endpoints feature.

Managed online endpoints in Azure ML offer a host of benefits that have made me a huge fan. I’ve worked with Azure Container Instances (ACIs) before, but after exploring managed online endpoints, I discovered key advantages that make them a superior choice:

🔐 Built-in Security

Managed online endpoints come with default Bearer token security. For additional control over token expiration, AML access tokens are available.

🔵 Native Blue/Green Deployments

Flexibility to create multiple deployments for an endpoint and distribute traffic percentages. Traffic mirroring to shadow models for performance comparison is also possible.

🚀 Auto-Scaling with Azure Monitor

Automatically scale based on traffic, ensuring reliability during traffic spikes.

While I faced some initial challenges due to differences from ACI, I’m here to provide a comprehensive quickstart guide to help you leverage the capabilities of Azure ML managed online endpoints. Special thanks to Vlad Iliescu for his valuable insights and quickstart on this topic!

Let’s get started.

Prerequisites

Ensure you have the following tools installed:

Azure CLI: Follow your platform’s install instructions.
Azure ML CLI v2: Find the install instructions.

For MacOS, install Azure CLI and the Azure ML extension using the commands below:

# Install Azure CLI
brew update && brew install azure-cli

# Add the Azure ML extension
az extension add -n ml# Log in with your Azure account and set the active subscription
az login
az account set --subscription "<your-subscription-id>"

Managed Online Endpoints

Managed online endpoints consist of three key components: the inference script, the endpoint, and the deployment. Let’s go through each.

The Inference Script

The inference script processes client inputs and model outputs. Azure ML requires two methods: init and run:

init: Initializes the container, loads the model into memory, and performs one-time operations.
run: Invoked for each API call, translates inputs, invokes the model, and returns formatted results. Speed is essential.

The Endpoint

The endpoint is an HTTPS path providing an interface for clients to send requests and receive model predictions. It offers authentication, SSL termination, and a stable scoring URI.

Create an endpoint using a YAML file and the following command:

az ml online-endpoint create -f endpoint.yml -g "<your-resource-group>" -w "<your-workspace>"

The Deployment

A deployment is a containerized environment running the inference script, a Docker image with a web server and dependencies.

Create a deployment using a YAML file and the following commands:

az ml online-deployment create -f blue-deployment.yml -g "<your-resource-group>" -w "<your-workspace>"
az ml online-endpoint update -n whats-in-a-name --traffic "blue=100" -g "<your-resource-group>" -w "<your-workspace>"

Connecting to a Managed Endpoint

With the endpoint and deployment ready, retrieve the inference URL and authentication token using the commands below:

az ml online-end point show -n whats-in-a-name -g "<your-resource-group>" -w "<your-workspace>" --query "scoring_uri"
az ml online-endpoint get-credentials -n whats-in-a-name -g "<your-resource-group>" -w "<your-workspace>" --query "primaryKey"

#With these credentials, you can send requests to the endpoint using `curl` or any other tool of your choice:

curl <your-inference-url> -H 'Authorization: Bearer <your-token>' -H 'Content-Type: application/json' --data-binary @sample-request.json

Conclusion

Azure ML managed online endpoints provide a powerful solution for deploying machine learning models to production. With built-in security, scalability, and reliability, you can confidently deploy models as accessible web APIs for clients.

I hope this guide has been helpful, and I encourage you to explore Azure ML managed online endpoints for your machine learning projects. A special thank you to Vlad Iliescu for his insightful contribution and QuickStart on deploying models with Azure ML managed online endpoints. Be sure to check out his QuickStart here.

As we continue our journey through #30DaysOfAzureAI, stay tuned for more topics and insights into the exciting world of Azure AI. If you have questions or feedback, feel free to share in the comments below.

🙏 Thank you for joining us on Day 18 of #30DaysOfAzureAI!

🔗 References:

#AzureAiDevs #Roadmap #30DaysOfAzureAI #AiMonthly #Community