Marek Linka, Senior Software Developer

Marek Linka

Senior Software Developer

Operationalizing machine learning models 2/3 – From model to service

The first part of this series of articles introduced the Azure Machine Learning Workbench and showed us how to setup the resources required to operationalize our machine learning models In this article, we will see how to get our model into Azure and run it locally for testing purposes. Let’s get to it!

File and folder structure

From this point on, we will work with several files (the ML model, the file, the web service schema etc.). I recommend putting all of these files under a single root directory, similar to this structure:

I will cover the specific files as we go through the deployment configuration process. Bottom line is, stuff works best when we have it in one place. In the CLI, navigate to this directory.

Model registration

To get our machine learning model into Azure, we first need to export it to a file (or several). This will differ from framework to framework, but for our example model written in Keras, it’s as simple as"model.h5")


When you have the file, add it to the target directory (chosen above) and under „model“. We now need to set the target model management account (remember, we named it „amlwmodelmanagement“ in the previous article):

az ml account modelmanagement set -n amlwmodelmanagement -g amlw


To register the model with the management account, run the following:

az ml model register --model model/model.h5 --name AMLWModel

You should see a confirmation like this one:

Mark down the ID of the model – we will need to specify it later when creating the web service manifest.

Notice how the model is being referred to as being under “model/model.h5” path – the CLI will register it under the path relative to the current directory. We will need to remember this later when we attempt to load and use the model.

If you now go to the Azure portal and look into the model management UI, you should be able to see your model:

So the model is ready – let’s start creating the web service to make use of it.

Web service schema

We are going to write some python Python code next. Open the root directory containing our model’s directory and create a file called “”. Put the following code inside the file:

model = None 

def init():
 # this will load and initialize our model

def run(inputs):
 # this will call the loaded model and do the scoring

def generate_api_schema():
# this creates an input and output schema for our webservice and save it into a json file
 from azureml.api.schema.dataTypes import DataTypes
 from azureml.api.schema.sampleDefinition import SampleDefinition
 from import generate_schema

 inputs = {"input_array": SampleDefinition(DataTypes.STANDARD, "dGVzdA==")}
 outputs = {"coordinates": SampleDefinition(DataTypes.STANDARD, [1, 2, 4, 5, 8, 9, 5, 6])}
 print(generate_schema(inputs=inputs, filepath="schema.json", run_func=run, outputs=outputs))


The actual content for “inputs” and “outputs” depends on your model and use case. For our example model (a computer vision task), the “run” function takes a base64-encoded byte array representing an image, transforms it into a valid input for our model, then calls the model. We’ll get to the implementation in short order. There are various configuration options available to customize the schema, you can find details in the documentation.

Now run the code:


This will create a „schema.json“ file in our working directory. This file will be used to describe our web service endpoint and validate inputs. When you are satisfied with your schema, delete the last line of the file – it’s not necessary anymore as the schema is already created.

Now let’s actually implement the web service!

Web service endpoint implementation

First off, the whole implementation will live in the file. Loading the model is easy – it will be provided automatically under the relative path under which it was registered – in our case, this path is “model/model.h5”:

def init():
from keras.models import load_model as load_keras_model
from PIL import Image

local_path = 'model/model.h5'
global model
model = load_keras_model(local_path)

Again, the actual implementation will differ depending on what you want to do. At any rate, our model is loaded and ready to be used.

The run function takes in the arguments defined by the schema, does any pre-processing we need, triggers the model, then returns the prediction:

def run(input_array):
          import base64
          import io

img =
          data = np.array(img)
          data = data / 255
          data = np.expand_dims(data, axis=0)

          prediction = model.predict(data)
          return { "coordinates": prediction.tolist()[0] }

Since our example model inputs an image, we convert the base64-encoded image into a byte array, convert it into an image, normalize it, then feed it into the model. We take the prediction (an array of 8 numbers) and wrap it in a result object.

Once again, this code can do whatever your model needs it to do. Mileage may vary. It’s a good idea to keep the code in the file as short as possible to not hinder performance. If I needed to do some preprocessing, such as resizing the input picture to the correct size and turning it black and white, I could do that on the caller’s side or maybe use a serverless function as a proxy.

With this, the hard parts are over – we can now package everything into a Docker image.

Web service manifest

To package the web service, we first need to create a manifest file for it. This manifest is used by Azure when creating the Docker image. The CLI will help us:

az ml manifest create --manifest-name AMLWManifest -f -r python -i 819cfa3a21614bc5bc56cbc5746343e5 -s schema.json -c conda.yml

There is a lot to parse here, let’s break it down:

  • –manifest-name = the name of the manifest, nothing special
  • -f = the file that will serve incoming requests
  • -r = runtime, either “python” or “spark-py”
  • -i = the ID of the model that will be made available to the service (we received this when we registered the model)
  • -s = the schema.json describing the service interface
  • -c = the Anaconda environment configuration file – this file is used to define what non-standard python libraries should be installed in the Docker image

The Anaconda file is crucial here – without it, most of our required libraries would be missing from our Docker image and the webservice would fail to execute.

Our model requires Keras, TensorFlow, Pillow, and some other dependencies to be installed. A conda file for this setup would looks similar to this:

# Version of this configuration file's structure and semantics in AzureML.
# This directive is stored in a comment to preserve the Conda file structure.
# [AzureMlVersion] = 2
name: project_environment
    - python=3.5.2
    - pip:
    - --index-url
    - --extra-index-url
    - azureml-requirements
    - numpy
    - tensorflow
    - keras
    - Pillow
    - azure-ml-api-sdk==0.1.0a10

This file will be parsed by Azure when creating the Docker image and install the indicated packages. Nothing simpler – we just need to remember to reference the file in the create manifest command.

The command will also return some information about the manifest, including its ID. Mark it down for the next step.

Create a docker image

Now that we have a manifest, we can create the Docker image. Or, more precisely, let Azure create and store it for us (we need to provide the manifest ID here):

az ml image create -n amlwimage --manifest-id 7c8efb6f-823a-4a6a-812f-401c1d13b898

Note that this operation will take some time, depending on how much data (models, packages etc.) is required. When the operation completes, you will get the ID of the image back. This image is automatically stored in the Azure Container Registry instance created within your environment.

Test the service locally

The last step to test and deploy the image locally is to pull and run it – this is where the Docker installation comes into play.

az ml service create realtime --image-id 49c07d92-c12c-42c8-90ba-1821b5be90aa -n AMLWService

This command will pull the image from the Azure Container Registry (it’s usually a few GB so it will take some time) and starts it on our local machine. After the container starts, you should see something similar to this:

To see what the actual address of the service is we can run the following:

az ml service usage realtime -i AMLWService

The command gives us some details about the local instance of the service:

So you see the service is listening at port 32770. You can visit the Swagger URL to see the description of the API and use the “score” endpoint to do the actual prediction. It also gives you example CLI and curl commands.

Ok, we have a service, and everything runs. Now let’s see if something actually happens when we call it!

Since our example takes in a base64-encoded image data, it’s best to use something different than a command line interface. I prefer Postman, but your tastes might differ.

You can notice that the JSON payload conforms to the schema.json we generated and included in the service manifest – it has an “input_array” field containing a base64-encoded string with our test image.

The service seems to be responding:

That’s it. We now have a self-contained Docker image with our ML model, a web service exposing said model, and a local deployment of the image for testing. And everything seems to be performing well!

Some interesting details

All artifacts created using the Azure ML CLI are stored in Model Management Account and versioned – if you realize you made a mistake (forgot to add something to your conda file, left a bug in the code etc.), we can always create a new artifact (model, manifest, image etc.) with the same name. Unfortunately, all artifacts depending on the one we changed need to be rebuilt – so if we create a new version of the service manifest, we need to create a new Docker image. If we want to modify the model, we need to rebuild both the manifest and the image (not to mention update any and all deployments).

We can see the various version in the Azure portal, under Model Management.

Doing this often can be very time-consuming. For any kind of production-grade deployment, it would be best to create a set of scripts to automate the process as much as possible. To make this easier, the whole process, from model registration to image creation, can be done in a single AZ ML command:

az ml service create realtime --model-file [model file/folder path] -f [scoring file e.g.] -n [your service name] -s [schema file e.g. service_schema.json] -r [runtime for the Docker container e.g. spark-py or python] -c [conda dependencies file for additional python packages]

The fact that the whole service lives in a Docker container makes testing it extremely easy – just set up a test environment with Docker and you can pull the image, start a container, and run automated tests against the service from within your normal CI pipeline.


This marks the end of the second part in the series. We saw how to register models with Azure, how to describe the exposing service, and how to create a Docker image with the service. We also managed to pull the Docker image, start it, and test that our service is actually doing what we want it to do.

At this point we are ready to make the last step – deploying the service into production and consume it over the internet.

Read previous article: Operationalizing machine learning models 1/3 – Azure ML Workbench

News from ERNI

In our newsroom, you find all our articles, blogs and series entries in one place.

> Load more

ERNI Schweiz

Casinoplatz 2

3011 Bern

Phone: +41 58 268 12 00

Email: [email protected]

ERNI Suisse

Bâtiment L

Route des Acacias 43

1227 Geneva

Phone: +41 58 268 11 03

Email: [email protected]

ERNI Suisse

Voie du Chariot 3

1003 Lausanne

Phone: +41 58 268 11 03

Email: [email protected]

ERNI Schweiz

Brünigstrasse 18

6005 Lucerne

Phone: +41 58 268 12 00

Email: [email protected]

ERNI Schweiz

Geschäftshaus Airgate

Thurgauerstrasse 40

8050 Zürich

Phone: +41 58 268 12 00

Email: [email protected]

ERNI Deutschland

Trakehner Str. 7-9

60487 Frankfurt am Main

Phone: +49 162 334 77 30

Email: [email protected]

ERNI Deutschland

Design Offices München – Arnulfpark

Luise-Ullrich-Str. 20

80636 München

Phone: +49 162 334 77 30

Email: [email protected]

ERNI Slovakia

Ševčenkova 34

851 01 Bratislava

Phone: +421 2 32 55 37 37

Email: [email protected]

ERNI España

Edificio El Triangle

Plaça Catalunya 1-4, 3º planta, Módulo A y B

08002 Barcelona

Phone: +34 93 667 77 76

Email: [email protected]

ERNI España

Carrer Pallars, 208, Bajos

08005 Barcelona

Phone: +34 93 667 77 76

Email: [email protected]

ERNI España

Calle de Alfonso XII 62

Oficina 3101

28014 Madrid

Phone: +34 901 848 787

Email: [email protected]

ERNI España

Sant Cugat ERNI Office

Plaça Xavier Cugat, 2 EDIF B Planta Baja

08174 Sant Cugat del Vallès

Phone: +34 93 667 77 76

Email: [email protected]

ERNI Romania

Calea Dorobantilor no. 98-100

3rd floor, in Olimpia Business Center

400609 Cluj-Napoca

Phone: +40 744 319 228

Email: [email protected]

ERNI Singapore

7 Straits View

Marina One East Tower #05-01

Singapore 018936

Phone: +65 9161 9863

Email: [email protected]

ERNI Philippines

9th Floor, 500 Shaw Zentrum Building

500 Shaw Boulevard

Mandaluyong City, Philippines 1555

Phone: +63 2 531 59 82

Email: [email protected]