#VS2019 – ML.NET Model Builder training using GPU, CPU and … Azure !

Buy Me A Coffee

Hi !

I my previous posts on Model Builder, I show the differences between using the CPU and GPU to train a model. There is a 3rd option, which involves an Azure ML Experiment, and performs the training on the cloud.

It took me some time, to setup this environment, mostly because I tried to use an existing Azure Compute Instance that I already have, and Model Builder needs a Compute Cluster.

Here is also important to remark, that you need to create a Dedicated GPU based Compute Cluster. There are some expenses / costs associated to these resources, so make your numbers before you start.

And, here we go, now we can move forward with the Model Builder Assistant.

I made some tests using a small image data set, and it was awesome. Training an 24 images dataset took between 8 and 9 minutes. and the results were very good. A good insight here, is the chance to get more details directly in the Azure Machine Learning portal.

We can go deep in each experiment, and take a look at some metrics like F1 Score, Precision, Recall and more.

Each Model Builder Image Classification project, will trigger several Azure ML Experiments.

  • Automated ML
  • HyperDrive
  • Preparation
  • Script

The Script experiment is the one we can open to get access to some detailed logs, and also to the ONNX model.

So, I decided to go big and test this using the set of images from a Kaggle challenge [State Farm Distracted Driver Detection] (see references). This is a 1GB image set, 22424images, with ten categories.

The 1st step is to upload the 22424 images to an Azure resource, this took some time.

And then, start tracking the progress in the Azure Machine Learning portal.

And after some time, the process triggered a timeout.

The details on the 4 experiments suggests that some limit was exceeded. I’m not sure if from the IDE or on the Azure side.

However, the experiment in charge to train the model [Run 12], got some sucess models. Acuraccy, F1 and precision were getting better

Reading some log I can see how the error was triggered on Epoch 8. I need to spend more time here to figure out what’s happened !

Note: I already reported the issue to the GitHub Repo.

As final thought. Using Azure as the training environment in Model Builder is an amazing option. A big DataSet maybe a problem, or maybe my quota is the problem. Anyways, in smaller DataSets it worked great. I’ll keep an eye in this issue and update the blog with some news.

Happy coding!

Greetings

El Bruno

References

#VS2019 – ML.NET Model Builder GPU vs CPU test: 4 times faster !

Buy Me A Coffee

Hi !

Yesterday I wrote about the new options that we have to train models in ML.Net Model Builder. The main new is that we have the option to use our GPU to train models.

Quick recap, Model Builder supports 3 specific training environments

  • Local (CPU)
  • Local (GPU)
  • Azure

Yesterday I tested train a small image recognition model using CPU and GPU and the training time were very similar. The image training set was small, and I also haven’t configured my GPU & CUDA environment, so I decided to raise my own bet and test with something a little more challenging.

For this new test, I’ll use a set of images from a Kaggle challenge [State Farm Distracted Driver Detection] (see references). This is a 1GB image set, 22424images, with ten categories.

Of course, I used Model Builder to train an Image Classifier scenario, here is a preview of the configuration:

CPU Training

This training scenario was much more resource heavy than yesterday easy test. The total time was 39.2 minutes. Here is more detail

Total experiment time : 2353.6729442 Secs
------------------------------------------------------------------------------------------------------------------
|                                                     Summary                                                    |
------------------------------------------------------------------------------------------------------------------
|ML Task: image-classification                                                                                   |
|Dataset: C:\Users\bruno\AppData\Local\Temp\5e873581-2dab-4d46-911d-cfc0a0455eb1.tsv                             |
|Label : Label                                                                                                   |
|Total experiment time : 2353.6729442 Secs                                                                       |
|Total number of models explored: 1                                                                              |
------------------------------------------------------------------------------------------------------------------

GPU Train

Using the GPU, we have 1/4 time over CPU! 9.6 minutes.

Total experiment time : 581.1946062 Secs
------------------------------------------------------------------------------------------------------------------
|                                                     Summary                                                    |
------------------------------------------------------------------------------------------------------------------
|ML Task: image-classification                                                                                   |
|Dataset: C:\Users\bruno\AppData\Local\Temp\cccb2b3f-dbce-45e5-b17e-872b6cc3f116.tsv                             |
|Label : Label                                                                                                   |
|Total experiment time : 581.1946062 Secs                                                                        |
|Total number of models explored: 1                                                                              |
------------------------------------------------------------------------------------------------------------------

Conclusion

GPUs are great for deep learning because the type of calculations they were designed to process are the same as those encountered in deep learning. Images, videos, and other graphics are represented as matrices so that when you perform any operation, such as a zoom-in effect or a camera rotation, all you are doing is applying some mathematical transformation to a matrix.

Even if you have a non-powerful GPU (like me!), you may want to give it a try when you are training a model. ML.Net Model Builder documentation includes a series of steps to configure a CUDA 10.0 environment, which is good enough for most NVIDIA Graphic Cards.

Happy coding!

Greetings

El Bruno

References

#VS2019 – ML.NET Model Builder GPU Support, awesome !

Buy Me A Coffee

Hi !

Machine Learning.Net (ML.Net) includes a visual / step-by-step option for Auto ML: Model Builder. Even if ML.Net is based on .Net Core, and we can use it on Windows, Linux or Mac; Model Builder is only available for Visual Studio 2019.

And, in the latest preview version, Model Builder supports 3 specific training environments

  • Local (CPU)
  • Local (GPU)
  • Azure

And as you can imagine, if you have a decent GPU, the 2nd option is a must. So, let’s have a review for this.

First step is to install the [ML.NET Model Builder GPU Support (Preview)] extension (see references).

This will take a couple of minutes, and it will add the GPU option to our Model Builder projects. Once we have the extension installed, when we create a new ML.Net Model Builder Scenario, we can choose for CPU, GPU, or Azure environment.

In each one, we can see the different options available.

For this test, I use a sample Image Recognition scenario with 24 images for 3 different labels: fish, flower and human. This is a very simple scenario and the RestNet will easily work on this.

CPU Train

So, the total time to train a Image Recognition model with my CPU is:

Total experiment time : 230.8386996 Secs

Here are some more details:

------------------------------------------------------------------------------------------------------------------
|                                                     Summary                                                    |
------------------------------------------------------------------------------------------------------------------
|ML Task: image-classification                                                                                   |
|Dataset: C:\Users\bruno\AppData\Local\Temp\81efe1ab-c776-4071-b0ea-b7c93c65b239.tsv                             |
|Label : Label                                                                                                   |
|Total experiment time : 230.8386996 Secs                                                                        |
|Total number of models explored: 1                                                                              |
------------------------------------------------------------------------------------------------------------------

GPU Train

So, the total time to train a Image Recognition model with my GPO is:

Total experiment time : 228.1201648 Secs

More Details

------------------------------------------------------------------------------------------------------------------
|                                                     Summary                                                    |
------------------------------------------------------------------------------------------------------------------
|ML Task: image-classification                                                                                   |
|Dataset: C:\Users\bruno\AppData\Local\Temp\727e5bf8-bbe0-4d13-9513-043453a06bec.tsv                             |
|Label : Label                                                                                                   |
|Total experiment time : 228.1201648 Secs                                                                        |
|Total number of models explored: 1                                                                              |
------------------------------------------------------------------------------------------------------------------

CUDA and GPU must be configured

As you can see, the time is very similar in both scenarios, and there is a good reason for this. I just installed a brand new Windows 10 environment, and I haven’t configured my GPU / CUDA.

Lucky for us, there is an option in the Environment process, which allow us to check if our GPU will work.

Last time, it took me almost a day to configure my NVIDIA CUDA environment. So, as soon as I get this up and running, I’ll update this post!

Happy coding!

Greetings

El Bruno

References

#VS2019 – Let’s do some image classification with #MLNET Model Builder! (AKA, let’s create an image classifier model without a line of code)

Buy Me A Coffee

Hi!

I’m getting ready for my last event of the year, and I just realize that in the latest update of Model Builder, we have the chance to build our own Image Classifier scenario. Let’s start with the official Model Builder definition (see references):

ML.NET Model Builder provides an easy to understand visual interface to build, train, and deploy custom machine learning models. Prior machine learning expertise is not required. Model Builder supports AutoML, which automatically explores different machine learning algorithms and settings to help you find the one that best suits your scenario.

Working with images was supported for a while in Machine Learning.Net. In the Machine Learning .Net Samples, we have sample scenarios like

Image Classification Model Training – Preferred API (Based on native TensorFlow transfer learning)

In this sample app you create your own custom image classifier model by natively training a TensorFlow model from ML.NET API with your own images.

We even have an amazing tutorial, to create our own image classification model from zero

Tutorial: Generate an ML.NET image classification model from a pre-trained TensorFlow model

Learn how to transfer the knowledge from an existing TensorFlow model into a new ML.NET image classification model. The TensorFlow model was trained to classify images into a thousand categories. The ML.NET model makes use of part of the TensorFlow model in its pipeline to train a model to classify images into 3 categories.

Training an Image Classification model from scratch requires setting millions of parameters, a ton of labeled training data and a vast amount of compute resources (hundreds of GPU hours). While not as effective as training a custom model from scratch, transfer learning allows you to shortcut this process by working with thousands of images vs. millions of labeled images and build a customized model fairly quickly (within an hour on a machine without a GPU). This tutorial scales that process down even further, using only a dozen training images.

And now, I found that Model Builder, also supports an Image Classification Scenario.

It follows the Model Builder standard workflow, starting with the selection of the scenario:

model builder select scenario

And then selecting a folder with the Images.

model builder images for training

Important: Model Builder expects image data to be JPG or PNG files organized in folders that correspond to the categories of the classification.

To load images into Model Builder, provide the path to a single top-level directory:

  •     This top-level directory contains one subfolder for each of the categories to predict.
  •     Each subfolder contains the image files belonging to its category.

Once the folder is selected, we can see a preview of the images and labels loaded from the folder.

model builder folder selected image preview

For more information about how to organize images for this scenario, refer to Load training data into Model Builder.

And now we start the training process. This may take a while, depending on your hardware. I’m using the sample set of drawings that we used on the InsiderDev Tour, for Custom Vision. These are 24 drawings images, with 3 labels, and in a PC with a I7, 32GB of Ram and an SSD, the training process took a little longer than 2 minutes.

model builder train images complete

Once the training is complete, we have a decent accuracy in our model, so it’s time to test. Before Model Builder last step, we have the chance to test the model with some test images.

Using one of the images that I created at Ignite in Orlando, the trained model get’s a human with a 99% of accuracy.

model builder model trained test image

And, the final step is to add the generated model and code to our project. I’ll write about how to use this generated code on the near future.

model builder code generated

Happy Coding!

Greetings @ Burlington

El Bruno

References

#MLNET – Testing Machine Learning Model Builder preview. It’s so cool !

Hi !

Last week Machine Learning.Net 1.0 was officially announced during Build 2019, and the ML.Net team also announced a set of ML tools related to ML.Net.

One of the most interesting ones is Machine Learning Model Builder. You can get more information about Model Builder in the official website.

ML.NET Model Builder provides an easy to understand visual interface to build, train, and deploy custom machine learning models. Prior machine learning expertise is not required. Model Builder supports AutoML, which automatically explores different machine learning algorithms and settings to help you find the one that best suits your scenario.

Machine Learning Model Builder

The tool is on Preview, but it’s still an amazing one to play around with ML. So I decided to give it a try with my small data set of kids, the one I use on the Machine Learning.Net demos.

The structure of my CSV file is very simple with just 3 columns: Age, Gender and Label.

However the first time I run the scenario I found the following error.

Inferring Columns ...
Creating Data loader ...
Loading data ...
Exploring multiple ML algorithms and settings to find you the best model for ML task: regression
For further learning check: https://aka.ms/mlnet-cli
|     Trainer                             RSquared Absolute-loss Squared-loss RMS-loss  Duration #Iteration      |
[Source=AutoML, Kind=Trace] Channel started
Exception occured while exploring pipelines:
Provided label column 'Label' was of type String, but only type Single is allowed.
System.ArgumentException: Provided label column 'Label' was of type String, but only type Single is allowed.
   at Microsoft.ML.CLI.Program.<>c__DisplayClass1_0.<Main>b__0(NewCommandSettings options)
   at Microsoft.ML.CLI.CodeGenerator.CodeGenerationHelper.GenerateCode()
Please see the log file for more info.
Exiting ...

Which makes a lot of sense, my Label column is a String and the Model Builder expects a Single data type. So, I updated my data file replacing the labels with numbers and I was ready for a 2nd test.

This time the training process started fine, however I noticed that using just a small training dataset didn’t trigger any comparing between different algorithms. So I created a much bigger training dataset, and now I got the training process up and running.

At the end the results are the ones below. And it’s very interesting. I do most of my demos using a MultiClass SDCA trainer and AutoML suggest me to use a LightGBM trainer. This will be part of my Machine Learning.Net speech for sure in the future.

You can download the Visual Studio extension from https://dotnet.microsoft.com/apps/machinelearning-ai/ml-dotnet/model-builder and remember that we can talk about this on the Visual Studio 2019 event with the Mississauga .Net User Group in a couple of weeks!

Happy Coding!

Greetings @ Toronto

El Bruno