Hi!
A few days ago, the ML.Net team released the 0.6.0 version of Machine Learning.Net and one of the most important changes it’s on the way we use ML.Net API.
In my MLNet sessions I usually comment on a prediction scenario based on the Label of a person to see if it is a child, baby or teenager. All of this using a small set of data with information like name, age and gender.
As you can see from the previous image, my training data set has the columns Name, Age, Gender And Label. well, One important detail is that I also need 2 .Net classes with fields to represent the rows of My trainings Datasets and the expected Prediction.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
public class AgeRangeData | |
{ | |
[Column(ordinal: "0")] | |
public float AgeStart; | |
[Column(ordinal: "1")] | |
public float AgeEnd; | |
[Column(ordinal: "2", name: "Label")] | |
public string Label; | |
} | |
public class AgeRangePrediction | |
{ | |
[ColumnName("PredictedLabel")] | |
public string PredictedLabels; | |
} |
In the 0.5.0 version of ML.Net the way of working was based on creating a pipeline with the following steps
- Define the data model and load training data
- Define the features and labels
- Select a Trainer and train the model
The generated model allowed to make Predictions. The following example explains it in 20 Lines of code
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
static void Main(string[] args) | |
{ | |
var fileName = "AgeRanges.csv"; | |
var pipeline = new LearningPipeline(); | |
pipeline.Add(new TextLoader(fileName).CreateFrom<AgeRange>(separator: ',', useHeader: true)); | |
pipeline.Add(new Dictionarizer("Label")); | |
pipeline.Add(new ColumnConcatenator("Features", "Age")); | |
pipeline.Add(new StochasticDualCoordinateAscentClassifier()); | |
pipeline.Add(new PredictedLabelColumnOriginalValueConverter {PredictedLabelColumn = "PredictedLabel" }); | |
var model = pipeline.Train<AgeRange, AgeRangePrediction>(); | |
Predict(model, "john", 9, "M"); | |
Predict(model, "mary", 14, "M"); | |
Predict(model, "laura", 2, "M"); | |
Console.ReadLine(); | |
} |
With the 0.6.0 version, the way the API is used has changed a lot. The best thing at this point is to read the original post of Cesar De la Torre where he explains the novelties of this version (see references). However, I think leaving an example of the same code, adapted to version 0.6.0, will be good enough to share an idea of how flexible is the new API
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
static void Main(string[] args) | |
{ | |
var dataPath = "AgeRangeData.csv"; | |
var env = new LocalEnvironment(); | |
var reader = TextLoader.CreateReader(env, ctx => ( | |
Age: ctx.LoadFloat(1), | |
Label: ctx.LoadText(3)), | |
separator: ',', hasHeader: true); | |
var trainData = reader.Read(new MultiFileSource(dataPath)); | |
var classification = new MulticlassClassificationContext(env); | |
var learningPipeline = reader.MakeNewEstimator() | |
.Append(r => ( | |
r.Label, | |
Predictions: classification.Trainers.Sdca | |
(label: r.Label.ToKey(), | |
features: r.Age.AsVector()))) | |
.Append(r => r.Predictions.predictedLabel.ToValue()); | |
var model = learningPipeline.Fit(trainData); | |
var predictionFunc = model.AsDynamic.MakePredictionFunction<AgeRangeNewApi, AgeRangePredictionNewApi>(env); | |
var example = new AgeRangeNewApi() | |
{ | |
Age = 6, | |
Name = "John", | |
Gender = "M" | |
}; | |
var prediction = predictionFunc.Predict(example); | |
Console.WriteLine("prediction: " + prediction.PredictedLabel); | |
Console.ReadLine(); | |
} |
The main changes
- Lines 5 to 9, loading the initial data file for training
- Lines 11 to 18, Defining Features and Label for training
- Line 20, training and model creation
- Line 22, Creating a function to make predictions
- Lines 24 to 32, example of a prediction
The complete example available in https://github.com/elbruno/Blog/tree/master/20181011%20MLNET%200.6%20NewAPI
Happy coding!
Greetings @ Toronto
El Bruno
References
My Posts
- Fix the error [System. InvalidOperationException, Entry Point ‘ Not found] when you train a pipeline
- Adding NuGet Packages in Preview mode from MyGet, ie: Microsoft.ML-0.6.0 Version
- ML.Net 0.5 initial support for TensorFlow
- New version 0.4, news Improvements in Text analysis using Word Embedding
- Error ‘Entry point ‘Trainers.LightGbmClassifier’ not found’ and how to fix it
- Machine Learning Glossary of terms
- Export Machine Learning.Net models to ONNX format
- Loading Data In our Learning Pipeline With List (Lists for ever!)
- What’s new in version 0.2.0
- What’s a Machine Learning model? A 7 minute video as the best possible explanation
- Write and Load models using Machine Learning .Net
- Understanding the step by step of Hello World
- Hello World in ML.Net, Machine Learning for .Net !
I like this coding style better than the samples! In this example are you only adding “Age” as the training parameter? If so, how to I add more? Gender for example.
LikeLike