Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • 05:14
  • Sep 15 22:14
    CESARDELATORRE edited #4216
  • Sep 15 22:12
    CESARDELATORRE edited #4216
  • Sep 15 22:09
    CESARDELATORRE edited #4216
  • Sep 15 22:09
    CESARDELATORRE labeled #4216
  • Sep 15 22:09
    CESARDELATORRE opened #4216
  • Sep 15 22:09
    CESARDELATORRE labeled #4216
  • Sep 15 13:39
    alankashiwa starred dotnet/machinelearning
  • Sep 14 19:26
    lapsick starred dotnet/machinelearning
  • Sep 14 16:35
    mesies starred dotnet/machinelearning
  • Sep 14 08:34
    nhatnlq starred dotnet/machinelearning
  • Sep 14 03:33
    julianobd edited #4213
  • Sep 13 21:15
    julianobd edited #4213
  • Sep 13 20:53
    ashbhandare synchronize #4205
  • Sep 13 20:53
    ashbhandare commented #4205
  • Sep 13 20:43

    bpstark on master

    Modified how DataViewTypes are … (compare)

  • Sep 13 20:43
    bpstark closed #4187
  • Sep 13 20:43
    bpstark closed #4121
  • Sep 13 20:41
    bpstark commented #4187
  • Sep 13 20:41
    bpstark commented #4187
Nat Elkins
@NatElkins
And the error messages don't provide much help
For example, when I call LoadFromTextFile, I get the error "Bad item type for Key"
Which column? What item? What type is bad? What are the valid types? And why does the API even allow me to do this in the first place?
To find these answers you need to go hunting through the source
Relevant code here:
        public static bool IsValidDataType(Type type)
        {
            return type == typeof(uint) || type == typeof(ulong) || type == typeof(ushort) || type == typeof(byte);
        }
But why only these types? Why not strings, if they're keys and therefore arbitrary identifiers?
Nat Elkins
@NatElkins
Some other questions, if anyone from the .NET team drops by...I notice in this sample, the record type is annotated with column information, but it's also explicitly created in code later on. Why is that?
I've been having a crazy amount of trouble getting this working, progress here if anyone feels like taking a look
Kevin Malenfant
@kevmal
@NatElkins Easiest thing is to append MatrixFactorization to your pipeline.... Right now you transform your data with pipeline and pass the data, mappedDataView, to train est. So if you want to use your trained model you would need to do the same thing which is why you get a "UserIdEncoded column 'MatrixColumnIndex' not found" exception
Re: "type is annotated with column information, but it's also explicitly created in code later on. Why is that?"... IIRC there was a point where samples got converted to use member attributes for data loading but in this particular case there was an issue that meant it had to remain using Column descriptors... The fact the LoadColumn attributes are there is just an artifact.... Most likely at this point it could use either method without issue.
Nat Elkins
@NatElkins
@kevmal Thank you so much helping! Trying to digest what you've said.
Nat Elkins
@NatElkins
I've got it running without crashing now, unfortunately my "Score" is returning as NaN
Thank you again!
Nat Elkins
@NatElkins
Does anyone know if a prediction score of NaN can result from a dataset being too small?
Or what other reasons might cause it?
Also, @kevmal , if you post your answer on my SO question I'll give you those sweet, sweet upvotes
Veikko Eeva
@veikkoeeva
:point_up: August 25, 2019 6:16 PM @jwood803 using compute sticks with ML.NET (transfer learning?) could be interesting. Something at :point_up: June 26, 2019 4:00 PM.
Edge is all the rage, so to speak.
Mark Whiting
@whitmark
@eerhardt Thanks for the assist on removing extra columns. Unfortunately, that technique does not play nice with saving the model in onnx format.
Using the AutoML api, I did get it to run just fine, but would like to know more for each of the trainers selected what options and option values are being used. Is is it possible to dump this info while the experiment is running? Doing a binary classifcation using LightGBM, for example.
Eric Erhardt
@eerhardt

@whitmark - if you look at the logs\debug_log.txt file that is outputted by mlnet auto-train, you will see all the trainers and options used. You will see something like:

[Source=AutoML, Kind=Trace] Evaluating pipeline xf=ValueToKeyMapping{ col=Iris_setosa:Iris_setosa} xf=ColumnConcatenating{ col=Features:5_1,3_5,1_4,0_2} xf=Normalizing{ col=Features:Features} tr=SdcaMaximumEntropyMulti{L2Regularization:0.01, L1Regularization:<Auto>, ConvergenceTolerance:0.001, MaximumNumberOfIterations:20, Shuffle:False, BiasLearningRate:0} xf=KeyToValueMapping{ col=PredictedLabel:PredictedLabel} cache=+

Note this part tr=SdcaMaximumEntropyMulti{L2Regularization:0.01, L1Regularization:<Auto>, ConvergenceTolerance:0.001, MaximumNumberOfIterations:20, Shuffle:False, BiasLearningRate:0} describes the trainer, and the hyper parameter values being used. If a parameter isn't specified, it is left to the default value.

Mark Whiting
@whitmark
@eerhardt - thanks, that's useful. I see the log is autogenerated with mlnet auto-train using the CLI, but am not seeing an option to enable the same using the C# AutoML API.
Mark Whiting
@whitmark
@eerhardt. Nevermind, when I reran the log popped up, thanks!
Mark Whiting
@whitmark
@eerhardt. Upon closer inspection, the debug_log.txt I found was from an earlier test I ran using the Model Builder and not from my subsequent C# AutoML run so it's still an open question as to whether the C# API can support this type of output. Also, is there an option to override the defaults used for any hyperparameter or change up which ones are <Auto>? Would be nice to specify where the log is written to as well. Seems like Azure has some of these AutoMLConfig capabilities but not for local operations?
Eric Erhardt
@eerhardt

so it's still an open question as to whether the C# API can support this type of output.

From the AutoML C# API, you get these outputs from handling the MLContext.Log event. You can then do whatever you'd like with the log message. That is the way the mlnet auto-train code works -

https://github.com/dotnet/machinelearning/blob/6f3d26c8ec247881aef8541d8c1ac5816c9d9f77/src/mlnet/CodeGenerator/CodeGenerationHelper.cs#L39-L40

Eric Erhardt
@eerhardt

is there an option to override the defaults used for any hyperparameter or change up which ones are <Auto>?

Not currently. Check out all the settings you are able to provide on https://docs.microsoft.com/en-us/dotnet/api/microsoft.ml.automl.experimentsettings and its derived classes. If you feel a setting is missing, feel free to open an issue on https://github.com/dotnet/machinelearning

marcnet80
@marcnet80
Hello, everyone. I want to clarify Is this bug fixed: dotnet/machinelearning#3992 ?
Jon Wood
@jwood803
@marcnet80 I believe this PR would fix it - dotnet/machinelearning#4010
marcnet80
@marcnet80
@jwood803 , ok, do you have actual list of bugs/features for newcomers? I see most of this issues have already committed: dotnet/machinelearning#830
Jon Wood
@jwood803
@marcnet80 Definitely! The team does a great job at labeling issues and the "good first issue" should be a good one - https://github.com/dotnet/machinelearning/issues?q=is%3Aissue+is%3Aopen+sort%3Aupdated-desc+label%3A%22good+first+issue%22
Alberto Gonzalez
@agonzalezm
is there any plan to have deep learning in ml net ?
Zeeshan Siddiqui
@codemzs
yea we already have it in preview
check out MIcrosoft.ML.DNN
Alberto Gonzalez
@agonzalezm
ok thanks!
Alberto Gonzalez
@agonzalezm
does this use tensorflow under it?
ML.NET Build 1.3.1 introduced a preview of Deep Neural Network training using C# bindings for Tensorflow
Zeeshan Siddiqui
@codemzs
yea
but I will recommend you use the latest
1.4.0-preview
Alberto Gonzalez
@agonzalezm
okay, but you have to use tensorflow apis or you provide another high-level apis on top of that?
what is the difference between using tensorflowsharp directly or this?
okay found a blog post explaining it
Zeeshan Siddiqui
@codemzs
yea
Haiping
@Oceania2018
@agonzalezm We recommend the advanced APIs provided by ml.net, even though you can also use the low-level APIs in C# Binding, but those low-level APIs may not be so friendly.
Zeeshan Siddiqui
@codemzs
besides we are also building AutoML layer in the advanced API
so you will not need to tune parameters
Alberto Gonzalez
@agonzalezm
understand thanks!