Developed with Antonio Salomao, building off his independent work here.
A lot of the growth in machine learning involves learning from loosely structured data such as text and images. The following analysis provides a light introduction to learning from textual data. We will see if it is possible to identify whether a firm is reporting good or bad results from the text of their quarterly earnings conference call. This is something that can be learned from looking at financial statements, but we want to see if we can train a model to learn similar information from the words spoken during the call. This is known as sentiment analysis and has been explored in finance contexts by Tetlock (2007), Loughran and McDonald (2011), and others.
We're going to use ML.NET, which provides a production-ready API for training and deploying machine learning models.
To start we'll load some libaries.
#r "nuget:FSharp.Stats"
#r "nuget: Microsoft.ML, 1.7.*"
#r "nuget: Microsoft.ML.FastTree"
#r "nuget: FSharp.Data, 5.0.2"
#r "nuget: Plotly.NET, 3.*"
#r "nuget: Plotly.NET.Interactive, 3.*"
#time "on"
open System
open System.IO
open System.IO.Compression
open System.Text.Json
open System.Net
open FSharp.Data
open FSharp.Stats
open Plotly.NET
open Microsoft.ML
open Microsoft.ML.Data
open Microsoft.ML.Transforms.Text
Environment.CurrentDirectory <- __SOURCE_DIRECTORY__
Data
We'll use a dataset containing transcripts of quarterly conference calls from NASADAQ100 companies from 2018 to 2021. Let's download that.
let download (inputUrl:string) (outputFile:string) =
Directory.CreateDirectory(Path.GetDirectoryName(outputFile)) |> ignore
if IO.File.Exists(outputFile) then
printfn $"The file {outputFile} already exists. Skipping download"
else
let web = Http.RequestStream(inputUrl)
use fileStream = IO.File.Create(outputFile)
web.ResponseStream.CopyTo(fileStream)
fileStream.Close()
// Decompress a gzip file
let gunzip (inputFile:string) (outputFile:string) =
Directory.CreateDirectory(Path.GetDirectoryName(outputFile)) |> ignore
if File.Exists(outputFile) then File.Delete(outputFile)
use inputStream = File.OpenRead(inputFile)
use outputStream = File.Create(outputFile)
use gzipStream = new GZipStream(inputStream, CompressionMode.Decompress)
gzipStream.CopyTo(outputStream)
let nq100FullUrl = "https://www.dropbox.com/s/izcsjp06lgwbauu/Nasdaq100CallFull.json.gz?dl=1"
let dataFolder = "data"
let nqFullFile = Path.Combine(dataFolder, "Nasdaq100CallFull.json")
let nq100FullFileGz = nqFullFile.Replace(".json", ".json.gz")
download nq100FullUrl nq100FullFileGz
gunzip nq100FullFileGz nqFullFile
You should now have a file called Nasdaq100CallFull.json
in the data
folder.
Let's read it into a list.
// Types - Earnings Announcement
type CallId =
{ Ticker: string
Exchange: string
FiscalQuarter: int
Date: DateTime }
type CallFull =
{ CallId: CallId
Header: string
PreparedRemarks: string
QuestionsAndAnswers: string
Label: float }
let nq100Full =
File.ReadAllText(nqFullFile)
|> JsonSerializer.Deserialize<List<CallFull>>
Let's look at a call.
let tsla2021q4 =
nq100Full
|> List.find (fun x ->
x.CallId.Ticker = "TSLA" &&
x.CallId.Date.Year = 2021 &&
x.CallId.FiscalQuarter = 4)
The opening of the prepared remarks section.
let elonStarts = tsla2021q4.PreparedRemarks.IndexOf("Elon has some opening remarks. Elon?")
tsla2021q4.PreparedRemarks[elonStarts..elonStarts+1_000]
Opening of the Q&A section.
let firstAnalystQuestion = tsla2021q4.QuestionsAndAnswers.IndexOf("Please go ahead.")
tsla2021q4.QuestionsAndAnswers[firstAnalystQuestion..firstAnalystQuestion+1_000]
And the market-adjusted stock return from the day before to the day after the call.
tsla2021q4.Label
Typical word lengths of the prepared remarks, Q&A, and market return.
let preparedLengthChart =
nq100Full
|> Seq.map (fun x -> x.PreparedRemarks.Split([|' '|]).Length)
|> Chart.Histogram
|> Chart.withTraceInfo(Name = "Prepared Remarks Length")
let qaLengthChart =
nq100Full
|> Seq.map (fun x -> x.QuestionsAndAnswers.Split([|' '|]).Length)
|> Chart.Histogram
|> Chart.withTraceInfo(Name = "Q&A Length")
let returnChart =
nq100Full
|> Seq.map (fun x -> x.Label)
|> Chart.Histogram
|> Chart.withTraceInfo(Name = "Return")
[ preparedLengthChart; qaLengthChart; returnChart ]
|> Chart.SingleStack()
Binary sentiment model
Is the market's reaction to the call correlated with the text of the call?
We need some types that work with ML.NET.
[<CLIMutable>]
type BinarySentimentInput =
{ Label: bool
Text: string }
[<CLIMutable>]
type BinarySentimentOutput =
{ PredictedLabel: bool
Probability: single
Score: single }
// ML.NET context
let ctx = new MLContext(seed = 1)
A train and test split of the data.
let nq100FullSentiment =
nq100Full
|> Seq.map (fun x ->
{ Label = x.Label > 0.0
Text = x.QuestionsAndAnswers })
|> ctx.Data.LoadFromEnumerable
let nq100FullSplits =
ctx.Data.TrainTestSplit(nq100FullSentiment,
testFraction = 0.2,
seed = 1)
ML.NET has some built-in featurization transforms that we can use to prepare the data.
FeaturizeText converts the text into vectors of normalized word and character n-grams.
let featurizePipeline =
ctx.Transforms.Text.FeaturizeText(
outputColumnName = "Features",
inputColumnName = "Text")
There are many different trainers.
let treeTrainer =
ctx.BinaryClassification.Trainers.FastTree(
labelColumnName = "Label",
featureColumnName = "Features")
We can put the featurization and the trainer together into a pipeline.
let treePipeline = featurizePipeline.Append(treeTrainer)
Trained model.
let binaryTreeModel = treePipeline.Fit(nq100FullSplits.TrainSet)
Model performance. First some functions to compute metrics.
let computeMetrics (model:TransformerChain<_>) iDataView =
let predictions = model.Transform iDataView
ctx.BinaryClassification
.Evaluate(predictions,
labelColumnName = "Label",
scoreColumnName = "Score")
let printBinaryClassificationMetrics name (metrics : CalibratedBinaryClassificationMetrics) =
printfn"************************************************************"
printfn"* Metrics for %s binary classification model " name
printfn"*-----------------------------------------------------------"
printfn"* Accuracy: %.2f%%" (metrics.Accuracy * 100.)
printfn"* Area Under Curve: %.2f%%" (metrics.AreaUnderRocCurve * 100.)
printfn"* Area under Precision recall Curve: %.2f%%" (metrics.AreaUnderPrecisionRecallCurve * 100.)
printfn"* F1Score: %.2f%%" (metrics.F1Score * 100.)
printfn"* LogLogg: %.2f%%" (metrics.LogLoss)
printfn"* LogLossreduction: %.2f%%" (metrics.LogLossReduction)
printfn"* PositivePrecision: %.2f" (metrics.PositivePrecision)
printfn"* PositiveRecall: %.2f" (metrics.PositiveRecall)
printfn"* NegativePrecision: %.2f" (metrics.NegativePrecision)
printfn"* NegativeRecall: %.2f" (metrics.NegativeRecall)
printfn"*\n-----------------------------------------------------------"
printfn"* Confusion matrix for %s binary classification model " name
printfn"*-----------------------------------------------------------"
printfn $"{(metrics.ConfusionMatrix.GetFormattedConfusionTable())}"
printfn"************************************************************"
Now let's actually look at the model performance.
We should be good in the training set.
nq100FullSplits.TrainSet
|> computeMetrics binaryTreeModel
|> printBinaryClassificationMetrics "Train set"
The test is how well we do in the test set.
nq100FullSplits.TestSet
|> computeMetrics binaryTreeModel
|> printBinaryClassificationMetrics "Test set"
That looks pretty good. But maybe there's something special about our train/test sample.
Let's try k-fold cross validation. If we do 5 folds, that means that we split the data into 5 random groups. Then we train the model on 4/5 of the data and test on the remaining 1/5. We do this 5 times, cycling through the data.
let downcastPipeline (pipeline : IEstimator<'a>) =
match pipeline with
| :? IEstimator<ITransformer> as p -> p
| _ -> failwith "The pipeline has to be an instance of IEstimator<ITransformer>."
//https://docs.microsoft.com/en-us/dotnet/machine-learning/how-to-guides/train-machine-learning-model-cross-validation-ml-net
let cvResults =
ctx.BinaryClassification
.CrossValidate(data = nq100FullSentiment,
estimator = downcastPipeline treePipeline,
numberOfFolds=5,
seed = 1)
Results.
cvResults
|> Seq.iteri (fun i x -> printfn $"Fold {i+1}: {x.Metrics.Accuracy}")
cvResults
|> Seq.averageBy (fun x -> x.Metrics.Accuracy)
|> printfn "Average accuracy: %.2f%%"
That's actually pretty good.
Let's see if there's much cost to simplifying the text feaurization pipeline. Currently we're using defaults, which uses all words and character n-grams. We have relatively few observations compared to our vocabulary size, so it might make sense to use fewer features.
let textFeatureOptions =
// Set up word n-gram options
// https://docs.microsoft.com/en-us/dotnet/api/microsoft.ml.transforms.text.wordbagestimator.options?view=ml-dotnet
let wordOptions = new WordBagEstimator.Options()
wordOptions.NgramLength <- 2
wordOptions.MaximumNgramsCount <- [| for i = 0 to wordOptions.NgramLength-1 do 1_000 |]
// Set up stop word options
// https://docs.microsoft.com/en-us/dotnet/api/microsoft.ml.transforms.text.textfeaturizingestimator.options.stopwordsremoveroptions?view=ml-dotnet
let stopOptions = new StopWordsRemovingEstimator.Options()
// Set up char n-gram options
// https://docs.microsoft.com/en-us/dotnet/api/microsoft.ml.transforms.text.charbagestimator.options?view=ml-dotnet
let charOptions = null //new WordBagEstimator.Options()
// Set the text options
// https://docs.microsoft.com/en-us/dotnet/api/microsoft.ml.transforms.text.textfeaturizingestimator?view=ml-dotnet
let textOptions = new Transforms.Text.TextFeaturizingEstimator.Options()
textOptions.CharFeatureExtractor <- charOptions
textOptions.WordFeatureExtractor <- wordOptions
textOptions.StopWordsRemoverOptions <- stopOptions
// return the text options
textOptions
let featurizePipelineSimple =
ctx.Transforms.Text
.FeaturizeText("Features",
"Text",
options=textFeatureOptions)
let treeSimple = featurizePipelineSimple.Append(treeTrainer)
Now try cross-validation on the simpler example.
let cvSimple =
ctx.BinaryClassification
.CrossValidate(data = nq100FullSentiment,
estimator = downcastPipeline treeSimple,
numberOfFolds=5,
seed = 1)
cvSimple
|> Seq.iteri (fun i x -> printfn $"Fold {i+1}: {x.Metrics.Accuracy}")
cvSimple
|> Seq.averageBy (fun x -> x.Metrics.Accuracy)
|> printfn "Average accuracy: %.2f%%"
(* That's pretty good, and much faster that the default featurization pipeline. Let's use that going forward.
Let's try a model on prepared remarks.
*)
let nq100PreparedSentiment =
nq100Full
|> Seq.map (fun x ->
{ Label = x.Label > 0.0
Text = x.PreparedRemarks })
|> ctx.Data.LoadFromEnumerable
let cvPreparedRemarks =
ctx.BinaryClassification
.CrossValidate(data = nq100PreparedSentiment,
estimator = downcastPipeline treeSimple,
numberOfFolds=5,
seed = 1)
Look at the cross validation performance.
cvPreparedRemarks
|> Seq.iteri (fun i x -> printfn $"Fold {i+1}: {x.Metrics.Accuracy}")
cvPreparedRemarks
|> Seq.averageBy (fun x -> x.Metrics.Accuracy)
|> printfn "Average accuracy: %.2f%%"
It's not a huge difference, but prepared remarks is not as informative as the Q&A. That can make sense if management tries to put a positive spin on things.
Is the model improved by training it with extreme events?
let nq100ExtremeSentiment =
nq100Full
|> Seq.filter (fun x -> (abs x.Label) > 0.075)
|> Seq.map (fun x ->
{ Label = x.Label > 0.0
Text = x.QuestionsAndAnswers })
|> ctx.Data.LoadFromEnumerable
let cvExtreme =
ctx.BinaryClassification
.CrossValidate(data = nq100ExtremeSentiment,
estimator = downcastPipeline treeSimple,
numberOfFolds=5,
seed = 1)
Look at the cross validation performance.
cvExtreme
|> Seq.iteri (fun i x -> printfn $"Fold {i+1}: {x.Metrics.Accuracy}")
cvExtreme
|> Seq.averageBy (fun x -> x.Metrics.Accuracy)
|> printfn "Average accuracy: %.2f%%"
Multiclass model
The multiple classes will be "positive", "negative", and "neutral".
[<CLIMutable>]
type MulticlassSentimentInput =
{ Label: string
Text: string }
[<CLIMutable>]
type MulticlassSentimentOutput =
{ PredictedLabel: string
Probability: single
Score: single }
let nq100MulticlassSentiment =
nq100Full
|> Seq.map (fun x ->
{ Label =
if x.Label < -0.05 then "neg"
elif x.Label > 0.05 then "pos"
else "neutral"
Text = x.QuestionsAndAnswers })
|> ctx.Data.LoadFromEnumerable
let nq100MulticlassSplits =
ctx.Data.TrainTestSplit(nq100MulticlassSentiment,
testFraction = 0.2,
seed = 1)
Picking a multi-class trainer.
let multiTrainer =
ctx.MulticlassClassification.Trainers.SdcaMaximumEntropy(
labelColumnName = "Label",
featureColumnName = "Features")
The finished pipeline.
let multiPipeline =
// Estimator chain seems to speed this up.
EstimatorChain()
.Append(featurizePipelineSimple)
// for multiclass, you have to put the label in a keyvalue store
.Append(ctx.Transforms.Conversion.MapValueToKey("Label"))
.AppendCacheCheckpoint(ctx)
.Append(multiTrainer)
The model (this can be slow to train).
let multiModel = multiPipeline.Fit(nq100MulticlassSplits.TrainSet)
Evaluating the model.
let computeMultiClassMetrics (model:TransformerChain<_>) iDataView =
let predictions = model.Transform iDataView
ctx.MulticlassClassification
.Evaluate(predictions,
labelColumnName = "Label",
scoreColumnName = "Score")
let printMultiClassClassificationMetrics name (metrics : MulticlassClassificationMetrics) =
printfn "************************************************************"
printfn "* Metrics for %s multi-class classification model " name
printfn "*-----------------------------------------------------------"
printfn " AccuracyMacro = %.4f, a value between 0 and 1, the closer to 1, the better" metrics.MacroAccuracy
printfn " AccuracyMicro = %.4f, a value between 0 and 1, the closer to 1, the better" metrics.MacroAccuracy
printfn " LogLoss = %.4f, the closer to 0, the better" metrics.LogLoss
printfn " LogLoss for class 1 = %.4f, the closer to 0, the better" metrics.PerClassLogLoss.[0]
printfn " LogLoss for class 2 = %.4f, the closer to 0, the better" metrics.PerClassLogLoss.[1]
printfn " LogLoss for class 3 = %.4f, the closer to 0, the better" metrics.PerClassLogLoss.[2]
printfn "************************************************************"
printfn $"{metrics.ConfusionMatrix.GetFormattedConfusionTable()}"
nq100MulticlassSplits.TrainSet
|> computeMultiClassMetrics multiModel
|> printMultiClassClassificationMetrics "Multi-class: TrainSet"
nq100MulticlassSplits.TestSet
|> computeMultiClassMetrics multiModel
|> printMultiClassClassificationMetrics "Multi-class: TestSet"
A function to make predictions.
let binaryTreePredictions =
ctx.Model.CreatePredictionEngine<BinarySentimentInput, BinarySentimentOutput>(binaryTreeModel)
Look at test output for a negative call.
let sampleNegCall :BinarySentimentInput = {
Label = false
Text = "Our earnings are terrible. All our customers are leaving
and are profits and free cash flow is falling. "}
binaryTreePredictions.Predict(sampleNegCall)
Look at test output for a positive call.
let samplePosCall: BinarySentimentInput = {
Label = true
Text = "We had very high free cash flow. Sales were up, profits were up,
we paid down debt. We expect to beat expectations."}
binaryTreePredictions.Predict(samplePosCall)
namespace FSharp
--------------------
namespace Microsoft.FSharp
namespace FSharp.Data
--------------------
namespace Microsoft.FSharp.Data
<summary>Provides information about, and means to manipulate, the current environment and platform. This class cannot be inherited.</summary>
<summary>Gets or sets the fully qualified path of the current working directory.</summary>
<exception cref="T:System.ArgumentException">Attempted to set to an empty string ("").</exception>
<exception cref="T:System.ArgumentNullException">Attempted to set to <see langword="null" />.</exception>
<exception cref="T:System.IO.IOException">An I/O error occurred.</exception>
<exception cref="T:System.IO.DirectoryNotFoundException">Attempted to set a local path that cannot be found.</exception>
<exception cref="T:System.Security.SecurityException">The caller does not have the appropriate permission.</exception>
<returns>The directory path.</returns>
val string: value: 'T -> string
--------------------
type string = String
<summary>Exposes static methods for creating, moving, and enumerating through directories and subdirectories. This class cannot be inherited.</summary>
Directory.CreateDirectory(path: string, unixCreateMode: UnixFileMode) : DirectoryInfo
<summary>Performs operations on <see cref="T:System.String" /> instances that contain file or directory path information. These operations are performed in a cross-platform manner.</summary>
Path.GetDirectoryName(path: ReadOnlySpan<char>) : ReadOnlySpan<char>
<summary>Provides static methods for the creation, copying, deletion, moving, and opening of a single file, and aids in the creation of <see cref="T:System.IO.FileStream" /> objects.</summary>
namespace System.Net.Http
--------------------
type Http = static member AsyncRequest: url: string * [<Optional>] ?query: (string * string) list * [<Optional>] ?headers: (string * string) seq * [<Optional>] ?httpMethod: string * [<Optional>] ?body: HttpRequestBody * [<Optional>] ?cookies: (string * string) seq * [<Optional>] ?cookieContainer: CookieContainer * [<Optional>] ?silentHttpErrors: bool * [<Optional>] ?silentCookieErrors: bool * [<Optional>] ?responseEncodingOverride: string * [<Optional>] ?customizeHttpRequest: (HttpWebRequest -> HttpWebRequest) * [<Optional>] ?timeout: int -> Async<HttpResponse> static member AsyncRequestStream: url: string * [<Optional>] ?query: (string * string) list * [<Optional>] ?headers: (string * string) seq * [<Optional>] ?httpMethod: string * [<Optional>] ?body: HttpRequestBody * [<Optional>] ?cookies: (string * string) seq * [<Optional>] ?cookieContainer: CookieContainer * [<Optional>] ?silentHttpErrors: bool * [<Optional>] ?silentCookieErrors: bool * [<Optional>] ?customizeHttpRequest: (HttpWebRequest -> HttpWebRequest) * [<Optional>] ?timeout: int -> Async<HttpResponseWithStream> static member AsyncRequestString: url: string * [<Optional>] ?query: (string * string) list * [<Optional>] ?headers: (string * string) seq * [<Optional>] ?httpMethod: string * [<Optional>] ?body: HttpRequestBody * [<Optional>] ?cookies: (string * string) seq * [<Optional>] ?cookieContainer: CookieContainer * [<Optional>] ?silentHttpErrors: bool * [<Optional>] ?silentCookieErrors: bool * [<Optional>] ?responseEncodingOverride: string * [<Optional>] ?customizeHttpRequest: (HttpWebRequest -> HttpWebRequest) * [<Optional>] ?timeout: int -> Async<string> static member Request: url: string * [<Optional>] ?query: (string * string) list * [<Optional>] ?headers: (string * string) seq * [<Optional>] ?httpMethod: string * [<Optional>] ?body: HttpRequestBody * [<Optional>] ?cookies: (string * string) seq * [<Optional>] ?cookieContainer: CookieContainer * [<Optional>] ?silentHttpErrors: bool * [<Optional>] ?silentCookieErrors: bool * [<Optional>] ?responseEncodingOverride: string * [<Optional>] ?customizeHttpRequest: (HttpWebRequest -> HttpWebRequest) * [<Optional>] ?timeout: int -> HttpResponse static member RequestStream: url: string * [<Optional>] ?query: (string * string) list * [<Optional>] ?headers: (string * string) seq * [<Optional>] ?httpMethod: string * [<Optional>] ?body: HttpRequestBody * [<Optional>] ?cookies: (string * string) seq * [<Optional>] ?cookieContainer: CookieContainer * [<Optional>] ?silentHttpErrors: bool * [<Optional>] ?silentCookieErrors: bool * [<Optional>] ?customizeHttpRequest: (HttpWebRequest -> HttpWebRequest) * [<Optional>] ?timeout: int -> HttpResponseWithStream static member RequestString: url: string * [<Optional>] ?query: (string * string) list * [<Optional>] ?headers: (string * string) seq * [<Optional>] ?httpMethod: string * [<Optional>] ?body: HttpRequestBody * [<Optional>] ?cookies: (string * string) seq * [<Optional>] ?cookieContainer: CookieContainer * [<Optional>] ?silentHttpErrors: bool * [<Optional>] ?silentCookieErrors: bool * [<Optional>] ?responseEncodingOverride: string * [<Optional>] ?customizeHttpRequest: (HttpWebRequest -> HttpWebRequest) * [<Optional>] ?timeout: int -> string
<summary> Utilities for working with network via HTTP. Includes methods for downloading resources with specified headers, query parameters and HTTP body </summary>
File.Create(path: string, bufferSize: int) : FileStream
File.Create(path: string, bufferSize: int, options: FileOptions) : FileStream
Stream.CopyTo(destination: Stream, bufferSize: int) : unit
type GZipStream = inherit Stream new: stream: Stream * compressionLevel: CompressionLevel -> unit + 3 overloads member BeginRead: buffer: byte array * offset: int * count: int * asyncCallback: AsyncCallback * asyncState: obj -> IAsyncResult member BeginWrite: buffer: byte array * offset: int * count: int * asyncCallback: AsyncCallback * asyncState: obj -> IAsyncResult member CopyTo: destination: Stream * bufferSize: int -> unit member CopyToAsync: destination: Stream * bufferSize: int * cancellationToken: CancellationToken -> Task member DisposeAsync: unit -> ValueTask member EndRead: asyncResult: IAsyncResult -> int member EndWrite: asyncResult: IAsyncResult -> unit member Flush: unit -> unit ...
<summary>Provides methods and properties used to compress and decompress streams by using the GZip data format specification.</summary>
--------------------
GZipStream(stream: Stream, compressionLevel: CompressionLevel) : GZipStream
GZipStream(stream: Stream, mode: CompressionMode) : GZipStream
GZipStream(stream: Stream, compressionLevel: CompressionLevel, leaveOpen: bool) : GZipStream
GZipStream(stream: Stream, mode: CompressionMode, leaveOpen: bool) : GZipStream
<summary>Specifies whether to compress data to or decompress data from the underlying stream.</summary>
GZipStream.CopyTo(destination: Stream, bufferSize: int) : unit
Path.Combine(path1: string, path2: string) : string
Path.Combine(path1: string, path2: string, path3: string) : string
Path.Combine(path1: string, path2: string, path3: string, path4: string) : string
String.Replace(oldChar: char, newChar: char) : string
String.Replace(oldValue: string, newValue: string, comparisonType: StringComparison) : string
String.Replace(oldValue: string, newValue: string, ignoreCase: bool, culture: Globalization.CultureInfo) : string
val int: value: 'T -> int (requires member op_Explicit)
--------------------
type int = int32
--------------------
type int<'Measure> = int
[<Struct>] type DateTime = new: year: int * month: int * day: int -> unit + 16 overloads member Add: value: TimeSpan -> DateTime member AddDays: value: float -> DateTime member AddHours: value: float -> DateTime member AddMicroseconds: value: float -> DateTime member AddMilliseconds: value: float -> DateTime member AddMinutes: value: float -> DateTime member AddMonths: months: int -> DateTime member AddSeconds: value: float -> DateTime member AddTicks: value: int64 -> DateTime ...
<summary>Represents an instant in time, typically expressed as a date and time of day.</summary>
--------------------
DateTime ()
(+0 other overloads)
DateTime(ticks: int64) : DateTime
(+0 other overloads)
DateTime(ticks: int64, kind: DateTimeKind) : DateTime
(+0 other overloads)
DateTime(date: DateOnly, time: TimeOnly) : DateTime
(+0 other overloads)
DateTime(year: int, month: int, day: int) : DateTime
(+0 other overloads)
DateTime(date: DateOnly, time: TimeOnly, kind: DateTimeKind) : DateTime
(+0 other overloads)
DateTime(year: int, month: int, day: int, calendar: Globalization.Calendar) : DateTime
(+0 other overloads)
DateTime(year: int, month: int, day: int, hour: int, minute: int, second: int) : DateTime
(+0 other overloads)
DateTime(year: int, month: int, day: int, hour: int, minute: int, second: int, kind: DateTimeKind) : DateTime
(+0 other overloads)
DateTime(year: int, month: int, day: int, hour: int, minute: int, second: int, calendar: Globalization.Calendar) : DateTime
(+0 other overloads)
val float: value: 'T -> float (requires member op_Explicit)
--------------------
type float = Double
--------------------
type float<'Measure> = float
File.ReadAllText(path: string, encoding: Text.Encoding) : string
<summary>Provides functionality to serialize objects or value types to JSON and to deserialize JSON into objects or value types.</summary>
(+0 other overloads)
JsonSerializer.Deserialize<'TValue>(reader: byref<Utf8JsonReader>, ?options: JsonSerializerOptions) : 'TValue
(+0 other overloads)
JsonSerializer.Deserialize<'TValue>(node: Nodes.JsonNode, jsonTypeInfo: Serialization.Metadata.JsonTypeInfo<'TValue>) : 'TValue
(+0 other overloads)
JsonSerializer.Deserialize<'TValue>(node: Nodes.JsonNode, ?options: JsonSerializerOptions) : 'TValue
(+0 other overloads)
JsonSerializer.Deserialize<'TValue>(element: JsonElement, jsonTypeInfo: Serialization.Metadata.JsonTypeInfo<'TValue>) : 'TValue
(+0 other overloads)
JsonSerializer.Deserialize<'TValue>(element: JsonElement, ?options: JsonSerializerOptions) : 'TValue
(+0 other overloads)
JsonSerializer.Deserialize<'TValue>(document: JsonDocument, jsonTypeInfo: Serialization.Metadata.JsonTypeInfo<'TValue>) : 'TValue
(+0 other overloads)
JsonSerializer.Deserialize<'TValue>(document: JsonDocument, ?options: JsonSerializerOptions) : 'TValue
(+0 other overloads)
JsonSerializer.Deserialize<'TValue>(json: string, jsonTypeInfo: Serialization.Metadata.JsonTypeInfo<'TValue>) : 'TValue
(+0 other overloads)
JsonSerializer.Deserialize<'TValue>(json: string, ?options: JsonSerializerOptions) : 'TValue
(+0 other overloads)
module List from FSharp.Stats
<summary> Module to compute common statistical measure on list </summary>
--------------------
module List from Microsoft.FSharp.Collections
--------------------
type List = new: unit -> List static member geomspace: start: float * stop: float * num: int * ?IncludeEndpoint: bool -> float list static member linspace: start: float * stop: float * num: int * ?IncludeEndpoint: bool -> float list
--------------------
type List<'T> = | op_Nil | op_ColonColon of Head: 'T * Tail: 'T list interface IReadOnlyList<'T> interface IReadOnlyCollection<'T> interface IEnumerable interface IEnumerable<'T> member GetReverseIndex: rank: int * offset: int -> int member GetSlice: startIndex: int option * endIndex: int option -> 'T list static member Cons: head: 'T * tail: 'T list -> 'T list member Head: 'T member IsEmpty: bool member Item: index: int -> 'T with get ...
--------------------
new: unit -> List
<summary>Gets the year component of the date represented by this instance.</summary>
<returns>The year, between 1 and 9999.</returns>
String.IndexOf(value: char) : int
String.IndexOf(value: string, comparisonType: StringComparison) : int
String.IndexOf(value: string, startIndex: int) : int
String.IndexOf(value: char, comparisonType: StringComparison) : int
String.IndexOf(value: char, startIndex: int) : int
String.IndexOf(value: string, startIndex: int, comparisonType: StringComparison) : int
String.IndexOf(value: string, startIndex: int, count: int) : int
String.IndexOf(value: char, startIndex: int, count: int) : int
String.IndexOf(value: string, startIndex: int, count: int, comparisonType: StringComparison) : int
module Seq from FSharp.Stats
<summary> Module to compute common statistical measure </summary>
--------------------
module Seq from Microsoft.FSharp.Collections
--------------------
type Seq = new: unit -> Seq static member geomspace: start: float * stop: float * num: int * ?IncludeEndpoint: bool -> float seq static member linspace: start: float * stop: float * num: int * ?IncludeEndpoint: bool -> float seq
--------------------
new: unit -> Seq
String.Split(separator: string array, options: StringSplitOptions) : string array
String.Split(separator: string, ?options: StringSplitOptions) : string array
String.Split(separator: char array, options: StringSplitOptions) : string array
String.Split(separator: char array, count: int) : string array
String.Split(separator: char, ?options: StringSplitOptions) : string array
String.Split(separator: string array, count: int, options: StringSplitOptions) : string array
String.Split(separator: string, count: int, ?options: StringSplitOptions) : string array
String.Split(separator: char array, count: int, options: StringSplitOptions) : string array
String.Split(separator: char, count: int, ?options: StringSplitOptions) : string array
static member Chart.Histogram: [<Runtime.InteropServices.Optional; Runtime.InteropServices.DefaultParameterValue ((null :> obj))>] ?X: #IConvertible seq * [<Runtime.InteropServices.Optional; Runtime.InteropServices.DefaultParameterValue ((null :> obj))>] ?Y: #IConvertible seq * [<Runtime.InteropServices.Optional; Runtime.InteropServices.DefaultParameterValue ((null :> obj))>] ?Orientation: StyleParam.Orientation * [<Runtime.InteropServices.Optional; Runtime.InteropServices.DefaultParameterValue ((null :> obj))>] ?Name: string * [<Runtime.InteropServices.Optional; Runtime.InteropServices.DefaultParameterValue ((null :> obj))>] ?ShowLegend: bool * [<Runtime.InteropServices.Optional; Runtime.InteropServices.DefaultParameterValue ((null :> obj))>] ?Opacity: float * [<Runtime.InteropServices.Optional; Runtime.InteropServices.DefaultParameterValue ((null :> obj))>] ?Text: 'c * [<Runtime.InteropServices.Optional; Runtime.InteropServices.DefaultParameterValue ((null :> obj))>] ?MultiText: 'c seq * [<Runtime.InteropServices.Optional; Runtime.InteropServices.DefaultParameterValue ((null :> obj))>] ?HistFunc: StyleParam.HistFunc * [<Runtime.InteropServices.Optional; Runtime.InteropServices.DefaultParameterValue ((null :> obj))>] ?HistNorm: StyleParam.HistNorm * [<Runtime.InteropServices.Optional; Runtime.InteropServices.DefaultParameterValue ((null :> obj))>] ?AlignmentGroup: string * [<Runtime.InteropServices.Optional; Runtime.InteropServices.DefaultParameterValue ((null :> obj))>] ?OffsetGroup: string * [<Runtime.InteropServices.Optional; Runtime.InteropServices.DefaultParameterValue ((null :> obj))>] ?NBinsX: int * [<Runtime.InteropServices.Optional; Runtime.InteropServices.DefaultParameterValue ((null :> obj))>] ?NBinsY: int * [<Runtime.InteropServices.Optional; Runtime.InteropServices.DefaultParameterValue ((null :> obj))>] ?BinGroup: string * [<Runtime.InteropServices.Optional; Runtime.InteropServices.DefaultParameterValue ((null :> obj))>] ?XBins: TraceObjects.Bins * [<Runtime.InteropServices.Optional; Runtime.InteropServices.DefaultParameterValue ((null :> obj))>] ?YBins: TraceObjects.Bins * [<Runtime.InteropServices.Optional; Runtime.InteropServices.DefaultParameterValue ((null :> obj))>] ?MarkerColor: Color * [<Runtime.InteropServices.Optional; Runtime.InteropServices.DefaultParameterValue ((null :> obj))>] ?Marker: TraceObjects.Marker * [<Runtime.InteropServices.Optional; Runtime.InteropServices.DefaultParameterValue ((null :> obj))>] ?Line: Line * [<Runtime.InteropServices.Optional; Runtime.InteropServices.DefaultParameterValue ((null :> obj))>] ?XError: TraceObjects.Error * [<Runtime.InteropServices.Optional; Runtime.InteropServices.DefaultParameterValue ((null :> obj))>] ?YError: TraceObjects.Error * [<Runtime.InteropServices.Optional; Runtime.InteropServices.DefaultParameterValue ((null :> obj))>] ?Cumulative: TraceObjects.Cumulative * [<Runtime.InteropServices.Optional; Runtime.InteropServices.DefaultParameterValue ((null :> obj))>] ?HoverLabel: LayoutObjects.Hoverlabel * [<Runtime.InteropServices.Optional; Runtime.InteropServices.DefaultParameterValue ((true :> obj))>] ?UseDefaults: bool -> GenericChart.GenericChart (requires 'c :> IConvertible)
type CLIMutableAttribute = inherit Attribute new: unit -> CLIMutableAttribute
--------------------
new: unit -> CLIMutableAttribute
union case HttpResponseBody.Text: string -> HttpResponseBody
--------------------
namespace System.Text
val single: value: 'T -> single (requires member op_Explicit)
--------------------
type single = Single
--------------------
type single<'Measure> = float32<'Measure>
type MLContext = interface IHostEnvironmentInternal interface IHostEnvironment interface IChannelProvider interface IExceptionContext interface IProgressChannelProvider new: ?seed: Nullable<int> -> unit member AnomalyDetection: AnomalyDetectionCatalog member BinaryClassification: BinaryClassificationCatalog member Clustering: ClusteringCatalog member ComponentCatalog: ComponentCatalog ...
<summary> The common context for all ML.NET operations. Once instantiated by the user, it provides a way to create components for data preparation, feature engineering, training, prediction, and model evaluation. It also allows logging, execution control, and the ability to set repeatable random numbers. </summary>
--------------------
MLContext(?seed: Nullable<int>) : MLContext
<summary> Data loading and saving. </summary>
DataOperationsCatalog.LoadFromEnumerable<'TRow (requires reference type)>(data: Collections.Generic.IEnumerable<'TRow>, ?schemaDefinition: SchemaDefinition) : IDataView
<summary> Data processing operations. </summary>
<summary> The list of operations for processing text data. </summary>
(extension) TransformsCatalog.TextTransforms.FeaturizeText(outputColumnName: string, options: TextFeaturizingEstimator.Options, [<ParamArray>] inputColumnNames: string array) : TextFeaturizingEstimator
<summary> Trainers and tasks specific to binary classification problems. </summary>
<summary> The list of trainers for performing binary classification. </summary>
(extension) BinaryClassificationCatalog.BinaryClassificationTrainers.FastTree(?labelColumnName: string, ?featureColumnName: string, ?exampleWeightColumnName: string, ?numberOfLeaves: int, ?numberOfTrees: int, ?minimumExampleCountPerLeaf: int, ?learningRate: float) : Trainers.FastTree.FastTreeBinaryTrainer
<summary> Training set. </summary>
type TransformerChain<'TLastTransformer (requires reference type and 'TLastTransformer :> ITransformer)> = interface ITransformer interface ICanSaveModel interface IEnumerable<ITransformer> interface IEnumerable interface ITransformerChainAccessor interface IDisposable new: transformers: IEnumerable<ITransformer> * scopes: IEnumerable<TransformerScope> -> unit + 1 overload member Append<'TNewLast (requires reference type and 'TNewLast :> ITransformer)> : transformer: 'TNewLast * ?scope: TransformerScope -> TransformerChain<'TNewLast> member Dispose: unit -> unit member GetEnumerator: unit -> IEnumerator<ITransformer> ...
<summary> A chain of transformers (possibly empty) that end with a <typeparamref name="TLastTransformer" />. For an empty chain, <typeparamref name="TLastTransformer" /> is always <see cref="T:Microsoft.ML.ITransformer" />. </summary>
--------------------
TransformerChain([<ParamArray>] transformers: ITransformer array) : TransformerChain<'TLastTransformer>
TransformerChain(transformers: Collections.Generic.IEnumerable<ITransformer>, scopes: Collections.Generic.IEnumerable<TransformerScope>) : TransformerChain<'TLastTransformer>
<summary> Evaluation results for binary classifiers, including probabilistic metrics. </summary>
<summary> Gets the accuracy of a classifier which is the proportion of correct predictions in the test set. </summary>
<summary> Gets the area under the ROC curve. </summary>
<remarks> The area under the ROC curve is equal to the probability that the classifier ranks a randomly chosen positive instance higher than a randomly chosen negative one (assuming 'positive' ranks higher than 'negative'). Area under the ROC curve ranges between 0 and 1, with a value closer to 1 indicating a better model. <a href="https://en.wikipedia.org/wiki/Receiver_operating_characteristic#Area_under_the_curve">Area Under ROC Curve</a></remarks>
<summary> Gets the area under the precision/recall curve of the classifier. </summary>
<remarks> The area under the precision/recall curve is a single number summary of the information in the precision/recall curve. It is increasingly used in the machine learning community, particularly for imbalanced datasets where one class is observed more frequently than the other. On these datasets, <see cref="P:Microsoft.ML.Data.BinaryClassificationMetrics.AreaUnderPrecisionRecallCurve" /> can highlight performance differences that are lost with <see cref="P:Microsoft.ML.Data.BinaryClassificationMetrics.AreaUnderRocCurve" />. </remarks>
<summary> Gets the F1 score of the classifier, which is a measure of the classifier's quality considering both precision and recall. </summary>
<remarks> F1 score is the harmonic mean of precision and recall: 2 * precision * recall / (precision + recall). F1 ranges between 0 and 1, with a value of 1 indicating perfect precision and recall. </remarks>
<summary> Gets the log-loss of the classifier. Log-loss measures the performance of a classifier with respect to how much the predicted probabilities diverge from the true class label. Lower log-loss indicates a better model. A perfect model, which predicts a probability of 1 for the true class, will have a log-loss of 0. </summary>
<remarks><format type="text/markdown"><![CDATA[ The log-loss metric, is computed as follows: $LogLoss = - \frac{1}{m} \sum{i = 1}^m ln(p_i)$ where m is the number of instances in the test set and $p_i$ is the probability returned by the classifier if the instance belongs to class 1, and 1 minus the probability returned by the classifier if the instance belongs to class 0. ]]></format></remarks>
<summary> Gets the log-loss reduction (also known as relative log-loss, or reduction in information gain - RIG) of the classifier. It gives a measure of how much a model improves on a model that gives random predictions. Log-loss reduction closer to 1 indicates a better model. </summary>
<remarks><format type="text/markdown"><![CDATA[ The log-loss reduction is scaled relative to a classifier that predicts the prior for every example: $LogLossReduction = \frac{LogLoss(prior) - LogLoss(classifier)}{LogLoss(prior)}$ This metric can be interpreted as the advantage of the classifier over a random prediction. For example, if the RIG equals 0.2, it can be interpreted as "the probability of a correct prediction is 20% better than random guessing". ]]></format></remarks>
<summary> Gets the positive precision of a classifier which is the proportion of correctly predicted positive instances among all the positive predictions (i.e., the number of positive instances predicted as positive, divided by the total number of instances predicted as positive). </summary>
<summary> Gets the positive recall of a classifier which is the proportion of correctly predicted positive instances among all the positive instances (i.e., the number of positive instances predicted as positive, divided by the total number of positive instances). </summary>
<summary> Gets the negative precision of a classifier which is the proportion of correctly predicted negative instances among all the negative predictions (i.e., the number of negative instances predicted as negative, divided by the total number of instances predicted as negative). </summary>
<summary> Gets the negative recall of a classifier which is the proportion of correctly predicted negative instances among all the negative instances (i.e., the number of negative instances predicted as negative, divided by the total number of negative instances). </summary>
<summary> The <a href="https://en.wikipedia.org/wiki/Confusion_matrix">confusion matrix</a> giving the counts of the true positives, true negatives, false positives and false negatives for the two classes of data. </summary>
<summary> Testing set. </summary>
<summary> The estimator (in Spark terminology) is an 'untrained transformer'. It needs to 'fit' on the data to manufacture a transformer. It also provides the 'schema propagation' like transformers do, but over <see cref="T:Microsoft.ML.SchemaShape" /> instead of <see cref="T:Microsoft.ML.DataViewSchema" />. </summary>
<summary> The transformer is a component that transforms data. It also supports 'schema propagation' to answer the question of 'how will the data with this schema look, after you transform it?'. </summary>
<summary> Metrics for this cross-validation fold. </summary>
<summary><see cref="T:Microsoft.ML.IEstimator`1" /> for the <see cref="T:Microsoft.ML.ITransformer" />. </summary>
<remarks><format type="text/markdown"><![CDATA[ ### Estimator Characteristics | | | | -- | -- | | Does this estimator need to look at the data to train its parameters? | Yes | | Input column data type | Vector of [Text](xref:Microsoft.ML.Data.TextDataViewType) | | Output column data type | Vector of known-size of <xref:System.Single> | | Exportable to ONNX | Yes | The resulting <xref:Microsoft.ML.ITransformer> creates a new column, named as specified in the output column name parameters, and produces a vector of n-gram counts (sequences of n consecutive words) from a given data. It does so by building a dictionary of n-grams and using the id in the dictionary as the index in the bag. <xref:Microsoft.ML.Transforms.Text.WordBagEstimator> is different from <xref:Microsoft.ML.Transforms.Text.NgramExtractingEstimator> in that the former takes tokenizes text internally while the latter takes tokenized text as input. Check the See Also section for links to usage examples. ]]></format></remarks>
<seealso cref="M:Microsoft.ML.TextCatalog.ProduceWordBags(Microsoft.ML.TransformsCatalog.TextTransforms,System.String,System.String,System.Int32,System.Int32,System.Boolean,System.Int32,Microsoft.ML.Transforms.Text.NgramExtractingEstimator.WeightingCriteria)" />
<seealso cref="M:Microsoft.ML.TextCatalog.ProduceWordBags(Microsoft.ML.TransformsCatalog.TextTransforms,System.String,System.String[],System.Int32,System.Int32,System.Boolean,System.Int32,Microsoft.ML.Transforms.Text.NgramExtractingEstimator.WeightingCriteria)" />
<summary> Options for how the n-grams are extracted. </summary>
<summary><see cref="T:Microsoft.ML.IEstimator`1" /> for the <see cref="T:Microsoft.ML.Transforms.Text.CustomStopWordsRemovingTransformer" />. </summary>
<remarks><format type="text/markdown"><![CDATA[ ### Estimator Characteristics | | | | -- | -- | | Does this estimator need to look at the data to train its parameters? | No | | Input column data type | Vector of [Text](xref:Microsoft.ML.Data.TextDataViewType) | | Output column data type | Variable-sized vector of [Text](xref:Microsoft.ML.Data.TextDataViewType) | | Exportable to ONNX | Yes | The resulting <xref:Microsoft.ML.Transforms.Text.StopWordsRemovingTransformer> creates a new column, named as specified in the output column name parameter, and fills it with a vector of words containing all of the words in the input column **except the predefined list of stopwords for the specified language. All text comparison made by casting predefined text and text from input column to lower case using casing rules of invariant culture. Check the See Also section for links to usage examples. ]]></format></remarks>
<seealso cref="M:Microsoft.ML.TextCatalog.RemoveDefaultStopWords(Microsoft.ML.TransformsCatalog.TextTransforms,System.String,System.String,Microsoft.ML.Transforms.Text.StopWordsRemovingEstimator.Language)" />
<summary> Use stop words remover that can remove language-specific list of stop words (most common words) already defined in the system. </summary>
<summary> An estimator that turns a collection of text documents into numerical feature vectors. The feature vectors are normalized counts of word and/or character n-grams (based on the options supplied). </summary>
<remarks><format type="text/markdown"><![CDATA[ ### Estimator Characteristics | | | | -- | -- | | Does this estimator need to look at the data to train its parameters? | Yes. | | Input column data type | [text](xref:Microsoft.ML.Data.TextDataViewType) | | Output column data type | Vector of <xref:System.Single> | | Exportable to ONNX | No | This estimator gives the user one-stop solution for doing: * Language Detection * [Tokenization](https://en.wikipedia.org/wiki/Lexical_analysis#Tokenization) * [Text normalization](https://en.wikipedia.org/wiki/Text_normalization) * [Predefined and custom stopwords removal](https://en.wikipedia.org/wiki/Stop_words) * [Word-based or character-based Ngram extraction and SkipGram extraction (through the advanced [options](xref:Microsoft.ML.Transforms.TextFeaturizingEstimator.Options.WordFeatureExtractor))](https://en.wikipedia.org/wiki/N-gram) * [TF, IDF or TF-IDF](https://en.wikipedia.org/wiki/Tf%E2%80%93idf) * [L-p vector normalization](xref: Microsoft.ML.Transforms.LpNormNormalizingTransformer) By default the features are made of (word/character) n-grams/skip-grams​ and the number of features are equal to the vocabulary size found by analyzing the data. To output an additional column with the tokens generated, use [OutputTokensColumnName](xref:Microsoft.ML.Transforms.Text.TextFeaturizingEstimator.Options.OutputTokensColumnName). The number of features can also be specified by selecting the maximum number of n-gram to keep in the <xref:Microsoft.ML.Transforms.Text.TextFeaturizingEstimator.Options>, where the estimator can be further tuned. Check the See Also section for links to usage examples. ]]></format></remarks>
<seealso cref="M:Microsoft.ML.TextCatalog.FeaturizeText(Microsoft.ML.TransformsCatalog.TextTransforms,System.String,Microsoft.ML.Transforms.Text.TextFeaturizingEstimator.Options,System.String[])" />
<seealso cref="M:Microsoft.ML.TextCatalog.FeaturizeText(Microsoft.ML.TransformsCatalog.TextTransforms,System.String,System.String)" />
<summary> Advanced options for the <see cref="T:Microsoft.ML.Transforms.Text.TextFeaturizingEstimator" />. </summary>
<summary> Ngram feature extractor to use for characters (WordBag/WordHashBag). Set to <see langword="null" /> to turn off n-gram generation for characters. </summary>
<summary> Ngram feature extractor to use for words (WordBag/WordHashBag). Set to <see langword="null" /> to turn off n-gram generation for words. </summary>
<summary> Option to set type of stop word remover to use. The following options are available <list type="bullet"><item><description>The <see cref="T:Microsoft.ML.Transforms.Text.StopWordsRemovingEstimator.Options" /> removes the language specific list of stop words from the input.</description></item><item><description>The <see cref="T:Microsoft.ML.Transforms.Text.CustomStopWordsRemovingEstimator.Options" /> uses user provided list of stop words.</description></item></list> Setting this to 'null' does not remove stop words from the input. </summary>
<summary> Trainers and tasks specific to multiclass classification problems. </summary>
<summary> The list of trainers for performing multiclass classification. </summary>
(extension) MulticlassClassificationCatalog.MulticlassClassificationTrainers.SdcaMaximumEntropy(?labelColumnName: string, ?featureColumnName: string, ?exampleWeightColumnName: string, ?l2Regularization: Nullable<float32>, ?l1Regularization: Nullable<float32>, ?maximumNumberOfIterations: Nullable<int>) : Trainers.SdcaMaximumEntropyMulticlassTrainer
type EstimatorChain<'TLastTransformer (requires reference type and 'TLastTransformer :> ITransformer)> = interface IEstimator<TransformerChain<'TLastTransformer>> new: unit -> unit member Append<'TNewTrans (requires reference type and 'TNewTrans :> ITransformer)> : estimator: IEstimator<'TNewTrans> * ?scope: TransformerScope -> EstimatorChain<'TNewTrans> member AppendCacheCheckpoint: env: IHostEnvironment -> EstimatorChain<'TLastTransformer> member Fit: input: IDataView -> TransformerChain<'TLastTransformer> member GetOutputSchema: inputSchema: SchemaShape -> SchemaShape val LastEstimator: IEstimator<'TLastTransformer>
<summary> Represents a chain (potentially empty) of estimators that end with a <typeparamref name="TLastTransformer" />. If the chain is empty, <typeparamref name="TLastTransformer" /> is always <see cref="T:Microsoft.ML.ITransformer" />. </summary>
--------------------
EstimatorChain() : EstimatorChain<'TLastTransformer>
<summary> The list of operations for data type conversion. </summary>
(extension) TransformsCatalog.ConversionTransforms.MapValueToKey(outputColumnName: string, ?inputColumnName: string, ?maximumNumberOfKeys: int, ?keyOrdinality: Transforms.ValueToKeyMappingEstimator.KeyOrdinality, ?addKeyValueAnnotationsAsText: bool, ?keyData: IDataView) : Transforms.ValueToKeyMappingEstimator
<summary> Evaluation results for multi-class classification trainers. </summary>
<summary> Gets the macro-average accuracy of the model. </summary>
<remarks> The macro-average is the average accuracy at the class level. The accuracy for each class is computed and the macro-accuracy is the average of these accuracies. The macro-average metric gives the same weight to each class, no matter how many instances from that class the dataset contains. </remarks>
<summary> Gets the average log-loss of the classifier. Log-loss measures the performance of a classifier with respect to how much the predicted probabilities diverge from the true class label. Lower log-loss indicates a better model. A perfect model, which predicts a probability of 1 for the true class, will have a log-loss of 0. </summary>
<remarks><format type="text/markdown"><![CDATA[ The log-loss metric is computed as follows: $LogLoss = - \frac{1}{m} \sum_{i = 1}^m log(p_i)$, where $m$ is the number of instances in the test set and $p_i$ is the probability returned by the classifier of the instance belonging to the true class. ]]></format></remarks>
<summary> Gets the log-loss of the classifier for each class. Log-loss measures the performance of a classifier with respect to how much the predicted probabilities diverge from the true class label. Lower log-loss indicates a better model. A perfect model, which predicts a probability of 1 for the true class, will have a log-loss of 0. </summary>
<remarks> The log-loss metric is computed as $-\frac{1}{m} \sum_{i=1}^m \log(p_i)$, where $m$ is the number of instances in the test set. $p_i$ is the probability returned by the classifier if the instance belongs to the class, and 1 minus the probability returned by the classifier if the instance does not belong to the class. </remarks>
<example><format type="text/markdown"><![CDATA[ [!code-csharp[LogLoss](~/../docs/samples/docs/samples/Microsoft.ML.Samples/Dynamic/Trainers/MulticlassClassification/LogLossPerClass.cs)] ]]></format></example>
<summary> The <a href="https://en.wikipedia.org/wiki/Confusion_matrix">confusion matrix</a> giving the counts of the predicted classes versus the actual classes. </summary>
<summary> Operations with trained models. </summary>
ModelOperationsCatalog.CreatePredictionEngine<'TSrc,'TDst (requires reference type and default constructor and reference type)>(transformer: ITransformer, inputSchema: DataViewSchema) : PredictionEngine<'TSrc,'TDst>
ModelOperationsCatalog.CreatePredictionEngine<'TSrc,'TDst (requires reference type and default constructor and reference type)>(transformer: ITransformer, ?ignoreMissingColumns: bool, ?inputSchemaDefinition: SchemaDefinition, ?outputSchemaDefinition: SchemaDefinition) : PredictionEngine<'TSrc,'TDst>
PredictionEngine.Predict(example: BinarySentimentInput, prediction: byref<BinarySentimentOutput>) : unit