.NET based ecosystem of open-source software for mathematics, science, and engineering.
1 var_list, Nullable
1 aggregation_method, GateGradientType gate_gradients, Boolean colocate_gradients_with_ops, Tensor grad_loss)1 var_list, GateGradientType gate_gradients, Nullable
1 aggregation_method, Boolean colocate_gradients_with_ops, String name, Tensor grad_loss)
Here's a fun one. Conv2DTranspose has default value for
bool use_bias = true,
However in the actual concrete class:
if (use_bias)
throw new NotImplementedException("");
case int[] shape2:
if (shape.ndim != shape2.Length)
return false;
int[] newDims;
newDims = new int[shape.dims.Length];
for (int i = 0; i < shape.dims.Length; i++)
{
newDims[i] = (int)shape.dims[i];
}
return Enumerable.SequenceEqual(newDims, shape2);
@Craigjw @SuperDaveOsbourne @AndreasHogstrandUltromics_gitlab Sadly after over an year of banging my head on the wall I've finally given up and moved across to using TorchSharp. Unfortunately for non trivial models there is some kind of a bug in the C++ build of Tensorflow itself. Which is not there in the Python Version.
What I've found is that regardless of which version you use and I have tried TF1.3, TF 1.15.5, TF 2.3, TF 2.6 The C++ build of Tensorflow has some kind of weird bug in it. I've tried both the nuget packaged versions provided here. As well as compiling from source using bazel. It makes no difference. The bug is "silent corruption of data". What this means is when building non trivial models (More complex than the samples or the unit tests) or in other words deep networks of more than 2 or 3 layers your model will train upto a certain point. After which it will not train anymore. You can try with Adam, ADAGrad, RMSProp or whatever you want. Your model will not train beyond a certain level. From what I can see it is related to floating point precision of the weight updates from gradients.
@Craigjw May I suggest just building it all in a console app. Much easier and a lot less pain. Also don't bother with the Eager mode if your planning on using GPU. Actually don't bother either way. Which unfortunately means that all of Keras is out the window. Here is the reason. When running in eager mode you have to use Gradient Tape. While the Python version of Gradient Tape might be computationally efficient, the Dot net translated version is anything But. Since most of this framework is tested on CPU rather than GPU this issue is hidden and not that obvious. However if your planning on doing any real training you will need to use GPU and lemme put it this way. I'm running a threadripper with 4 X 3090 cards. The Gradient Tape implementation is single threaded and I can barely manage to get one GPU at just over 2% Cuda utilization.
Your alternative is to use Graph mode which does work quite well. This however means Tensorflow Native which is really not that much of a big deal. Most of the operations are fully implemented except in the area of convolutions. Conv2D is implemented and does work. However Conv2dTranspose is not implemented to any functional level. Having said that its also not that much of a big deal cause you can get close to the same result using dilated conv2D followed by a standard dense to expand the shape. I've tested this approach and it does work decently.
@AndreasHogstrandUltromics_gitlab May I suggest using NetMQ for your transport along with MessagePack and MessagePack.Annotations for serialization. I can get near wire speed (1Gbps) serialization of floats from multiple agents to the central Neural Network (Multi Agent reinforcement learning scenario). Note: the Numpy implementation in Tensorflow.Net is extremely performant and blazing fast for converting float shape structures. Much faster than the methods used in TorshSharp so I'm continuing to use the Numpy implementation even through I'm now using TorchSharp as the neural network backend
public static Tensor FullyConnectedDense(Tensor _Input, int _Num_Units, TF_DataType dataType = TF_DataType.TF_FLOAT, bool Trainable = true, bool Normalize = false,
IInitializer InitializeVariant = null, string _Name = null)
{
if (InitializeVariant == null)
{
InitializeVariant = tf.truncated_normal_initializer(0, 1);
}
int in_dim = (int)_Input.shape[1];
string WeightsName;
string BiasName;
WeightsName = _Name.ConcatIfNotNullOrEmptyElseNull(@"_Weights");
BiasName = _Name.ConcatIfNotNullOrEmptyElseNull(@"_Bias");
if (in_dim == 0)
{
in_dim = -1;
}
ResourceVariable Weights;
ResourceVariable Bias;
Weights = tf.Variable(
InitializeVariant,
name: WeightsName,
dtype: dataType,
shape: new int[2] { in_dim, _Num_Units },
trainable: Trainable,
validate_shape: false);
Bias = tf.Variable(
InitializeVariant,
name: BiasName,
dtype: dataType,
shape: new int[1] { _Num_Units },
trainable: Trainable,
validate_shape: false);
Tensor layer = tf.matmul(_Input, Weights) + Bias;
return layer;
}
public static void BuildTrainExample(bool IsNormalizing)
{
//These values are not normalized so they will perform badly.
//Unless they were intended as categorical classes
//in which case they should be 1 Hot encoded instead of normalized
NDArray InputData = np.array(-7.0f, -4.0f, -1.0f, 2.0f, 5.0f, 8.0f, 11.0f, 14.0f,
-7.0f, -4.0f, -1.0f, 2.0f, 5.0f, 8.0f, 11.0f, 14.0f,
-7.0f, -4.0f, -1.0f, 2.0f, 5.0f, 8.0f, 11.0f, 14.0f).reshape(new Shape(3, 8));
NDArray OutputLabels = np.array(3.0f, 6.0f, 9.0f, 12.0f, 15.0f, 18.10f, 21.0f, 24.0f,
3.0f, 6.0f, 9.0f, 12.0f, 15.0f, 18.10f, 21.0f, 24.0f,
3.0f, 6.0f, 9.0f, 12.0f, 15.0f, 18.10f, 21.0f, 24.0f).reshape(new Shape(3, 8)); ;
// Have to do this early because it is eager by default and placeholders cannot be used
tf.compat.v1.disable_eager_execution();
Graph MainGraph = new Graph().as_default(); // To reset the graph etc.. you can tf.reset_default_graph();
Tensor InputPlaceHolder = tf.placeholder(tf.float32, shape: new int[2] { -1, (int)InputData.shape[1] }, name: "Input");
Tensor LabelPlaceHolder = tf.placeholder(tf.float32, shape: new int[2] { -1, (int)OutputLabels.shape[1] }, name: "Labels");
int NormlizatioScaleOrNumClasses = 24;
Tensor NormalizationFactor = tf.constant((float)NormlizatioScaleOrNumClasses, TF_DataType.TF_FLOAT);
Tensor LabelsNormalized = tf.div(LabelPlaceHolder, NormalizationFactor);
int NumInputVectors = (int)InputData.shape[1];
int NumOutputVectors = (int)OutputLabels.shape[1];
Tensor InputRemap;
Tensor DenseLayerLogits;
Tensor DenseLayerActivated;
Tensor DenseLayerFinal;
Tensor Loss_MSELoss;
Tensor Loss_SSELoss;
if (IsNormalizing)
{
InputRemap = tf.div(InputPlaceHolder, NormalizationFactor);
DenseLayerLogits = FullyConnectedDense(InputRemap, NumOutputVectors);
DenseLayerActivated = tf.nn.leaky_relu(DenseLayerLogits);
//or whatever other activation
//Tensor Activation = tf.nn.sigmoid(DenseLayer);
//Tensor Activation = tf.nn.relu(DenseLayer);
//Tensor Activation = tf.nn.tanh(DenseLayer);
}
else
{
//Instead if you were doing 1 Hot
InputRemap = tf.one_hot(InputPlaceHolder, NormlizatioScaleOrNumClasses);
InputRemap = tf.reshape(InputRemap, new int[2] { -1, (NumInputVectors * NormlizatioScaleOrNumClasses)});
//This will not work because argmax has no gradient implemented here so it breaks the optimizer / gradient flow
//DenseLayerLogits = FullyConnectedDense(InputRemap, NumOutputVectors * NormlizatioScaleOrNumClasses);
//DenseLayerActivated = tf.nn.sigmoid(DenseLayerLogits);
//DenseLayerActivated = tf.reshape(DenseLayerActivated, new int[3] { -1, NumInputVectors, NormlizatioScaleOrNumClasses });
//DenseLayerActivated = tf.arg_max(DenseLayerActivated, 2);
//DenseLayerActivated = tf.cast(DenseLayerActivated, TF_DataType.TF_FLOAT);
DenseLayerLogits = FullyConnectedDense(InputRemap, NumOutputVectors);
DenseLayerActivated = tf.nn.leaky_relu(DenseLayerLogits);
}
Tensor LearningRate = tf.placeholder(tf.float32, shape: new int[0], name: "LearningRate");
//MSE Loss
Loss_MSELoss = tf.reshape(tf.reduce_mean(tf.square(LabelsNormalized - DenseLayerActivated), axis: 1), new int[2] { -1, 1 });
//SSE Loss
Loss_SSELoss = tf.reshape(tf.reduce_sum(tf.square(LabelsNormalized - DenseLayerActivated), axis: 1), new int[2] { -1, 1 });
Operation NetworkOptimizer = new Tensorflow.Train.AdamOptimizer(LearningRate).minimize(Loss_MSELoss);
//Operation NetworkOptimizer = new Tensorflow.Train.AdamOptimizer(LearningRate).minimize(Loss_SSELoss);
Operation Init = tf.global_variables_initializer();
//various Config option examples
var TFConfig = new ConfigProto();
TFConfig.GpuOptions = new GPUOptions();
TFConfig.GpuOptions.AllowGrowth = true; //Prevents Tensorflow swallowing all GPU memory
//TFConfig.GpuOptions.PerProcessGpuMemoryFraction = 20;
//TFConfig.GpuOptions.Experimental = new GPUOptions.Types.Experimental();
//TFConfig.GpuOptions.Experimental.UseUnifiedMemory = true;
//TFConfig.IntraOpParallelismThreads = 10; //C# thread count
//TFConfig.InterOpParallelismThreads = 2;
//TFConfig.LogDevicePlacement = true; //Writes a hell of a lot to the console
//This is how you can grab a reference to all the variables if you want to.
List<ResourceVariable> AllVars = tf.get_collection<ResourceVariable>(tf.GraphKeys.GLOBAL_VARIABLES);
Saver TFSaver = tf.train.Saver();
using (Session Sess = tf.Session(MainGraph, config: TFConfig))
{
Sess.run(Init);// Initialiizes global Variables/
Sess.graph.as_default();
//Only need this if your train code is in some other method somewhere else. It is already the default here.
//Of course create a proper training loop instead of this for loop that just repeats the same thing.
for (int Epoch = 1; Epoch <= 20; Epoch++)
{
for (int I = 0; I < 50; I++)
{
float MyAdjustableLearningRate;
MyAdjustableLearningRate = 0.001f;
//Really readable method
List<FeedItem> FeedList = new List<FeedItem>();
FeedList.add((InputPlaceHolder, InputData));
FeedList.add((LabelPlaceHolder, OutputLabels));
FeedList.add((LearningRate, MyAdjustableLearningRate));
FeedItem[] FeedArray;
FeedArray = FeedList.ToArray();
Sess.run(NetworkOptimizer, FeedArray);
//Or the shortcut Way
Sess.run(NetworkOptimizer, (InputPlaceHolder, InputData), (LabelPlaceHolder, OutputLabels), (LearningRate, MyAdjustableLearningRate));
}
float SSELoss, MSELoss;
(MSELoss, SSELoss) = Sess.run((Loss_MSELoss, Loss_SSELoss), (InputPlaceHolder, InputData), (LabelPlaceHolder, OutputLabels));
StringBuilder sb;
sb = new StringBuilder();
if(IsNormalizing)
{
sb.Append(@"Normalize Inputs Version: ");
}
else
sb.Append(@"One Hot Version: ");
{
}
sb.Append(@"Epoch: ").Append(Epoch.ToString(@"00"));
sb.Append(@" - Itteration: ").Append((Epoch * 50).ToString(@"0000"));
sb.Append(@" - MSE: ").Append(MSELoss.ToString(@"0.000000000000"));
sb.Append(@" - SSE: ").Append(SSELoss.ToString(@"0.000000000000"));
Console.WriteLine(sb.ToString());
}
}
}
@Craigjw Here is what the output is:
Note each run results will vary cause in the example the weights are using just a standard truncated normal initializer.
Normalize Inputs Version: Epoch: 01 - Itteration: 0050 - MSE: 0.585179900000 - SSE: 4.681439000000
Normalize Inputs Version: Epoch: 02 - Itteration: 0100 - MSE: 0.454372400000 - SSE: 3.634979000000
Normalize Inputs Version: Epoch: 03 - Itteration: 0150 - MSE: 0.307974100000 - SSE: 2.463793000000
Normalize Inputs Version: Epoch: 04 - Itteration: 0200 - MSE: 0.184113500000 - SSE: 1.472908000000
Normalize Inputs Version: Epoch: 05 - Itteration: 0250 - MSE: 0.104549100000 - SSE: 0.836393200000
Normalize Inputs Version: Epoch: 06 - Itteration: 0300 - MSE: 0.078881500000 - SSE: 0.631052000000
Normalize Inputs Version: Epoch: 07 - Itteration: 0350 - MSE: 0.065381990000 - SSE: 0.523055900000
Normalize Inputs Version: Epoch: 08 - Itteration: 0400 - MSE: 0.055645780000 - SSE: 0.445166200000
Normalize Inputs Version: Epoch: 09 - Itteration: 0450 - MSE: 0.031652850000 - SSE: 0.253222800000
Normalize Inputs Version: Epoch: 10 - Itteration: 0500 - MSE: 0.000863294100 - SSE: 0.006906353000
Normalize Inputs Version: Epoch: 11 - Itteration: 0550 - MSE: 0.000022935460 - SSE: 0.000183483600
Normalize Inputs Version: Epoch: 12 - Itteration: 0600 - MSE: 0.000001567331 - SSE: 0.000012538650
Normalize Inputs Version: Epoch: 13 - Itteration: 0650 - MSE: 0.000000238269 - SSE: 0.000001906155
Normalize Inputs Version: Epoch: 14 - Itteration: 0700 - MSE: 0.000000035869 - SSE: 0.000000286954
Normalize Inputs Version: Epoch: 15 - Itteration: 0750 - MSE: 0.000000004706 - SSE: 0.000000037646
Normalize Inputs Version: Epoch: 16 - Itteration: 0800 - MSE: 0.000000000532 - SSE: 0.000000004254
Normalize Inputs Version: Epoch: 17 - Itteration: 0850 - MSE: 0.000000000051 - SSE: 0.000000000409
Normalize Inputs Version: Epoch: 18 - Itteration: 0900 - MSE: 0.000000000004 - SSE: 0.000000000035
Normalize Inputs Version: Epoch: 19 - Itteration: 0950 - MSE: 0.000000000001 - SSE: 0.000000000009
Normalize Inputs Version: Epoch: 20 - Itteration: 1000 - MSE: 0.000000000000 - SSE: 0.000000000004
One Hot Version: Epoch: 01 - Itteration: 0050 - MSE: 1.136966000000 - SSE: 9.095726000000
One Hot Version: Epoch: 02 - Itteration: 0100 - MSE: 0.663073900000 - SSE: 5.304591000000
One Hot Version: Epoch: 03 - Itteration: 0150 - MSE: 0.416191000000 - SSE: 3.329528000000
One Hot Version: Epoch: 04 - Itteration: 0200 - MSE: 0.265655000000 - SSE: 2.125240000000
One Hot Version: Epoch: 05 - Itteration: 0250 - MSE: 0.059993860000 - SSE: 0.479950800000
One Hot Version: Epoch: 06 - Itteration: 0300 - MSE: 0.027356000000 - SSE: 0.218848000000
One Hot Version: Epoch: 07 - Itteration: 0350 - MSE: 0.018255970000 - SSE: 0.146047700000
One Hot Version: Epoch: 08 - Itteration: 0400 - MSE: 0.000066116740 - SSE: 0.000528933900
One Hot Version: Epoch: 09 - Itteration: 0450 - MSE: 0.000009215314 - SSE: 0.000073722510
One Hot Version: Epoch: 10 - Itteration: 0500 - MSE: 0.000001104103 - SSE: 0.000008832826
One Hot Version: Epoch: 11 - Itteration: 0550 - MSE: 0.000000109968 - SSE: 0.000000879746
One Hot Version: Epoch: 12 - Itteration: 0600 - MSE: 0.000000009018 - SSE: 0.000000072147
One Hot Version: Epoch: 13 - Itteration: 0650 - MSE: 0.000000000602 - SSE: 0.000000004818
One Hot Version: Epoch: 14 - Itteration: 0700 - MSE: 0.000000000035 - SSE: 0.000000000280
One Hot Version: Epoch: 15 - Itteration: 0750 - MSE: 0.000000000004 - SSE: 0.000000000035
One Hot Version: Epoch: 16 - Itteration: 0800 - MSE: 0.000000000002 - SSE: 0.000000000018
One Hot Version: Epoch: 17 - Itteration: 0850 - MSE: 0.000000000001 - SSE: 0.000000000011
One Hot Version: Epoch: 18 - Itteration: 0900 - MSE: 0.000000000001 - SSE: 0.000000000009
One Hot Version: Epoch: 19 - Itteration: 0950 - MSE: 0.000000000001 - SSE: 0.000000000008
One Hot Version: Epoch: 20 - Itteration: 1000 - MSE: 0.000000000001 - SSE: 0.000000000006
Note: all the example stuff above is against Tensorflow.Net 0.60.4 which is Tensorflow 2.6
So just to clarify when I said earlier that I'm having issues with Tensorflow.Net and had to move across to TorchSharp that is specifically for my use cases which are triggering some rare edge case in the C++ build of Tensorflow.
For most use cases Tensorflow.Net will work just as well as TorchSharp. All depends on what your use cases require.
Also in the example code above I put in a few explanations of things that took me a while to figure out (Like how to get around the ArgMax issue here
public static string ConcatIfNotNullOrEmptyElseNull(this string Value, string Append = @"", string Prepend = @"")
{
if (string.IsNullOrEmpty(Value)) { return null; }
else { return string.Concat(Prepend, Value, Append); }
}