Hi All! Can someone please help me with an issue? I'm training a recurrent neural network (with GRU) for a classification problem using rmsprop as an optimizer.
Training loss goes down for the first ~1 million examples, but then starts going up again
Why could it be?
The only reason that I might think of something like that is missclassification
How sure are you that your dataset is right?
And aslso the model might be too small
(I am a student so please take it as a grain of salt)
The dataset is probably noisy. If I reduce dataset size to few hundred thousand examples I get training accuracy above 90%. But even if the dataset is noisy, can it lead to training error increasing over time? I've thought if model capacity is large enough training error should decrease to near 0 (memorize the training set), if capacity is small it should stay flat at some point.
"Understanding Deep Learning Requires Rethinking Generalization" paper shows how neural networks can memorize even random labels
Assuimg you reduce in a random manner, I can only assume that the noisy is made by a model to throw off other models (GAMs by Ian Goodfellow)
Just dropping in a hello in case I sleep and miss the chat again.
Sorry I couldn't make it Friday night guys, I was at the Machine Learning in Finance conference
Feel free to read through my notes and if you have any questions, let me know.
The Goldman Sachs Senior Data scientists I talked to was a really cool guy
Great story about how he went from sleeping in his car, to winning data hackathons in San Fran, to working at GS
Hi, I want to classify a multi labeled data using deep learning techniques like CNN without building multiple classifier for each label.. when I read about it they say that I should use multiple sigmoid units on the last layer with binary cross entropy loss function.. actually I didn't understand why this would work and is there a better way to do this?
I'm availabe for at least the next hour to help out with whatever.
And if any of you know an easy way to parallelize my R scripts I'd love to hear it. ;)
@rawan_la_twitter I haven't done a multi labeled data classification with CNN, but I have done it with neural nets in general.
first step is to change your label data into a binary set
once you get the data out of a one field with multiple values and into multiple fields with binary values it's much easier