These are chat archives for FreeCodeCamp/DataScience

18th
Sep 2016
Alice Jiang
@becausealice2
Sep 18 2016 06:08
If anyone has any advice or knows of any resources off hand that might be of use for a ML project composing music based off a million billion GB library of music, my brother and I are talking about seeing if we can't join forces on a project :D
Eric Leung
@erictleung
Sep 18 2016 08:01

@alicejiang1 sorry, I haven't heard of any. Are you wanting to generate music based on already available music? It would be interesting for something to generate music an artist has never created but is in the same style.

Someone was able to something tangent with art where you yourself would sketch something and then run the drawing through an algorithm and then it would pop out a drawing that is in the style of one of the famous artists.

I couldn't find the software that does that. However, while trying to look for that, I stumbled upon something even cooler! Someone's been able to take pre-existing art and extend it! Here's the site with famous art pieces extended as if the original artists painted more.

Alice Jiang
@becausealice2
Sep 18 2016 08:08
@erictleung That's awesome! That's actually where I got the idea, the famous artist versions of whatever pictures you have. My brother is a DJ and has 300GB of music loaded on hard drives ready to use with another couple hundred GB at least that he has on CDs, so I have plenty to work with... I'm just gonna figure out how to do the thing...
I don't need anyone to go out of their way, this was honestly just thought up a few hours ago so it's in very early planning stages, and since I'm the housewife with nothing else going on who had the idea I should be doing all the hard labor, but if anyone thinks of something that might be of use I'd be eternally grateful :)
Samuel Antonioli
@samuelantonioli
Sep 18 2016 11:00

Sorry if this seems off-topic, but I would like to know if the GTX 1060 6GB suffices to replicate some recent things like Wavenet or matching muted videos with sound without spending weeks training the model - I don't want to waste money on a graphics card which is too weak for this purposes.
I've read a very comprehensive guide but I'm still unsure what accuracy I have to expect after several pooling layers. Did someone try Tensorflow implementations of those projects on this GPU (although I know that TF may be slower)? Would be glad to hear some GTX 1060 users.

@alicejiang1
And the music project sounds like fun, this project seems to do the same. Maybe it's easier to train two models: one for generating sheets of music based on training data (e.g. outputting MIDI files which encode relevant information for synthesis, like DeepJazz which uses MIDI files for training) and another one which synthesizes the sound. This may enable the second model to use music by e.g. Mozart and synthesizing it using modern/rock sounds (like "A Neural Algorithm of Artistic Style" for music, although I think that artistic videos are conceptually more similar to music generation than generating one image) and you may profit from TTS research. Google Brain with Magenta seems to be very interested in that topic, too.

The research of David Cope "Experiments in Musical Intelligence" may be interesting for you, I was stunned when I first heard his creation Emily Howell which is able to analyze structure and repetitions in its training data.

Maybe it's a funny idea to use DQN and Reinforcement Learning techniques, but I wouldn't know what the reward function should look like.
Another spontaneous idea would be to encode the pitch as a number and use n-grams Markov model to create music based on trained music - I'm not sure if this is a good idea or would generate good music, but I've seen it to create poems. Music is normally very structured and this may help to assign appropriate probabilites and imitate that structure of chorus and verse.

evaristoc
@evaristoc
Sep 18 2016 11:39
@samuelantonioli Interesting question. I haven't tried any GPU programming so far but I am not sure about those 6GB for a quick training. Additionally, the first thing that came to my mind were the Tesla ones, that were until last year more popular for ML. Of course it depends on the size of the training/test datasets you have available. But someone else might give you a better answer here?
The article is really a good introduction. If you make some code eventually please share with us here? I would like to know more about GPU programming myself...
@alicejiang1 the size of the data is HUGE. You should start thinking about how to get enough hardware for that. As @samuelantonioli is pointing out, some idea of GPU programming might come handy for this one too. Actually, very specially for this one. Of course there are ways to find solutions to this...
This kind of projects sound like an interesting challenge for skale.me by the way...
evaristoc
@evaristoc
Sep 18 2016 11:55
@alicejiang1 Convolutional NN are the way to go for this project IMO. Check the guide that @samuelantonioli is providing. @koustuvsinha already did a small project with FCC data using convolutional: excellent to get idea of structures. I know python is not your thing but check the following material we have checked here before: http://cs231n.stanford.edu/. Andrej Karpathy is the person you have to check if you want to know more.
However, be aware that Conv-NN are REALLY tricky to tune. I was in a short master class about its use in Text Mining and you have several parameters to take in consideration in order to come up with something decent, which means a long Search Grid, which mean a lot of training. Also being careful in selecting how to implement an appropriate Error Function for your case is relevant (L1 or L2 or both?). I also remember we were talking in that class about bootstrap functions that re-normalise the values of the solutions between layers to be applicable for the case study we were evaluating, so... ufff!
evaristoc
@evaristoc
Sep 18 2016 12:04
I am just right now having a problem with a project that for not being even optimised for CPU is taking a lot of time to run even if is VERY small...
@alicejiang1 : I was more commenting the use of Conv-NN if your project get close to @ericleung proposal. @samuelantonioli also commented about that too.
evaristoc
@evaristoc
Sep 18 2016 12:13
@erictleung I saw that project before: forgot to comment that here... REALLY nice. There is also on internet something written about the algo I think, with a python version... don't remember...
evaristoc
@evaristoc
Sep 18 2016 12:19
@alicejiang1 The project I wanted to comment you that you could consider for evaluation of Emotion Analysis is Opencv. Here an elementary example (seems more like a kind of test project). For a nice project you need a HUGE annotated library of emotions.
https://www.youtube.com/watch?v=pPclypFDcrk
Victor Durte Diniz Monteiro
@victorddiniz
Sep 18 2016 16:16
Hi everyone! I am new to data science and I read the Free Code Camp article about this project and was really interest to contribute. Suggestions on how I can start?
evaristoc
@evaristoc
Sep 18 2016 19:45
@victorddiniz nice to have you around! Checking your site: really nice that you are so much into algorithms! Interesting propositions of problems to implement usual algos. And a lot of practice with C! Don't know much about it (I am more into high level ones). Stay around!