These are chat archives for deeplearning4j/deeplearning4j/earlyadopters

11th
May 2016
Patrick Skjennum
@Habitats
May 11 2016 07:36
@treo @raver119 no overflow of memory after chaging to -XX:+UseConcMarkSweepGC and cannot notice any difference in training time either. maybe that's something you could mention in the docs?
Paul Dubs
@treo
May 11 2016 08:05
I'd rater not mention GC changing in the docs, as raver said, it would be better to not rely on that
Patrick Skjennum
@Habitats
May 11 2016 08:28
well it seriously did wonders to my program
i can run 3 spark jobs simultaneously now, even
Paul Dubs
@treo
May 11 2016 08:49
have you tried the g1 gc yet? :D
Patrick Skjennum
@Habitats
May 11 2016 08:51
nope, i don't want to change now:P
i can finally do stuff
raver119
@raver119
May 11 2016 08:53
heh
but i still think that it's wrong, if you have to rely on specific gc for something
Patrick Skjennum
@Habitats
May 11 2016 08:53
these algorithms exists for a reasson
default gc behaviour is pretty shitty for high mem, high cpu environments
raver119
@raver119
May 11 2016 08:54
well, luckily i know why they do exist, and i know that our usecase shouldn't involve them :(
Patrick Skjennum
@Habitats
May 11 2016 08:55
dl4j alone, maybe not, but together with spark?
Paul Dubs
@treo
May 11 2016 09:23
damn... jsoup is efficient, just parsed 11k files (2.4gb) in 2 or 3 seconds
Patrick Skjennum
@Habitats
May 11 2016 10:00
but does it beat jackson native?
Paul Dubs
@treo
May 11 2016 10:01
jsoup is for parsing html
not json :)
Patrick Skjennum
@Habitats
May 11 2016 10:01
yeah i just realized
Paul Dubs
@treo
May 11 2016 10:01
given that jsoup implements the WHATWG HTML5 specification, and parses HTML to the same DOM as modern browsers do. it is really impressive
Patrick Skjennum
@Habitats
May 11 2016 10:02
yeah i used jsoup for my web crawler
forgot the name, apparently
Michael Sch.
@mschaars
May 11 2016 11:30
@crockpotveggies hehe zoubin is getting more interested in deep learning
Patrick Skjennum
@Habitats
May 11 2016 11:47
any pointers regarding the size of a fatjar with dl4j++? i guess it could get huge?
Michael Sch.
@mschaars
May 11 2016 11:48
someone just asked me whether nd4j/libnd4j uses AVX, do you?
Paul Dubs
@treo
May 11 2016 11:48
it depends on the system it is compiled on
and the blas lib
I'm using it on a skylake cpu with mkl, so it even uses avx2 :)
@Habitats With 3 trained models my fatjar is 200mb
Michael Sch.
@mschaars
May 11 2016 11:49
so it always uses the flag?
Paul Dubs
@treo
May 11 2016 11:49
it tries to vectorize where it can, and is currently compiled with -march=native
Michael Sch.
@mschaars
May 11 2016 11:49
ok, thanks!
Patrick Skjennum
@Habitats
May 11 2016 11:58
alright @treo
@crockpotveggies you're using gradle+scala right?
Justin Long
@crockpotveggies
May 11 2016 14:10
@Habitats correct! Downgraded to Scala 2.10 and targeting jvm 1.7
@mschaars Hehe not too familiar with his history but I do find it very interesting
Patrick Skjennum
@Habitats
May 11 2016 14:28
@crockpotveggies ever experience that dl4j/spark swallows exceptions?
Justin Long
@crockpotveggies
May 11 2016 14:53
mostly when Spark is in cluster mode. Usually Spark will output errors to individual slave logs (not sure how it works when using Spark local) however I'm currently setting up YARN client so I can get those exceptions ;) @Habitats
ME
@enache2004
May 11 2016 15:18
Hi..I want to use the dl4j in the fastest way that is available now
Adam Gibson
@agibsonccc
May 11 2016 15:19
cool so go here
ME
@enache2004
May 11 2016 15:19
:))
start setting up the c++
follow the windows.md
For the release we're going to setup an msi executable
ME
@enache2004
May 11 2016 15:20
it would be great
Adam Gibson
@agibsonccc
May 11 2016 15:20
yeah we appreciate feedback now though
we have people testing with mkl and openblas
@treo has done a lot with that
he runs windows
ME
@enache2004
May 11 2016 15:20
I prefer windows too..for the moment
I'm reading the windows.md and I will come with other questions
Adam Gibson
@agibsonccc
May 11 2016 15:22
cool @treo did a really good job with it
he wrote it
k it's 12:30am time zones etc you'll be in good hands from here
Paul Dubs
@treo
May 11 2016 15:25
@enache2004 hurry, I'm leaving in about an hours as well :P
ME
@enache2004
May 11 2016 15:26
well ...I'm not too experienced with C++ compilers
yestearday I tried to compile the BLAS
Paul Dubs
@treo
May 11 2016 15:26
just follow the windows.md
and skip cuda
ME
@enache2004
May 11 2016 15:26
I've created the PATH but nothing works
ok..I will let cuda for the next weeks
Paul Dubs
@treo
May 11 2016 15:27
the windows.md instructions should cover pretty much every step, especially if you skip cuda :)
also: that is the only thing you should skip, read everything carefully and don't try to skip ahead
ME
@enache2004
May 11 2016 15:28
oh..
sometimes I don't find patience to read lines of instructions..my bad :D
Paul Dubs
@treo
May 11 2016 15:29
everyone who doesn't read the instructions carefully deserves to waste their time :P
ME
@enache2004
May 11 2016 15:30
I'm agree
ok...so how much time do you have to support me ?
Patrick Skjennum
@Habitats
May 11 2016 15:32
try the instructions and post gists of whatever errors your get
and take it from there
ME
@enache2004
May 11 2016 15:32
ok...now I'm going to solve the part with Msys2
Paul Dubs
@treo
May 11 2016 15:39
it should take you about half an hour to do everything
(if you have a decent internet connection)
ME
@enache2004
May 11 2016 15:40
I hope
ME
@enache2004
May 11 2016 15:45
ok
I set the mingw64\bin to path
I should test it from cmd ?
Patrick Skjennum
@Habitats
May 11 2016 15:47
use msys2 shell
ME
@enache2004
May 11 2016 15:54
the buildnativeoperations.sh was run
now skipping the cuda I'm going to building nd4j
Paul Dubs
@treo
May 11 2016 15:55
don't forget to export LIBND4J_HOME
ME
@enache2004
May 11 2016 15:55
I'm here
I'm little bit confused
how to export? it's a keyword for msys2 ?
I've created a variable LIBND4J in system path and user path
Paul Dubs
@treo
May 11 2016 15:59
you literally have just run export LIBND4J_HOME=`pwd` while in the libnd4j folder
ME
@enache2004
May 11 2016 16:00
using msys2 right ?
Paul Dubs
@treo
May 11 2016 16:00
note that those aren't ' but ` instead
yes while still in the msys2 shell
everything you do should be in the same shell session
ME
@enache2004
May 11 2016 16:01
ok..I've closed them
Paul Dubs
@treo
May 11 2016 16:01
why would you do that?
ME
@enache2004
May 11 2016 16:01
when I run the buildnative.sh
I've just run it from cmd
how to open msys2 in that folder?
Paul Dubs
@treo
May 11 2016 16:02
you can open the msys2 shell from the start menu
ME
@enache2004
May 11 2016 16:02
ok
Paul Dubs
@treo
May 11 2016 16:02
and then you simply cd into the directory
ME
@enache2004
May 11 2016 16:02
and then I have a msys2 shell with cmd like
and another one with fancy colors
Paul Dubs
@treo
May 11 2016 16:03
you take the one with fancy colors
ME
@enache2004
May 11 2016 16:03
but that one it doesn't let me to navigate
with cd etc.
aa
sorery
Paul Dubs
@treo
May 11 2016 16:04
it does let you navigate, but it is more like a unix shell
if you want to switch drives you cd /d/ instead of D:
everything else should be pretty much the same
ME
@enache2004
May 11 2016 16:05
yes
ok so the first option export will create that variable for me ?
I run it and nothing shown ..probably it's ok
Paul Dubs
@treo
May 11 2016 16:06
yes
but only for the current session
I prefer that to putting it forever in the environment variables of the system
anyway, now you can build nd4j
ME
@enache2004
May 11 2016 16:07
I done that before running this export option
I;ve created a variable x_HOME in my session and also in system path
anyway
Paul Dubs
@treo
May 11 2016 16:08
then you should have it for certain now :)
ME
@enache2004
May 11 2016 16:09
ok..I'm cloning the nd4j now
ups...it will take a while to download
Paul Dubs
@treo
May 11 2016 16:12
you can also download only the current state for now: https://github.com/deeplearning4j/nd4j/archive/master.zip
that should be faster than getting all of the history
isn't great for updating though
ME
@enache2004
May 11 2016 16:16
I've downloaded the nd4j-master.zip
but some problems encountered when I run the second option for building the code
Paul Dubs
@treo
May 11 2016 16:19
gist of that?
ME
@enache2004
May 11 2016 16:22
D:\RESEARCH\nd4j-master>mvn clean install -DskipTests -Dmaven.javadoc.skip=true
-pl '!org.nd4j:nd4j-cuda-7.5,!org.nd4j:nd4j-tests'
[INFO] Scanning for projects...
[WARNING]
[WARNING] Some problems were encountered while building the effective model for
org.nd4j:nd4j-perf:jar:0.4-rc3.9-SNAPSHOT
[WARNING] The expression ${version} is deprecated. Please use ${project.version}
instead.
[WARNING]
[WARNING] Some problems were encountered while building the effective model for
org.nd4j:nd4j-cuda-7.5:jar:0.4-rc3.9-SNAPSHOT
[WARNING] 'dependencies.dependency.(groupId:artifactId:type:classifier)' must be
unique: junit:junit:jar -> version ${junit.version} vs (?) @ org.nd4j:nd4j-cuda
-7.5:[unknown-version], D:\RESEARCH\nd4j-master\nd4j-backends\nd4j-backend-impls
\nd4j-cuda-7.5\pom.xml, line 188, column 21
[WARNING]
[WARNING] It is highly recommended to fix these problems because they threaten t
he stability of your build.
[WARNING]
[WARNING] For this reason, future Maven versions might no longer support buildin
g such malformed projects.
[WARNING]
[ERROR] [ERROR] Could not find the selected project in the reactor: '!org.nd4j:n
d4j-cuda-7.5 @
[ERROR] Could not find the selected project in the reactor: '!org.nd4j:nd4j-cuda
-7.5 -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e swit
ch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please rea
d the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MavenExecution
Exception
Paul Dubs
@treo
May 11 2016 16:23
you have to update your maven version
install the newest
devlop16
@devlop16
May 11 2016 16:23
Back again for more testing , built correctly current sources, trying cuda, is there a way to disable log spaming XLenght & Shapeinfo ? thanks
Paul Dubs
@treo
May 11 2016 16:23
also: I've got to leave now
raver119
@raver119
May 11 2016 16:24
@devlop16 on master you can disable that manually only
ME
@enache2004
May 11 2016 16:24
i have the latest 3.3.9
devlop16
@devlop16
May 11 2016 16:24
manually you mean in the sources before building?
Paul Dubs
@treo
May 11 2016 16:25
it isn't using that... and from your prompt I gather that you are out of msys again... as I said earlier: you have to do EVERYTHING in the msys2 shell
raver119
@raver119
May 11 2016 16:25
@devlop16 yes. it's just 2 places in source
Paul Dubs
@treo
May 11 2016 16:25
anyway, I've got to leave now
ME
@enache2004
May 11 2016 16:25
ok..thank you Paul
raver119
@raver119
May 11 2016 16:25
however it still can be usefull for detecting bad launch params
devlop16
@devlop16
May 11 2016 16:26
thanks @raver119 will try to find where it is, in libnd4j?
raver119
@raver119
May 11 2016 16:26
just use search. it's all in NativeOps.cu at the beginning of the file
devlop16
@devlop16
May 11 2016 16:26
ok
raver119
@raver119
May 11 2016 16:27
but without that, i dunno what exactly you're going to test :)
it's all about occupancy right now
so launch params is the most important thing there
devlop16
@devlop16
May 11 2016 16:27
trying to train a full network
raver119
@raver119
May 11 2016 16:28
heh
devlop16
@devlop16
May 11 2016 16:28
not yet ready for that?
raver119
@raver119
May 11 2016 16:29
it will work, but it's slow on master
my branch is faster, but right now it's disassembled to atoms
devlop16
@devlop16
May 11 2016 16:29
testing vs the same network in 3.8
better wait then?
raver119
@raver119
May 11 2016 16:30
better use cpu
or test launch params :)
devlop16
@devlop16
May 11 2016 16:31
OK :-)
raver119
@raver119
May 11 2016 16:40
seriously, launch params testing is really important these days
devlop16
@devlop16
May 11 2016 16:56
Ok, I notice that now Useregularization fails if it is false and l1 and l2 are set
AkshitaT
@AkshitaT
May 11 2016 19:08

Hi, I was implementing Canova-cli examples. Both of the text vectorization examples (Children’s Book Example and Tweets Example) are giving me the following error:
Exception in thread "main" java.lang.ExceptionInInitializerError
at org.canova.cli.transforms.text.nlp.TfidfTextVectorizerTransform.convertTextRecordToTFIDFVector(TfidfTextVectorizerTransform.java:192)
at org.canova.cli.transforms.text.nlp.TfidfTextVectorizerTransform.transform(TfidfTextVectorizerTransform.java:350)
at org.canova.cli.vectorization.TextVectorizationEngine.execute(TextVectorizationEngine.java:85)
at org.canova.cli.subcommands.Vectorize.execute(Vectorize.java:286)

at org.canova.cli.driver.CommandLineInterfaceDriver.doMain(CommandLineInterfaceDriver.java:56)
at org.canova.cli.driver.CommandLineInterfaceDriver.main(CommandLineInterfaceDriver.java:79)

Caused by: java.lang.RuntimeException: java.lang.NullPointerException
at org.nd4j.linalg.factory.Nd4j.initWithBackend(Nd4j.java:4788)
at org.nd4j.linalg.factory.Nd4j.initContext(Nd4j.java:4716)
at org.nd4j.linalg.factory.Nd4j.<clinit>(Nd4j.java:148)
... 6 more
Caused by: java.lang.NullPointerException
at org.nd4j.linalg.factory.Nd4j.initWithBackend(Nd4j.java:4749)
... 8 more

Paul Dubs
@treo
May 11 2016 19:09
are you sure you have switched to nd4j-native?
raver119
@raver119
May 11 2016 19:09
as usually - check your pom.xml
AkshitaT
@AkshitaT
May 11 2016 19:12
I will check my pom!
Patrick Skjennum
@Habitats
May 11 2016 20:41
any idea of how to persist a dataset in ram between jvm restarts:p? haven't found any suitable database and http is way too slow
Paul Dubs
@treo
May 11 2016 20:43
persist a dataset in ram between jvm restarts? what exactly are you trying to do?
Patrick Skjennum
@Habitats
May 11 2016 20:43
i don't want to reload my dataset everytime i change my neural config, and i'm too lazy to make a gui
but i guess my life would be easier if i just made one
Paul Dubs
@treo
May 11 2016 20:44
Your life would be easier if you could load your data faster :P
Patrick Skjennum
@Habitats
May 11 2016 20:44
yeah but i cannot load faster than the read speeds on my ssd
Paul Dubs
@treo
May 11 2016 20:44
get a faster ssd :P
Patrick Skjennum
@Habitats
May 11 2016 20:44
you're not helping
Paul Dubs
@treo
May 11 2016 20:45
I know :)
Patrick Skjennum
@Habitats
May 11 2016 20:45
i just want to be able to tune hyperparamters quickly
Paul Dubs
@treo
May 11 2016 20:45
but really, 512GB@2gb/s is pretty cheap
Patrick Skjennum
@Habitats
May 11 2016 20:46
i bought a samsung 512gb 840 PRO when it was new
that thing cost a fucking fortune back then. i'm not throwing it out:P
Paul Dubs
@treo
May 11 2016 20:46
:D
you are still using the text based wordvectors though?
Patrick Skjennum
@Habitats
May 11 2016 20:48
i could create a ramdisk i guess
Paul Dubs
@treo
May 11 2016 20:48
you could also reduce the size of your data
that's why I ask how you are currently storing it
Patrick Skjennum
@Habitats
May 11 2016 20:49
massive waste though:P
how big did you say was the file, 17GB?
Or, better yet, how many documents do you have there? I just want to calculate how small the file may be if you save it as a binary
Paul Dubs
@treo
May 11 2016 20:58
scrolled all the way back up, to find it, so the file doesn't need to be any larger than 5.5gb
how does a 3x speedup sound like?
actually it should load in about 10 seconds on a 840 Pro
Paul Dubs
@treo
May 11 2016 21:12
looks like putScalar is slow once again
blob
raver119
@raver119
May 11 2016 21:19
putScalar actually shouldn't be fast, and shouldn't be used
1) build java array locally
2) create your vector from that java array
3) put is as row in your vocab
that'll be way faster
raver119
@raver119
May 11 2016 21:21
i know
i hadn't got there yet
check JCublasNDArray, and you'll see it's not the case there
i'll propagate the same changes to cpu as well, when i'm done here
cpu backend will use multiple optimizations that were implemented in cuda backend, and the one you've mentioned - is one of them :)
raver119
@raver119
May 11 2016 21:24
yes
trace it
Paul Dubs
@treo
May 11 2016 21:24
just calls super
raver119
@raver119
May 11 2016 21:24
trace to super
and you'll see that it comes to createBuffer call
that gets routed to appropriate bufferFactory
it gets memcpy'd in 1 call
instead of multiple cycled jni calls for put