These are chat archives for deeplearning4j/deeplearning4j/earlyadopters

29th
Apr 2016
Abdullah-Al-Nahid
@lasin02_twitter
Apr 29 2016 01:37
Hi Is any one have this code "IRIS Classifed With a DBN".
Adam Gibson
@agibsonccc
Apr 29 2016 01:40
@lasin02_twitter please move to the tuning help channel
This channel isn't for beginners
Justin Long
@crockpotveggies
Apr 29 2016 02:43
@agibsonccc @ds923y I have Anaconda installed on my system and didn't experience that problem
with both 3.8 and 3.9-SNAPSHOT
Adam Gibson
@agibsonccc
Apr 29 2016 02:43
Mind posting your env?
It's looking for anaconda with mkl
so maybe that's the diff?
Most people don't have mkl with anaconda
Alex Black
@AlexDBlack
Apr 29 2016 04:14
@dkmisra new putScalar overrides are merged: https://github.com/deeplearning4j/nd4j/pull/871/files
Justin Long
@crockpotveggies
Apr 29 2016 04:20
@agibsonccc sorry went for a run I'm posting it now
justin$ export
declare -x Apple_PubSub_Socket_Render="/private/tmp/com.apple.launchd.GfAj2sI2S8/Render"
declare -x COLORFGBG="7;0"
declare -x GRADLE_OPTS="-Xmx4g"
declare -x HOME="/Users/justin"
declare -x ITERM_PROFILE="Default"
declare -x ITERM_SESSION_ID="w0t4p0"
declare -x JAVA_HOME="/Library/Java/JavaVirtualMachines/jdk1.8.0_65.jdk/Contents/Home"
declare -x LANG="en_CA.UTF-8"
declare -x LIBND4J_HOME="/Users/justin/Projects/libnd4j"
declare -x LOGNAME="justin"
declare -x OLDPWD="/Users/justin/Projects/dl4j-convolutional-net-scala"
declare -x PATH="/Users/justin/anaconda/bin:/Library/Java/JavaVirtualMachines/jdk1.8.0_65.jdk/Contents/Home/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin"
declare -x PWD="/Users/justin/Projects/dl4j-test"
declare -x SHELL="/bin/bash"
declare -x SHLVL="1"
declare -x SSH_AUTH_SOCK="/private/tmp/com.apple.launchd.eiEmDLS9sB/Listeners"
declare -x TERM="xterm-256color"
declare -x TERM_PROGRAM="iTerm.app"
declare -x TMPDIR="/var/folders/wt/1lx3xvdj78l9gy6d8xqxb0z80000gn/T/"
declare -x USER="justin"
declare -x XPC_FLAGS="0x0"
declare -x XPC_SERVICE_NAME="0"
declare -x __CF_USER_TEXT_ENCODING="0x1F5:0x0:0x0"
Adam Gibson
@agibsonccc
Apr 29 2016 04:22
oh so.. I'm talking about the anaconda thing
Is anaconda mkl for you?
Justin Long
@crockpotveggies
Apr 29 2016 04:30
mkl? to be honest I don't use it. downloaded it, ran a single notebook once, and then never touched it after that
Adam Gibson
@agibsonccc
Apr 29 2016 04:31
yeah so I'm wondering if it's because he has an anaconda distro with mkl in it
it doesn't find veclib on osx
for some reason it uses mkl in anaconda
but cmake can't use anaconda's setup without some hacks
at least from what I'm gathering here
Justin Long
@crockpotveggies
Apr 29 2016 04:33
any idea how Anaconda was installed?
IIRC, if you do the brew install I think it isolates it
or vice-versa
if I look up that filename I found an issue in Caffe that mentions mkl as part of CUDA
Justin Long
@crockpotveggies
Apr 29 2016 04:41
This message was deleted
deleted my last message since this seems more relevant: https://cmake.org/pipermail/cmake/2012-April/050068.html
Valerio Zamboni
@vzamboni
Apr 29 2016 08:29
guys, I'm struggling to load a jpg images dataset with a RecordReaderDataSetIterator (i'm using this code: http://deeplearning4j.org/image-data-pipeline.html ) with latest build of master brench but I keep on getting a NullPointerException every time I try to perform .next() on the iter, do you have any idea why?
Exception in thread "main" java.lang.NullPointerException
at org.nd4j.linalg.dataset.DataSet.<init>(DataSet.java:85)
at org.nd4j.linalg.dataset.DataSet.<init>(DataSet.java:74)
at org.deeplearning4j.datasets.canova.RecordReaderDataSetIterator.getDataSet(RecordReaderDataSetIterator.java:267)
at org.deeplearning4j.datasets.canova.RecordReaderDataSetIterator.next(RecordReaderDataSetIterator.java:170)
at org.deeplearning4j.datasets.canova.RecordReaderDataSetIterator.next(RecordReaderDataSetIterator.java:335)
at org.deeplearning4j.datasets.canova.RecordReaderDataSetIterator.next(RecordReaderDataSetIterator.java:47)
Adam Gibson
@agibsonccc
Apr 29 2016 08:31
@vzamboni file an issue on nd4j thanks!
Patrick Skjennum
@Habitats
Apr 29 2016 08:32
any neat way to cut off all values in an indarray that matches some filter? i.e. all values below zero?
Samuel Audet
@saudet
Apr 29 2016 08:56
@vzamboni Please try to create your reader this way: DataSetIterator iter = new RecordReaderDataSetIterator(recordReader, 1, labels.size());
Adam Gibson
@agibsonccc
Apr 29 2016 09:00
@Habitats the max op does that
Look at Transforms.java for how to use the opexecutioner directly
Sadat Anwar
@SadatAnwar
Apr 29 2016 09:12
@treo remember my issue with the dl4j train crashing the jvm? well I took your advise and I did make it have very large arrays and now its crashing for sure as soon as it starts the train... So how do I explain the problem to you? or any one else up to help me sort this? Its kinda complicated what I am trying to do...
Paul Dubs
@treo
Apr 29 2016 09:14
Now that it crashes fast, you can find the point right before it crashes and isolate it
Sadat Anwar
@SadatAnwar
Apr 29 2016 09:16
so, its surely crashing someplace inside the train method. But like you said, I think the problem is created when I am creating my dataset object
is there a possibility that if my DS contains multiple references to the same INDArray, it would cause the crash?
Ill try a step by step debug and see where it is really crashing
or can I set a breakpoint in intelliJ to break when the jvm crashes? (its not really an exception is it )
Paul Dubs
@treo
Apr 29 2016 09:21
not, when the jvm crashes it is too late, but with the larger arrays it should produce a log that gives you a stacktrace
Patrick Skjennum
@Habitats
Apr 29 2016 09:22
@agibsonccc ah, awesome. was confused by the name.
Adam Gibson
@agibsonccc
Apr 29 2016 09:22
There's different max operations
max(0,x) is the cutoff activation for neural nets
there's also max of value which is a reduce
both make sense :(
Patrick Skjennum
@Habitats
Apr 29 2016 09:24
yeah, now they do:P just assumed max was something else. didn't look at the source.
Valerio Zamboni
@vzamboni
Apr 29 2016 09:29
@saudet thanks a lot, it's working that way. I changed the constructor cause I saw it was deprecated
Sadat Anwar
@SadatAnwar
Apr 29 2016 09:32
@treo a log? where would it be? there is nothing printed to console, and I dont see any new files in the project that I am working with. :(
Paul Dubs
@treo
Apr 29 2016 09:54
then make your arrays even larger :) it should produce a crash log
Justin Long
@crockpotveggies
Apr 29 2016 15:28
@agibsonccc saw your comment here deeplearning4j/nd4j#869 I assume you mean re-open it under libnd4j repo? https://github.com/deeplearning4j/libnd4j
Justin Long
@crockpotveggies
Apr 29 2016 16:00
@agibsonccc nevermind, already did it, included repo deeplearning4j/libnd4j#153
Paul Dubs
@treo
Apr 29 2016 16:17
the load average now also goes where I expect it to be: right now already at 31 and rising, so I expect it to go all the way to 35 really
ok, now it is at 35 :D
An interesting thing to consider: rc3.9 would have taken 14 Days to finish at the rate that it was going
On my desktop it would have taken 9
raver119
@raver119
Apr 29 2016 16:20
bring hot spots on
:)
i'm 99.9% sure there's the same issue with getRow() topped
Paul Dubs
@treo
Apr 29 2016 16:20
rc3.8 will probably take only 1 day as it looks right now
raver119
@raver119
Apr 29 2016 16:20
i've said you before - it has almost linear scaling
Paul Dubs
@treo
Apr 29 2016 16:21
:)
If it works as it is supposed to :D
raver119
@raver119
Apr 29 2016 16:21
keep java aside :)
it's definitely related to nd4j internals now
and after you'll show me hot spots - we'll both know that :)
Paul Dubs
@treo
Apr 29 2016 16:22
sure :D
you want those for rc3.9, I guess :D
I'll start it with a 100x smaller corpus, so we don't get old waiting for it
raver119
@raver119
Apr 29 2016 16:24
sure
yea, sry, for 3.9. going crazy here with byte offsets
Paul Dubs
@treo
Apr 29 2016 16:25
blob
raver119
@raver119
Apr 29 2016 16:27
yea, ty
exactly same issue with a bit different flavour
like a banana taste now
Paul Dubs
@treo
Apr 29 2016 16:29
I should get a large aws compute instance to do the training more often... it's noticeably cooler in here :D
Paul Dubs
@treo
Apr 29 2016 17:08
also, really nice speed - 13 seconds per 100k sentences
raver119
@raver119
Apr 29 2016 17:09
thanks
Paul Dubs
@treo
Apr 29 2016 17:09
You have some work to do, to get back to that speed :P
raver119
@raver119
Apr 29 2016 17:10
i doubt that's me
right now it sounds like saudet
:)
Paul Dubs
@treo
Apr 29 2016 17:11
really? Do you think it is something in javacpp?
raver119
@raver119
Apr 29 2016 17:11
open your own hotspots screenshot
two methods topped
buffer.asNio
and pointer.asDirectBuffer
if you'll check BaseDataBuffer.asNio method, you'll see - it's just a wrapper over jcpp pointer.asByteBuffer call
so, on other words: yes, i'm pretty sure that's jcpp
Paul Dubs
@treo
Apr 29 2016 17:16
Do I read the backtraces there correctly that it pretty much comes down to memory allocation?
Paul Dubs
@treo
Apr 29 2016 17:38
I've ran it with tracing instead of sampling again, and to me it looks like working with the shape information is the main reason for the slowdown
blob
raver119
@raver119
Apr 29 2016 17:41
expand everything to the bottom please :)
Paul Dubs
@treo
Apr 29 2016 17:41
I know that it goes on to create arrays then, but that isn't my point
my point is that nd4j creates a LOT of C arrays
raver119
@raver119
Apr 29 2016 17:43
yes.
solution is really simple there - cache them.
that what i've did for gpu
but i'm not sure if that's really worth a hassle on cpu
on gpu mallocs are REALLY expensive
Paul Dubs
@treo
Apr 29 2016 17:45
I guess lots of mallocs on cpu aren't great either
raver119
@raver119
Apr 29 2016 17:46
well
show me expanded hot spots
and we'll see :)
if malloc is the reason
but i doubt that
Paul Dubs
@treo
Apr 29 2016 17:48
See my last hotspot screenshot, all those calls to asDirectBuffer are coming from the constructors of FloatBuffer and IntBuffer
raver119
@raver119
Apr 29 2016 17:49
thats great
but that doesnt have to mean there's malloc in the bottom
i just want to see - what's the final leaf of that tree :)
Paul Dubs
@treo
Apr 29 2016 17:52
blob
raver119
@raver119
Apr 29 2016 17:53
pew
allocateArray
Paul Dubs
@treo
Apr 29 2016 18:31
interesting... when I disable javacpp garbage collection, creating int buffers takes as long as creating float buffers, if it is enabled int buffers take 3 times as long (but, I run of memory eventually :D)
raver119
@raver119
Apr 29 2016 18:32
so deallocator part takes high price.
Paul Dubs
@treo
Apr 29 2016 18:32
looks like it... I'm not sure it was ever intended to have that many tiny arrays around
raver119
@raver119
Apr 29 2016 18:33
don't forget to include that into your issue :)
Paul Dubs
@treo
Apr 29 2016 18:34
you're right, I should update that issue with the new findings
Paul Dubs
@treo
Apr 29 2016 19:15
say, If I create a new buffer with Nd4j.createBuffer, will that use GPU memory?
raver119
@raver119
Apr 29 2016 19:17
yes
if its not IntBuffer though
but iā€™m going to lift that limitation
Paul Dubs
@treo
Apr 29 2016 19:19
I'm just wondering where all of those int buffers are coming from... and from the signature, they are actually supposed to be views
raver119
@raver119
Apr 29 2016 19:20
yes
but each view
has its own shapeInfoBuffer
describing the view
raver119
@raver119
Apr 29 2016 22:44
@treo deeplearning4j/libnd4j#141 last comment
that will bring us some boost as well.
finally i've got there :)