These are chat archives for deeplearning4j/deeplearning4j/earlyadopters

26th
Apr 2016
Justin Long
@crockpotveggies
Apr 26 2016 00:27
@agibsonccc ah thanks for that, just looking at this now
Adam Gibson
@agibsonccc
Apr 26 2016 00:27
finally getting to the new cuda reduce with TAD :D
Samuel Audet
@saudet
Apr 26 2016 00:32
C++ demangling is available on the command line:
$ c++filt _ZNSt12length_errorD1Ev
std::length_error::~length_error()
Justin Long
@crockpotveggies
Apr 26 2016 00:33
ah sweet even better :)
Adam Gibson
@agibsonccc
Apr 26 2016 00:33
we got rid of that
it's not even used anymore
Justin Long
@crockpotveggies
Apr 26 2016 00:35
ah good to know. I've been trying to follow along and understand these concepts. I've been doing a lot of my work in Scala and trying to understand the backend implementations and whatnot
I'm a C++ n00b but thought I'd invest the time in it
Samuel Audet
@saudet
Apr 26 2016 00:35
@crockpotveggies GCC isn't very well supported on the OS X platform. Any reason for not using Clang?
Adam Gibson
@agibsonccc
Apr 26 2016 00:35
TAD is my invention
It's a bit..interesting
A lot of the details from it were me empirically trying stuff and seeing what sticked
lol
Justin Long
@crockpotveggies
Apr 26 2016 00:37
@saudet if I run gcc --version it gives me
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.3.0 (clang-703.0.29)
Target: x86_64-apple-darwin15.5.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
Adam Gibson
@agibsonccc
Apr 26 2016 00:37
Basic idea is being able to given k dimensions: (eg rows/columns) for an ndarray: what is the shape/stride of a given ndarray
Justin Long
@crockpotveggies
Apr 26 2016 00:37
@saudet so looks like clang to me?
Adam Gibson
@agibsonccc
Apr 26 2016 00:37
I would try to "use" that not "understand" it
heh
it's a core abstraction we use to parallelize problems though
not much really existed to do that before
Samuel Audet
@saudet
Apr 26 2016 00:38
/usr/include/c++/4.2.1 that looks suspicious
Justin Long
@crockpotveggies
Apr 26 2016 00:38
@agibsonccc ah if I am "getting" you I ran into similar problems when trying to parallelize graph computation problems
Adam Gibson
@agibsonccc
Apr 26 2016 00:39
right
I broke that open
at least for ndarays
ndarrays*
It was highly tedious to find something that worked that was ACTUALLY general purpose
rather than the stupid benchmarks you typically see
"oh hey guyz I can sum a whole buffer!"
Justin Long
@crockpotveggies
Apr 26 2016 00:40
@agibsonccc I found it extremely difficult to parallelize an algorithm such as Markov Centrality
Adam Gibson
@agibsonccc
Apr 26 2016 00:40
vs "oh this needs to work on real problems and less than ideal circumstances"
Justin Long
@crockpotveggies
Apr 26 2016 00:40
@agibsonccc heh ;)
Adam Gibson
@agibsonccc
Apr 26 2016 00:40
"on arbitrary dimensions including weird shapes like 6,1,1,2,2"
numpy said "screw it, everythings an iterator"
Justin Long
@crockpotveggies
Apr 26 2016 00:40
@saudet what should I be seeing there? if not at all?
Samuel Audet
@saudet
Apr 26 2016 00:41
But that's what I get here as well, so that's alright
Justin Long
@crockpotveggies
Apr 26 2016 00:41
@saudet ah ok
Samuel Audet
@saudet
Apr 26 2016 00:45
@crockpotveggies Could you copy/paste the whole console output of the build? (in a gist or something to not spam this channel)
@saudet naive thought: is it possible that libc++ is being used and not libstdc++?
Samuel Audet
@saudet
Apr 26 2016 00:49
Yes, that's normal. Apple doesn't want to use libstdc++, they use libc++
I get pretty much the same output, but I use a slightly older version of Xcode, so it's possible that for very recent versions of Xcode the binaries for clang-omp are not up-to-date. We might need to rebuild from source...
Justin Long
@crockpotveggies
Apr 26 2016 00:51
ha oh shit :P
@saudet just so I understand you correctly, you mean rebuild clang-omp from source?
Samuel Audet
@saudet
Apr 26 2016 00:55
Yeah... sounds complicated. Maybe the binary is available, and just reinstalling everything will gets things in order
Justin Long
@crockpotveggies
Apr 26 2016 00:55
going to try brew reinstall clang-omp here we go...
nope :(
Samuel Audet
@saudet
Apr 26 2016 00:57
I mean, it's possible for brew's database to be corrupted etc
Justin Long
@crockpotveggies
Apr 26 2016 01:04
@saudet what's weird is nd4j would "start" to compile and would throw an error at a missing omp.h include. I then ran xcode-select --install and it finally made it to that most recent error. any chance they're related?
@saudet I'm just starting to think my local is messy and perhaps Homebrew is referencing different compilers, etc.
Samuel Audet
@saudet
Apr 26 2016 01:09
Yeah, that's what it sounds like. It seems like the clang-omp version is too old for your libs
Justin Long
@crockpotveggies
Apr 26 2016 01:11
I'm also going to delete the project and re-download for sanity check
negiyas
@negiyas
Apr 26 2016 01:43
It seems that the latest version causes compiler errors on src/main/java/org/deeplearning4j/examples/feedforward/mnist/MLPMnistTwoLayerExample.java
Is my tentative patch right?

$ diff dl4j-0.4-examples/src/main/java/org/deeplearning4j/examples/feedforward/mnist/MLPMnistTwoLayerExample.java{,.org}
24c24

< public class MLPMnistTwoLayerExample {

public class MLPMnistSingleLayerExample {
26c26

< private static Logger log = LoggerFactory.getLogger(MLPMnistTwoLayerExample.class);

private static Logger log = LoggerFactory.getLogger(MLPMnistSingleLayerExample.class);
Alex Black
@AlexDBlack
Apr 26 2016 02:16
ah, missed that when I merged it. yeah, that looks correct. send us a PR if you don't mind :)
Adam Gibson
@agibsonccc
Apr 26 2016 02:17
fixed
negiyas
@negiyas
Apr 26 2016 02:17
thx
Cryddit
@Cryddit
Apr 26 2016 05:19
Wups. Why did it work for me then?
Damn, that's embarrassing.
Andreas Eberle
@andreas-eberle
Apr 26 2016 08:01
Hi guys, I just wanted to build cuda with the current master but got the following error https://gist.github.com/andreas-eberle/abb91372508225cdbf853bc5579c6a06
any ideas?
raver119
@raver119
Apr 26 2016 08:40
yea. don't build cuda until adam and me finish that today
we're both changing stuff there, adam improves TAD and i'm passing proper memory there
Andreas Eberle
@andreas-eberle
Apr 26 2016 08:45
k
Adam Gibson
@agibsonccc
Apr 26 2016 09:36
No but seriously, all that matters is actual data
anything else is useless noise
I mean beyond it happening
Andreas Eberle
@andreas-eberle
Apr 26 2016 09:36
what should I provide you?
Adam Gibson
@agibsonccc
Apr 26 2016 09:36
at minimum jvisualvm and how you're compiling libnd4j
Andreas Eberle
@andreas-eberle
Apr 26 2016 09:36
I mean in the task manager, I can see that my CPU isn't fully utilized... 25% is far away from 100
Adam Gibson
@agibsonccc
Apr 26 2016 09:37
feel free to file issues with an info dump and we can start from there
Andreas Eberle
@andreas-eberle
Apr 26 2016 09:37
libnd4j compiled with release
Adam Gibson
@agibsonccc
Apr 26 2016 09:37
I mean it doesn't do us much good without some sort of diagnostics
Paul Dubs
@treo
Apr 26 2016 09:37
Most of the time I run my i7-6700k maxed out, so it isn't just an issue with dl4j
Alex Black
@AlexDBlack
Apr 26 2016 09:37
we already know CNNs are slow, I haven't merged my new branch yet
due to subsampling layer
Paul Dubs
@treo
Apr 26 2016 09:37
It depends on what you are actually doing
Adam Gibson
@agibsonccc
Apr 26 2016 09:37
right
Andreas Eberle
@andreas-eberle
Apr 26 2016 09:38
what I'm doing: I just run the LenetMnistExample with nd4j-native build from last weeks master with release build
and what I expect is to see all CPUs running maxed out.
Adam Gibson
@agibsonccc
Apr 26 2016 09:38
That's nice
Andreas Eberle
@andreas-eberle
Apr 26 2016 09:38
Is there anything I need to configure to get this?
Adam Gibson
@agibsonccc
Apr 26 2016 09:38
But data
"Data"
It should auto configure
Andreas Eberle
@andreas-eberle
Apr 26 2016 09:39
ok
Alex Black
@AlexDBlack
Apr 26 2016 09:39
tl;dr is there's a bottleneck in CNNs currently that I'm working to eliminate
Adam Gibson
@agibsonccc
Apr 26 2016 09:39
Try other neural nets too
Andreas Eberle
@andreas-eberle
Apr 26 2016 09:39
which example should I run?
Alex Black
@AlexDBlack
Apr 26 2016 09:39
right, MLPs and RNNs should be pretty good
MLP mnist single layer is a good start
or the LSTM character one
Paul Dubs
@treo
Apr 26 2016 09:40
the mlp mnist sinlge layer one, runs at 100% utilization for me
(using MKL and MKL_DYNAMIC=false)
Andreas Eberle
@andreas-eberle
Apr 26 2016 09:41
with hyper threads included or not? With hyper threads I have 60% utilized
Paul Dubs
@treo
Apr 26 2016 09:41
that's where the MKL_DYNAMIC thing comes in
Andreas Eberle
@andreas-eberle
Apr 26 2016 09:42
where do I have to add this?
Paul Dubs
@treo
Apr 26 2016 09:42
are you using mkl?
if not, then you need another env variable
Andreas Eberle
@andreas-eberle
Apr 26 2016 09:43
what's MKL? Do I have to install it first?
Adam Gibson
@agibsonccc
Apr 26 2016 09:43
uhhh
it's the only fast cpu blas impl
Paul Dubs
@treo
Apr 26 2016 09:44
it is the only fast, that you don't have to compile yourself :D
Andreas Eberle
@andreas-eberle
Apr 26 2016 09:44
I compiled openblas... but I guess I will use that one then
Paul Dubs
@treo
Apr 26 2016 09:45
https://software.intel.com/sites/campaigns/nest/ get it from there, it is free :)
Alex Black
@AlexDBlack
Apr 26 2016 09:45
if you are compiling openblas, be careful about the number of cores... defaulted to 2 cores for me, even on my 8 core machine
but yeah, mkl is better
Andreas Eberle
@andreas-eberle
Apr 26 2016 09:58
is there only a linux installation?
Paul Dubs
@treo
Apr 26 2016 09:58
no
it is available for windows as well
Andreas Eberle
@andreas-eberle
Apr 26 2016 09:59
hmm, must have missed that option..
k, found it
badend
@badend
Apr 26 2016 10:06
This message was deleted
Patrick Skjennum
@Habitats
Apr 26 2016 10:33
@AlexDBlack i think the display for mean magnitudes are too small when the values becomes tiny?
blob
Alex Black
@AlexDBlack
Apr 26 2016 10:34
you mean the values on the left? or bottom? (though neither are great in that image)
open an issue I guess :)
Patrick Skjennum
@Habitats
Apr 26 2016 10:35
was referring to the y-axis
Paul Dubs
@treo
Apr 26 2016 10:36
you mean that the text doesn't fit?
Patrick Skjennum
@Habitats
Apr 26 2016 10:36
yeah i have no idea if it's 0.0001, or 0.0000001 :P
no way to judge by the GUI
Paul Dubs
@treo
Apr 26 2016 10:36
it should probably switch to 1e-X notation after 0.001
after that counting the zeros is tedious :D
Patrick Skjennum
@Habitats
Apr 26 2016 10:40
issue created
Alex Black
@AlexDBlack
Apr 26 2016 10:41
thanks
Patrick Skjennum
@Habitats
Apr 26 2016 10:41
@treo should probably, but it doesn't:P
@treo you're not in the tuning channel so i goto bug you here:P
did you say i could easily use minibatch of 1k for that ffn?
wait you're in the channel but didn't show up when i did @t wut:s
Paul Dubs
@treo
Apr 26 2016 10:45
I am in the tuning channel :D
Patrick Skjennum
@Habitats
Apr 26 2016 10:45
yeah gitter is fuckin with me
Paul Dubs
@treo
Apr 26 2016 10:45
it sometimes takes a while to show up
Yes, I did say that you can use a minibatch of 1k for that ffn
Alex Black
@AlexDBlack
Apr 26 2016 11:02
heads up for anyone that cares (@raver119, @treo, @eraly ?)
just merged the CNN subsampling layer optimizations (ismax) stuff
seeing about >20% improvement on current LenetMnistExample with that... haven't looked into new bottlenecks etc though
raver119
@raver119
Apr 26 2016 11:03
cool
Paul Dubs
@treo
Apr 26 2016 11:04
on the topic of bottlenecks, though: By moving the examples order to 'f' in SyntheticRNN I could go back to the same speed that I had before the TAD merge
Alex Black
@AlexDBlack
Apr 26 2016 11:04
examples as in the examples repo?
yeah, should switch those DataSetIterators to f order...
already did for SequenceRecordReaderDataSetIterator, but nothing else
Paul Dubs
@treo
Apr 26 2016 11:05
no, Examples as in generated stuff for the Benchmarks :)
Alex Black
@AlexDBlack
Apr 26 2016 11:05
oh, yeah, that too
switch all the examples!
Paul Dubs
@treo
Apr 26 2016 11:05
:D
Sometime we should document that somewhere: "If you build your own DataSetIterator you should make sure your features / labels are 'f' ordered"
Alex Black
@AlexDBlack
Apr 26 2016 11:07
yeah, probably on the using rnns page I think
good idea
Paul Dubs
@treo
Apr 26 2016 11:08
Is that only for RNNs though?
Alex Black
@AlexDBlack
Apr 26 2016 11:08
yep
with f order the buffer of the 3d time series data is basically [time step 1][time step 2][time step 3][...]
where each time step is 2d matrix
Andreas Eberle
@andreas-eberle
Apr 26 2016 11:20
@treo: I know have MKL installed, what environment variables do I have to set?
Paul Dubs
@treo
Apr 26 2016 11:20
have you rebuild libnd4j?
you have to rebuild it, so it links to mkl
And you have to make sure that you have the mkl libs on your path
Andreas Eberle
@andreas-eberle
Apr 26 2016 11:21
k
will it automatically link to mkl instead of openblas?
Or do I have to remove openblas from path
?
Paul Dubs
@treo
Apr 26 2016 11:22
it tries to find mkl first
Andreas Eberle
@andreas-eberle
Apr 26 2016 11:22
k
Paul Dubs
@treo
Apr 26 2016 11:22
you know that it worked, when after building libnd4j you see mkl_rt.dll at the end of the output
Andreas Eberle
@andreas-eberle
Apr 26 2016 11:25
I just saw that libnd4j cpu builds with DEBUG, although I used ./buildnativeoperations.sh blas cpu Release
Paul Dubs
@treo
Apr 26 2016 11:25
don't care about that right now
Andreas Eberle
@andreas-eberle
Apr 26 2016 11:25
k
which libnd4j commit is still building? The current master does not build...
Paul Dubs
@treo
Apr 26 2016 11:29
it does build...
or are you talking about cuda?
but if you are looking for speed, you can't be currently talking about cuda
Andreas Eberle
@andreas-eberle
Apr 26 2016 11:30
cpu did build in libnd4j, but nd4j didn't build
Paul Dubs
@treo
Apr 26 2016 11:31
nd4j does also build just fine
so, what is your error?
Andreas Eberle
@andreas-eberle
Apr 26 2016 11:35
I had to run the mvn in a bash... didn't know that
This message was deleted
of course it fails on cuda now... but running mvn clean install -DskipTests -Dmaven.javadoc.skip=true -pl '!org.nd4j:nd4j-cuda-7.5'as written in windows.md fails with https://gist.github.com/andreas-eberle/39c257bd374610bd3ae883a26dc9c66a
Paul Dubs
@treo
Apr 26 2016 11:37
404
Andreas Eberle
@andreas-eberle
Apr 26 2016 11:38
gist doesn't work...
Paul Dubs
@treo
Apr 26 2016 11:38
use pastebin
Andreas Eberle
@andreas-eberle
Apr 26 2016 11:38
Paul Dubs
@treo
Apr 26 2016 11:39
whats your pwd
i.e. in which path are you running that?
Andreas Eberle
@andreas-eberle
Apr 26 2016 11:40
in the nd4j path
Paul Dubs
@treo
Apr 26 2016 11:40
also, try to run it with -pl '!:nd4j-cuda-7.5'
Andreas Eberle
@andreas-eberle
Apr 26 2016 11:40
same error
pwd is /d/test/Beispielproject/nd4j
Alex Black
@AlexDBlack
Apr 26 2016 11:41
what's your maven version?
"mvn --version"
Andreas Eberle
@andreas-eberle
Apr 26 2016 11:41
Apache Maven 3.0.5 (r01de14724cdef164cd33c7c8c2fe155faf9602da; 2013-02-19 14:51:28+0100) Maven home: D:\programs\IntelliJ\plugins\maven\lib\maven3 Java version: 1.8.0_77, vendor: Oracle Corporation Java home: C:\Program Files\Java\jdk1.8.0_77\jre Default locale: de_DE, platform encoding: Cp1252 OS name: "windows 10", version: "10.0", arch: "amd64", family: "dos"
Paul Dubs
@treo
Apr 26 2016 11:41
oh, right, this only works on newer maven versions
Alex Black
@AlexDBlack
Apr 26 2016 11:41
hm, not super recent, but not super old, either
Andreas Eberle
@andreas-eberle
Apr 26 2016 11:42
it worked last week XD
Paul Dubs
@treo
Apr 26 2016 11:42
you were probably not skipping cuda there
Andreas Eberle
@andreas-eberle
Apr 26 2016 11:42
yes, that's true
Paul Dubs
@treo
Apr 26 2016 11:42
update your maven
Andreas Eberle
@andreas-eberle
Apr 26 2016 11:42
k
Andreas Eberle
@andreas-eberle
Apr 26 2016 11:50
ok, that helped
Which is the minimum maven version required for this? This should be added to the windows.md
Paul Dubs
@treo
Apr 26 2016 11:53
Not sure, I just noticed it doesnt work on the 3.0 series, and updated to the 3.3 series
Alex Black
@AlexDBlack
Apr 26 2016 11:54
works on 3.2.3
Andreas Eberle
@andreas-eberle
Apr 26 2016 11:55
so, I build libnd4j and nd4j. Do I have to add any more evironment variables?
how can I see that mkl is actually used?
Paul Dubs
@treo
Apr 26 2016 11:56
did it have mkl_rt.dll at the end of the output when you built libnd4j?
Andreas Eberle
@andreas-eberle
Apr 26 2016 12:00
no, don't think so
Paul Dubs
@treo
Apr 26 2016 12:00
-lopenblas didn't pickup mkl
post the output from running export on your shell
Andreas Eberle
@andreas-eberle
Apr 26 2016 12:01
Paul Dubs
@treo
Apr 26 2016 12:02
have you restarted your shell after installing mkl?
Andreas Eberle
@andreas-eberle
Apr 26 2016 12:02
yes
Paul Dubs
@treo
Apr 26 2016 12:03
ok, then add the following manually to your path:
C:\Program Files (x86)\IntelSWTools\compilers_and_libraries\windows\redist\intel64_win\mkl
Andreas Eberle
@andreas-eberle
Apr 26 2016 12:06
that helped
ok, CPU is at 80% now and it "looks" faster regarding the debug output
Andreas Eberle
@andreas-eberle
Apr 26 2016 12:13
is it a problem that libnd4j only compiled with debug ?
raver119
@raver119
Apr 26 2016 12:13
no, it's not a problem, since it's not released yet :)
Andreas Eberle
@andreas-eberle
Apr 26 2016 12:14
;) will it slow the calculations down a lot?
Paul Dubs
@treo
Apr 26 2016 12:19
not really
raver119
@raver119
Apr 26 2016 12:19
that's debug information incompiled
Paul Dubs
@treo
Apr 26 2016 12:19
Also, if you want 100% utilisation, you can set the MKL_DYNAMIC=false environment variable
BUT, you will get only a minor speedup
because MKL usually knows what it is doing
Patrick Skjennum
@Habitats
Apr 26 2016 12:24
btw; how can i get the UIServer to not hijack my logger?
because atm it just hijacks everything and spams INFO [2016-04-26 12:25:24,867] org.deeplearning4j.ui.weights.HistogramIterationListener: InboundJaxrsResponse{context=ClientResponse{method=POST, uri=http://localhost:50507/weights/update?sid=72100f72-beaa-4498-97fe-e065e6ccb7f1, status=200, reason=OK}} INFO [2016-04-26 12:25:25,901] org.deeplearning4j.ui.weights.HistogramIterationListener: InboundJaxrsResponse{context=ClientResponse{method=POST, uri=http://localhost:50507/weights/update?sid=72100f72-beaa-4498-97fe-e065e6ccb7f1, status=200, reason=OK}}
Alex Black
@AlexDBlack
Apr 26 2016 12:26
dropwizard.yml: add the following under "server:"
requestLog:
appenders: []
if you don't have a dropwizard.yml, get the one from the examples repo
Patrick Skjennum
@Habitats
Apr 26 2016 12:27
i just put in in resources?
just needs to be on classpath?
Alex Black
@AlexDBlack
Apr 26 2016 12:27
yeah, should be fine there
resources that is
Patrick Skjennum
@Habitats
Apr 26 2016 12:28
got one from example repo and it already has that stuff, neat
Andreas Eberle
@andreas-eberle
Apr 26 2016 12:39
@treo: Before, you talked about the 'f' order of the DataSetIterators, where can I change that?
Paul Dubs
@treo
Apr 26 2016 12:39
it is just a parameter to Nd4j.create
but that only helps for RNNs
(says Alex, I haven't tried it for fnns yet)
Alex Black
@AlexDBlack
Apr 26 2016 12:42
@andreas-eberle if you are using SequenceRecordReaderDataSetIterator as of current dl4j master, it's done for you
raver119
@raver119
Apr 26 2016 12:56
how do you think, emulating singleton using oldschool pointer-style - bad idea? :)
Patrick Skjennum
@Habitats
Apr 26 2016 13:14
adding appenders: [] didn't remove the spam btw @AlexDBlack
Paul Dubs
@treo
Apr 26 2016 13:20
@raver119 depends on what you want to do
@Habitats you can always configure in your logback.xml that you don't want messages lower than WARN from org.deeplearning4j.ui.weights.HistogramIterationListener
Alex Black
@AlexDBlack
Apr 26 2016 13:21
I think dropwizard won't pay attention to that though
not 100% sure
Paul Dubs
@treo
Apr 26 2016 13:21
it grabs an available logger, so I would guess configuring the logger should work
Alex Black
@AlexDBlack
Apr 26 2016 13:21
anyway, the dropwizard.yml thing should do it, if it's properly formatted and in the right place
Patrick Skjennum
@Habitats
Apr 26 2016 13:22
i just copied the file from the examples
Alex Black
@AlexDBlack
Apr 26 2016 13:24
hm, I'd have to look into that. might be as simple as not the right directory though...
try adding it to the root directory of your project
Patrick Skjennum
@Habitats
Apr 26 2016 13:26
all of my other resources are loaded correctly
Andreas Eberle
@andreas-eberle
Apr 26 2016 13:26
@AlexDBlack @treo : ok, thanks
Patrick Skjennum
@Habitats
Apr 26 2016 13:33
@AlexDBlack changing the log level for dl4j in dropwizard.yml worked
Sadat Anwar
@SadatAnwar
Apr 26 2016 14:41
hey guys, any one noticed the JVM crash while using dl4j?
its been happening a bit here with me, and there is no explanation, and at times it wont crash, but at times it does, even when running the same program
Adam Gibson
@agibsonccc
Apr 26 2016 14:42
@sadatanwer we need minimal programs that reproduce this stuff
try to track it down and file an issue
Sadat Anwar
@SadatAnwar
Apr 26 2016 14:44
Yeah, that is what I am trying, but its so in consistent. I had a similar issue once when I was using a dll with jni and it kept crashing, I think it was because the dll process would die, (or throw an exception, just a theory) and that would kill the jvm. maybe something similar is happening here?
@agibsonccc have you ever experienced this? this is the only message i get after I get a box saying the jvm crashed Process finished with exit code -1073741819 (0xC0000005)
Adam Gibson
@agibsonccc
Apr 26 2016 14:47
just give us as much info as you can
it's better knowing it exists
Sadat Anwar
@SadatAnwar
Apr 26 2016 14:48
sure... :D
(if I can figure it out my self!)
Paul Dubs
@treo
Apr 26 2016 14:50
This looks pretty much like a smashed stack
But first of all: make sure you are running the current master
Sadat Anwar
@SadatAnwar
Apr 26 2016 15:01
@treo I think so too, do you know what could cause this? I compiled libnd4j and nd4j just yesterday, so not sure if its current master, but its not very old for sure. Also after a few crashed attempts it seems to be running fine now (fingers crossed)
(without any changes to the code)
Paul Dubs
@treo
Apr 26 2016 15:02
It is usually caused by shape mismatches that aren't caught early enough
When it tries to write somewhere, were it shouldn't it overwrites your stack
sometimes it just writes somewhere else in memory
writing there might not overwrite the stack, and so it runs happily corrupting memory until it writes somewhere, were it crashes the jvm
Sadat Anwar
@SadatAnwar
Apr 26 2016 15:07
interesting.... and that is why I dont like c++ hahaha! No but what you are saying seems plausible, I could be having some stuff like that in here, I am testing some really crazy ideas . Ill take a closer look at the arrays....
Paul Dubs
@treo
Apr 26 2016 15:07
make your arrays larger
it will crash faster
:)
Cryddit
@Cryddit
Apr 26 2016 17:46
I have a question about the class library. What does LoggerFactory do and why is it needed?
Paul Dubs
@treo
Apr 26 2016 17:49
for some reason Java has a metric ton of logging libraries, the LoggerFactory produces one that you can use depending on the logging library you actually use
  • The <code>LoggerFactory</code> is a utility class producing Loggers for
  • various logging APIs, most notably for log4j, logback and JDK 1.4 logging.
  • Other implementations such as {@link org.slf4j.impl.NOPLogger NOPLogger} and
  • {@link org.slf4j.impl.SimpleLogger SimpleLogger} are also supported.
Cryddit
@Cryddit
Apr 26 2016 17:51
Yesterday I submitted a pull request for a new MNIST example but I used LoggerFactory wrongly.
I needed to pass it the new class that the example created, instead of the class of the MNIST data itself?
How does it even know about the new class that the example created?
Paul Dubs
@treo
Apr 26 2016 17:53
You pass it the class where it is used, so when you log something, you know where it comes from or can tell a class to shut up
Cryddit
@Cryddit
Apr 26 2016 17:53
Ah. So it doesn't know anything about the class, it just needs to know where to pass messages.
I understand now. Sorry I messed up yesterday.
Cryddit
@Cryddit
Apr 26 2016 18:18
FWIW, I find the interpretation of momentum in this library surprising. It apparently multiplies the maximum update/learning rate by the reciprocal of (1-momentum) instead of stating what portion of the maximum update is determined by earlier feedback. Whatever, I can deal with it. Just need to adjust learning rate down accordingly whenever adjusting momentum up.
Justin Long
@crockpotveggies
Apr 26 2016 18:50
okay guys a follow up on my previous issues. I've tried recompiling to get both OpenMP support and libstdc++ so I can finally get libnd4j compiled...
cmake ../llvm
dyld: Symbol not found: ___cxa_bad_typeid
  Referenced from: /usr/lib/libicucore.A.dylib
  Expected in: /usr/local/Cellar/clang-omp/2015-04-01/libexec/lib/libc++.1.dylib
 in /usr/lib/libicucore.A.dylib
Trace/BPT trap: 5
looks like I've got some issues locally. Anyone who's a C++ expert recognize this?
Paul Dubs
@treo
Apr 26 2016 18:59
you still have those exports from @eraly's gist, right?
Justin Long
@crockpotveggies
Apr 26 2016 19:01
@treo correct
I've got them saved in my .bash_profile
export PATH=/usr/local/bin:$PATH
export OPENMP_HOME=/usr/local/Cellar/libiomp/20150701
export CLANGOMP_HOME=/usr/local/Cellar/clang-omp/2015-04-01
export PATH=$CLANGOMP_HOME/bin:$PATH
export C_INCLUDE_PATH=$CLANGOMP_HOME/libexec/include/clang-c:$OPENMP_HOME/include/libiomp:$C_INCLUDE_PATH
export CXX_INCLUDE_PATH=$CLANGOMP_HOME/libexec/include/clang-c:$OPENMP_HOME/include/libiomp:$C_INCLUDE_PATH
export CPLUS_INCLUDE_PATH=$CLANGOMP_HOME/libexec/include/c++/v1:$OPENMP_HOME/include/libiomp:$CPLUS_INCLUDE_PATH
export LIBRARY_PATH=$CLANGOMP_HOME/libexec/lib:$OPENMP_HOME/include/libiomp:$LIBRARY_PATH
export LD_LIBRARY_PATH=$CLANGOMP_HOME/libexec/lib:$OPENMP_HOME/include/libiomp:$LD_LIBRARY_PATH
#export DYLD_LIBRARY_PATH=$OPENMP_HOME/include:$LD_LIBRARY_PATH
export DYLD_LIBRARY_PATH=$CLANGOMP_HOME/libexec/lib:$OPENMP_HOME/include/libiomp:$LD_LIBRARY_PATH
the only change I had to make was that it appears libiomp had been upgraded to 20150701
Justin Long
@crockpotveggies
Apr 26 2016 19:11
@sadatanwer any chance you fixed the building issues at compile time with libnd4j?
Paul Dubs
@treo
Apr 26 2016 19:14
@crockpotveggies lets try it without the exports once again
Justin Long
@crockpotveggies
Apr 26 2016 19:14
okay let me remove it and create a clean shell
one sec
oh my god get out of here
it works without the exports
GET OUTTA HERE
[100%] Linking CXX shared library libnd4j.dylib
cd /Users/justin/Projects/libnd4j/blasbuild/cpu/blas && /usr/local/Cellar/cmake/3.4.1/bin/cmake -E cmake_link_script CMakeFiles/nd4j.dir/link.txt --verbose=1
clang-omp++   -march=native -Wall -g -Wall -fopenmp -std=c++11 -fassociative-math -funsafe-math-optimizations -dynamiclib -Wl,-headerpad_max_install_names  -o libnd4j.dylib -install_name @rpath/libnd4j.dylib CMakeFiles/nd4j.dir/cpu/NativeBlas.cpp.o CMakeFiles/nd4j.dir/cpu/NativeOps.cpp.o -framework Accelerate -framework Accelerate
[100%] Built target nd4j
/usr/local/Cellar/cmake/3.4.1/bin/cmake -E cmake_progress_start /Users/justin/Projects/libnd4j/blasbuild/cpu/CMakeFiles 0
okay looks like I'm back on track...so here's a recap and I suspect it's going to impact lots of OS X users
1) I ran xcode-selection --install which solved initial missing omp.h includes
2) I removed any exports in current shell and .bash_profile that referenced paths to openmp etc.
Justin Long
@crockpotveggies
Apr 26 2016 19:19
3) deleted and re-cloned a fresh repo of libnd4j, and started up a fresh Terminal
nice thinking @treo
Paul Dubs
@treo
Apr 26 2016 19:20
so, all that probably needs to be run is xcode-selection and a fresh shell
@crockpotveggies the exports have been a problem for several people with your symptoms, and on my test osx I have never needed them
Justin Long
@crockpotveggies
Apr 26 2016 19:21
yea, even though some users might have Xcode installed it would appear that the clang we want is not the default in the path until xcode-select --install
Sadat Anwar
@SadatAnwar
Apr 26 2016 19:22
@crockpotveggies no you should not use that gist
Paul Dubs
@treo
Apr 26 2016 19:22
I've commented on it, that it shouldn't be used
Justin Long
@crockpotveggies
Apr 26 2016 19:22
going to celebrate with a muffin. thanks @treo @agibsonccc @saudet @eraly for helping out on this
Paul Dubs
@treo
Apr 26 2016 19:23
@crockpotveggies can you add something about it to https://github.com/deeplearning4j/libnd4j/blob/master/macOSx10%20(CPU%20only).md as a pull request?
Sadat Anwar
@SadatAnwar
Apr 26 2016 19:23
@crockpotveggies I also uploaded an OSX setup instruction, did you try that?
Justin Long
@crockpotveggies
Apr 26 2016 19:24
@treo @sadatanwer ah did not see that particular guide, but yes will modify it to include the xcode bit
done
Paul Dubs
@treo
Apr 26 2016 19:32
merged
Romeo Kienzler
@romeokienzler
Apr 26 2016 19:37
Hi, just wanna try out libnd4j and it is written the following: Depends on the distro - ask in the earlyadopters channel for specifics on distro
Where can I find instructions for Ubuntu 14.04?
Paul Dubs
@treo
Apr 26 2016 19:38
use the search in the channel, that question has been answered already several times... We should really update that
Romeo Kienzler
@romeokienzler
Apr 26 2016 19:38
ok, thanks a lot, if I find it I'll do it :)
Set a LIBND4J_HOME as an environment variable. This is required for building nd4j as well. => Where shall it point to?
Paul Dubs
@treo
Apr 26 2016 19:44
to the libnd4j folder
Romeo Kienzler
@romeokienzler
Apr 26 2016 19:51
ah thanks!
./buildnativeoperations.sh blas cpu Debug
eval cmake
Running blas
CPU BUILD DEBUG
RUNNING COMMAND cmake
./buildnativeoperations.sh: line 124: cmake: command not found
make: * No targets specified and no makefile found. Stop.
(On Ubuntu 14.04)
raver119
@raver119
Apr 26 2016 19:55
check cmake & gcc versions
but in your case - you dont have cmake
Romeo Kienzler
@romeokienzler
Apr 26 2016 19:56
ok, thanks now have the wrong version, will fix...
Justin Long
@crockpotveggies
Apr 26 2016 19:58
I assume everything is fine at this stage, and enforcer didn't backtrack anything?
!!! You have to compile libnd4j with cuda support first!
Some required files are missing:
/Users/justin/Projects/libnd4j/blasbuild/cuda/blas
will ND4J complain if it can't find cuda?
I'm using CPU
Paul Dubs
@treo
Apr 26 2016 19:59
the simple mvn command tries to install everything, even cuda
if you want so skip cuda, add -pl '!:nd4j-cuda-7.5' to the command
Justin Long
@crockpotveggies
Apr 26 2016 19:59
ah perfect thanks!
interesting, it looks like enforcer still tries to test it even when I add the -DskipTests flag
[WARNING] The POM for org.nd4j:nd4j-cuda-7.5:jar:0.4-rc3.9-SNAPSHOT is missing, no dependency information available
Paul Dubs
@treo
Apr 26 2016 20:03
you are probably not on the current master for nd4j
Justin Long
@crockpotveggies
Apr 26 2016 20:03
just cloned it yesterday but let me give it a pull
yea says I'm up to date
mvn install -DskipTests -pl '!:nd4j-cuda-7.5' is what I'm running
Paul Dubs
@treo
Apr 26 2016 20:04
anyway, you can also skip nd4j-tests: -pl '!:nd4j-cuda-7.5,!:nd4j-tests'
Justin Long
@crockpotveggies
Apr 26 2016 20:40
have there been significant changes on Canova? specifically canova-nd4j-codec and canova-nd4j-image?
since 1-2 months ago
ah looks like so (checking the logs)
oh and is Scala 2.10 going to remain the default for mixed-in Akka libraries? having conflicts while importing DL4J dependencies
Paul Dubs
@treo
Apr 26 2016 20:48
akka is going to be dropped as far as I know
Romeo Kienzler
@romeokienzler
Apr 26 2016 20:53
CMake Error: The following variables are used in this project, but they are set to NOTFOUND.
Please set them or make sure they are set and tested correctly in the CMake files:
OpenBLAS_LIB (ADVANCED)
linked by target "nd4j" in directory /home/romeokienzler/dl4j/libnd4j/blas
So installed cmake > 3.2 & libblas-dev , any ideas?
Adam Gibson
@agibsonccc
Apr 26 2016 20:54
@romeokienzler just clone openblas and install run make && sudo make install..it might be easier
Either that or try libopenblas
libblas is insanely slow for anything
Romeo Kienzler
@romeokienzler
Apr 26 2016 20:55
@agibsonccc thanks!!
Paul Dubs
@treo
Apr 26 2016 20:56
also, building it yourself will provide you with a faster version than what you can get from the repos (as it will all your cpu's features)
Adam Gibson
@agibsonccc
Apr 26 2016 20:57
right
Justin Long
@crockpotveggies
Apr 26 2016 21:00
@treo yea I was having a hard time seeing the value of Akka in the project, unless it became a part of DL4S which is pretty much abandoned correct?
@treo having Spark integration makes Akka moot (was it being used for streaming? was that why Canova took over?)
Paul Dubs
@treo
Apr 26 2016 21:01
akka is/was used in the nlp part, but for something that actually doesn't need it
Justin Long
@crockpotveggies
Apr 26 2016 21:02
ah yea drop it while it's cold :snoop:
Paul Dubs
@treo
Apr 26 2016 21:02
raver119
@raver119
Apr 26 2016 21:03
yea, that last 2 things that actually were using akka were rewritten month+ ago
Justin Long
@crockpotveggies
Apr 26 2016 21:03
was it in ND4J repo and not DL4J?
raver119
@raver119
Apr 26 2016 21:03
no, dl4j
dl4j-nlp
Justin Long
@crockpotveggies
Apr 26 2016 21:04
so basically rm -rf deeplearning4j-scaleout-akka?
I'll do it right now :P
Romeo Kienzler
@romeokienzler
Apr 26 2016 21:15
ok git it compiled under Ubuntu 16.04, updated the documentation accordingly: deeplearning4j/libnd4j#147
git=>got
raver119
@raver119
Apr 26 2016 21:15
@crockpotveggies that's hadoop/spark/yarn, not dl4j-nlp
Adam Gibson
@agibsonccc
Apr 26 2016 21:16
@romeokienzler thank you sir merged
Romeo Kienzler
@romeokienzler
Apr 26 2016 21:16
@agibsonccc tnx 4 accepting :)
Justin Long
@crockpotveggies
Apr 26 2016 21:18
@raver119 ah I see my bad, there's no Job class in scaleout-akka
@raver119 okay let me see if I can remove it and submit a PR
@raver119 would be nice to get some of my Scala compatibility back in 2.11
raver119
@raver119
Apr 26 2016 21:20
for deeplearning4j-nlp we have planned akka removal, that's why stuff was rewritten there - to get rid of it there
Adam Gibson
@agibsonccc
Apr 26 2016 21:21
right
Justin Long
@crockpotveggies
Apr 26 2016 21:30
@raver119 I'm seeing a lot of akka in deeplearning4j-aws so I assume it will be left alone?
raver119
@raver119
Apr 26 2016 21:31
@crockpotveggies i've said ONLY about deeplearning4j-nlp, sorry
Adam Gibson
@agibsonccc
Apr 26 2016 21:31
yeah
raver119
@raver119
Apr 26 2016 21:31
aws is probably different story
Romeo Kienzler
@romeokienzler
Apr 26 2016 21:31
./buildnativeoperations.sh blas cuda
eval cmake
Running blas
CUDA BUILD RELEASE
CMake Error: The source directory "/home/romeokienzler/dl4j/libnd4j/blasbuild/cuda" does not appear to contain CMakeLists.txt.
Specify --help for usage, or press the help button on the CMake GUI.
make: * No targets specified and no makefile found. Stop.
I assume I have to install some CUDA packages?
Adam Gibson
@agibsonccc
Apr 26 2016 21:32
yeah
Romeo Kienzler
@romeokienzler
Apr 26 2016 21:32
ok tnx
raver119
@raver119
Apr 26 2016 21:32
./buildnativeoperations.sh blas cuda debug
Adam Gibson
@agibsonccc
Apr 26 2016 21:32
or: release
You DO need to install cuda though
Romeo Kienzler
@romeokienzler
Apr 26 2016 21:33
@raver119 @agibsonccc ok tnx I'll do so and check...
Adam Gibson
@agibsonccc
Apr 26 2016 21:33
Yeah I don't think I updated that script to default to release yet..
Justin Long
@crockpotveggies
Apr 26 2016 21:34
@raver119 okay just being thorough here. VocabActor doesn't have a deprecated flag but I assume it's also going to go?
in deeplearning4j-nlp ;)
Adam Gibson
@agibsonccc
Apr 26 2016 21:35
@crockpotveggies PRs :D
I don't want raver doing anything but cuda right now :P{
Justin Long
@crockpotveggies
Apr 26 2016 21:35
@agibsonccc @raver119 yes sir!
Adam Gibson
@agibsonccc
Apr 26 2016 21:36
Thanks!
There's a lot of low hanging fruit like that
Justin Long
@crockpotveggies
Apr 26 2016 22:21
@agibsonccc left some questions in there that you can address later: deeplearning4j/deeplearning4j#1459
Paul Dubs
@treo
Apr 26 2016 22:22
@crockpotveggies replace # with * in your issue
Justin Long
@crockpotveggies
Apr 26 2016 22:22
ha just saw that myself
done
Paul Dubs
@treo
Apr 26 2016 22:22
:+1:
Justin Long
@crockpotveggies
Apr 26 2016 23:00
here's an interesting question: even though I'm developing in Scala, do I still need to call JavaRDD instead of RDD when parallelizing a dataset?
Adam Gibson
@agibsonccc
Apr 26 2016 23:00
interchangeable
just call toRdd or toJavaRdd
if something requires either
Justin Long
@crockpotveggies
Apr 26 2016 23:01
gotcha thanks!
Justin Long
@crockpotveggies
Apr 26 2016 23:22
is LoggingEarlyStoppingListener an ideal way to "listen" in when deploying on Spark?
or is it just smarter to fire up the Histogram server? (I'm not sure which is better when training)
Justin Long
@crockpotveggies
Apr 26 2016 23:32
okay after looking deeper probably best to reframe my question: is it possible at all to boot up the Histogram listener server? Or is that implemented within EarlyStoppingListener?
Justin Long
@crockpotveggies
Apr 26 2016 23:39
ah figured it out, all good