Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
Justin Payan
@justinpayan
@elbamos Wow, thank you!
Nilesh Kulkarni
@nileshkulkarni

Hey,
I am trying to run this example recurrentlanguagemodel.lua using the penn tree dataset
When I train it with cuda flag as false it runs perfectly file.
But on trying to run with cuda it gets a segmentation fault.

Following is my log
http://pastebin.com/PH9R0QMF

Any debugging helps would be great. How to go about solving this.

Thanks,
Nilesh

arunpatala
@arunpatala
Hi is there a way to do data augmentation (such as rotation,crop etc) with dpnn? Any example code would also be helpful? Thanks
elbamos
@elbamos
@arunpatala its absolutely possible, I do it.
arunpatala
@arunpatala
Any pointers on how to approach that ? @elbamos
elbamos
@elbamos
Be careful with allocating your buffers if you plan to multithread.
cjviper
@cjviper
@arunpatala I perform data augmentation explicitly in advance, by using graphics magick commands. I wrote simple shell scripts that perform the initial resizing along with the crops/rotations etc.
Sanuj Sharma
@sanuj
@elbamos do you know how to change when the model will be saved while training a neural net? Currently it happens when there is minimum validation error.
elbamos
@elbamos
Yes, I do. And no, it doesn't.
Sanuj Sharma
@sanuj
@elbamos can you point me to the code that controls the saving of model. And what's the default criteria of saving of the model and how do i get to know about it?
elbamos
@elbamos
@sanuj Read the documentation for the Observer class and for the ErrorMinima class and it superclassers and subclasses.
Sanuj Sharma
@sanuj
Thanks @elbamos . That was helpful
Sanuj Sharma
@sanuj
Due to limited RAM i have multiple data-sets for training. I want to change the dataset after each epoch but the lua garbage collector is not able to clear the old dataset to make way for the newer one.
code looks like:
function loadData(train_data, validate_data)
    ds = nil
    train = nil
    valid = nil
    train_target = nil
    valid_target = nil
    train_input = nil
    valid_input = nil
    n_valid = nil
    n_train = nil
    nuclei_train = nil
    nuclei_valid = nil
    nuclei_train = torch.load(train_data)
    nuclei_valid = torch.load(validate_data)
    nuclei_train.data = nuclei_train.data:double()
    nuclei_valid.data = nuclei_valid.data:double()
    n_valid = (#nuclei_valid.label)[1]
    n_train = (#nuclei_train.label)[1]

    train_input = dp.ImageView('bchw', nuclei_train.data:narrow(1, 1, n_train))
    train_target = dp.ClassView('b', nuclei_train.label:narrow(1, 1, n_train))
    valid_input = dp.ImageView('bchw', nuclei_valid.data:narrow(1, 1, n_valid))
    valid_target = dp.ClassView('b', nuclei_valid.label:narrow(1, 1, n_valid))

    train_target:setClasses({0, 1, 2})
    valid_target:setClasses({0, 1, 2})

    -- 3. wrap views into datasets

    train = dp.DataSet{inputs=train_input, targets=train_target, which_set='train'}
    valid = dp.DataSet{inputs=valid_input, targets=valid_target, which_set='valid'}

    -- 4. wrap datasets into datasource

    ds = dp.DataSource{train_set=train, valid_set=valid}
    ds:classes{0, 1, 2}
end
while true do
    train_data = '/home/sanuj/Projects/nuclei-net-data/fine-tune/1/train.t7'
    validate_data = '/home/sanuj/Projects/nuclei-net-data/fine-tune/1/validate.t7'
    loadData(train_data, validate_data)
    print 'Using data-set 1.'
    train_data = '/home/sanuj/Projects/nuclei-net-data/fine-tune/2/train.t7'
    validate_data = '/home/sanuj/Projects/nuclei-net-data/fine-tune/2/validate.t7'
    print 'Using data-set 2.'
    loadData(train_data, validate_data)
    xp:run(ds)
end
Sanuj Sharma
@sanuj
it is able to clear ds if i don't call xp:run(ds) between two loadData calls but the xp:run(ds) adds more references to ds i guess which stops the garbage collector to clear it. Don't know how to fix it. @elbamos can you help me with this?
elbamos
@elbamos
instead of changing datasets after each epoch, what you want to do is write a subclass of DataSet that produces the data you want after each epoch. handling memory considerations of moving training data in and our of ram is a responsibility of the DataSet and DataSource objects
Sanuj Sharma
@sanuj
@elbamos how will the subclass change the dataset after every epoch? does it need to subscribe to some event to get notified?
elbamos
@elbamos
are you asking me how it will know whne one epoch ends and another begins?
it could subsribe if you wanted to do it that way, but there's an easier way: depending on the Sampler you use, the system will decide what rows to ask for in what order. So you tell the dataset to tell dp there are however many rows there actually are, and when that number of rows has been processed, it won't ask for any more and the epoch will be over. if different parts of your dataset have different numbers of rows, then the simplest thing is to just ignore what is one epoch and what is another. you just produce batches in what order you want them processed, and when it runs out from one place it starts taking batches from someplace else. then, the length of an epoch is just however often you want to see feedback reports
Sanuj Sharma
@sanuj
i think i would have to understand the internals of dp
elbamos
@elbamos
no just study the way imagedataset etc are implemented
Sanuj Sharma
@sanuj
thanks @elbamos :smile: , i'll try this tomorrow. I hope it works.
elbamos
@elbamos
@sanuj it definitely works. There are examples included with dp specifically to show how to handle when you have a dataset that's too large to fit in memory.
Jacky Yang
@anguoyang

Hi, all,
Could anyone be kindly to help me on this issue? thanks:

We have lots of photos/images, say 10 million or more, they are original photos/images from our customers which need to be protected(To prevent plagiarism), here we call it as dataset A.
We also got lots of images by way of web crawler, from bloggers, websites, forum, etc. some of these images are simply copied from dataset A, some added with additional watermark, we call it as dataset B. it currently contains about 300000 images, but will grow day by day.
We will use 1 image or several images from dataset A, we call it as dataset C, we want to search images in B which is similar with C, and list all similar images.

We want to use deep learning for similarity search, but most of the images in dataset A has no tag, could we train these images into a specific model, then we could get more accurate result while searching similar images?

Thanks a lot for your patience to read this long requirement, and have a nice day!

elbamos
@elbamos
@anguoyang that's very similar to work I've done - you can pm me
Sanuj Sharma
@sanuj
hey @elbamos i was trying to do transfer learning with dp. I want different learning rates for each layer in my cnn. How can I do that? Here is the script that I'm using.
elbamos
@elbamos
@sanuj In dp, when you create your dp.Optimizer objection, you define a function called callback. The callback function is executed after every batch on the training set, and it performs the actual parameter updates. Your script uses the simple callback from one fo the dp examples. If you want a learning pattern other than simple SGD - like adding momentum, norms, cutoffs, etc. - you do it in dp by modifying the callback function.
@sanuj Can we assume that you were able to resolve the large-data issue you were having a few weeks ago?
Sanuj Sharma
@sanuj
@elbamos thanks for your reply. I had used ImageSourcewhich allows to read each batch from the hard drive but it was slow as I don't have an SSD. I couldn't make it read data for each epoch instead of each batch but then I don't need it anymore so didn't try further.
Jay Devassy
@jaydevassy
I have a trained convnet for object identification. I need to "run" it on a larger image to locate the target object in the larger image. How do I leverage the inbuilt convolution operation/module in torch to do this. Not worried about scale or rotation invariance at this point. Basically trying to avoid using a sliding window approach over the larger image which would be inefficient (most of the computations would be thrown away at next window position). Any ideas or pointers? Thx
elbamos
@elbamos
@jaydevassy train a conv net as a pixel classifier or build an attention model
Lior Uzan
@ghostcow
Hey guys, anyone know why this was merged into dp?
nicholas-leonard/dp#197
the commit msg reads that its to circumvent the lua 2GB address space limit, but before it was also a Tensor and Tensor memory isn't stored in the lua heap, and there shouldn't be a problem using >2GB tensors at all. So what's the idea here?
Jin Hwa Kim
@jnhwkim
somewhat harsh question: how about torchnet comparing with dp? straighforward implementation
young for wild cases.
Soumith Chintala
@soumith
@jnhwkim both are very similar
Nicholas Léonard
@nicholas-leonard
@ghostcow I just merged it because Float is indeed more efficient than Double. But you are right that it does not circumvent the 2GB limit. For that you should install torch with Lua instead of LuaJIT (see torch.ch getting started).
I would recommend taking a look at torchnet if you like dp's style. For myself, I now prefer to do without either, and just write my own training scripts (more flexible in the end).
elbamos
@elbamos
@nicholas-leonard Maybe dp2 could wrap/extend torchnet? I continue to find considerable benefit to the design pattern framework implemented by dp. But I'm wondering if your interests have really just progressed at this point?
Nicholas Léonard
@nicholas-leonard
Yes in a way they have progressed. But you are right that torchnet could definitely benefit from some extensions
Remi
@Cadene

@nicholas-leonard

I would recommend taking a look at torchnet if you like dp's style. For myself, I now prefer to do without either, and just write my own training scripts (more flexible in the end).

Could you please be more explicit and include examples to illustrate you statement ?
It must be easier to train non standard models such as adversarial networks on a dedicated architecture, but it takes a lot of time to code/test.
Would it possible to add a plugin to dp / torchnet for doing that ?

biterbilen
@biterbilen
any experience with anomaly detection (one-class calssifiers?)
Abdullah Jamal
@abdullahjamal
Hi guys, can we use dp. GCN or dp.ZCA for any other dataset ? In examples, it looks like the ZCA or GCN can only be used if dataset is coming for dp library
AnkurRaj
@AnkurRaj
autoencoder source code
Punita Ojha
@punitaojha
This message was deleted
Renato Marinho
@renatomarinho
This message was deleted
Renato Marinho
@renatomarinho
This message was deleted