Where communities thrive

  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
Repo info
    Yonatan Koren
    Just gonna use google cloud because per hour billing is annoying
    Andre Bieler
    Is it possible to resize kernel size of a pooling layer at run time? I want to use this to go from variable inputs to a fixed size output for the fully connected layer in a CNN. E.g. the idea is to fix a number of pooling regions, say 10 and then have the kernel size (width, height/10) where width and height differ.
    resize kernel size of a pooling layer at run time is extremely hard for devs I think. The better practice would be link multiple pool to conv layer
    Andre Bieler
    @skywalkerytx I do not quite follow your argument. Are you saying connect N fixed sized pooling layers to a conv layer? How does this fix the variable size of the conv layer?
    Is NDArray.argmaxChannel changed in scalapkg's 0.9.3a?I can use which one to replace now? I'm running mnist from tutorial .
    I'm trying mxnet with scala. I'm attempting to include shared objects(i.e. libcblas.so.3, ...) to jar file and I expect it will be read from jar file when code run. Is this possible?
    Couldn't we get such as "NDArray.show"?
    In python, I should use NDArray.asnumpy. However, in scala, what shall I do?
    Perhaps, is Nothing the method?
    Could we use NDArrayIter as DataIter at Feedforward in scala? I got an ERROR in fit() with NDArrayIter.
    At the moment, I made XOR Feedforward network with Scala and It was running.
    The MxNet for Scala dose not have enough error checks yet. I made an effort for using NDArrayIter. That will be crashed in libmxnet-scala.so when it was set wrong NDArray Shape...
    Hi everybody ! Does anyone has any resources about Transfer Learning in MXNet for Julia ? All I have found was about either Python API, learning from scratch or using a pretrained network without retraining.
    Daniel Haehn
    hi there! my X_train are "images" with 6 features per pixel. how can i use a mx.io.NDArrayIter with them?
    error is TypeError: Invalid type '<type 'numpy.ndarray'>' for data, should be NDArray or numpy.ndarray
    X_train.shape = (80000, 119, 119, 6)
    Daniel Haehn
    I was hoping that the NDArrayIter takes care of it
    Andre Bieler
    train_iter = mx.io.NDArrayIter(data=X_train, label=your_label_array, batch_size=your_batch_size, shuffle=True/False)
    be aware that the correct format for a convolution layer is data: (batch_size, channel, height, width), where I think in your case you have the channels as last index.
    Daniel Haehn

    @abieler thank you! I swapped the axis but still no luck:

    train_iter = mx.io.NDArrayIter(data=X_train, label=Y_train, batch_size=batch_size, shuffle=True)


    TypeError: Invalid type '<type 'numpy.ndarray'>' for data, should be NDArray or numpy.ndarray
    print type(X_train)
    print type(Y_train)
    print X_train.shape
    print Y_train.shape
    <type 'numpy.ndarray'>
    <type 'numpy.ndarray'>
    (80000, 6, 119, 119)
    mx.__version__ = '0.10.0'
    Daniel Haehn

    Funny, if I restrict the size

    train_iter = mx.io.NDArrayIter(data=X_train[0:50000], label=Y_train, batch_size=batch_size, shuffle=True)

    it works.. I did some tests with the NDArray constructor to figure this out:

    a = mx.nd.array(X_train[0:50000]) # no problem
    a = mx.nd.array(X_train[0:60000]) # fails
    MXNetError: [16:37:29] include/mxnet/././tensor_blob.h:247: Check failed: this->shape_.Size() == shape.Size() (5097960000 vs. 802992704) TBlob.get_with_shape: new and old shape do not match total elements
    Stack trace returned 10 entries:
    [bt] (0) /home/dhaehn/D1/lib/python2.7/site-packages/mxnet/libmxnet.so(+0x18b0dc) [0x7f655a9510dc]
    [bt] (1) /home/dhaehn/D1/lib/python2.7/site-packages/mxnet/libmxnet.so(+0x27d680) [0x7f655aa43680]
    [bt] (2) /home/dhaehn/D1/lib/python2.7/site-packages/mxnet/libmxnet.so(+0x27db65) [0x7f655aa43b65]
    [bt] (3) /home/dhaehn/D1/lib/python2.7/site-packages/mxnet/libmxnet.so(+0xc8047d) [0x7f655b44647d]
    [bt] (4) /home/dhaehn/D1/lib/python2.7/site-packages/mxnet/libmxnet.so(+0xc5d29b) [0x7f655b42329b]
    [bt] (5) /home/dhaehn/D1/lib/python2.7/site-packages/mxnet/libmxnet.so(MXNDArraySyncCopyFromCPU+0xa) [0x7f655b2f0a1a]
    [bt] (6) /lib64/libffi.so.6(ffi_call_unix64+0x4c) [0x7f66065e3dcc]
    [bt] (7) /lib64/libffi.so.6(ffi_call+0x1f5) [0x7f66065e36f5]
    [bt] (8) /home/dhaehn/D1/lib64/python2.7/lib-dynload/_ctypes.so(_ctypes_callproc+0x30b) [0x7f66067f6c8b]
    [bt] (9) /home/dhaehn/D1/lib64/python2.7/lib-dynload/_ctypes.so(+0xaa85) [0x7f66067f0a85]
    # but..
    a = mx.nd.array(X_train[50000:80000]) # no problem
    i have 200 GB free memory so there shouldn't be any issue
    Andre Bieler
    Well, it is a big matrix (> 25 GB for 32bit floats), sure you have that much RAM available? (not disk space)
    Daniel Haehn
    yea i got 512 GB RAM
    its 16 bits
    and also the error should be different, no?
    print X_train.shape, Y_train.shape, X_train.nbytes
    print X_val.shape, Y_val.shape, X_val.nbytes
    print X_test.shape, Y_test.shape, X_test.nbytes
    (212700, 6, 119, 119) (212700,) 36144536400
    (70900, 6, 119, 119) (70900,) 12048178800
    (70900, 6, 119, 119) (70900,) 12048178800
    t0 = time.time()
    batch_size = 100
    train_iter = mx.io.NDArrayIter(data=X_train, label=Y_train, batch_size=batch_size)
    val_iter = mx.io.NDArrayIter(data=X_val, label=Y_val, batch_size=batch_size)
    test_iter = mx.io.NDArrayIter(data=X_test, label=Y_test, batch_size=batch_size)
    print 'iterators configured', time.time()-t0, 'seconds'
    TypeError                                 Traceback (most recent call last)
    <ipython-input-6-713633301713> in <module>()
          4 t0 = time.time()
          5 batch_size = 100
    ----> 6 train_iter = mx.io.NDArrayIter(data=X_train, label=Y_train, batch_size=batch_size)
          7 val_iter = mx.io.NDArrayIter(data=X_val, label=Y_val, batch_size=batch_size)
          8 test_iter = mx.io.NDArrayIter(data=X_test, label=Y_test, batch_size=batch_size)
    /home/dhaehn/D1/lib/python2.7/site-packages/mxnet/io.pyc in __init__(self, data, label, batch_size, shuffle, last_batch_handle, data_name, label_name)
        577         super(NDArrayIter, self).__init__(batch_size)
    --> 579         self.data = _init_data(data, allow_empty=False, default_name=data_name)
        580         self.label = _init_data(label, allow_empty=True, default_name=label_name)
    /home/dhaehn/D1/lib/python2.7/site-packages/mxnet/io.pyc in _init_data(data, allow_empty, default_name)
        485             except:
        486                 raise TypeError(("Invalid type '%s' for %s, "  % (type(v), k)) + \
    --> 487                     "should be NDArray or numpy.ndarray")
        489     return list(data.items())
    TypeError: Invalid type '<type 'numpy.ndarray'>' for data, should be NDArray or numpy.ndarray
    $ free -h
                  total        used        free      shared  buff/cache   available
    Mem:           503G        118G        156G        695M        228G        384G
    Swap:          2.0G        1.1G        955M
    Daniel Haehn
    @abieler what do you suggest to work around this?
    if i pass only 50000 elements to the iterators at once, it works
    Daniel Haehn
    this is the problem: apache/incubator-mxnet#6195
    Andre Bieler
    I do not have an answer... For what its worth I wrote a hdf5 data iterator, you can find the code here: apache/incubator-mxnet#6872
    maybe you can write your data to a hdf5 file and use this iterator which only loads data batch by batch into memory.
    Daniel Haehn
    ok great, I will try that. Thank you!
    Daniel Haehn
    seemed to work but then this happens
    TypeError                                 Traceback (most recent call last)
    <ipython-input-17-ae9607bb45a5> in <module>()
         10           force_init = True,
         11           begin_epoch=0,
    ---> 12           num_epoch=1000
         13           )
         14 print 'training complete', time.time()-t0, 'seconds'
    /home/dhaehn/D1/lib/python2.7/site-packages/mxnet/module/base_module.pyc in fit(self, train_data, eval_data, eval_metric, epoch_end_callback, batch_end_callback, kvstore, optimizer, optimizer_params, eval_end_callback, eval_batch_end_callback, initializer, arg_params, aux_params, allow_missing, force_rebind, force_init, begin_epoch, num_epoch, validation_metric, monitor)
        493                     end_of_batch = True
    --> 495                 self.update_metric(eval_metric, data_batch.label)
        497                 if monitor is not None:
    /home/dhaehn/D1/lib/python2.7/site-packages/mxnet/module/module.pyc in update_metric(self, eval_metric, labels)
        678             Typically ``data_batch.label``.
        679         """
    --> 680         self._exec_group.update_metric(eval_metric, labels)
        682     def _sync_params_from_devices(self):
    /home/dhaehn/D1/lib/python2.7/site-packages/mxnet/module/executor_group.pyc in update_metric(self, eval_metric, labels)
        546         for texec, islice in zip(self.execs, self.slices):
        547             labels_slice = []
    --> 548             for label, axis in zip(labels, self.label_layouts):
        549                 if axis == 0:
        550                     # slicing NDArray along axis 0 can avoid copying
    TypeError: zip argument #1 must support iteration
    train_iter = HDF5ArrayIter(data=X_train, label=Y_train, batch_size=batch_size)
    X_train.dtype, Y_train.dtype
    (dtype('float32'), dtype('bool'))
    @abieler any idea?
    Andre Bieler
    I updated the getlabel() function in the code from the github issue mentioned above. I didnt test with bool type labels, only integers. maybe you want to try ints as labels if the new version still fails.
    Andre Bieler
    Any luck @haehn ?
    Daniel Haehn
    I didn't try anymore and switched to keras/TF for now. It just worked. Once I have more time, I will try MXNet again.
    Jordan Green
    Hi all, has anyone got a process they used to successfully build on nvidia TK1?
    Jordan Green
    I managed to get it working by using the mxnetOnACL fork
    Andre Bieler
    FYI @haehn i saw last week that mxnet now supports h5py data sets with NDArrayIter https://mxnet.incubator.apache.org/api/python/io.html#mxnet.io.NDArrayIter.

    On a different topic:

    I tried to put together a linear regression toy example, which is easy enough. However, I would like to replace mx.sym.FullyConnected() by the dot productmx.sym.dot(X, w) for educational purposes. Not sure how to achieve that the optimizer recognizes to optimize w in this case. Full example below:

    m = 1000
    batch_size = 100
    nVars = 4
    data = np.random.normal(0,1, (m, nVars))
    labels = -10 * data[:,0] + data[:,1]*np.pi + 5 * np.sqrt(abs(data[:,2])) - data[:,3] + np.random.normal(0,1, m)*2
    train_iter = mx.io.NDArrayIter(data={'data':data}, label={'labels':labels}, batch_size=batch_size)
    X = mx.sym.Variable('data', shape=(batch_size, nVars))
    y = mx.sym.Variable('labels', shape=(batch_size))
    w = mx.sym.var(name='theta', shape=(nVars), init=mx.initializer.Normal())
    fc = mx.sym.FullyConnected(data=X, name='fc1', num_hidden=1)
    fc_dot = mx.sym.dot(X, w)
    yhat = mx.sym.LinearRegressionOutput(fc, label=y, name='yhat')
    yhat_dot = mx.sym.LinearRegressionOutput(fc_dot, label=y, name='yhat')
    model = mx.mod.Module(symbol=yhat, data_names=['data'], label_names=['labels'])
    model.fit(train_iter, num_epoch=10)
    pred = model.predict(train_iter).asnumpy().flatten()
    model_dot = mx.mod.Module(symbol=yhat_dot, data_names=['data'], label_names=['labels'])
    model_dot.fit(train_iter, num_epoch=10)
    pred_dot = model_dot.predict(train_iter).asnumpy().flatten()
    np.mean(pred - labels)
    is it appropriate to ask gluon questions here?