by

Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Sameroom
    @sameroom-bot
    [nishino, chainer] This was not accurate. It does not work for links, but does work for other callables.
    # error
    seq = Sequential(L.Linear(3))
    seq.insert(100, L.Linear(4))
    
    # works
    seq = Sequential(F.sin)
    seq.insert(100, F.cos)
    Sameroom
    @sameroom-bot

    [nishino, chainer] To me, this comment looks worth reconsidering.
    https://github.com/chainer/chainer/pull/2918#pullrequestreview-104821928

    For this point, we can just override namedlinks() and skip dummy links.
    https://github.com/chainer/chainer/pull/2918#issuecomment-375222516

    Sameroom
    @sameroom-bot
    [himkt, chainer] I want to know what dev-team discuss about chainer/chainer#6351 in detail, if I can. :eyes:
    Sameroom
    @sameroom-bot
    [Seiya Tokui, chainer] We are discussing to introduce a new type that combines a padded ndarray (sorted by lengths) and the lengths information. cuDNN changes its main RNN APIs toward this direction, so we want to follow that. We are currently considering NStepRNN and some basic attention mechanism as an example usage, where some reduction and elementwise operations should be supported for this new type (e.g. softmax, sum, multiply, etc.). There is no concrete design plan yet.
    Sameroom
    @sameroom-bot
    [himkt, chainer] Thanks for giving comprehensive and detailed information! So does it indicates NStepRNN may become like RNN of PyTorch?
    (Personally, I really love the current interface of NStepRNN which takes a list of variable-length sequences)
    Sameroom
    @sameroom-bot
    [Seiya Tokui, chainer] We do not have a conclusion on whether to reuse the existing functions/links to overload the interface or to create new functions/links for that new interface.
    Sameroom
    @sameroom-bot

    [Masaki Kozuki, chainer] Question about how to use attr.cudnn in links tests.

    Current tests of L.BatchNormalization related to GPU/cuDNN (https://github.com/chainer/chainer/blob/master/tests/chainer_tests/links_tests/normalization_tests/test_batch_normalization.py#L119-L122) seems to contradict that of L.Convolution2D (https://github.com/chainer/chainer/blob/master/tests/chainer_tests/links_tests/connection_tests/test_convolution_2d.py#L166-L169).
    The former tests seem to check whether link.forward kicks CuPy implementation when cudnn is available while the latter tests seem to check whether link.forward correctly kicks cudnn implementation if cudnn is available.

    I know there will be LinkTestCase that will ease writing tests for links in a unified way, though, *which is more appropriate?*

    Sameroom
    @sameroom-bot

    [Masaki Kozuki, chainer] Currently, F.concat, F.dstack, F.hstack, F.stack, and F.vstack can take an ndarray as input while they assume xs to be a list of Variables or ndarrays. This is because they don’t do isinstance(xs, list).
    Is this expected?

    e.g.

    In [12]: x = [np.random.randn(1, 2) for _ in range(3)]
    
    In [13]: F.dstack(x)
    Out[13]:
     variable([[[ 1.06418134, -1.1030954 , -1.77550052],
                      [ 0.91533154,  1.22747268, -0.84523645]]])
    
     In [14]: F.dstack(np.asarray(x))
     Out[14]:
     variable([[[ 1.06418134, -1.1030954 , -1.77550052],
                      [ 0.91533154,  1.22747268, -0.84523645]]])
    Sameroom
    @sameroom-bot
    [Seiya Tokui, chainer] Not sure about the intention of the current code, but maybe it's safe to just assume that xs is a "sequence" of Variables or ndarrays. In this case, ndarray with ndim>0 is valid as xs.
    Sameroom
    @sameroom-bot

    [Kshiteej K, chainer] Hi,
    Going through the code base I saw,

    kMaxNdim = 10
    I am curious to know about this choice.
    Thank You.

    Sameroom
    @sameroom-bot
    [Masaki Kozuki, chainer] @tos As to the below link, I wonder there is there’s a consensus about eps values for each dtype. This is true to some methods like l2normalize, I think.
    https://github.com/chainer/chainer/pull/6655#issuecomment-480648016
    Sameroom
    @sameroom-bot
    [Masaki Kozuki, chainer] 無題
    [Masaki Kozuki, chainer] Is this expected behavior?
    In [1]: import numpy as np, chainer, chainer.links as L
    
    In [2]: chainer.print_runtime_info()
    Platform: Linux-4.4.0-141-generic-x86_64-with-debian-stretch-sid
    Chainer: 6.0.0rc1
    NumPy: 1.15.4
    CuPy:
      CuPy Version          : 6.0.0rc1
      CUDA Root             : /usr/local/cuda-9.2
      CUDA Build Version    : 9020
      CUDA Driver Version   : 10000
      CUDA Runtime Version  : 9020
      cuDNN Build Version   : 7004
      cuDNN Version         : 7004
      NCCL Build Version    : None
      NCCL Runtime Version  : None
    iDeep: 2.0.0.post3
    
    In [3]: D = chainer.mixed16
    
    In [4]: initialW, initial_bias = np.random.uniform(-1, 1, (20, 10)).astype(D), np.random.uniform(-1, 1, (20,)).astype(D)
    
    In [5]: linear = L.Linear(10, 20, initialW=initialW, initial_bias=initial_bias)
    
    In [6]: linear.W.dtype, linear.b.dtype
    Out[6]: (dtype('float32'), dtype('float32'))
    Sameroom
    @sameroom-bot
    [Masaki Kozuki, chainer] I guess the cause is that chainer.initializers.Constant ignores dtype [here](https://github.com/chainer/chainer/blob/master/chainer/initializers/init.py#L96-L104, https://github.com/chainer/chainer/blob/master/chainer/initializers/constant.py#L57).
    Also, I'm wondering the priority order of CHAINER_DTYPE and initializer.dtype.
    Sameroom
    @sameroom-bot
    [tos, chainer] It's unexpected to me.
    Sameroom
    @sameroom-bot
    [tos, chainer] There was an offline discussion on chainer/chainer#6116 on Feb 14, but dtype wasn't discussed.
    Sameroom
    @sameroom-bot
    [Seiya Tokui, chainer] Hmm
    [Seiya Tokui, chainer] I originally intended that the default dtype is used whenever there is no information to decide the dtype of a new array (thus the name "default dtype").
    [Seiya Tokui, chainer] So I personally think the above case should be float16 instead of float32 (because the given initial array has the information of dtype). Not sure if there is a subtle case that this behavior would not be natural...
    Sameroom
    @sameroom-bot
    [Masaki Kozuki, chainer] I will consider this more later, but one corner case I randomly come up with is BatchNormalization (this puzzles us always).
    So, if the default dtype is float32 (untouched by a user) and the initializer arrays for gamma and beta are half, then it’s almost impossible to correctly infer the dtype the user wants to use.
    Sameroom
    @sameroom-bot
    [Seiya Tokui, chainer] I expect that the user intentionally passed fp16 arrays in that case (thus it's ok to me that these parameters are initialized in fp16).
    Sameroom
    @sameroom-bot

    [Masaki Kozuki, chainer] Agree.

    Aside from that, I’m curious about why L.BatchNormalization allows dtype argument in its constructor?

    Sameroom
    @sameroom-bot
    [Seiya Tokui, chainer] I think it's a historical argument.
    [Seiya Tokui, chainer] It was added before the following options (use_gamma/beta, initial_***, ...).
    Sameroom
    @sameroom-bot
    [Masaki Kozuki, chainer] As to parameter initializers with dtype, I have some thoughts.
    1. I prefer dtype specification of whole network to a layer-wise manner. I mean, it’ll be much more intuitive and user-friendly if there’s a Link’s method such as to_half() and cast(mixed16).
    2. Current Initializers support too many patterns. So, if the fill_value can be a parameter as is, we should treat it separately. One example is implementing a class method like L.Convolution2D.from_params(cls, W, b=None, stride=1, pad=0, groups=1, dilate=1) because if a fill_value can be a parameter as is, in_size, out_size, and equivalent arguments are redundant.
    Sameroom
    @sameroom-bot

    [Seiya Tokui, chainer] 1. That sounds good for single-device models. I'm not sure what is the desirable behavior for multi-device models (model parallelism). One idea is to let user write the mapping of devices (e.g. for moving cpu/gpu hybrid model to multi-gpu, write mapping like cpu->gpu1, gpu0->gpu0). I think such a case is currently rare, so just starting from cast(dtype) for single-device models is also ok.

    1. The simplified creation function sounds interesting. We can start from major links like Linear and Convolution*D.

    Could you make issues for them? I think both ideas are reasonable to implement.

    Sameroom
    @sameroom-bot
    [Masaki Kozuki, chainer] Sure!
    Sameroom
    @sameroom-bot

    [Masaki Kozuki, chainer] So, I tentatively filed an issue that describe background of changes that I want to add.
    chainer/chainer#7040

    I’ll add detailed issues for my 2 ideas.

    Sameroom
    @sameroom-bot
    [Masaki Kozuki, chainer] I want the chainer core dev team to give contributors more information. There are some chainer specific conventions as to PRs/issues. For example, as to “pfnCI, test this please.” command and it’s results/failures, what the command is and who can use the command are unclear. Also, as some tests which are not in the scope of a PR are flaky, :x: appears on GitHub and might make contributors confused about what to do next, for instance, whether they are responsible for and fix them. Another example is implementing a new feature which is already supported by cuDNN. Do we have to prioritize supporting the cudnn implementation in CuPy to implementing it using chainer features while the latter is out-of-box?
    Sameroom
    @sameroom-bot
    [Seiya Tokui, chainer] As for CI, I thought it’s obvious from the sentence that this command is to kick CI. The CI is running in our in-house cluster, so only github.com/chainer members can kick it for security. I agree that such kind of information is better documented in the contribution guide, but the detail is changing over time, so I’m afraid of writing the details to the document.
    As for flaky tests, we are working hard on eliminating all the flaky tests, so I believe the situation will get better in the near future (sorry for the inconvenience for now).
    As for cuDNN…I do not think I have something that can newly reveal to contributors on it (nothing secret there); let’s just discuss such things in issues/PRs.
    Sameroom
    @sameroom-bot

    [Masaki Kozuki, chainer] 1. As to CIs. I got your point that the detail is changing though, IMO github’s wiki is more appropriate place if README has the link to the wiki to write such information with changing details since there is no need to file PRs.

    1. Flaky tests. I do know the team has been working so hard to improve the situation. But, since different repositories have different rules / conventions as to PRs and their reviews, but it should be better to say that we can ignore failures unrelated to our PRs in the same PRs.
    2. cuDNN. I’m sorry here for being ambiguous.
      Hereafter, what I write would be about __the style of review__ and different from what I initially said :bow:. I implicitly mentioned chainer/chainer#7879. There, I now understand the feature should be marked as experimental since the CuPy doesn’t support multihead attention yet. But the original PR got no reaction for a while despite there is an assignee. I submitted a PR as WIP (draft PR) initially and there is a possibility the one did not notice that I set it ready for review. Since GitHub officially supports draft PRs, we have to some rules as to them. I mean, if one mark his/her PR as ready for review, whether to mention the assignee.
      Also, while I think I know the team has a ton to do and sometimes it’s inevitable to suspend reviews and leave PRs to review, in such situation, I want some words because I have no way to tell the status: whether the reviewers forget the existence of PRs or just they are so busy to do that.

    Thanks.

    Sameroom
    @sameroom-bot
    [nishino, chainer] +1 for writing such information in the GitHub wiki.
    [nishino, chainer] I think the reviewer guide (currently internal) can be moved to the Wiki, too
    Sameroom
    @sameroom-bot
    [himkt, chainer] chainer/chainer#7882
    Does anyone have a opinion for naming choice of number variable?
    I think it is good to unify n_{prural of noun} because it seems to be most frequently used at a glance.
    Sameroom
    @sameroom-bot
    [Masaki Kozuki, chainer] +1 for n_<prural>
    Sameroom
    @sameroom-bot
    [Seiya Tokui, chainer] I left some comments. +1 for n_{plural}, too!
    Sameroom
    @sameroom-bot
    [himkt, chainer] Thanks @crcrpar and @beam2d!
    We will work based on this policy. thumbsup emoji
    Sameroom
    @sameroom-bot
    [Ishan Rai, chainer] I am getting theses errors with the latest version of pytest. The older version seems to work though.
    [Ishan Rai, chainer] `\
    Sameroom
    @sameroom-bot
    [Ishan Rai, chainer] Error with pytest
    Sameroom
    @sameroom-bot
    [UmashankarTriforce, chainer] Can someone explain what backward or backward_gpu aims to do in chainer functions?
    Sameroom
    @sameroom-bot
    [Do Anh Tuan, chainer] hi everyone
    [Do Anh Tuan, chainer] I investigate YOLOv2 and Chainer
    [Do Anh Tuan, chainer] How do I can load a custom data set to training with Chainer?
    Sameroom
    @sameroom-bot
    [UmashankarTriforce, chainer] Hey guys! I have a basic question
    How do I convert chainer.Variable to cupy.ndarray?
    Sameroom
    @sameroom-bot
    [Do Anh Tuan, chainer] I have 3 GPU Nvidia Titan V. How do I can use 3 GPU for training use Chainer?
    Sameroom
    @sameroom-bot
    [Ishan Rai, chainer] Is there a way print the summary of a model in chainer?
    Sameroom
    @sameroom-bot
    [Ishan Rai, chainer] Should we add a function like model.summary() like that of keras that prints a summary representation of our model or simply print(model) like that of pytorch that gives us some ideas about different layers involved and their specifications.
    I understand that we can always use graphviz but it involves number of steps to be performed. Something like a quick summary would definitely make it easy to analyze the network.
    Sameroom
    @sameroom-bot
    [Cloud Han, chainer] ChainerX only support cudaDeviceSynchronize at the moment, is there any plan to add cuda stream related API?