Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    JSong-Jia
    @JSong-Jia

    Hello Everyone! Sorry for bothering you. NNI Team hopes to dig deeper about our user’s usage and scenario, as well as collecting feedback about NNI, to assess customer needs and provide effective solutions.

    If any user of our Gitter Chat Group would like to join our interview, please send a message to Gitter Chat. We really appreciate your support. Thank you so much!

    wei_xin1
    @Wastexin
    image.png
    when i try to run the example I get the error like this
    how can i fix it
    thank you
    wei_xin1
    @Wastexin
    i haved fix it , i fix my poweshell and it works
    Scarlett Li
    @scarlett2018

    i haved fix it , i fix my poweshell and it works

    😁👍

    Dassdy
    @Dassdy

    Hello!When creating a job on kubeflow, I met this error

    "error":"TrainingService setClusterMetadata timeout. Please check your config file."

    My exp_config.yml file is like below.

    authorName: default
    experimentName: example_dist
    trialConcurrency: 1
    maxExecDuration: 1h
    maxTrialNum: 9
    trainingServicePlatform: kubeflow
    searchSpacePath: search_space.json
    nniManagerIp: 10.50.200.190
    useAnnotation: false
    tuner:
    builtinTunerName: TPE
    classArgs:
    optimize_mode: maximize
    trial:
    codeDir: .
    worker:
    replicas: 1
    command: python3 mnist.py
    gpuNum: 0
    cpuNum: 1
    memoryMB: 8192
    image: msranni/nni:latest
    kubeflowConfig:
    operator: tf-operator
    apiVersion: v1
    storage: nfs
    nfs:
    server: 10.50.200.200

    path: /home/admin/nfs

    And then start the experiment

    nnictl creat --config exp_config.yml

    my nni Environment:

    nni version: latest
    OS: centos7
    python version: 3.6
    is conda or virtualenv used?: NO
    k8s version: 1.18.2
    kubeflow version 1.0.1
    my k8s cluster with 2 nodes (IP 10.50.200.190 as master and IP 10.50.200.200 as slave)
    nfs server also running on 10.50.200.200

    godfather
    @curiousjit
    Hello
    I want to use my own dataset while running the NAS tuners
    Running CIFAR10 is straightforward since it's used a torch dataset
    Can someone help me how to use my own dataset for DARTS or ENAS tuner?
    joe807191330
    @joe807191330
    hello everyone!
    Who can pull me into a WeChat group ?
    Ragav Venkatesan
    @ragavvenkatesan

    Hi Just realizing this Gitter. Forgot about it.
    Please refer to by stack overflow question here https://stackoverflow.com/questions/62403788/multiple-host-multiple-gpu-trials

    Basically, I am wondering, If

    i can have distributed trials.
    akchoi
    @akchoi
    Hi, im running NNI on AWS ec2, but web ui doesnt work and all my trials give me a "failed", when I run NNI on my local (macbook air) the trials run, but when I run it on AWS ec2, it gives me a failed, does anyone know hwo to fix this problem?
    Scarlett Li
    @scarlett2018
    @joe807191330 - 首页有wechat的二维码
    @akchoi can you share the error logs?
    akchoi
    @akchoi
    @scarlett2018 Turns out it is more of a AWS ec2 problem. I am using a free-trial version so GPU is very limited
    Pratima
    @pratzz
    Hi, is nnictl the only way to create an experiment or is there any other option like an API to start an experiment?
    chicm-ms
    @chicm-ms
    @pratzz there is a python api nnicli.start_nni which is a wrapper of nnictl, you can check this notebook : https://github.com/microsoft/nni/blob/master/examples/notebooks/retrieve_nni_info_with_python.ipynb
    Pratima
    @pratzz
    @chicm-ms Thank you so much!
    Laughingz23
    @Laughingz23
    Who can pull me into a WeChat group ?It is over 100. my wechat is LaughingZ23
    Ken
    @HongzoengNg
    may i have one question, i am writing my own class nas search program with builtin tuner of ppo, but when i start running "nnictl create --config config.yaml", it was prompted with tuner error with "ERROR: builtinTunerName should be in range of ('TPE', 'Random', 'Anneal', 'Evolution', 'BatchTuner', 'GridSearch', 'NetworkMorphism', 'MetisTuner', 'GPTuner', 'PBTTuner')!"
    hi, anyone can help? thx :)
    Scarlett Li
    @scarlett2018
    @Laughingz23 - Junwei will add you later.
    Scarlett Li
    @scarlett2018
    @HongzoengNg PPO need install first.
    nnictl package install --name=PPOTuner
    LovPe
    @LovPe
    Could you also pull me into weChat group the QR is out of date? my wechat is lovpeChen
    marsggbo
    @marsggbo
    微信群二维码失效了,还能再分享一下吗
    Junwei Sun
    @JunweiSUN
    @marsggbo 很抱歉,已经更新了~
    @LovPe I’ll add you and pull you into the group
    DanceInDark
    @DanceInDark
    请问NNI在接入使用DDP(distributedDataParralel)分布式程序是DDP进程启动失败是什么原因?
    有人遇到类似的问题吗?
    Ce Gao
    @gaocegege
    报错是什么
    Earl
    @Aierhaimian
    求拉微信群
    Junwei Sun
    @JunweiSUN
    @Aierhaimian 首页的微信二维码已经更新了,您可以再扫一下
    Earl
    @Aierhaimian
    嗯,好的
    PeijieSun
    @PeiJieSun
    @JunweiSUN 你好,请问微信二维码在哪里可以看到呢,谢谢
    @JunweiSUN 不好意思,我已经加入中转群了。谢谢
    michele fraccaroli
    @micheleFraccaroli
    Good morning, how can I log all ENAS training with tensorboard? It is possible?
    or other option like Weights & Biases if not Tensorboard
    HeekangPark
    @HeekangPark
    Hi, I believe there was a branch implementing FairNAS. Is there any branch or repo implementing FairNAS?
    Oli2
    @Oli2
    Hello, are any of you aware if there is any documentation or set of guidelines to help run NAS experiments with my own data sets rather than CIFAR and ImageNet ?
    Scarlett Li
    @scarlett2018
    @Oli2 - the data processing part is handle by your own python codes, so it is very similar to how you work on CIFAR, imageNet. or, are you asking how to upload and store date if the data is not on a public website like imagenet?
    @HeekangPark - we don't have FairNAS reference implementation yet, you are highly encourage to contribute one, or submit an issue request for it.
    we are working on enable tensorboard , @ultmaster - any suggestions to @micheleFraccaroli .
    Oli2
    @Oli2
    @scarlett2018 Thank you Scarlett. I am rather asking about the case where I want to re-define the search space. E.g. in CDARTS, the operations available to build a cell are defined in ops.py. I also see that the class Model in model.py should be adjusted to my specific case but I am not clear where else I need to customize the code to make it work ?
    hamideh-h
    @hamideh-h

    Hi everybody, I have problems with running "nnictl create --config nni\examples\trials\mnist-tfv1\config_windows.yml" . It gets me an error as
    TypeError: expected str, bytes or os.PathLike object, not NoneType

    Can anyone possibly help me

    1 reply
    Savan Visalpara
    @savan77

    Hi,

    We are experimenting with NNI, specifically hyperparameter tuning, and running into one issue. I would appreciate any help.
    We are running hyperparameter tuning with --foreground flag as we want it not to shut down before all the trials completed. However, with the --foreground flag set it never gets terminated. The log says "Experiment done", but the process never gets terminated.

    Any help will be appreciated.
    Thanks

    LuckyGuySam
    @LuckGuySam
    Hellow everybody!I have some problem!
    I use nni to run the mnist example with pytorch,but the final result can't show on the WebUI.The nnimanager.log and dispatcher.log are normal, but trial.log and metrics open on notepad has a lot of NULNULNULNUL...I think it cause WebUI can't get the information lile this
    image.png
    It will successful on cpu, but if work on gpu then the metric can't show on web.
    My environment :
    pythorch version: 1.5.0
    nni version: v1.9
    nni mode(local|pai|remote): local
    OS: windows10
    python version: 3.7
    is conda or virtualenv used?: conda virtualenv
    is running in docker?: no