Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Israel Saeta Pérez
    @dukebody
    I read that many groupbys in dask are performant but in jseabold notebook it says they are inefficient
    ramdandi
    @ramdandi
    i see it after cluster is up
    Israel Saeta Pérez
    @dukebody
    groupby - This requires a full on-disk shuffle and is very inefficient
    (sorry, not attending the conference, but wanted to clarify this in my brain)
    sorry, I mixed up dataframes with bags! forget abou it
    Aron Ahmadia
    @ahmadia
    @quasiben - I'm seeing a few people having difficulty getting a cluster up, long wait times after clicking the "start" button.
    Don't know if there's anything you can check/adjust on your end.
    Benjamin Zaitlen
    @quasiben
    It could be that the node they are hitting wasn’t populated with the image yet (though this is unlikely)
    Aron Ahmadia
    @ahmadia
    okay
    I've been telling folks to refresh for now, obviously if you've got a better troubleshooting idea please post here.
    Benjamin Zaitlen
    @quasiben
    Can you have them send me the cluster id (cluster-136017)...
    Aron Ahmadia
    @ahmadia
    the cluster id isn't coming up.
    Benjamin Zaitlen
    @quasiben
    yeah, re-launching is a good answer
    Aron Ahmadia
    @ahmadia
    this is before that step.
    okay
    Dan Coates
    @dan-coates
    I think the %%snakeviz code won't work on the cluster: jiffyclub/snakeviz#59
    Benjamin Zaitlen
    @quasiben
    @ahmadia on older browsers the http redirect might not work. In that case, spin up a cluster on their behalf and give them the URL
    I recall seeing something like that on old machines and on samsung tablets at scipy
    Aron Ahmadia
    @ahmadia
    ok
    thanks @dan-coates - I'll make sure Matt mentions that.
    he's running off the cluster so he'll encounter it himself :)
    Thomas Smith
    @tgs
    On this slide, it should probably be 'futures.add' not 'outputs.add' in the parallel version
    Souheil Inati
    @inati
    For submit, are these threads with shared memory?
    Thomas Smith
    @tgs
    Yes, the threads will have shared memory
    Abhijit Dasgupta
    @webbedfeet
    no cv_params_demo module available?
    Souheil Inati
    @inati
    looks like cv_params_demo is not installed on the cluster.
    Chris White
    @cicdw
    i got the same thing; here's a link to the code that you can copy / paste in a cell at the top :P
    Souheil Inati
    @inati
    works, i just created a text file named cv_params_demo.py in the same folder as my notebooks.
    Matthew Rocklin
    @mrocklin
    Our fault. Use the 05_ notebook in notebooks/05_... instead of pydata-dc-2016/04_...
    Wave me over if you have trouble with this
    Aron Ahmadia
    @ahmadia
    from dask.distributed import Executor, progress
    e = Executor('schedulers:9000')
    
    e.upload_file('cv_params_demo.py')
    
    futures = []
    
    for split in cv_splits:
        for params in param_samples:
            future = e.submit(evaluate_one, SVC, params, split)
            futures.append(future)
    
    progress(futures)
    vav1288
    @vav1288
    This message was deleted
    Dan Coates
    @dan-coates
    Can I just say how sweet progress bars are? I love progress bars. OK, I'm done.
    jc-healy
    @jc-healy
    Yep, those are awesome progress bars. They bring me joy.
    Benjamin Zaitlen
    @quasiben
    Is the tutorial finished ? I’d I can leave up the cluster for the day but i’d like to scale it back a bit
    Benjamin Zaitlen
    @quasiben
    @/all I’m scaling the cluster back — apologies if you’re in the middle of work
    Benjamin Zaitlen
    @quasiben
    out for a bit
    vav1288
    @vav1288
    Do we have an url for data pipeline tutorial?
    Rajashree Baradur
    @rajashreebaradur
    Taylor Terry
    @taylorterry3
    I left a usb key in the dask room this morning, anybody grab extras?
    Matthew Rocklin
    @mrocklin
    If you like the progress bars you might like the dask.distributed web ui: http://distributed.readthedocs.io/en/latest/web.html
    Minh Mai
    @minh5
    I found a Thinking Fast and Slow book in the 3rd floor room, in case your'e missing it. Let me know if its yours!
    adriannr
    @adriannr
    Hello, I am having trouble running prep.py. It is only generating blank folders. Anybody has a hunch what the problem might be?
    Jasdeep Singh
    @jay-dee7
    im executing some tasks on my nodes using parallel python.
    can we somehow get to know which node is computing
    or which like send a message back to master from node that im executing?