Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • Jan 31 2019 21:59
    joegoldbeck edited #4446
  • Jan 31 2019 21:56

    martindurant on master

    Fix relative path parsing on wi… (compare)

  • Jan 31 2019 21:56
    martindurant closed #4445
  • Jan 31 2019 21:55
    martindurant commented #4445
  • Jan 31 2019 21:54
    Dimplexion commented #4445
  • Jan 31 2019 21:47
    joegoldbeck opened #4446
  • Jan 31 2019 21:41
    TomAugspurger commented #4361
  • Jan 31 2019 21:38
    holoneo starred dask/dask
  • Jan 31 2019 21:30
    Mdhvince commented #4361
  • Jan 31 2019 20:52
  • Jan 31 2019 20:49
    mrocklin commented #2497
  • Jan 31 2019 20:22
    mrocklin opened #2497
  • Jan 31 2019 20:18
    mrocklin closed #4444
  • Jan 31 2019 20:18
    mrocklin commented #4444
  • Jan 31 2019 20:17

    mrocklin on 1.25.3

    (compare)

  • Jan 31 2019 20:17

    mrocklin on master

    bump version to 1.25.3 (compare)

  • Jan 31 2019 20:12
    martindurant commented #4445
  • Jan 31 2019 18:52

    mrocklin on master

    bump version to 1.1.1 (compare)

  • Jan 31 2019 18:52

    mrocklin on 1.1.1

    (compare)

  • Jan 31 2019 18:38
    martindurant commented #4445
Martin Durant
@martindurant
Oh, seems like slack is having issues in general https://status.slack.com/
Julia Signell
@jsignell
yep
Martin Durant
@martindurant
@dharhas @mmccarty - we’re supposed to meet to finalise the CFP! We could talk about it at tomorrow’s conference meet, but I think we should have a time specifically set aside.
Wow, everything went from orange to red at https://status.slack.com/ :)
Martin Durant
@martindurant
So hello gitter! Happy new year to dask, may it be far better than the last (not that Dask had a bad year of it)
Matthew Rocklin
@mrocklin
Woo!
GFleishman
@GFleishman
Hello Dask devs.
I recently tried installing dask via pip: pip install dask[complete]
This somehow fetched click: v8.0.0a1 (released on PyPI on Nov 25th last year), which seems to have removed a function used in dask.distributed: click._unicodefun._verify_python3_env
I'm not sure why setuptools would allow installing a pre-release alpha version, but it happened.
Reverting to click v7.1.2 worked fine.
Is this better as a github issue?
Martin Durant
@martindurant
I have heard many reports of pip doing interesting things. I’m not sure there’s much Dask can do about it.
GFleishman
@GFleishman
I thought this might boil down to pinning a click version (or range) in some part of the dask setuptools infrastructure; dask should also be aware that it seems this function will be removed from the click library - meaning either sticking to an old version of click or modifying this part of dask.distributed. Those are dask issues.
Martin Durant
@martindurant
If it’s marked for deprecation, then I totally agree, and an issue would be apropriate. We could add a version pin for click for now.
GFleishman
@GFleishman
:thumbsup:
I don't remember any deprecation warnings about this function previously and there doesn't seem to be anything about it in the source. I guess it's weird that click would remove it (or relocate it, not sure) in v8.0 w/o having given any warnings? I guess I can ask them about it.
Martin Durant
@martindurant
That’s a good idea. Maybe post the issue just with your original situation and link to it too - then we can find out if pip is doing this to others too.
Matthew Rocklin
@mrocklin
Hrm, I'm apparently double booked during the maintenance meeting this morning with something that is somewhat important. My apologies. I'll try to sneak out of my other meeting, but it's looking unlikely.
Martin Durant
@martindurant
Is anybody interested in getting involved in google-summer-of-code? Numfocus is a host organisation, so we could enter a project, if we can come up with a good idea of what that project (dask or dask-adjacent) might be. I don’t have such an idea yet! I would gladly try to brainstorm one though, and co-mentor. In my experience, you shouldn’t expect to get much work out of GSC (the coder will take as much effort to train as it would have to get the original thing done, on average), but it’s a good way to widen our contributor pool, be forced to come up with ideas, and practice mentorship.
1 reply
Julia Signell
@jsignell
I'm going to miss the maintenance meeting this morning. My report is same as usual though
kpasko
@kpasko
anyone have issues getting any of the docker image to read parquet? it seems they don't include fastparquet or pyarrow packages by default, though it could of course just be my naivete in deployment
Matthew Rocklin
@mrocklin
Dask array team ^ ??
Benjamin Zaitlen
@quasiben
We're still planning on a release this Friday, correct ?
jakirkham
@jakirkham
Benjamin Zaitlen
@quasiben
Came across a paper on folks trying to build a multi-backend execution engine for Python:
http://cidrdb.org/cidr2021/papers/cidr2021_paper08.pdf
Martin Durant
@martindurant
Yes saw it. Feels like Blaze… (joking, a little)
James Bourbeau
@jrbourbeau
I'm gonna have to miss the meeting today. Last week: released Dask and Distributed 2021.02.0, continued master->main changes with @jsignell, tried to be active on new issues / PRs. This week: similar things.
kirikov
@kirikov
Hello Dask community, my name Kirill and I'm CTO of http://datrics.ai/. We're developing no-code data-science platform and we're based on dask. We have problems with memory leaks and performance, and we need help or some consultation from folks knowing dask deeply.
If you're interested, please DM me. Thank you!
Martin Durant
@martindurant
Some of the dask-involved companies such as Anaconda and Quansight might be interested in a consulting contract, if you were to get in touch with them directly. Obviously, it’s in the interests of Dask in general to solve memory issues, so if you can make your situation, or a good proxy for it public, you might get more help.
1 reply
Itamar Turner-Trauring
@itamarst
so I'm looking at this bug where there's a dataframe with a column with dtype object, and it stores datetime objects rather than strings
3 replies
but the metadata heuristics just assume that dtype("O") means str
my first thought is that meta generation should be given some data from the underlying series, so it can guess based on real data
which might be intrusive, but might work
Julia Signell
@jsignell
Try setting the _meta by hand and see if it works
I think I did try that on the MRE and it didn't quite fix it
Itamar Turner-Trauring
@itamarst
(I have a smaller reproducer now BTW, the arrow stuff was a distraction, just how the end user found it)
Julia Signell
@jsignell
you can set the meta using ddf._meta = ddf.head()
Can we move this discussion to the issue actually?
Itamar Turner-Trauring
@itamarst
will do
Ali Kefia
@alikefia
Hello, I am new to dask, quick question : The nanny and worker classes are sharing some logic to check the parameters, is there any reason to not delegate this to ONE side ?
A side effect on temporary-directory (we concat twice the suffix dask-worker-spaceto the configured tmp folder)
jakirkham
@jakirkham
Hi Ali, could you please raise this as issue on Distributed? Also please feel free to drop that link here so people can follow the conversation. Thanks! 😀
Ali Kefia
@alikefia
Sure @jakirkham !
Ali Kefia
@alikefia
I will work on a PR and ask for reviews :)
Sebastian Berg
@seberg
I think I had asked before, and pretty sure it is fine. But Dask does not expect numpy to forward "invalid" arguments to ufuncs, right? Something like np.add(dask_arr1, dask_arr2, dask_specific_argument="value")? I am cleaning out the code in NumPy and that includes checking argument names (not the actual values) up-front before dispatching to __array_ufunc__.
Matthew Rocklin
@mrocklin
Hrm, I'm not sure I know of a good use case for this currently. In general I would not expect Numpy to properly handle Dask specific keywords, so maybe this question is moot?
Sebastian Berg
@seberg
Yeah, should be moot and I was just being paranoid. In any case, if anyone notices a change just let me know.
Doug Friedman
@realdoug_twitter

Hello, apologies if i'm not asking in the right place here, but:

I noticed that running mypy on a file that uses dask results in Skipping analyzing 'dask.bag': found module but no type hints or library stubs
Assuming its done with consideration and very incrementally, is there openness to receiving PRs that add support for the typing module and related tooling to dask?

I am a relative newcomer to dask so I could be totally off base here but figured i'd ask! Thanks!

Martin Durant
@martindurant
I think that falls under “we haven’t got around to it”. Certainly, typing has not been a priority. I expect that implementing it would be quite an undertaking. We do not run mypy in CI. Probably no one is outright opposed, so long as it doesn’t complicate the code too much.
Doug Friedman
@realdoug_twitter
numpy supports it now so that might be a starting point for the portions that mimic the numpy api
Matthew Rocklin
@mrocklin
Jason Wagner
@keegean1_gitlab
hello
I'm trying to use dask to load in a csv file and I've specified the dtypes and the column names. When I call n = df['item'] and then call n.compute(), I get an error saying the its trying to convert 'item' from object to float. The item is actually a float. I know all my data in the csv file are valid numbers.