Where communities thrive

  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
Repo info
  • 14:24
    topper-123 synchronize #41380
  • 14:17
    MarcoGorelli synchronize #41283
  • 14:17
    attack68 milestoned #40422
  • 14:16
    attack68 synchronize #40422
  • 14:10
    phofl commented #39424
  • 14:09
    attack68 commented #41358
  • 14:09
    attack68 commented #39424
  • 13:41
    rhshadrach commented #41139
  • 13:38
    MarcoGorelli synchronize #41283
  • 13:34
    MarcoGorelli commented #41283
  • 13:30
    MarcoGorelli ready_for_review #41283
  • 12:22
    rhshadrach commented #41314
  • 12:22
    rhshadrach closed #41314
  • 12:22
    rhshadrach commented #41314
  • 12:22
    rhshadrach unlabeled #41314
  • 12:22
    rhshadrach labeled #41314
  • 12:20
    rhshadrach labeled #41297
  • 12:16
    rhshadrach commented #41297
  • 12:16
    rhshadrach commented #41297
  • 12:16
    MarcoGorelli synchronize #41283
Tom Augspurger
@jreback we had the call. Notes at https://docs.google.com/document/d/1k_E_1oSV9VNHgGzepdeCyFdju8ZaXDwI3CvDYkhOtQ8/edit?usp=sharing. Summary at the bottom is where I’d recommend focusing.
Jeff Reback
onboard with the summary - i’d say go ahead with a PR
William Ayd
@datapythonista cool thanks for the info. I am also on board with maybe try and if we don’t like can always revert
Where in what you showed me did you see the info around a stable release?
Marc Garcia
GitHub actions itself is in beta, not sure when it'll be stable: https://github.com/features/actions
I am having trouble with scatter plot. I have posted my problem in Stack Overflow-
I need some hints on my mistake
I am not sure, if this is my mistake or some kind of bug
hi there.
i want to store a class or @dataclass as the values for one of my dataframe columns
i was just wondering if you can access member functions or attributes particularly like when you do the df filtering
like df[df['MYCLASSCOL'].attribute == 'helloworld']]
or df[df['MYCLASSCOL'].someBoolMethod(1,2,3)]
When parsing a dictionary containing a list of datetime objects, pandas converts it to numpy.datetime64. Is this normal? Reproducable example:
>>>  from pandas import DataFrame as df
>>>  import datetime
>>>  inp = {'date':[datetime.datetime.now() for val in range(3)]}
>>>  inp_tbl = df(inp)
>>>  inp_tbl
0  2019-09-20 15:00:35.080488
1  2019-09-20 15:00:35.080525
2  2019-09-20 15:00:35.080527
>>>  inp_tbl.dtypes
date    datetime64[ns]
dtype: object
>>>  inp_tbl['date'].values[0]
Felipe Catão do Nascimento
Please, who can follow me in GitHub? I write one project in Python one new framework, and this project and my profile need plus one follow, for I can participate one conpetition, who could help , I thanks : https://github.com/FelipeKatao
Ian Horsman
Hey guys, I'm interested in contributing to pandas. I have a lot of experience in python. I've used pandas on a few projects, and I'd be interested to spend some of my free time helping it grow whether that's committing to first-timer issues or developing new features. If someone who is an active contributor would like to help me get started, or even provide some mentorship I would appreciated the opportunity.
William Ayd
Hi @Ianphorsman - that’s great. If looking to contribute we have quite a few things labeled “good first issue” on GitHub. I’d suggest finding one that may interest you and working from there
Marc Garcia
@Ianphorsman I'd suggest having a look at this issue: pandas-dev/pandas#27977, and fixing few of the problems mentioned there, to get familiarity with the project in a simple way
feel free to DM after that, and I'll try to give you advice on what else you could work on, based on your interests
Tom Augspurger
This message was deleted
William Ayd
Does anyone know what our requirements are off hand to add a method to an accessor?
Specifically looking at the .dt accessor I thought it might be driven by some of the delegate decorators and class variables like _datetimelike_methods
But I see stuff exposed to the accessor that don’t appear in those variables (ex: to_pydatetime for a DTA)
Tom Augspurger
I think you’re correct about that. It’s on DatetimeProperties, TimedeltaProperties, etc.
That class also defines a to_pydatetime. But the majority come from _datetimelike_methods I think
William Ayd
OK cool
Yea just not sure what makes a method like to_pydatetime appear but not to_perioddelta
Will keep digging though; sounds like I’m in the ballpark
Thanks @TomAugspurger
Tom Augspurger
I don’t think to_perioddelta is in DatetimeArray._datetimelike_methods
William Ayd
Neither is to_pydatetime
So I think there’s just some more logic than looking at that variable
While uploading CSV file to Google drive, it automatically converting to Google Sheets. How to save it as CSV file in drive? or can I read google sheet through pandas data frame in Google Colab?
Trying to replicate some Excel conditional formatting in pandas. I have used the .background_gradient method, but I would like to replicate the Excel three-colour formatting where you have two colour scales (one +ve and one -ve) and you define the centre value (e.g. 0).
Anyone achieved anything similar? Essentially a 3 colour diverging map where you specify the value for the centre colour.
Josiah Baker
hey all, been working on pandas-dev/pandas#27977 as a first time contributor. first off, super excited to get involved. i have two pull requests open and was curious if there is a formal process for requesting a review? thanks!
William Ayd
No formal request process - someone will review as they come across
I’ll try to take a look later today if no one else gets to it in the meantime
Josiah Baker
@WillAyd awesome, good to know! and thanks
Joshua Wilson
Hi, I've been using Pandas (and GeoPandas) for several years and just now realized how many open issues (3000+) and PRs (~150) there are! Is there a roadmap / master plan somewhere? Does the project just move slow/carefully, or are y'all drowning in more work than you can handle?
Marc Garcia
@jwilson8767 we're all volunteers, surely doing more than we can, and help always welcome. The project is big and stable enough to move slowly, and you have the roadmap here: https://pandas.pydata.org/pandas-docs/stable/development/roadmap.html
Joshua Wilson
Irv Lustig
@andymcarter Use xlsxwriter to do the conditional formatting. More info here: https://xlsxwriter.readthedocs.io/working_with_conditional_formats.html
Thomas Havlik
Hey folks! When using either df.drop(df.index[i], inplace=True) or df = pd.concat([df[:i], df[i+1:]]), some cells become NaN
df.isna()['Timestamp'].sum() == 0 until I try and remove rows from df. At some point, df.at[i+1, 'Timestamp'] becomes nan. The loop for this is range(0, len(df)-1)
i.e. df.at[i+1, :] is always accessible
start, repeat = (0, True)
initial_rows = len(df)
num_dupes = 0
while repeat:
    repeat = False
    for i in range(0, len(df)-1):
        c = df.iloc[i]
        n = df.iloc[i+1]
        ct = int(c['Timestamp'])
        nt = int(n['Timestamp'])
        diff = nt - ct
        if diff < dupe_threshold:
            # remove the latter sample, logically ORing
            # this sample with it before removal
            if bool(n['Label']):
                df.at[i, 'Label'] = True
            # this doesnt work
            #df.drop(df.index[i+1], inplace=True)

            # neither did this
            #indexes_to_drop = set([i+1])
            #indexes_to_keep = set(range(df.shape[0])) - indexes_to_drop
            #df = df.take(list(indexes_to_keep))

            # and this didnt either
            df = pd.concat([df[:i], df[i+1:]])

            start, repeat = (i, True)
            num_dupes += 1
            total_num_dupes += 1
nt = int(n['Timestamp']) throws "cannot convert nan to int"
Thomas Havlik
I think when I am setting the value with df.at[i, 'Label'] = True, I am losing those rows' data
Thomas Havlik
This is nonsense. Why is it that df.loc/iloc modify the entire row even if you only want to modify a single column from the row?