Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • 05:49
    datapythonista commented #29530
  • 05:44
    datapythonista commented #29771
  • 04:30
    gfyoung reopened #29233
  • 04:30
    gfyoung commented #29233
  • 04:30
    gfyoung closed #29233
  • 04:30
    gfyoung commented #29233
  • 04:21
    jbrockmendel synchronize #29985
  • 02:09
    jbrockmendel closed #9045
  • 02:09
    jbrockmendel commented #9045
  • 01:58
    jbrockmendel commented #28293
  • 01:36
    jbrockmendel commented #29233
  • 01:13
    jbrockmendel unlabeled #14135
  • 01:13
    jbrockmendel labeled #14135
  • 01:06
    jbrockmendel opened #30222
  • 00:52
    stevenanton-bc commented #16048
  • 00:46
    mroeschke commented #5586
  • 00:36
    lithomas1 commented #29886
  • 00:30
    jbrockmendel commented #5764
  • 00:23
    jbrockmendel commented #5586
  • 00:18
    jbrockmendel unlabeled #16692
sabquat
@sabquat
I am having trouble with scatter plot. I have posted my problem in Stack Overflow-
https://stackoverflow.com/q/57988832/3862410
I need some hints on my mistake
I am not sure, if this is my mistake or some kind of bug
Chris
@0x5f3759df_gitlab
hi there.
i want to store a class or @dataclass as the values for one of my dataframe columns
i was just wondering if you can access member functions or attributes particularly like when you do the df filtering
like df[df['MYCLASSCOL'].attribute == 'helloworld']]
or df[df['MYCLASSCOL'].someBoolMethod(1,2,3)]
killerontherun1
@killerontherun1
When parsing a dictionary containing a list of datetime objects, pandas converts it to numpy.datetime64. Is this normal? Reproducable example:
>>>  from pandas import DataFrame as df
>>>  import datetime
>>>  inp = {'date':[datetime.datetime.now() for val in range(3)]}
>>>  inp_tbl = df(inp)
>>>  inp_tbl
                        date
0  2019-09-20 15:00:35.080488
1  2019-09-20 15:00:35.080525
2  2019-09-20 15:00:35.080527
>>>  inp_tbl.dtypes
date    datetime64[ns]
dtype: object
>>>  inp_tbl['date'].values[0]
numpy.datetime64('2019-09-20T15:00:35.080488000')
Felipe Catão do Nascimento
@FelipeKatao
Please, who can follow me in GitHub? I write one project in Python one new framework, and this project and my profile need plus one follow, for I can participate one conpetition, who could help , I thanks : https://github.com/FelipeKatao
Ian Horsman
@Ianphorsman
Hey guys, I'm interested in contributing to pandas. I have a lot of experience in python. I've used pandas on a few projects, and I'd be interested to spend some of my free time helping it grow whether that's committing to first-timer issues or developing new features. If someone who is an active contributor would like to help me get started, or even provide some mentorship I would appreciated the opportunity.
William Ayd
@WillAyd
Hi @Ianphorsman - that’s great. If looking to contribute we have quite a few things labeled “good first issue” on GitHub. I’d suggest finding one that may interest you and working from there
Marc Garcia
@datapythonista
@Ianphorsman I'd suggest having a look at this issue: pandas-dev/pandas#27977, and fixing few of the problems mentioned there, to get familiarity with the project in a simple way
feel free to DM after that, and I'll try to give you advice on what else you could work on, based on your interests
Tom Augspurger
@TomAugspurger
This message was deleted
William Ayd
@WillAyd
Does anyone know what our requirements are off hand to add a method to an accessor?
Specifically looking at the .dt accessor I thought it might be driven by some of the delegate decorators and class variables like _datetimelike_methods
But I see stuff exposed to the accessor that don’t appear in those variables (ex: to_pydatetime for a DTA)
Tom Augspurger
@TomAugspurger
I think you’re correct about that. It’s on DatetimeProperties, TimedeltaProperties, etc.
That class also defines a to_pydatetime. But the majority come from _datetimelike_methods I think
William Ayd
@WillAyd
OK cool
Yea just not sure what makes a method like to_pydatetime appear but not to_perioddelta
Will keep digging though; sounds like I’m in the ballpark
Thanks @TomAugspurger
Tom Augspurger
@TomAugspurger
I don’t think to_perioddelta is in DatetimeArray._datetimelike_methods
William Ayd
@WillAyd
Yea
Neither is to_pydatetime
So I think there’s just some more logic than looking at that variable
Phaneendra
@PhaneendraGunda
While uploading CSV file to Google drive, it automatically converting to Google Sheets. How to save it as CSV file in drive? or can I read google sheet through pandas data frame in Google Colab?
andymcarter
@andymcarter
Trying to replicate some Excel conditional formatting in pandas. I have used the .background_gradient method, but I would like to replicate the Excel three-colour formatting where you have two colour scales (one +ve and one -ve) and you define the centre value (e.g. 0).
Anyone achieved anything similar? Essentially a 3 colour diverging map where you specify the value for the centre colour.
Josiah Baker
@josibake
hey all, been working on pandas-dev/pandas#27977 as a first time contributor. first off, super excited to get involved. i have two pull requests open and was curious if there is a formal process for requesting a review? thanks!
William Ayd
@WillAyd
No formal request process - someone will review as they come across
I’ll try to take a look later today if no one else gets to it in the meantime
Josiah Baker
@josibake
@WillAyd awesome, good to know! and thanks
Joshua Wilson
@jwilson8767
Hi, I've been using Pandas (and GeoPandas) for several years and just now realized how many open issues (3000+) and PRs (~150) there are! Is there a roadmap / master plan somewhere? Does the project just move slow/carefully, or are y'all drowning in more work than you can handle?
Marc Garcia
@datapythonista
@jwilson8767 we're all volunteers, surely doing more than we can, and help always welcome. The project is big and stable enough to move slowly, and you have the roadmap here: https://pandas.pydata.org/pandas-docs/stable/development/roadmap.html
Joshua Wilson
@jwilson8767
Thanks!
Irv Lustig
@Dr-Irv
@andymcarter Use xlsxwriter to do the conditional formatting. More info here: https://xlsxwriter.readthedocs.io/working_with_conditional_formats.html
Thomas Havlik
@thavlik
Hey folks! When using either df.drop(df.index[i], inplace=True) or df = pd.concat([df[:i], df[i+1:]]), some cells become NaN
df.isna()['Timestamp'].sum() == 0 until I try and remove rows from df. At some point, df.at[i+1, 'Timestamp'] becomes nan. The loop for this is range(0, len(df)-1)
i.e. df.at[i+1, :] is always accessible
start, repeat = (0, True)
initial_rows = len(df)
num_dupes = 0
while repeat:
    repeat = False
    for i in range(0, len(df)-1):
        c = df.iloc[i]
        n = df.iloc[i+1]
        ct = int(c['Timestamp'])
        nt = int(n['Timestamp'])
        diff = nt - ct
        if diff < dupe_threshold:
            # remove the latter sample, logically ORing
            # this sample with it before removal
            if bool(n['Label']):
                df.at[i, 'Label'] = True
            # this doesnt work
            #df.drop(df.index[i+1], inplace=True)

            # neither did this
            #indexes_to_drop = set([i+1])
            #indexes_to_keep = set(range(df.shape[0])) - indexes_to_drop
            #df = df.take(list(indexes_to_keep))

            # and this didnt either
            df = pd.concat([df[:i], df[i+1:]])

            start, repeat = (i, True)
            num_dupes += 1
            total_num_dupes += 1
            break
nt = int(n['Timestamp']) throws "cannot convert nan to int"
Thomas Havlik
@thavlik
I think when I am setting the value with df.at[i, 'Label'] = True, I am losing those rows' data
Thomas Havlik
@thavlik
This is nonsense. Why is it that df.loc/iloc modify the entire row even if you only want to modify a single column from the row?
df.loc[row, col] = val is a setter for the entire row, values indexed by {col:val}
How can I patch a single column's value for a single row in-place?
Thomas Havlik
@thavlik
assert df.isna().sum()['Timestamp'] == 0
df.at[i, 'Label'] = True
assert df.isna().sum()['Timestamp'] == 0 # AssertionError
Why is this happening? df.at[i, 'Label'] = True should not be purging values for other columns
Joshua Wilson
@jwilson8767
@thavlik Can you check the dtype of column Label before and after using df.at?