Where communities thrive

  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
Repo info
  • 20:28
    rhshadrach commented #37452
  • 20:22
    jbrockmendel synchronize #37524
  • 20:21
    jbrockmendel opened #37524
  • 20:19
    jbrockmendel opened #37523
  • 20:17
    simonjayhawkins commented #37397
  • 20:14
    avinashpancham synchronize #36838
  • 20:12
    avinashpancham synchronize #37063
  • 20:12
    dsaxton commented #37401
  • 20:11
    jbrockmendel synchronize #37455
  • 20:01
    jbrockmendel opened #37522
  • 20:00
    simonjayhawkins commented #37397
  • 19:57

    jreback on master

    ENH: Improve numerical stabilit… (compare)

  • 19:57
    jreback closed #37453
  • 19:57
    jreback closed #37448
  • 19:44
    jbrockmendel opened #37521
  • 19:32
    pep8speaks commented #37204
  • 19:32
    mroeschke synchronize #37204
  • 19:31
    mroeschke commented #15095
  • 19:25
    rhshadrach synchronize #37452
  • 19:19
    rhshadrach commented #37452
While uploading CSV file to Google drive, it automatically converting to Google Sheets. How to save it as CSV file in drive? or can I read google sheet through pandas data frame in Google Colab?
Trying to replicate some Excel conditional formatting in pandas. I have used the .background_gradient method, but I would like to replicate the Excel three-colour formatting where you have two colour scales (one +ve and one -ve) and you define the centre value (e.g. 0).
Anyone achieved anything similar? Essentially a 3 colour diverging map where you specify the value for the centre colour.
Josiah Baker
hey all, been working on pandas-dev/pandas#27977 as a first time contributor. first off, super excited to get involved. i have two pull requests open and was curious if there is a formal process for requesting a review? thanks!
William Ayd
No formal request process - someone will review as they come across
I’ll try to take a look later today if no one else gets to it in the meantime
Josiah Baker
@WillAyd awesome, good to know! and thanks
Joshua Wilson
Hi, I've been using Pandas (and GeoPandas) for several years and just now realized how many open issues (3000+) and PRs (~150) there are! Is there a roadmap / master plan somewhere? Does the project just move slow/carefully, or are y'all drowning in more work than you can handle?
Marc Garcia
@jwilson8767 we're all volunteers, surely doing more than we can, and help always welcome. The project is big and stable enough to move slowly, and you have the roadmap here: https://pandas.pydata.org/pandas-docs/stable/development/roadmap.html
Joshua Wilson
Irv Lustig
@andymcarter Use xlsxwriter to do the conditional formatting. More info here: https://xlsxwriter.readthedocs.io/working_with_conditional_formats.html
Thomas Havlik
Hey folks! When using either df.drop(df.index[i], inplace=True) or df = pd.concat([df[:i], df[i+1:]]), some cells become NaN
df.isna()['Timestamp'].sum() == 0 until I try and remove rows from df. At some point, df.at[i+1, 'Timestamp'] becomes nan. The loop for this is range(0, len(df)-1)
i.e. df.at[i+1, :] is always accessible
start, repeat = (0, True)
initial_rows = len(df)
num_dupes = 0
while repeat:
    repeat = False
    for i in range(0, len(df)-1):
        c = df.iloc[i]
        n = df.iloc[i+1]
        ct = int(c['Timestamp'])
        nt = int(n['Timestamp'])
        diff = nt - ct
        if diff < dupe_threshold:
            # remove the latter sample, logically ORing
            # this sample with it before removal
            if bool(n['Label']):
                df.at[i, 'Label'] = True
            # this doesnt work
            #df.drop(df.index[i+1], inplace=True)

            # neither did this
            #indexes_to_drop = set([i+1])
            #indexes_to_keep = set(range(df.shape[0])) - indexes_to_drop
            #df = df.take(list(indexes_to_keep))

            # and this didnt either
            df = pd.concat([df[:i], df[i+1:]])

            start, repeat = (i, True)
            num_dupes += 1
            total_num_dupes += 1
nt = int(n['Timestamp']) throws "cannot convert nan to int"
Thomas Havlik
I think when I am setting the value with df.at[i, 'Label'] = True, I am losing those rows' data
Thomas Havlik
This is nonsense. Why is it that df.loc/iloc modify the entire row even if you only want to modify a single column from the row?
df.loc[row, col] = val is a setter for the entire row, values indexed by {col:val}
How can I patch a single column's value for a single row in-place?
Thomas Havlik
assert df.isna().sum()['Timestamp'] == 0
df.at[i, 'Label'] = True
assert df.isna().sum()['Timestamp'] == 0 # AssertionError
Why is this happening? df.at[i, 'Label'] = True should not be purging values for other columns
Joshua Wilson
@thavlik Can you check the dtype of column Label before and after using df.at?
Joshua Wilson
Also, are you sure you don't mean to be using df.iat instead of df.at?
Thomas Havlik
it's bool
iAt based indexing can only have integer indexers for df.iat[i, 'Label'] = True
df.iat[i, df.columns.get_loc('Label')] = True appears to work
Joshua Wilson
Awesome, glad that was all it was.
hi I'm having a problem with pandas data frame ..
it takes away the last decimal rounding
hey yo
Hello, I would really like to contribute, but I feel so lost in this huge project.
Anyone can recommend me a simple yet productive issue? i'd love to handle it
Gabriel Corona
@jreback: Shall I rebase instead of merge ? (pandas-dev/pandas#28459)
OK, no it's documented on the contributing doc.
@Dr-Irv thanks, I have multiple dataframes of different shapes on a single excel tab, so keeping track of the exact position and extent of each df reprecluded using xlsxwriter. I did end up figuring out an implementation for the 3 colormap styler without xlsxwriter, modifying this SO post (https://stackoverflow.com/a/57445863/8731272)

I'm trying to create a python environment as described here https://dev.pandas.io/docs/development/contributing.html#creating-a-development-environment but the command "python -m pip install -e . --no-build-isolation" fails due to unknown version of pandas (pandas 0+unknown), how to fix it?


Obtaining file:///C:/Users/GuestUser/Documents/GitHub/pandas
Preparing wheel metadata ... done
Requirement already satisfied: pytz>=2017.2 in c:\users\guestuser\miniconda3\envs\pandas-dev\lib\site-packages (from pandas==0+unknown) (2019.2)
Requirement already satisfied: numpy>=1.13.3 in c:\users\guestuser\miniconda3\envs\pandas-dev\lib\site-packages (from pandas==0+unknown) (1.16.5)
Requirement already satisfied: python-dateutil>=2.6.1 in c:\users\guestuser\miniconda3\envs\pandas-dev\lib\site-packages (from pandas==0+unknown) (2.8.0)
Requirement already satisfied: six>=1.5 in c:\users\guestuser\miniconda3\envs\pandas-dev\lib\site-packages (from python-dateutil>=2.6.1->pandas==0+unknown) (1.12.0)
ERROR: xarray 0.13.0 has requirement pandas>=0.19.2, but you'll have pandas 0+unknown which is incompatible.
ERROR: statsmodels 0.10.1 has requirement pandas>=0.19, but you'll have pandas 0+unknown which is incompatible.
ERROR: seaborn 0.9.0 has requirement pandas>=0.15.2, but you'll have pandas 0+unknown which is incompatible.
ERROR: fastparquet 0.3.2 has requirement pandas>=0.19, but you'll have pandas 0+unknown which is incompatible.
Installing collected packages: pandas
Found existing installation: pandas 0+unknown
Can't uninstall 'pandas'. No files were found to uninstall.
Running setup.py develop for pandas

Successfully installed pandas-0.25.1


Tom Augspurger

It looks like it may have succeeded?

But if you’re worried, you might repeatedly conda uninstall -y --force pandas and pip uninstall -y pandas and maybe remove any pandas directories under your site-packages folder, including a possible pandas.dist_info or pandas.egg_info.

it looks like it have succeeded but it also emitted 4 errors. can I ignore them?
Tom Augspurger
Not sure. Do pandas.__version__ and pandas.__file__ look correct? If so, then yes you can ignore.

--- print(pandas.version)
--- print(pandas.file)

i think it is not good

Ochirgarid Chinzorig
Hello guys, I'm trying to setup pandas-dev environment following the instruction on https://dev.pandas.io/docs/development/contributing.html
But I keep getting dependency errors: currently I've installed sphinx and pyzmq manually
I'm on
  • Ubuntu 18.04.03
  • Python 3.6.8
  • pip 19.1.1
  • virtualenv 16.0.0
Ochirgarid Chinzorig
Any help for the environment? or should I need use docker or other environments?
Thanks for advance
William Ayd
Anyone else noticing an issue with 32 bit Linux environment?
Haven’t looked deeply but seen on a couple PRs
Maybe an issue with pytest release
Josiah Baker
hey @Ochirgarid can you provide an example of what errors you are getting? have you created an environment using conda as per the instructions?
you shouldn’t have to install any of the dependencies manually as that is managed by conda env create -f environment.yml