Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • 20:59
    jbrockmendel labeled #31299
  • 20:59
    jbrockmendel labeled #31299
  • 20:56
    nrebena synchronize #28933
  • 20:10
    jbrockmendel commented #30492
  • 20:10
    ThibTrip commented #30492
  • 20:09
    ThibTrip commented #30492
  • 19:42
    joaoe commented #30741
  • 19:31
    rushabh-v review_requested #31328
  • 19:31
    ThibTrip synchronize #30492
  • 19:31
    rushabh-v commented #31328
  • 19:21
    nrebena synchronize #28933
  • 19:18
    charlesdong1991 assigned #31256
  • 19:18
    charlesdong1991 commented #31256
  • 19:16
    charlesdong1991 unassigned #31256
  • 19:06
    joaoe commented #30741
  • 19:05
    joaoe commented #30741
  • 19:04
    joaoe commented #30741
  • 19:03
    nrebena opened #31330
  • 19:03
    joaoe commented #31327
  • 18:58
    jbrockmendel synchronize #29941
William Ayd
@WillAyd
You are on linux right?
Joris Van den Bossche
@jorisvandenbossche
yes
for me it consistently segfaults each time
William Ayd
@WillAyd
Gotcha
Well that’s a good thing at least
Joris Van den Bossche
@jorisvandenbossche
is it more consistent if you eg include a garbage collection?
William Ayd
@WillAyd
Not sure
Joris Van den Bossche
@jorisvandenbossche
But so it also doesn't consistently segfault when you close the process? (eg quitting the ipython terminal, or stopping your pytest)
William Ayd
@WillAyd
No
Joris Van den Bossche
@jorisvandenbossche
Hmm, then I suppose adding a garbage collection won't help
William Ayd
@WillAyd
Maybe
I’ll be at a computer with a linux VM later today so can give it another look then
That would be very helpful if could at least reproduce consistently
Quite the interesting bug
Gaurav Sharma
@greatsharma
Can we bucket categorical data in pandas?
harrys1000rr
@harrys1000rr
Hi guys
Compl Yue
@complyue
RanadeepPolavarapu
@RanadeepPolavarapu

Hi, does anyone know how to groupBy a DateTimeIndex column by a frequency with only the time? I want to be able to ignore the date and only group the time into 10Min intervals. I did this:

agg_10m = df2.groupby(pd.Grouper(freq='10Min')).aggregate(np.sum)

But it returns the date included in the grouping like so:

2016-10-17 12:10:00-04:00,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0
2016-10-17 12:20:00-04:00,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0
2016-10-17 12:30:00-04:00,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2016-10-18 11:10:00-04:00,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2016-10-19 14:40:00-04:00,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
Irv Lustig
@Dr-Irv
@RanadeepPolavarapu create a new column representing the minute. Something like this:
>>> df=pd.DataFrame({"data": [1,2,4,8,16], "dates": pd.to_datetime(["2016-10-17 12:10:00-04:00", "2016-10-17 12:20:00-04:00", "2016-10-17 12:30:00-04:00", "2016-10-18 11:10:00-04:00", "2016-10-19 14:40:00-04:00"])})
>>> df.assign(minutes=lambda df:df.dates.dt.minute).groupby('minutes')['data'].sum()
minutes
10     9
20     2
30     4
40    16
Name: data, dtype: int64
nubonics
@nubonics
hi, was hoping someone could help me with pandas, (if you would like a different format of how im presenting the information, please ask, and I will gladly oblige), so here is my data that is accurate https://imgur.com/a/NMcdhbh
and here is the data that is inaccurate, except in the format i would like https://imgur.com/swjUpfo, however, using .fillna(method='ffill'), makes the data inaccurate
so my question is would i go about translating the data from image A to image B
the problem with the first bit of data is that there are NaNs for data that is not missing, it is just split into multiple rows (sometimes), if it isnt in another row, then the data doesnt exist, in which case I will drop the row
Irv Lustig
@Dr-Irv
@nubonics This gitter channel is for pandas development issues. Your question is better placed on Stack Overflow
Gaurav Sharma
@greatsharma
Can we bucket categorical data in pandas?
Please tell
matrixbot
@matrixbot
ruski you mean a groupby?
lucasmarinsnave
@lucasmarinsnave
Hi
isnani
@isnani
Hi, can anyone help me understand what percentage sign means in the following code? @Appender(_shared_docs['reindex_axis'] % _shared_doc_kwargs)
Joris Van den Bossche
@jorisvandenbossche
sameerCoder
@sameerCoder

@sameerCoder
DropDown option to show subset of dataframe Tkinter
I have a dataframe pandas of which i want to make GUI to display the data , i have date_time one column which show data every one hour interval i want to make a dropdown option suppose if user select 1 hours then only the 1 hours all columns & rows show , if user select 2 hour then second all columns & rows display . Cany any one please help me how to display the data gving dropdown option. I will really Appreciate it. Thanks in Advance.

SAMPLE DATA:

Name:   Longitude(degree)   Latitude(degree)    DATE_TIME   Mean Sea Value (m)  DRY or WET
SD      87.0308            21.4441    00:00 IST 05-08-2019    -0.0467     DRY
Sea1    87.0544            21.4152    00:00 IST 05-08-2019    -1.0653     DRY
4K      86.9927            21.4197    00:00 IST 05-08-2019    -0.1331     DRY
4KP1    86.9960            21.4166    00:00 IST 05-08-2019    -0.0863     DRY
Name:   Longitude(degree)   Latitude(degree)    DATE_TIME   Mean Sea Value (m)  DRY or WET
SD      87.0308          21.4441      01:00 IST 05-08-2019    -0.0329     DRY
Sea1    87.0544          21.4152      01:00 IST 05-08-2019    -0.4067     DRY
4K      86.9927          21.4197      01:00 IST 05-08-2019    -0.0897     DRY
4KP1    86.9960           21.4166     01:00 IST 05-08-2019    -0.0676     DRY

Reading of df by tkinter i have done but i dont know to make dropdown function to groupby DATE_TIME column in Tkinter
I will appreciate.

Christian Roy
@roychri

Hi.

This gitter channel is for pandas development issues.

Does it mean developing using pandas or the development of pandas itself?

Tom Augspurger
@TomAugspurger
Development of pandas itself.
Christian Roy
@roychri
oh! :-( Alright, thanks anyway :) bye
Alan Auckland
@alanauckland86
Hi new to programming and Pandas.
I have dataframe imported from csv. All vales are represented as objects. I can do df = dataframe.query('colum.str.contains"test"')
But I can't do
dataframe.query('colum.str.contains"5.1111"')
I set fillna 'missing' in read_excel()
I tried every which way I can find I guess the . Is messing it up.
Alan Auckland
@alanauckland86
Sorry I made mistake I am importing from Excel not csv.
Everything is an object which means it's a string of any length. Essentially all the data that that's says get all colums where column1 value contains test and column2 value contains value new test and column3 value does not contain test text. I think .query is how to do this but I can't get it to work how I think. I read documentation which is confusing. Somet
I tried dataframe = dataframe
Alan Auckland
@alanauckland86
[~dataframe.Colum.str.contains("5.11")] which ~ nit but this errors. I tried converting column to int but this errors also says int
Int() with base 10: 'Appliance' which means nothing to me also.
Sorry I try to explain
Andrew Tolmie
@DancingQuanta
Hi @alanauckland86 can you show the traceback please?
You said everything is a string in your dataframe. This means you cannot convert to int if the string contains numbers that does not look like an int
Andrew Tolmie
@DancingQuanta
You used an argument "5.11". You cannot convert to an int because it have a decimal point. This number is called a float.
Alan Auckland
@alanauckland86
Ah I worked it out yes your right. Int is not a float. I also needed is_numeric which belongs to pandas not the dataframe.
Andrew Tolmie
@DancingQuanta
Good!