Where communities thrive

  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
Repo info
  • 13:27
    jorisvandenbossche commented #46979
  • 13:24
    jorisvandenbossche synchronize #46979
  • 13:21
    seberg commented #41935
  • 12:58
    jorisvandenbossche synchronize #46958
  • 12:50
    simonjayhawkins synchronize #47096
  • 12:35
    simonjayhawkins labeled #47032
  • 12:33
    jorisvandenbossche commented #46979
  • 12:20
    michalkahle commented #47061
  • 12:19
    simonjayhawkins commented #44980
  • 12:19
    jorisvandenbossche synchronize #46979
  • 12:05
    simonjayhawkins opened #47096
  • 12:05
    simonjayhawkins labeled #47096
  • 12:05
    simonjayhawkins milestoned #47096
  • 11:35
    simonjayhawkins unlabeled #44980
  • 10:37
    Alex-Blade commented #47085
  • 10:35
    simonjayhawkins labeled #44980
  • 10:35
    simonjayhawkins labeled #44980
  • 10:30
    simonjayhawkins milestoned #44980
  • 10:21
    Alex-Blade commented #47085
  • 10:19
    Alex-Blade edited #47085
NaN keep being NaNs, it depends on your context how you handle them. There are test functions to filter things (is.nan?). Here you have them as strings or why regex? There are new NaNs after selection: Invalid selectors (out of range)?
1 reply
Hi guys
needed help in setting up the development environment for documentation contribution
since its my first time i might need a little guidance :/
Lucas Servi
@Dr-Irv: Hi Irv!
I've run into a bit of a problem in trying to fix an issue with the documentation
I have tried to set up the conda environment for documentation building
however, the make.py file has some unresolved issues, such as 'pandas._libs.interval' has not been imported
2 replies
error: command 'C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.29.30133\bin\HostX86\x64\cl.exe' failed with exit status 1
i have no idea whats causing it
is it because i havent setup the anaconda environment>
it also throws this one:
note: This error originates from a subprocess, and is likely not a problem with pip.
@Dr-Irv: @MarcoGorelli how do i resolve the error while trying to build the documentation?
3 replies
it throws an error:
ModuleNotFoundError: No module named 'pandas._libs.interval'
my data is like "[[[123,234][234,456]],[[234,567],[678,789]]]"
any idea to convert this to [[[123,234][234,456]],[[234,567],[678,789]]] this
in pandas dataframe ??
Hi All - I'm trying to determine if I can set the chunk cache size when reading from a hdf5 file and can't find anything in the documentation. Is this currently possible, or would it be a feature request?
Ginger Bread
I'm new to contributing on large codebases, #46388 this is the pull request i did, in that there are 2 checks that failed (pre-commit and typing validation checks), so should I close the pull request and make a new one with the changes or is it fine even if they failed, I'm asking because I've seen few pull requests without typing validation which got merged, please help in this context.
2 replies
Running test suite against Oracle Database
I want to run the pandas test suite (tests/io/test_sql.py) with oracle database using SQLAlchemy.
Can anyone please help with test suite execution with oracle?
Thomas Grainger
but it's in the docs and re-exported using an import *
Thomas Grainger
oh there's no py.typed anyway
Leonidas Tsaprounis

Hi pandas wizards! I've had a really peculiar problem with the pandas DatetimeIndex.
The problem appeared in the sktime CI pipelines and the code was working well on Linux and Mac but was failing on Windows.
sktime has a pandas<=1.1.5 so maybe this is not a problem in the recent versions, but it's worth a check.

The core problem was that for Linux and Mac, after some indexing/filtering operations on a multiindex dataframe, the resulting series had a DatetimeIndex with the right frequency, but for windows the frequency was None. The fix was to add this:

if hasattr(item.index, "freq") and item.index.freq is None:
     item.index.freq = pd.infer_freq(item.index)

Here you can find the CI logs from a dummy PR I did to debug this:

and here is the code were I applied the fix:

The quick-fix works, so no rush, but I'm really curious to figure out what the real cause is.

Mathieu Leduc-Hamel
:wave: I have an issue with mypy and pandas. I was using pandas-stubs for as typing stubs for pandas but it's not up-to-date and i just realized there's already pyi files in sources of pandas. Anyone here found a way of running mypy on a project using pandas using those stubs? Are they packages with pandas ?
1 reply
How do I learn to read the source code of pandas? Is there any structure or navigation file for the dependencies between the files on pandas? I will be very happy, if anyone enlighten me on the above question. Greets, elizabeta
Hey to all! I am working on an issue #46588 and preparing a pull request. Is there a pandas dev way of removing all possible pandas NaN keys from a dict? Or is there some list of possible NaN-values available throughout the codebase? I was hoping to avoid a construct like
nans = [np.nan,None,pd.NaT, pd.NA] #all possible nan values that I could think of for nan in nans: if nan in dictionary: del dictionary[nan] Thanks in advance
PS: Having some issues using GITTER . code sample in the image
Irv Lustig
@mctessi you can use isna(v) to test if a value v is a NaN key
Hi folks, looking for guidance to use the test suite for the extension arrays
Hi, I am using dataframe to pass in the procedure as TVP to persist the data in the sql server database. The problem i am facing is if dataframe is having None the TVP is not able to match the table type parameter.
if i replace the None with '' in dataframe, i am able to execute the procedure but i am ending up saving the '' in place of null in the database
is there is a way to convert the None with Null in data frame ?
Kozo Nishida

Hi all,
I want to debug run a specific test in the pandas source code with Pycharm.
But it fails with the following error:

C:\Users\hoge\miniforge3\envs\pandas-dev\python.exe "C:\Program Files\JetBrains\PyCharm Community Edition 2021.3\plugins\python-ce\helpers\pycharm\_jb_pytest_runner.py" --path C:/Users/hoge/PycharmProjects/pandas/pandas/tests/io/parser/test_parse_dates.py
Testing started at 22:52 ...
Launching pytest with arguments C:/Users/hoge/PycharmProjects/pandas/pandas/tests/io/parser/test_parse_dates.py --no-header --no-summary -q in C:\Users\hoge\PycharmProjects\pandas\pandas\tests\io\parser

ImportError while loading conftest 'C:\Users\hoge\PycharmProjects\pandas\pandas\conftest.py'.
..\..\..\__init__.py:22: in <module>
    from pandas.compat import is_numpy_dev as _is_numpy_dev
..\..\..\compat\__init__.py:15: in <module>
    from pandas.compat.numpy import (
..\..\..\compat\numpy\__init__.py:4: in <module>
    from pandas.util.version import Version
..\..\..\util\__init__.py:1: in <module>
    from pandas.util._decorators import (  # noqa:F401
..\..\..\util\_decorators.py:14: in <module>
    from pandas._libs.properties import cache_readonly  # noqa:F401
..\..\..\_libs\__init__.py:13: in <module>
    from pandas._libs.interval import Interval
E   ModuleNotFoundError: No module named 'pandas._libs.interval'

Process finished with exit code 4

Empty suite

Let me know if you have any ideas for this ModuleNotFoundError.

3 replies
hello! From my benchmarks, dtype "string[pyarrow]" is 4x less efficient than "string[python]". I was expecting quite the opposite
unless "memory usage" from df.info() is not reliable for this matter
neverming, I was not using deep=True
Felix Sargent
Hiya, new to the Pandas community, but was wondering if a module around voting tabulation would be useful for the community. I work a lot in election methods and I've noticed that Pandas is used a lot but everyone rolls their own methods. Would it be something that could belong on Core Pandas or would it be better as its own library?
2 replies

Hi all, I would like to ask if it is possible to pass in the group name to the function when doing groupby/apply?
For example, if I do this, I can get the group name:

for group, frame in data.groupby([“some group”]):
    # do something with frame

If I do this instead, I only get the frame as argument, and I don’t have the group name to even pass as extra argument.

results = data.groupby([“some group”]).apply(transform_func)
1 reply
Munsif Raza
Glad to be the part of this powerful community.
Ian Alexander Joiner
Looks like I’m getting the “oldest-supported-numpy>=0.10” error.
May I ask how this is supposed to be fixed? Is this due to some issue in main I’m unaware of?
Hi All, I am trying to run pandas test suite with a database mostly the tests under (/tests/io/sql.py) and as the connection is only allowed through SQLAlchemy. I am using SQLALchemy's create_engine() to get an Engine object. My query is from method
def pandasSQL_builder(con, schema: str | None = None):
as it checks whether the given conn parameter is either string or a SQLAlchemy Connectable otherwise throws an UserWarning.
  1. the check is something like this :
    ```if sqlalchemy is not None and isinstance(con, sqlalchemy.engine.Connectable):
          return SQLDatabase(con, schema=schema)```
    this gives me an AttributeError: module 'sqlalchemy.engine' has no attribute 'Connectable'. Did you mean: 'Connection'?
  2. If I pass con parameter as My connection string in such case this method call SQLAlchemy's create_engine() which returns an Engine Object. how can this be compared with the engine.Connections or engine.Connectable ?
    Link : https://github.com/pandas-dev/pandas/blob/a853022ea154dd38dd759300ee50b456f3a9ddf6/pandas/io/sql.py#L731
    I want your help on how to use the SQLAlchemy to connect to database to pandas. Let me know If I am doing anything wrong.
This message was deleted
This message was deleted