Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
Tom Kooij
@tomkooij
I pushed new manylinux builds to PyPI. No SIGILL on my test machine. It is build using our old wheel building method, against an external (outdated) blosc. They will be replaced with versions that use an up-to-date blosc soon.
Tom Kooij
@tomkooij
Wheels should be fixed now.
Eric Roberts
@EricR86
they are, thanks!
Eric Roberts
@EricR86
Hello! If I open_file on python2 and also on python3, I find that the empty hdf5 files are slightly different
only a few bytes near the beginning
and a significantly different amount of bytes near the footer
are these files compatable with each other? Will a h5 file created with 2 work with 3 or vice-versa?
Tom Kooij
@tomkooij
@EricR86 : Yes, the files should be compatible. I don't know where the difference comes from and I haven't tried reproducing it. If there is a problem, let me know.
Eric Roberts
@EricR86
@tomkooij: When we do an h5dump on both hdf5 files we find that everytime there is a "DATATYPE H5T_STRING", between python 2 and 3 the encoding changes from CSET H5T_CSET_ASCII to CSET H5T_CSET_UTF8
also notably when not specifying the title on open_file (with mode="w") there is a difference of DATASPACE SCALAR declared for it with an empty string as it's dataset in python 2, and in python 3 there is a DATASPACE NULL with no data defined for it
Tom Kooij
@tomkooij
Thanks. That is at least inconsistent. Can you open an issue on github that describes this? I'll be on holiday for the next weeks, so I probably will forget about it, if it is not documented somewhere.
henry senyondo
@henrykironde
Need some help. Am trying to use PyTable to insert a .CSV file with headers in to an hdf file and extract the same back to .CSV here is the sample. A,B,C,D 1,2,3, 5,6,7,8 ,2,2,2
Tom Kooij
@tomkooij
Perhaps pandas can help? There are some examples in: https://github.com/tomkooij/scipy2017
Eric Roberts
@EricR86
@tomkooij hello
I've found a reproducable segfault and I think I've found a possible solution
conda-forge/pytables-feedstock#29
I have not tried to reproduce this outside of conda, but I think it might be possible
Eric Roberts
@EricR86
@FrancescAlted I believe this also related to how c-blosc library is built for PyTables since as far as I can tell, it is the only source of builtin zlib symbols.
Francesc Alted
@FrancescAlted
Hey @EricR86 . Yes, c-blosc vendors zlib internally, but as far as I remember, we did not have problems with this before. Is this problem happening with older versions of PyTables, or the latest version is the only one exposing the problem?
Eric Roberts
@EricR86
It's the latest version
The issue, as described, seems pretty tricky to resolve with a builtin zlib
Essentially anyone who imports PyTables first which also dynamically loads one if it's shared libraries is at serious risk of causing a zlib version mismatch and segfault
The conda recipe itself, as it stands, is pinned to a later version than the builtin zlib in c-blosc
In the case I described on the pytables-feedstock, it comes from the fact that if a shared object from PyTables is dynamically loaded first, those symbols, including the builtin zlib symbols take precedence over the ones from the environment's zlib
This happens in the case shown above since it goes utilsextension.so -- dynamically imports --> hdf5.so -- dynamically imports --> libz.so
Eric Roberts
@EricR86
When symbol resolution happens for libz.so, it's at the bottom of the list since it's so far down in the dependency chain. Any symbols in libz.so that are found in utilsextension.so invariably go there no matter what
Why not dynmically load the environment zlib if it's available instead of building it in internally?
Eric Roberts
@EricR86
I just verified that this still happens on the latest conda release with Python 3
'conda --create tablestest python=3 pytables' is the environment
$ python -c "from tables import utilsextension;import zlib;zlib.decompress(zlib.compress(b'foo'))"
that's the one liner that will cause a segfault
Eric Roberts
@EricR86
@FrancescAlted I'm fairly certain this is reproducible whenever the environment zlib version mismatches the version (1.2.8) from c-blosc (which is the case of the latest conda release)
Aquil Abdullah
@aabdullah-bos

I need to delete some data from a table, however the "Hints for SQL users" states:

one should consider whether dumping filtered data from one table into another isn’t a much more convenient approach.

I would love to use this approach, accept I can't find clear documentation on copying filtered data from one table to another.
I can get the data with the following code:
import tables

h5f = tables.open_file('mydata.h5', 'a')
tbl = h5f.get_node('/exp1/observations')
data = tbl.read_where('text_ts < "2019-04-01"')
Aquil Abdullah
@aabdullah-bos
I know that I can create a new table using the description (schema) from tbl, and then supply data as a parameter, but I was wondering if there was something that was more memory efficient, where I could use say an iterrator or the coordinates to create the table with the filtered data?
I'd like to avoid creating each row using the the columns.
Denis Lisov
@tanriol
Do you mean something like tbl.append_where(...)?
Aquil Abdullah
@aabdullah-bos
@tanriol That might be it. Let me give it a spin.
Aquil Abdullah
@aabdullah-bos
Thanks @tanriol! That was the magic I needed. It's much faster than deleting rows!
nicain
@nicain
It looks like the most recent release 3 hours ago put the tar ball on pypi, but the wheels are not up yet. FYI this async deployment breaks our CI builds, and this is the second time it has happened. Is there a reason for the delay in deploying whls?
Anthony Scopatz
@scopatz
Perhaps you can open an issue for this.
Miroslav Šedivý
@eumiro
Preparing GitHub Actions configuration for PyTables: PyTables/PyTables#834 and https://github.com/eumiro/PyTables/actions if someone wants to follow and give me tips.
Corentin Dancette
@cdancette

Hi! I'm using pytables and I really like it. I have a question about a certain usage

I'm trying create a table, where one field is an array of size (N, 2048), where N is variable (from 1 to 100).
I'm not sure how to go about this, should I create a nested table with a field of size 2048 ? Or is this a use case for VLArray ?

for now I was padding like this

class Data(tb.IsDescription):
    array = tb.Float32Col(shape=(100, 2048))

and setting the padded data to zero.
What's the easiest way to do this with pytables ?

Corentin Dancette
@cdancette
I ended up with storing this data in a new EArray, and storing positions in the main table
Andreas Motl
@amotl
Hi there.
Attaching to PyTables/PyTables#834, I am currently about to finish my work on PyTables/PyTables#872 (Expand CI and wheel-building to all platforms using GHA).
In order to test the upload of wheels to PyPI, I've asked for permissions to test.pypi.org at https://github.com/PyTables/PyTables/issues/823#issuecomment-770637412 and wanted to take the chance to also ping @tomkooij about it here.