shashank88 on master
Pandas: use sort_index instead … Create df from structured lists… Fix another multi index read_st… and 2 more (compare)
NaNs would still have a 'ones' bitmask?
rowmaskonly gets written if you pass in a list, not when you pass in a dataframe.
numpy.int32are written as
numpy.int64into arctic. And using my custom scala driver, if I store them as
int32, then reading them using your python driver I get
def _set_or_promote_dtype(self, column_dtypes, c, dtype): existing_dtype = column_dtypes.get(c) if existing_dtype is None or existing_dtype != dtype: # Promote ints to floats - as we can't easily represent NaNs if np.issubdtype(dtype, int): dtype = np.dtype('f8') column_dtypes[c] = np.promote_types(column_dtypes.get(c, dtype), dtype)
Hi, I'm a PhD student considering exploring using arctic for some time series storage and analysis. However, I'm not gonna store financial data. It'll essentially be health time series w/ metadata, possibly multivariate.
Looked online and didn't seem to find anyone ever exploring this.
Was wondering if:
I have been using arctic (and love the simple API) but changed to flat files which IMHO are better suited than mongoDB for time series. You might want to have a look at Apache parquet or HDF5.
Also Dask is amazing for big data.
Flat files don't support time traveling and aren't easy to provide over cloud. I've been experimenting with TileDB and it's been higher performance, provides arctic-like features such as time-travel and scales well due to its serverless architecture.