These are chat archives for data-8/datascience
tables(though I'm still learning those myself!). More just wondering about it as a data ingest step -- I often see students struggle with importing data from databases even when they are already well equipped to manipulate the data within a given framework once the data are imported. Teaching all that is of course beyond the scope of what I can get into a connector, but just wondering if it's worth giving some glimpse of data read/parse command that isn't
csv. More a pedagogy issue than a technical one I suppose, and I'm still on the fence.
dplyr.... I have a table in which one column is a grouping factor, so for each group I want to apply a summary function. Here's my R version: https://gist.github.com/cboettig/7ce0f311daa428b023f9
tablesgroup() will apply the summary function to every column, whereas group_by lets you apply it only to some columns
nanvalues in python?
nan. Looks like there are two options: for
maxin particular there is
nanmax, which ignores
nans. In general you could use
np.ma.masked_array(my_array, np.isnan(my_array))to get a view of
my_arraythat doesn't include that
nans, and then do whatever computation you wanted on that view.
True/Falsevalues in the new column...
np.ma.masked_arrayon a datascience
x = values.select(["assessid", "ssb"]).where("assessid", "AFSC-BKINGCRABPI-1960-2008-JENSEN") collapsed(x["ssb"])returns
nanvalues, so I'd have expected it to return
nan. And in R, when dropping nans, it returns true.
listobjects? I'm still a bit foggy on the difference between a list and an array. is an
arraya numpy object? for doubles only?
pip install datascience.... not quite sure how to check my module version info
import datascience as ds ds.__version__
folium sphinx numpy scipy matplotlib pandas IPython
conda install package_name
conda install numpy scipy matplotlib pandas jupyter
pip install folium