These are chat archives for data-8/datascience

30th
Dec 2015
henryem
@henryem
Dec 30 2015 05:36
Tables will still come before function declarations and for loops in the main class, I think
Carl Boettiger
@cboettig
Dec 30 2015 05:39
@henryem good, that makes sense, I'll start with manipulating real data in Tables then; hopefully will reinforce rather than confuse what they learn in the main class... Finding it hard already to write an intro lesson without [] subsetting...
for instance, which plot for a timeseries is less confusing / most consistent with the core class:
## Tables method: (needs to select column first otherwise it still plots all columns!?)
co2.select(["decimal.date", "average"]).plot("decimal.date")

## matplotlib style, also requires [] subsetting
plt.plot(co2["decimal.date"], co2["average"])
henryem
@henryem
Dec 30 2015 05:45
The first one
Though we'll still do everything you can do with [] indexing, except we'll use method calls instead
It's purely a syntactic simplification
Carl Boettiger
@cboettig
Dec 30 2015 05:46
cool. It would be nice if Tables.plot were as intelligent as pandas.plot for these line objects though
henryem
@henryem
Dec 30 2015 05:46
I don't know if the get-a-column method exists yet
Yeah that's not my department :-/
What's the problem in this case?
Carl Boettiger
@cboettig
Dec 30 2015 05:49
no worries. the data.frame we read in has extra columns, and the Tables.plot method tries to plot all columns as additional lines in different colors if we don't select them out.
Pandas syntax is more consise, it would just be co2.plot("decimal.date", "average") without the repetition needed in the datascience call
Carl Boettiger
@cboettig
Dec 30 2015 05:57
but that's all pretty minor. I think you've got me on the right path by focusing on the datascience method calls and trying to avoid [] indexing... it does get tricky very fast though; keep wanting to introduce pandas functions here & there where datascience doesn't have an easy way (that I know of) to do what I need. (and I'm just learning python as I go myself; coming from R mostly)
henryem
@henryem
Dec 30 2015 06:04
I think there's nothing wrong with using Pandas or matplotlib stuff here and there if it's substantially easier. For example, there's currently no way to label a plot without using matplotlib functions, so we did that in labs. I think some degree of magical thinking about library functions is inevitable anyway.