Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Sep 18 13:38
    papajohn commented #414
  • Sep 18 05:53
    adnanhemani commented #414
  • Sep 18 01:11
    davidwagner opened #414
  • Sep 17 22:21

    adnanhemani on gh-pages

    Generated by commit 23b4f5f6482… (compare)

  • Sep 17 22:04

    adnanhemani on gh-pages

    Generated by commit 23b4f5f6482… (compare)

  • Sep 17 21:13

    davidwagner on fix_version_15_2

    (compare)

  • Sep 17 21:13

    davidwagner on master

    fixed bug from previous version CHANGELOG Delete top_movies.csv Accident… and 2 more (compare)

  • Sep 17 21:13
    davidwagner closed #413
  • Sep 17 20:37

    adnanhemani on gh-pages

    Generated by commit ec024b264ff… (compare)

  • Sep 17 20:32

    davidwagner on travis_doesnt_like_libgfortran

    (compare)

  • Sep 17 20:32

    davidwagner on master

    Try removing libgfortran depend… Merge pull request #412 from da… (compare)

  • Sep 17 20:32
    davidwagner closed #412
  • Sep 16 17:04
    SamLau95 commented #412
  • Sep 16 06:23
    adnanhemani commented #412
  • Sep 16 06:16

    adnanhemani on fix_version_15_2

    Delete hist_workout_2.ipynb Ac… (compare)

  • Sep 16 06:16
    adnanhemani synchronize #413
  • Sep 16 06:15

    adnanhemani on fix_version_15_2

    Delete top_movies.csv Accident… (compare)

  • Sep 16 06:15
    adnanhemani synchronize #413
  • Sep 16 06:09
    adnanhemani review_requested #413
  • Sep 16 06:09
    adnanhemani review_requested #413
Chris Holdgraf
@choldgraf
Try that
it tells pandas to change how it parses the bytecode depending on the encoding you supply
when you read from a URL, there must be some intelligent stuff happening under the hood that infers this
I feel like these are the kind of problems that make people hate coding :P
Carl Boettiger
@cboettig
ha, thanks! yeah, that works.
I suspect the html version is actually somehow getting converted to utf-8
Chris Holdgraf
@choldgraf
yeah that could be
but in general if you get an error along these lines, it's often an encoding problem
Carl Boettiger
@cboettig
wonder if I can do some operation on the file itself (e.g. outside of python) to fix the encoding of that file? Never quite had a good grasp of where encodings are set; I always thought they were more a property of assumptions made by the parser about the file than a filetype property....
Chris Holdgraf
@choldgraf
well, in python I believe you can change the string encodings manually
though it's something that has generally confused me over the years
Carl Boettiger
@cboettig
Right. I guess there must be a character somewhere in that csv file that is unique to latin-1, but I don't spot it
Carl Boettiger
@cboettig
okay, well I can have vim rewrite the encoding... interesting to see the git diff to see what changes: dsten/ecology-connector@aed261b Makes the non-UTF8 characters obvious...
Chris Holdgraf
@choldgraf
interesting
looks like this could be used to detect the character encodings: https://pypi.python.org/pypi/chardet
e.g.: df['Kingdom'].str.encode('utf-8')
so you could loop through your columns, and if it's a column of strings change the encoding to utf-8
then write back to disk
Carl Boettiger
@cboettig
Cool. Though for the students I think it's best if I can give them utf-8 encoded data whenever possible; like you say, it's kind of the dark underbelly, I'm sure no one actually enjoys battling string encodings...
Chris Holdgraf
@choldgraf
oh definitely
I meant just for you/me
Carl Boettiger
@cboettig
right
Chris Holdgraf
@choldgraf
blob
e.g. that seems to work properly
and if I write that dataframe to file, I can now read it back in like before
but w/o the extra 'encoding' flag
df = extinct.to_df()
for col, vals in df.iteritems():
    try:
        df.loc[:, col] = vals.str.encode('utf-8')      
    except:
        print(col)
df.to_csv('./test.csv')
ds.Table.read_table('./test.csv')
there's the code to do it
Carl Boettiger
@cboettig
@choldgraf @SamLau95 Hmm, somehow I can no longer do an apply with a function that takes multiple arguments:
letter = ['a', 'b', 'c', 'z']
count = [9, 3, 3, 1]
points = [1, 2, 2, 10]

t = Table([letter, count, points], ['letter', 'count', 'points'])

# Works
t.apply(lambda x: x * x, 'count')

## throws error
t.apply(lambda x, y: x * y, ['count', 'points'])
Error is:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-7-8fe518b79c55> in <module>()
      7 t.apply(lambda x: x * x, 'count')
      8 
----> 9 t.apply(lambda x, y: x * y, ['count', 'points'])
     10 
     11 #array([ 9,  6,  6, 10])

/usr/local/lib/python3.4/dist-packages/datascience/tables.py in apply(self, fn, column_label)
    447         """Returns an array where fn is applied to each element
    448         of a specified column."""
--> 449         return np.array([fn(v) for v in self[column_label]])
    450 
    451     ############

/usr/local/lib/python3.4/dist-packages/datascience/tables.py in __getitem__(self, label)
    352 
    353     def __getitem__(self, label):
--> 354         return self.values(label)
    355 
    356     def __setitem__(self, label, values):

/usr/local/lib/python3.4/dist-packages/datascience/tables.py in values(self, label)
    438             An instance of ``numpy.array``.
    439         """
--> 440         return self._columns[label]
    441 
    442     def column_index(self, column_label):

TypeError: unhashable type: 'list'
Sam Lau
@SamLau95
looks like you might have an old version of the package @cboettig
do you have 0.3.dev21?
Chris Holdgraf
@choldgraf
I can confirm that @cboettig's code works on the latest version for me
Chris Holdgraf
@choldgraf
Also - quick quibble, but I often forget to open this package with python 3 sourced, and the error message you get for doing so is a bit cryptic (it just throws an import error)

think we could just put:

import sys
if sys.version_info < (3, 0):
    raise ValueError('This package requires python >= 3.0')

at the root __init__.py?

Sam Lau
@SamLau95
ah, that’s a good idea
Carl Boettiger
@cboettig
@SamLau95 yup, running on ds8.berkeley.edu and it says '0.3.dev21'
Chris Holdgraf
@choldgraf
@SamLau95 want me to make a PR?
Carl Boettiger
@cboettig
It's very strange. I have in my git history successful runs of the notebook with the very same code that works with multiple columns, so I'm not sure why it is failing for me now.
Chris Holdgraf
@choldgraf
hmm, that is weird - and it works for me
problem is I can't really debug it w/o the code breaking for me
maybe just to be sure you can pull the latest changes from the git repo
Sam Lau
@SamLau95
@choldgraf that’d be great
Carl Boettiger
@cboettig
Chris Holdgraf
@choldgraf
@SamLau95 hrmmm, actually it might be more complicated
because it looks like some of the changes actually result in a syntaxerror in 2.7
so it won't get to the point that it's running any code