These are chat archives for FreeCodeCamp/DataScience

7th
Sep 2017
Vignesh Ramesh
@VRamazing
Sep 07 2017 13:15
@evaristoc I am interested
Amélie
@ameliejyc
Sep 07 2017 14:10
Hello @/all we'd really link to use the FCC API but understand it's deprecated. Does anyone here know of any plans/timeline for the API to be usable once more?
Upkar Lidder
@lidderupk
Sep 07 2017 19:14
hi. is this a good place to ask Panadas read_csv question?
Matthew Barlowe
@mcbarlowe
Sep 07 2017 19:37
@lidderupk i might be able to help you out but there is also a pandas room on gitter as well
Upkar Lidder
@lidderupk
Sep 07 2017 19:52

Thanks @mcbarlowe. I am trying to read a csv file with the following format

1”, “data”, “more
data
in multiple lines”, "5”, “6”

read_csv craps out. Is there a way to allow new lines inside the quotes? Or I just have to clean up the file before feeding it to pandas? Not even sure how I would clean this up.

CamperBot
@camperbot
Sep 07 2017 19:52
:cookie: 126 | @mcbarlowe |http://www.freecodecamp.com/mcbarlowe
lidderupk sends brownie points to @mcbarlowe :sparkles: :thumbsup: :sparkles:
Matthew Barlowe
@mcbarlowe
Sep 07 2017 20:17
yeah read_csv doesn't like that instead use it with the flag quotechar = ' "" ' that everything between the quotes is one entry
@lidderupk
Upkar Lidder
@lidderupk
Sep 07 2017 20:20
@mcbarlowe I get
TypeError: "quotechar" must be a 1-character string
and if I just try quotechar = ‘“‘, I get the original ParserError: Error tokenizing data.
Matthew Barlowe
@mcbarlowe
Sep 07 2017 20:29
yeah sorry i put in an extra quotation mark there i know that works for a regular csv reader thought it would be the same for pandas
Matthew Barlowe
@mcbarlowe
Sep 07 2017 20:47
quotechar = '"' is what it should actually be
Matthew Barlowe
@mcbarlowe
Sep 07 2017 20:58
@lidderupk what does the whole parsererror tell you?
Upkar Lidder
@lidderupk
Sep 07 2017 21:18
Error tokenizing data. C error: Expected 30 fields in line 3, saw 31
I also asked in the pandas channel as you suggested. Thank you.
Matthew Barlowe
@mcbarlowe
Sep 07 2017 21:21
yeah for some reason it thanks that your line has to many items in the row for the columns you have without seeing the actual source it would be hard to see what the problem is if you set the flag error_bad_lines = False it will skip the bad lines if you can live with the lack of data