These are chat archives for FreeCodeCamp/DataScience

10th
Feb 2016
Carl Parrish
@carl-parrish
Feb 10 2016 00:32 UTC
One of the reasons I'm thinking of moving. I'll be visiting for the first time in September.
evaristoc
@evaristoc
Feb 10 2016 08:36 UTC
Let me know when you come
evaristoc
@evaristoc
Feb 10 2016 10:35 UTC

Hi everyone:
For those who might try to open the Torrent dataset with python:
My version of of the dataset came as a line-by-line file; the best way to read it in my case was by using either readlines or readline methods.
You could then use the json library to read it all but you need to get rid of the no-json characters at the beginning and the end of the line you want to read (eg. the inline character).

NOTE: it is huge! So far I needed 9.0 GiB of memory just to open it. I have enough memory for one reading, but if you don't you would have read it in bunches... Perhaps this can help: http://stackoverflow.com/questions/519633/lazy-method-for-reading-big-file-in-python

Matt Gilbert
@Alquh
Feb 10 2016 23:03 UTC
anyone have experience with web crawling/data mining? i am trying to get data from a website and store it in a db on my website
Rex Schrader
@SaintPeter
Feb 10 2016 23:26 UTC
@Alquh There are a number of good tutorials on the web. Search for "Web Scraping" for the basics. I've done a little bit, so I may be able to help.