These are chat archives for FreeCodeCamp/DataScience

Feb 2017
Tom Lee
Feb 01 2017 00:29
Hi all, is there any Ruby/JS PDF scraper library recommendation? :smile:
G Singh
Feb 01 2017 01:55
hello world
Feb 01 2017 01:55

welcome to FreeCodeCamp @gsingh1313!

Alice Jiang
Feb 01 2017 04:06
@erictleung yes! I was telling my mom about them and said basically that same thing... I wonder how much good content I've missed because I've accidentally developed the ability to scroll through dozens of their posts without paying enough attention to realize I'm scrolling through their posts :/
Eric Leung
Feb 01 2017 07:25

@becausealice2 someone should make like a collaborative filter or something on their posts to find the good ones :laughing:

@user512 sorry, I don't know of any...

@gsingh1313 welcome!

Hèlen Grives
Feb 01 2017 11:52
@gsingh1313 hi welcome!
About my project: it is coming along very slowly. I decided to make a python module for some of the things just to see how I wrestle with that. I do need to process hundreds of files. So I have setup the project structure as good as I can. There's still a small issue. If I ocr the files, correcting them immediately will be much easier. The output won't be 100% accurate .However I don't know if that step leaves the data raw enough; or that I have to process them and later correct them. I haven't made up my mind yet about that. Anyway I keep in mind the article about the fact that raw data is actually never processed or worked with. It is always a derivative.
Feb 01 2017 13:46
@user512 Sorry, no idea. Haven't tried yet to work on PDF format.
Feb 01 2017 20:41