These are chat archives for FreeCodeCamp/DataScience

6th
May 2016
Victor
@Evaderei
May 06 2016 00:25
@erictleung b'{"delete":{"status":{"id":722520315532853248,"id_str":"722520315532853248","user_id":299893741,"user_id_str":"299893741"},"timestamp_ms":"1462419713259"}}'
Victor
@Evaderei
May 06 2016 00:50
@joeybuczek Right that would make sense but don't byte literals look like b'0101010' something like this?
Joey Buczek
@joeybuczek
May 06 2016 00:51
@Evaderei true.. which is why I'm not sure if it helps ... but, perhaps the feed is generated from something stored as a binary literal? (grasping at straws... and I also dont know enough about it)
Victor
@Evaderei
May 06 2016 00:52
grasping straws is better than pulling hair :P
Joey Buczek
@joeybuczek
May 06 2016 00:52
just strip it out?
or check for it, then strip it out.. etc
Victor
@Evaderei
May 06 2016 00:54
err what do you mean by strip it out? @joeybuczek
So I found something that said it's a byte object?
so it said to decode("utf-8")
I did that for a line, but got an error, saying that the lines are <str> type
and that str object has no attribute "decode"
I'm a long way from being near DJPatel level :P
Joey Buczek
@joeybuczek
May 06 2016 00:57
I'm afraid I won't have the answer either :(
Victor
@Evaderei
May 06 2016 00:59
No worries, it's the thought that counts
Thanks for helping I mean haha
Joey Buczek
@joeybuczek
May 06 2016 01:00
perhaps a search for byte converter or similar for your prog language? shrug
Victor
@Evaderei
May 06 2016 01:02
@joeybuczek It's funny because the json object I want is just inside b' '
I was thinking of using regex to grab the object
and spent a couple hours trying to do it
it seems so easy to just grab the json object
but I guess there's something I just haven't understood yet
Victor
@Evaderei
May 06 2016 01:12
Good news, so I learned more abt python strings
so I did line[2:-1] to remove b' ' we were talking about later. I ended up with the string unchanged and null interspaced between every other character
weird O_O
Victor
@Evaderei
May 06 2016 01:18
Python documentation isn't pretty like JScript is.
I miss jscript.
Daniel
@profoundhub
May 06 2016 03:48
@Evaderei yea... what ever happened to JScript...
evaristoc
@evaristoc
May 06 2016 09:04

@Evaderei you could either send your code to me or better asking in python room. It is strange that the message is embedded in a binary and still being string. Anyway: you can easily parse the binary into a string, I think it is str(msg), I haven't check. Just look for how to get binary into string.

Also let know what library you are using to open the message. If you are not using a known python library specifically for twitter you might find yourself dealing with more low level transformations.

Also if you want to let me know what are you at to? I have been working with python's twitter libraries recently. We could start a project together if you are interested?

@tufonas your question is not clear. Also for you, you could either send your code to me or better asking in python room. I have the impression that you need something similar to evalfor your case BUT please keep in mind that eval shouldn't be used in production otherwise you run the risk of facilitating certain types of malware attacks.
evaristoc
@evaristoc
May 06 2016 09:12
@Evaderei ... and what you seem to have there is a JSON, I think you should check the python json library. JSON is NOT a format that it is easily understood by python as it conflicts with a similar data type, dictionaries. The json library will parse the JSON document into a dictionary.
Victor
@Evaderei
May 06 2016 09:42
@evaristoc It's a binary? I thought it was a byte object, I used line.str("utf-8") to get rid of the b' '. i then tried to json.loads() the resultant string. But then I got stuck as the tweets had stuff like \u20x7 (just made this up) in them (I got some error I forgot about)
Is there a python library specially for twitter?
evaristoc
@evaristoc
May 06 2016 09:43
Sorry byte you are right
Victor
@Evaderei
May 06 2016 09:43
I'm doing an assignment that analyzes sentiment in tweets
no worries
evaristoc
@evaristoc
May 06 2016 09:44
Try by not transforming to string and using json directly on the document?
What for book reference are you using for that assignment? A bit updated but I recommend "Mining the Social Web" by Russell
@Evaderei ^
Should I read the whole book? Or just go through parts I think might help?
evaristoc
@evaristoc
May 06 2016 09:54

@Evaderei Depends... He goes through Twitter as the main topic of the book but then there are code that is not commented in some advanced parts of the book if the code was already commented previously, so you can get lost with cross references...

You have to combine that with the Twitter API info too. I think Russell was the author of the python Twitter API, I don't remember; but check for updates: his last edition is 2014.

@Evaderei Forgot to mention... when it comes to explain how to make all the analyses and concepts, the book is the best introduction I have found...
You are going to go through Text Mining so be ready... You can ask here if you are in doubt.
Brahma Reddy Chilakala
@bradd123
May 06 2016 11:12
Hello DataScience group!
evaristoc
@evaristoc
May 06 2016 11:37
Welcome!
Sam Aiken
@SamAI-Software
May 06 2016 12:29
@evaristoc greetings! I didn't get what did u mean by that
"Please contact @erictleung to confirm a copy of the file is available."
But I don't see any changes there (last commit 4 days ago)
evaristoc
@evaristoc
May 06 2016 12:35
@krisgesling how is it going with your viz? You suggested to add additional comparisons? Can we see your progress?

@SamAI-Software @erictleung and me are working on separate forks before adding the transformed file. You can see some progress with the code used to create that file at:
https://github.com/erictleung/2016-new-coder-survey/tree/clean-and-combine-data/clean-data

The file is still being worked locally, but it could be useful to start checking a few discrepancies. If you would like to verify some of them, your help will be welcome.

Kris Gesling
@krisgesling
May 06 2016 13:33
Hey @evaristoc latest version is at http://codepen.io/krisgesling/full/GZwYKV
Country fill based on total number of respondents. Haven't added other data in yet, going to make it a responsive projection tomorrow if I get time. Can easily add more data to mouse over pop up if people think it will be useful
Kris Gesling
@krisgesling
May 06 2016 13:38
Ill take a look at the cleaned data sets too, looks like we might have more people helping out with the article publicity :smile:
Jason Boxman
@jboxman
May 06 2016 13:49
@krisgesling Cool
Brahma Reddy Chilakala
@bradd123
May 06 2016 15:16
Did anyone complete udacity data analyst nano degree?
Daniel
@profoundhub
May 06 2016 16:08
@krisgesling that map by @evaristoc looks great, i have to learn how to do something like that. :)
Jason Boxman
@jboxman
May 06 2016 21:00
There are some books and videos on Safari
But I've found some good resources on the Web too
@bradd123 I haven't - I'd be curious if it's any good?
I have Safari access through work, so I'm riding it for all its worth