These are chat archives for FreeCodeCamp/DataScience

25th
Jul 2016
Evita
@webel
Jul 25 2016 14:56

Hey guys! Hoping someone can help me with a pandas issue!
So I have a dataframe with a column that contains lists, which look like this:
0 [Paid Search] 1 [Paid Search, unavailable] 2 [Paid Search, unavailable, Paid Search] 3 [Paid Search, unavailable, Paid Search, unavai... 4 [Paid Search, unavailable, Paid Search, unavai... 5 [Paid Search, unavailable, Paid Search, unavai... 6 [Paid Search, unavailable, Paid Search, unavai... 7 [Paid Search, unavailable, Paid Search, unavai... 8 [Paid Search, unavailable, Paid Search, unavai... 9 [Paid Search, unavailable, Paid Search, unavai... 10 [Paid Search, unavailable, Paid Search, Direct] 11 [Paid Search, unavailable, Paid Search, Direct... 12 [Paid Search, unavailable, Paid Search, Direct... 13 [Paid Search, unavailable, Paid Search, Direct... 14 [Paid Search, unavailable, Paid Search, Direct... 15 [Paid Search, unavailable, Paid Search, Direct... 16 [Paid Search, unavailable, Paid Search, Direct... 17 [Paid Search, unavailable, Paid Search, Direct... 18 [Paid Search, unavailable, Paid Search, Direct... 19 [Paid Search, unavailable, Paid Search, Direct...
etc.

And I'd like to find out all the unique values that are present in the lists in that column... any thoughts?

oops, it didn't print very nicely here :( sorry about that, but each number is a new row anyway
Evita
@webel
Jul 25 2016 15:14
no worries! I got that down, the data I posted is already cleaned in a previous function so in that function I just tried adding each channel to a set. :)
Second problem – getting average length of each list in a row? feels unnecessary to iterate through the whole thing?
evaristoc
@evaristoc
Jul 25 2016 15:23
:+1: for the first problem
Second problem: I am afraid you have to iterate, specially if the lengths are different between rows. Suggestion? First thing that comes to my mind is the apply method in pandas? It will iterate inefficiently. I think you can get a better result by considering it as a numpy problem and vectorise but I am not sure...
@webel ^^
Evita
@webel
Jul 25 2016 15:25
I'm thinking some combo of apply and reduce but having some syntax issues, thanks @evaristoc for the guidance!
CamperBot
@camperbot
Jul 25 2016 15:25
webel sends brownie points to @evaristoc :sparkles: :thumbsup: :sparkles:
:cookie: 301 | @evaristoc |http://www.freecodecamp.com/evaristoc