These are chat archives for FreeCodeCamp/DataScience

14th
Nov 2017
glen weyombo
@weyoG
Nov 14 2017 10:20 UTC
hi guys
i am new in the data science but have some coding experince
i have a project at hand and i need some one tho guide me on where and how to implement it.
i have been following some tutorial on ML and have written a basic model that predicts house prices, my question is how do i use it in real life, like i want to expose endpoint, be it soap or REST where i can pass in the variables and use the model developed to predict the house price
Josh Goldberg
@GoldbergData
Nov 14 2017 11:44 UTC
Hi do I access the Pop-up that was available when I first entered this room?
Matthew Barlowe
@mcbarlowe
Nov 14 2017 12:17 UTC
@weyoG what language did you write your model in?
But usually you save the model to a file and then call that file in another script and pass it the variables that way
Matthew Barlowe
@mcbarlowe
Nov 14 2017 13:06 UTC
finally figured it out @evaristoc this link was what finally made the pieces click https://en.wikipedia.org/wiki/Disjoint-set_data_structure
glen weyombo
@weyoG
Nov 14 2017 14:12 UTC
@mcbarlowe I am using python.
Matthew Barlowe
@mcbarlowe
Nov 14 2017 14:16 UTC
Then usually using pickle is one way to save and load your model for future use at least if your using scikit learn
glen weyombo
@weyoG
Nov 14 2017 14:31 UTC
@mcbarlowe any material, tutorial or example i can follow?
evaristoc
@evaristoc
Nov 14 2017 17:50 UTC
@mcbarlowe ohhhh! You see? I saw the algo being formulated for the problem but I didn't know it as so!!! Thanks!!!!
@GoldbergData in the settings of the room, probably a button ("Room Settings") on your top-right that looks like a tuner. Possibly the Settings sub-menu.
Rhistina Revilla
@Rhistina
Nov 14 2017 18:40 UTC

Question for pandas/python. is there a way to drop rows with all 0s in the columns EXCEPT for the index? i found this link below for all columns

https://stackoverflow.com/questions/22649693/drop-rows-with-all-zeros-in-pandas-data-frame

evaristoc
@evaristoc
Nov 14 2017 19:07 UTC

@Rhistina how was at the end with the model? (Wasn't you who made a question last time about a regression model?)

About your question above, not sure if I understand your question.

drop rows with all 0s in the columns

Is this about dropping the column or the row? Or do you mean a row full of 0s that must be dropped? Or is a row of 0s but the index?

If the last, I don't think the operation will look at the index if from pandas. If so, you might be applying an operation over a pandas object from another library, likely numpy.

Rhistina Revilla
@Rhistina
Nov 14 2017 19:09 UTC
regression model was not me :)
evaristoc
@evaristoc
Nov 14 2017 19:09 UTC
Ok!
Rhistina Revilla
@Rhistina
Nov 14 2017 19:09 UTC
but i think i have my answer. i didn't look at the top answer of the stack overflow link and instead used all() instead of any()
evaristoc
@evaristoc
Nov 14 2017 19:10 UTC
Hmmmm.... not sure. But if it works for you, then it is ok. Check your results though!
Not sure, not saying it is not ok.
Matthew Barlowe
@mcbarlowe
Nov 14 2017 19:11 UTC
@Rhistina why don’t you just subset the dataframe like this df[df.values.sum(axis=1) != 0]
evaristoc
@evaristoc
Nov 14 2017 19:12 UTC
In that case, any is correct by keeping all those rows with at least one value different to 0. So it is a good alternative if that is what you want.
Matthew Barlowe
@mcbarlowe
Nov 14 2017 19:14 UTC
But if you hav values of negative numbers that might not work
Rhistina Revilla
@Rhistina
Nov 14 2017 19:14 UTC
yeah i need negative numbers
evaristoc
@evaristoc
Nov 14 2017 19:14 UTC
@mcbarlowe IMO, depends but I am not sure if necessary for this case?
Rhistina Revilla
@Rhistina
Nov 14 2017 19:15 UTC
>>> df
   col1  col2  col3  col4
0     1     2     3     4
1     0     0     0     0
2     1    -1     0     0
>>> df[(df != 0).any(axis=1)]
   col1  col2  col3  col4
0     1     2     3     4
2     1    -1     0     0
>>> df[df.values.sum(axis=1) != 0]
   col1  col2  col3  col4
0     1     2     3     4
evaristoc
@evaristoc
Nov 14 2017 19:15 UTC
And the sum you are suggesting might be, by accident, equal to 0.
Rhistina Revilla
@Rhistina
Nov 14 2017 19:15 UTC
i want row 0 and 2 to come back
so gonna use the (df != 0).any(axis=1) for this
evaristoc
@evaristoc
Nov 14 2017 19:16 UTC
I see! Well, you got it right then!
Rhistina Revilla
@Rhistina
Nov 14 2017 19:16 UTC
thanks @evaristoc and @mcbarlowe :)
Matthew Barlowe
@mcbarlowe
Nov 14 2017 19:17 UTC
Yeah upon further study that looks like the best way
Oh well sorry I️ couldn’t help more
Rhistina Revilla
@Rhistina
Nov 14 2017 19:17 UTC
you helped. to be honest talking it out and just trying things out is always helpful
evaristoc
@evaristoc
Nov 14 2017 19:19 UTC
@Rhistina no reason! Thanks for asking! (it is really nice to share the challenges that other people face! :) )
Matthew Barlowe
@mcbarlowe
Nov 14 2017 19:27 UTC
I've been working too much in R recently I need to get back to python lol I'm rusty
Rhistina Revilla
@Rhistina
Nov 14 2017 19:28 UTC
i'm all over the place with R. i have an easier time reading python
at pydata there's actually R for pythonistas talk that
i'd really love to go to
Matthew Barlowe
@mcbarlowe
Nov 14 2017 19:32 UTC
I love plotting in R although they've pulled ggplot to python I haven't messed with it much
evaristoc
@evaristoc
Nov 14 2017 19:32 UTC
I have been using r2py several times too. And well, jupyter you can easily combine both.
Matthew Barlowe
@mcbarlowe
Nov 14 2017 19:32 UTC
but I basically just know how to plot and manipulate dataframes in R I couldn't write a loop or funciton without looking it up lol
evaristoc
@evaristoc
Nov 14 2017 19:35 UTC

Sometimes I explore data in Kaggle and people trend to select one or another. By working in Jupyter I can cp some scripts from both and test them in the same place.

But yeah... I am also very rusty now with R. @erictleung is very proficient in R. I have seen him writing scripts in R from the top of his head, without looking at references.

pyrl or something like that? It is a different syntax I am not used to.

Personally, I usually feel frustrated when programmers embed sub-languages within an existing one (exception, JQuery). Even if it helps better, it is like having to learn everything again.

Rhistina Revilla
@Rhistina
Nov 14 2017 19:39 UTC
hehe. sub-languages in an existing one. i've seen some really old legacy R scripts that call python, java, or ruby
it's terrifying
trying to clean up our code base
evaristoc
@evaristoc
Nov 14 2017 19:40 UTC
Well that is extreme...
Although I would still have some nice words to multi-languages. But yeah, it can be a nightmare for debugging and cleaning for sure.
But then there is plyr. It is a nice R package, really nice. But it is written in a way that cause a certain cultural shock at first.
If you are used to R the old way, you see plyr and then you say: "Shall I have to learn all over again?"
And in fact you have: apparently plyr is becoming more the rule in R scripting than the exception.
evaristoc
@evaristoc
Nov 14 2017 19:45 UTC

But yeah, @Rhistina : I must admit that although is nice for the coder to be able to write in different languages, it is not nice when it comes to maintenance and all that stuff.

Somehow the reason why many people have been enjoying the JS stack, I guess.

Rhistina Revilla
@Rhistina
Nov 14 2017 19:48 UTC
i think when this place first started they had traders writing R scripts
and when developers came and went
they started retrofitting the existing R scripts
with their own flavor
and then left
evaristoc
@evaristoc
Nov 14 2017 19:52 UTC
00 !!
And now they left you all that stuff...
I DON'T want to be a coder...
:)

People going to see a film. "The Square", an Swedish film with an Italian friend. Just a day after Italy lost for the first time their place in the Soccer World Cup to... Sweden!!!

For them not going to the World Cup is, like put in several news, the Apocalypse, End of World, bla bla bla. My friend is saying he is planning to boycott the film... Still going though.

See you around!

Matthew Barlowe
@mcbarlowe
Nov 14 2017 19:57 UTC
for me dplyr is what makes R functional
im a huge tidyverse propnent
Eric Leung
@erictleung
Nov 14 2017 20:14 UTC
@mcbarlowe yeah, I agree. The tidyverse is quite powerful. What I worry is that with all the rapid development that is done on those packages, older code will be more difficult to run. However, from what I have seen so far, they have done a good job of notifying the user of deprecated functions and even suggest newer functions to use. I used a mutate_each() in one code and it says that it is deprecated and you should use mutate_all() or something similar.
@mcbarlowe with a reproducible example
> mtcars %>% mutate_each(function(x) x - 3)
`mutate_each()` is deprecated.
Use `mutate_all()`, `mutate_at()` or `mutate_if()` instead.
To map `funs` over all variables, use `mutate_all()`
Error: is_fun_list(funs) is not TRUE
> mtcars %>% mutate_all(function(x) x - 3)
    mpg cyl  disp  hp  drat     wt  qsec vs am gear carb
1  18.0   3 157.0 107  0.90 -0.380 13.46 -3 -2    1    1
2  18.0   3 157.0 107  0.90 -0.125 14.02 -3 -2    1    1
3  19.8   1 105.0  90  0.85 -0.680 15.61 -2 -2    1   -2
4  18.4   3 255.0 107  0.08  0.215 16.44 -2 -3    0   -2
5  15.7   5 357.0 172  0.15  0.440 14.02 -3 -3    0   -1
...
Matthew Barlowe
@mcbarlowe
Nov 14 2017 20:23 UTC
@erictleung yeah even Hadley Wickham has come out and said he doesnt understand everything in dplyr etc.
Josh Goldberg
@GoldbergData
Nov 14 2017 20:26 UTC
I love dplyr
Alice Jiang
@becausealice2
Nov 14 2017 21:19 UTC
His royal highness has spoken... Stop neglecting design needs.
I'm trying really hard to not lose my mind that Alberto Cairo Actually responded to my tweet. It's been a day, but I'm still not over it.
Matthew Barlowe
@mcbarlowe
Nov 14 2017 21:24 UTC
LOL nice chris albon has responded to me a couple times
Alice Jiang
@becausealice2
Nov 14 2017 22:20 UTC
FYI, Firefox Quantum is :ok_hand: :ok_hand: :ok_hand:
Matthew Barlowe
@mcbarlowe
Nov 14 2017 22:24 UTC
I've switch to opera and I've been please I might try out firefox
Josh Goldberg
@GoldbergData
Nov 14 2017 23:53 UTC
I just actually looked at what plotly was in R. Looks like a D3 wrapper. Maybe I don’t need to jump directly into D3. lol