These are chat archives for FreeCodeCamp/DataScience

11th
Oct 2018
Eric Leung
@erictleung
Oct 11 2018 00:29

Some solid advice on entry level data science jobs:

  • Data science is wide, so no company really knows what they want
  • Figure out what kind of "data scientist" you want to be (e.g. machine learning vs data visualization)
  • Know what value you can bring to a business
  • Look up alternative job titles like "Product Analyst"

https://twitter.com/jGage718/status/1049693399832514561

@sourav006 are you specifically using the data.table package?
Eric Leung
@erictleung
Oct 11 2018 00:37
@sourav006 otherwise, you could use tidyverse packages and do something like this:
iris %>%
    dplyr::mutate_all(stringr::str_length) %>%
    dplyr::summarise_all(max)
#   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
# 1            3           3            3           3      10
That'll find the max string length from every column in the data frame.
@sourav006 The above is just for demonstration purposes. It slightly fails because it converts the numeric values into strings and then calculates the length of that string. So if you have any numeric values, then there may be some changes you'll have to make.
@rbhatia46 if Google or your favorite search engine doesn't work, you always have GitHub to help you out https://github.com/search?q=mord.LogisticIT&type=Code
@rbhatia46 I got the mord.LogisticIT from the documentation https://pythonhosted.org/mord/#. Feel free to change the GitHub search based on whatever function you need to search.
sourav006
@sourav006
Oct 11 2018 02:55
@erictleung Thanx for the answer,
@erictleung Can we use vectorisation to make our code faster and we could get the max string length from all coloumn without giving a loop,Is it possible?If yes,can u please provide the code?
Eric Leung
@erictleung
Oct 11 2018 03:57
@sourav006 the code I've provided has no loops. And although I haven't tested it for performance, it should be good enough optimized to do the job.
@sourav006 you just need to replace the variable iris with your data.
sourav006
@sourav006
Oct 11 2018 04:53
@erictleung Ok brother,thanks for the help, let me try the code...
sourav006
@sourav006
Oct 11 2018 05:37
@erictleung I have two R code,how can i compare between them that which code is fast and optimized?
Eric Leung
@erictleung
Oct 11 2018 05:52
@sourav006 here's several ways to benchmark time execution of functions/commands https://stackoverflow.com/questions/6262203/measuring-function-execution-time-in-r
sourav006
@sourav006
Oct 11 2018 06:03
@erictleung Thanks 🙂
Alice Jiang
@becausealice2
Oct 11 2018 19:29
Already linked once, but OpenAI is looking for next year's Fellows and Interns and they've now opened up their search for the Scholars as well
mstellaluna
@mstellaluna
Oct 11 2018 20:58
@becausealice2 :wave: