These are chat archives for FreeCodeCamp/DataScience

26th
Feb 2017
Hèlen Grives
@mesmoiron
Feb 26 2017 16:34
@becausealice2 Who are they? they guys and gals from DS pick? I do find their way of introduction a bit odd. '..writers and editors to propose up-to-date content...' They ask the general - alot; but often I find people with a good story more interesting without the pretention. From what I see I can't judge if I fit in with my point of view (being alowed to be different ; -) )
Hèlen Grives
@mesmoiron
Feb 26 2017 17:00
@deusesx Hi, welcome; about FreeCodeCamp I don't know. I mainly came here for the chat channel. Depending on your level of knowledge and interests you can come up with your own research question and work with open data. Data science is not data visualization although it is a way to present your findings. The field is rather large. You can focus on any of the skills. See it this way. If you would have been working with biological data or genome data than you could not show that to an employer in a portfolio. What the employer would rather want to know is: what was the research topic, what kind of problems did you encounter. How did you contribute, solve or use what kind of strategy did you follow. What do you like and what do you do best considered the process pipeline. So your portfolio could be a careful project description with some visualizations. Don't forget about the 'science' in DS; it is a distinct portion of the job. For my own project I keep an official science notebook and log all my doings, missteps, searches, choices etc. I work with primary data, the messiest and dirtiest form as it is the only way to get the secondary data that I can then use for the analysis and testing of my research questions.
Albert Jonathan
@albert2309
Feb 26 2017 17:02
@mesmoiron Good point. I consider data science is a methodology to reach a solution.
And speaking of Biological data, I found slides from 13th IDBigData Meetup (a meetup dedicated for Big Data in Indonesia) regarding Big Data for Bioinformatic and Healthcare. The topics are quite interesting as they cover data science both in academia and healthcare industry.
Since the slides are written by Indonesian, you will find many grammatical errors.
Alice Jiang
@becausealice2
Feb 26 2017 17:08
@mesmoiron as far as I can tell, just a DS publication on Medium. They found me.
Hèlen Grives
@mesmoiron
Feb 26 2017 17:08
@albert2309 Oh nice; I am a bit into the health part of the research and that gives constrains as to what I can do with it. Sharing or reproducing; also things like access is crucial. One can have a great idea, but then you must see how to get to the data first. I'll look at your slides
@becausealice2 oh okay. From the conversation here I know that some of are must more knowledgable about the math and algo parths of DS. My contribution would be mere about slicing the process of the DS pipeline depending on not so clean data and the interesting question it gives rise to.
Albert Jonathan
@albert2309
Feb 26 2017 17:12
@mesmoiron The slides can be intimidating and confusing for those who don't know what protein and DNA. I felt intrigued but feared whether I can do it or not after reading the first two slides.
@mesmoiron I think access is the crucial yet hardest part as storing data of DNA is quite difficult.
Alice Jiang
@becausealice2
Feb 26 2017 17:14
@mesmoiron my article they added to their publication is barely DS at all. If you want to contribute, I'd go ahead and ask.
Hèlen Grives
@mesmoiron
Feb 26 2017 17:30
@albert2309 Very nice slides; I'll download them. What I know from some of the data sciencetists from John Hopkins is that you don't always need to know the biology. What you need to understand is translating the question of the researcher into good code and solutions. It helps to have a bio background, but in the early days biologist could not program and they old school ones had to ask programmers to help them. We are just transitioning into the all round skilled type.