Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • Oct 02 18:47
    sarony removed as member
  • Oct 02 17:45
    erictleung commented #82
  • Aug 15 11:17
    FrednandFuria opened #82
  • Jun 20 21:19
    @bjorno43 banned @shenerd140
  • May 10 09:13
    @bjorno43 banned @zhaokunhaoa
  • Apr 27 19:48
    @mstellaluna banned @zhonghuacx
  • Apr 25 17:07
    @mstellaluna banned @cmal
  • Jan 08 22:07
    @mstellaluna banned @gautam1858
  • Jan 08 22:05
    @mstellaluna banned @dertiuss323
  • Dec 15 2018 23:34
    @mstellaluna banned @Julianna7x_gitlab
  • Oct 12 2018 05:50
    @bjorno43 banned @NACH74
  • Oct 05 2018 23:02
    @mstellaluna banned @JomoPipi
  • Sep 16 2018 12:21
    @bjorno43 banned @yash-kedia
  • Sep 16 2018 12:16
    @bjorno43 banned @vnikifirov
  • Sep 05 2018 08:13
    User @bjorno43 unbanned @androuino
  • Sep 05 2018 07:38
    @bjorno43 banned @androuino
  • Aug 23 2018 16:58
    User @bjorno43 unbanned @rahuldkjain
  • Aug 23 2018 16:23
    @bjorno43 banned @rahuldkjain
  • Jul 29 2018 14:15
    User @bjorno43 unbanned @jkyereh
  • Jul 29 2018 01:00
    @bjorno43 banned @jkyereh
Eric Leung
@erictleung

Just a reminder:

"Premature optimization is the root of all evil."
-- Donald Knuth

Alice Jiang
@becausealice2
@janus-reith I agree with @erictleung if all you're doing is pulling apart the data than deep learning is going to be a heavier solution than you'll need.
Janus Reith
@janus-reith
The cpu would just be an example - There would be lots of different varying specifications. I was hoping to be able make use of Deep Learning to avoid having to match each pattern separately.
Seemed like a classic use case to me.
But thanks for the hint, I'll ingestivate how far regex matching can get me here.
Still, I feel like I might have some misconcept regarding the way some ML patterns work, as I'm quite baffled that my use case is neither a typical one nor (relatively) easy to achieve.
Janus Reith
@janus-reith
Might look into something ready to use like Amazon Comprehend if there is nothing open to use and no similar example to base my efforts on.
Eric Leung
@erictleung

@janus-reith my rule of thumb on when to use deep learning is for tasks that are easy for humans to do but difficult to tell someone to do.

For example, if you think of all the typical deep learning applications, all of them are difficult to just tell someone about, like driving a car or producing art/images. But I could tell someone how to look for computer memory with some simple rules like, "If you see a number in front of 'GB' that is next to some letters like 'DDR', then use that number for memory size."

And again, the way technical specifications for computers is fairly standardized. The only thing you might have to worry about it using a comma instead of a decimal point in those numbers (1,6 versus 1.6).

Here's an example of just using regex to extract product details http://ceur-ws.org/Vol-1267/LD4IE2014_Petrovski.pdf There is no code, per se, but it gives you an idea of how it is possible.
Janus Reith
@janus-reith
@erictleung Yeah thats similar to how I got it. For a human it is is easy to reccognize these patterns (if I had a lot of lines with "Core i7" and "Core i5", it will be clear that "Core i3" is the type "i3" ).
However, explaining the logic to match all these patterns would be more difficult to explain to a human, while it would be easy for a human to understand the pattern, even without knowing the language.
I get that regex matching could be easier as the amount of different specifications will still be a limited so I could make a list of rules that could catch a high percentage of the fields I need.
Still I don't really get how this is not a designated task for ML
Eric Leung
@erictleung

@janus-reith machine learning is generally divided into two categories: supervised and unsupervised. Supervised is you have labels on data and you want to correctly assign that label to that data. In unsupervised, you're just searching for patterns. Your problem is maybe closest to supervised learning.

You've mentioned that input would be something like

"Intel Core i7-7500U 2,70 GHz, 16 GB DDR4"

and the example output would be

{ cpuClockRate: "2,70", "memorySize": 16 }

This is more clearly just normal text processing because you are simply extracting information from a set of text. In other words, the answer you're looking for is within your input data.

An appropriately used machine learning task is spam filtering in emails. There are words within the emails that hint to you that it is spam, but the task is to categorize the data rather than simply extract weird words from an email. ML is also necessary because the number and types of words you may see in emails is unconstrained (i.e., you don't know all the words someone might use in an email).

Here are some tips from Amazon on when to use ML:

  • You cannot code the rules e.g., spam or not spam. If you start seeing yourself writing a lot of rules to solve your problem, ML might help.
  • You cannot scale to manually review by a human.

Again, information extraction for your products you should be able to code most if not all the rules. And the number of samples will depend on your situation.

Zijing Zhang
@zzj0402_gitlab
import * as use from "@tensorflow-models/universal-sentence-encoder";
       ^

SyntaxError: Unexpected token *
    at Module._compile (internal/modules/cjs/loader.js:721:23)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:787:10)
    at Module.load (internal/modules/cjs/loader.js:653:32)
    at tryModuleLoad (internal/modules/cjs/loader.js:593:12)
    at Function.Module._load (internal/modules/cjs/loader.js:585:3)
    at Function.Module.runMain (internal/modules/cjs/loader.js:829:12)
    at startup (internal/bootstrap/node.js:283:19)
    at bootstrapNodeJSCore (internal/bootstrap/node.js:622:3)
node js hates * here, how to fix this?
Eric Leung
@erictleung
@zzj0402_gitlab what version of Node do you have? Some searching around suggests you need a specific version of Node, namely at least v12.2.0 https://stackoverflow.com/a/56350495/
Janus Reith
@janus-reith
@erictleung I ended up using Machine Learning to detemine the necessary Regexes - Now try to stop me : D
Janus Reith
@janus-reith
Thanks for your hints
Eric Leung
@erictleung
@janus-reith nice! :smiley:
Philip Durbin
@pdurbin
If anyone here happens to be in Boston there's a "machine learning and big data" track at this free conference on Thursday and Friday (Aug 15-16) at Boston University: https://devconfus2019.sched.com/overview/type/Machine+Learning+%26+Big+Data
ali Fazeli
@AFZL95
you can check my blog posts about my experience as a data analyst at Huawei Technologies in my portfolio: https://faze.li/
Eric Leung
@erictleung
Occasionally people stop by and ask about mathematics for machine learning. Just ran into this book that may prove useful https://mml-book.github.io/. Short-ish book of about 400 pages and it is free to download. It will also be published eventually if you want a physical copy. I ran into this while reading through this thread books to read for deep learning in case you want to dig a bit deeper.
Eric Leung
@erictleung
Oh this is pretty cool. I'm a bit late to the party, but it looks like Kaggle (known for data science prediction competitions) has a YouTube playlist of their reading group https://www.youtube.com/playlist?list=PLqFaTIg4myu8t5ycqvp7I07jTjol3RCl9
Here's the details if you wanna join in for the continuing discussions https://twitter.com/rctatman/status/1131621843188604928
Eric Leung
@erictleung
For a gem, see the definition for "data science" 🧐🙃
Alice Jiang
@becausealice2
That's hysterical! Thanks for sharing @erictleung
Vishu
@bommojuvishu
Hi Guys , I am trying to deploy the opencv in the heroku using the python flask . I am getting the following error : ImportError: libSM.so.6: cannot open shared object file: No such file or directory
Is there any way to deploy the opencv in the heroku ?
Alice Jiang
@becausealice2
Give this a try @bommojuvishu
Koderkid1936
@Koderkid1936
chance = 0
while chance <=3:
        guess = int(input("Guess: "))
        if guess == 9:
                print("try again")
               chance+1
       else:
                print("try again")
        chance+=1
print(chance) #why does this print 4 instead of 3 when you enter the integer 9?
Eric Leung
@erictleung
@Koderkid1936 your while loop will allow chance to equal 3. So it will go through the loop once more and run the line chance += 1 again, thus making chance equals to 4 as it has printed out.
@Koderkid1936 this link might help you visualize what your code is doing. It will create diagrams of what your code is doing as you go through it line by line.
Koderkid1936
@Koderkid1936
@erictleung thanks alot much appreciated, i think im starting to understand the flow of the program it executes the chance+=1 statement beofre the while loop.. i think but i will do more research thanks :thumbsup:
Eric Leung
@erictleung
Random comment. So I'm helping out with a data science boot camp in town and last time I checked, in one of their modules, they are using one of the freeCodeCamp new coder surveys as one of their datasets! So crazy that something I've had a hand in cleaning up has come full circle for me to see again :laughing:
Philip Durbin
@pdurbin
Can you please link me to that dataset? I'll take it under consideration for dataverse-sample-data. :)
Eric Leung
@erictleung
BuntyBru
@BuntyBru
Hi guys 

Can anyone mention good courses for data science for complete beginners ( person who hasn't had attachment with tech ever and is a business graduate)
I did some google search for this  
But wanted to know some personal reviews

Thanks
Philip Durbin
@pdurbin
@erictleung oh, right, we talked about this FCC dataset at freeCodeCamp/2017-new-coder-survey#7 :) Do you want to hear more about dataverse-sample-data? :)
@BuntyBru here's a general list of awesome programming and data science materials as well https://github.com/P1xt/p1xt-guides
Eric Leung
@erictleung
@pdurbin I was a bit curious about it :) You mentioned it so casually!
sa-js
@sa-js
Hey all. I am a camper and recently wrote an article on Medium. Give it a read guys : https://medium.com/@saeeddev/learn-how-to-store-data-for-your-data-science-problems-even-if-you-dont-want-to-27e02dd9f781?source=friends_link&sk=8139c568a7fe885d9b560e7cde7035b2. Plus also let me know how can I get it featured on FCC New Section
BuntyBru
@BuntyBru
@erictleung
thanks Eric
Philip Durbin
@pdurbin
@erictleung here is where I recently announced dataverse-sample-data: https://groups.google.com/d/msg/dataverse-community/u-Yv0U3v4Bo/NgBfDrDlAgAJ . I'm wondering if you (or anyone else here) is interested in contributing any datasets. :)
Alice Jiang
@becausealice2
@erictleung you're famous. Remember us
Philip Durbin
@pdurbin
heh
Alice Jiang
@becausealice2
if anyone is looking for free data, I've just discovered the Food and Agriculture Organization of the United Nations' FAOSTAT and OHMYGOD so much information
(in case it needs to be stated explicitly, it's all agricultural-related data)
ruitoantunes
@ruitoantunes
Hi
ruitoantunes
@ruitoantunes
i'm starting programming in php and would like to have some support on the php code
Alice Jiang
@becausealice2
@ruitoantunes I don't know that many people here would be able to help with PHP, it's not a very common language in Data Science. Perhaps if you share a code snippet someone could have a look?
ruitoantunes
@ruitoantunes
Ok thanks.