Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • Oct 02 18:47
    sarony removed as member
  • Oct 02 17:45
    erictleung commented #82
  • Aug 15 11:17
    FrednandFuria opened #82
  • Jun 20 21:19
    @bjorno43 banned @shenerd140
  • May 10 09:13
    @bjorno43 banned @zhaokunhaoa
  • Apr 27 19:48
    @mstellaluna banned @zhonghuacx
  • Apr 25 17:07
    @mstellaluna banned @cmal
  • Jan 08 22:07
    @mstellaluna banned @gautam1858
  • Jan 08 22:05
    @mstellaluna banned @dertiuss323
  • Dec 15 2018 23:34
    @mstellaluna banned @Julianna7x_gitlab
  • Oct 12 2018 05:50
    @bjorno43 banned @NACH74
  • Oct 05 2018 23:02
    @mstellaluna banned @JomoPipi
  • Sep 16 2018 12:21
    @bjorno43 banned @yash-kedia
  • Sep 16 2018 12:16
    @bjorno43 banned @vnikifirov
  • Sep 05 2018 08:13
    User @bjorno43 unbanned @androuino
  • Sep 05 2018 07:38
    @bjorno43 banned @androuino
  • Aug 23 2018 16:58
    User @bjorno43 unbanned @rahuldkjain
  • Aug 23 2018 16:23
    @bjorno43 banned @rahuldkjain
  • Jul 29 2018 14:15
    User @bjorno43 unbanned @jkyereh
  • Jul 29 2018 01:00
    @bjorno43 banned @jkyereh
Philip Durbin
@pdurbin
Code Ocean uses it and we (Dataverse) will be integrating with Code Ocean so I'm looking forward to playing more with Jupyter Lab then. For now I'm just individual Jupter Notebooks on Binder and Whole Tale.
Eric Leung
@erictleung
@pdurbin very cool. I haven't heard of Code ocean or Whole Tale before. Seems like science is slowly converging on some computational interactive scientific articles like that described in eLife's blog https://elifesciences.org/labs/ad58f08d/introducing-elife-s-first-computationally-reproducible-article
@FazeelUsmani this book recently came out teaching not only Python, but R as well https://www.anotherbookondatascience.com/ That may be useful to you in your learning efforts.
Philip Durbin
@pdurbin
I closed out our 5th annual Dataverse conference with a live demo of spinning up a Jupyter Notebook (R kernel) on Whole Tale based on a dataset in Dataverse. Screenshots and a transcript of my talk are on my blog, if you're interested: https://scholar.harvard.edu/pdurbin/blog/2019/jupyter-notebooks-and-crazy-ideas-for-dataverse
Eric Leung
@erictleung
@pdurbin ooo will definitely check out :+1: Thanks!
Philip Durbin
@pdurbin
Sure. If you spot any typos or have any feedback, I'm all ears.
Zijing Zhang
@zzj0402_gitlab
@zzj0402_gitlab
Trying to code up a service that runs sentence encoding from tensorflow.js. The dot product is never computed. Any suggestion?
compare(text1: string, text2: string): number {
    let similarityScore = -1;
    UniversalSentenceEncoder.load().then(model => {
      const sentences = [text1, text2];
      model.embed(sentences).then(embeddings => {
        embeddings.array().then(embeddingsArray => {
          similarityScore = Math.round(
            math.dot(embeddingsArray[0], embeddingsArray[1]) * 100
          );
        });
      });
    });
    return similarityScore;
  }
it("should compare two text strings", () => {
    const service: SemanticSimilarityService = TestBed.get(
      SemanticSimilarityService
    );
    expect(service).toBeTruthy();
    expect(service.compare("This is a test!", "This is a test!")).toEqual(
      100,
      "Similarity Score is not 100 for exact same strings."
    );
  });
HeadlessChrome 76.0.3803 (Linux 0.0.0) SemanticSimilarityService should compare two text strings FAILED
        Error: Expected -1 to equal 100.
            at <Jasmine>
            at UserContext.<anonymous> (src/app/semantic-similarity.service.spec.ts:20:67)
            at ZoneDelegate.invoke (node_modules/zone.js/dist/zone-evergreen.js:359:1)
            at ProxyZoneSpec.push../node_modules/zone.js/dist/zone-testing.js.ProxyZoneSpec.onInvoke (node_modules/zone.js/dist/zone-testing.js:308:1)
HeadlessChrome 76.0.3803 (Linux 0.0.0): Executed 6 of 11 (1 FAILED) (skipped 5) (0 secs / 0.575 secs)
HeadlessChrome 76.0.3803 (Linux 0.0.0) SemanticSimilarityService should compare two text strings FAILED
        Error: Expected -1 to equal 100.
            at <Jasmine>
            at UserContext.<anonymous> (src/app/semantic-similarity.service.spec.ts:20:67)
            at ZoneDelegate.invoke (node_modules/zone.js/dist/zone-evergreen.js:359:1)
HeadlessChrome 76.0.3803 (Linux 0.0.0): Executed 6 of 11 (1 FAILED) (skipped 5) (0.672 secs / 0.575 secs)
Eric Leung
@erictleung
@zzj0402_gitlab if you can share a minimal example to debug, that may help us help you. It is difficult to reproduce your error with what you've shared.
Eric Leung
@erictleung
Very interesting document, "NPR's Hypothesis-Driven Design for Editorial Projects". Although aimed at journalists, it can probably be used for (data) scientists as well, especially the communication aspect of making insights from data.
Fazeel Usmani
@FazeelUsmani
Thanks a lot @pdurbin @erictleung
Zijing Zhang
@zzj0402_gitlab
Problem solved. Async function and test were needed.
Alice Jiang
@becausealice2
Ahhh the ol' JavaScript async "gotcha"
phao5814
@phao5814
Hey guys, would appreciate if someone with a lot of experience messing with neural networks would be able to help me out here! https://stackoverflow.com/questions/56921254/why-is-my-cnn-model-being-trained-in-less-time-on-my-local-cpu-than-hosted-optio
1rjun
@1rjun
Does anyone here interested to work on drinking water problem
?
if any then please ping me personally
Hey guys, I have written a blog post about Libra Cryptocurrency.
Check it out!
phao5814
@phao5814
hey guys - would appreciate it if someone with experience in CNNs could help clarify this for me :) https://stackoverflow.com/questions/57038055/what-are-activations-activation-gradients-weights-and-weight-gradients
Juliana Marie
@julianamariemorales
Hi guys, what is your suggested minimum data to be in a cluster?
Philip Durbin
@pdurbin
I went to a "Intro to Data Science: Predict the Box Office" workshop last night and the slides and Jupter Notebook they used are linked from https://www.thinkful.com/workshops/city/box-office/ . Pretty interesting stuff. Pandas, scikit-learn, etc.
Alice Jiang
@becausealice2
@julianamariemorales I feel like that would be subjective, I don't think anyone can give you a solid answer...
@pdurbin Oooohhhhh.... I've had my eye on their meetups for quite some time but haven't actually gone yet. Would you recommend?
Philip Durbin
@pdurbin
The speaker had a somewhat heavy accent but seemed to really know what he was talking about. Sure, I'd recommend it.
Alice Jiang
@becausealice2
fabulous
Philip Durbin
@pdurbin
It was on the 6th floor. Decent view.
Alice Jiang
@becausealice2
I imagine so. I love back bay. Anytime I go exploring I almost always end up wandering around there
Philip Durbin
@pdurbin
Yeah, my wife and I lived in the Fenway for four years and we'd wander over there a lot.
Alice Jiang
@becausealice2
I'm looking through that notebook now and.... what movie was eight and a half hours?!
Alice Jiang
@becausealice2
This one is listed in the dataset as being more than eight and a half hours long lol
oh wait it's a TV series.. Why is it in a boxoffice dataset?
Philip Durbin
@pdurbin
dunno
Seems like a fun dataset though. I like movies. :)
Alice Jiang
@becausealice2
It needs to be looked at and cleaned a bit lol I've really only looked at durations and I've found a few that are very wrong
Philip Durbin
@pdurbin
He had many, many slides on data cleansing.
Alice Jiang
@becausealice2
I haven't looked through the slides yet
but there's a bunch of TV series and the wolf of wallstreet is listed at 4 hours long lol
it's just a list of 5k random titles I think, there's tons of titles an hour and under and I recognize most of them as TV series
Alice Jiang
@becausealice2
it appears that the data cleaning process eliminates most, if not all non-movies so don't mind me complaining :joy:
Philip Durbin
@pdurbin
heh
@julianamariemorales hi there! What do you mean by "cluster"? Like a computer cluster or an unsupervised learning cluster?
Eric Leung
@erictleung
@phao5814 I took a try at explaining an answer to your question https://stackoverflow.com/a/57085350/6873133
vsvd
@vsvd
Hi, Are there any tools(Open source) for generating test data for page view for ML algorithm. For now I am trying to write a selenium test to view different products randomly in a website. So that way, i can generate VIEW ACTION logs for the Machine Learning Algorithm. I am expecting by doing this way, it will take lot lot of time to generate bulk data/log (More than one million logs in DB). Is there any Data Generation tool/open source tool for above scenario.
Alice Jiang
@becausealice2
I just read a Kotaku article on gaming sustainability that included some information on data center efficiency. You guys ever considered how eco-friendly your cloud computing is(n't) before?
@vsvd you're looking for an open source tool to generate bogey data based on existing data?
Vishesh Mangla
@XtremeGood
anyone here?