Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • Sep 08 07:27
    drEnilight commented #26
  • Sep 08 07:27
    drEnilight commented #26
  • Jun 27 19:10

    sclinede on master

    Fix undef method when Bestgems … (compare)

  • Jun 24 13:30
    sclinede commented #31
  • Feb 14 16:20
    sclinede commented #31
  • Feb 14 16:20
    sclinede commented #31
  • Feb 14 16:20
    sclinede commented #31
  • Jan 29 12:02
    sclinede commented #31
  • Jan 27 23:51
    codecov-io commented #31
  • Jan 27 23:47
    justedd synchronize #31
  • Jan 27 22:46
    justedd edited #31
  • Jan 27 22:41
    codecov-io commented #31
  • Jan 27 22:36
    justedd synchronize #31
  • Jan 27 14:40
    justedd opened #31
  • Jan 26 11:23
    justedd closed #30
  • Jan 26 11:23
    justedd edited #30
  • Jan 26 11:23
    justedd edited #30
  • Jan 23 21:45
    justedd edited #30
  • Jan 23 21:42
    justedd edited #30
  • Jan 23 18:22
    justedd edited #30
Sergey Dolganov
@sclinede
Welcome, everybody.
Here I want to discuss the solution to a common problem of modern web development:
How can we measure maintenance quality of any open-source packet/gem/library to keep our projects stable using proven and solid open-source basis.
My current point of view found its body in project http://ossert.evilmartians.io.
Currently it is only work with Ruby projects but with other platforms in mind.
I'll happily introduce features that necessary for community and add support for NPM if needed.
Let's start our dialog!
P.S. if you understand Russian you could also watch my RailsClub'16 talk as an intro here
Vsevolod Rodionov
@Jabher
Sergey, I think I have a point. I would propose a crazy thing: let's use machine learning. All we need to is to understand the expected output for a package, some, emm... Score. Like 0 for some crappy package, 0.8 for babel, 0.9 for react, 0.5 for angular and so. After that we can discover what metrics interfere and how to calculate the score. All we need is just to create a stats.
I'm rather say we need something like interview with 2000-3000 people or so, making them to score random 20-30 famous and not-so-famous packages, and after that we can do something good
It's all about stats first of all. Not about coding.
Sergey Dolganov
@sclinede

Hi.
The main problem of machine learning here is to proof that it provide valid results.
Of course, when it is obvious whether or not our ML gives us good results it make sense.
But.
If we will feed the ML with bad data that is not prepared properly and data that is not correlate with our decision we'll receive unexpected result and I don't think it will give us much profit.
I've started with Decision Trees and some metrics and found it is not working well.
So the next move was to discover more raw metrics and provide valuable higher order metrics upon them.
And here I found that we have many corner cases that will not be covered by interview of 3K people and 30 packges, believe me.

I'm sure that ML will help us, but not now.
At first we need to understand how to prepare as much clean stats as possible before feeding ML.

That's my point, but if you'll have working solution - that's great!

I didn't though about interviews but I think it will be more simple if we will have a prototype and feedback form, where interviewers could say what is missing for their decision. And... we already have a prototype just need to add feedback form and start the whole process)
WDYT?

Vsevolod Rodionov
@Jabher
We do not need any decision trees or something, we definitely know that we have bunch of params, each of them and I suggest that each of them is applying it's weight monotoniously. It's not linear (e.g. I suggest gh stars will probably interfere as log(x)), but always monotonious, so linear regression for data and monotonious transformations of data. It's like we're taking sigmoid, square root and log for gh stars (e.g.) and look what will correlate more.
Moreover, using deep learning is crappy idea, we need to engineer features on out own, not leave this job for NN.
And yes, we need to discover tons of features, but creating stats is, like, non-corellated task. If you know how to already the word about the poll, I'll work on picking packages for interview and so. It's not all about every package on npm, it's about some subset which will be used as initial group of training the algorithm.
Tim
@tonkonogov
emm... Score. Like 0 for some crappy package, 0.8 for babel, 0.9 for react, 0.5 for angular and so. After that we can discover what metrics interfere and how to calculate the score.
I guess it's barely possible to assign the single reliable score for a project
Tim
@tonkonogov
more than that, the state of the project is always changing, 3 years ago angular could be hypothetical 0.9, right now it's much lower
what is the trigger to change the score? how to change it? ask that 3-4k interviewers again?
Tim
@tonkonogov
another thing is subsections - popularity, maturity, maintenance (which I find pretty useful for), I can't imagine how ML could help with such theoretical division without a cherry-picking of metrics
Vladimir Starkov
@iamstarkov
whats about https://npms.io/ ?
it has some kind of scores already
Sergey Dolganov
@sclinede
As I understand npms.io is a search tool, Ossert is a research tool.
In npms.io you see scores, but you don't understand why are they have that score and actually you don't mind as the point is to find NPM package (because search of NPM is not good at all).
On the other hand Ossert tries to give as complete overview of project activity as possible (also comparison tool will appear in near future). It is more about insights for maintainers and users of a project to see weaks and strenghts, to see where to help and what to do next. And it will help to make decision about compliance of a project to your context as a side effect (i.e. R&D or highly available business project have different restrictions).
Tim
@tonkonogov
as I understand the total score is based on those three subscores
sum of components / 3, apparently
Vsevolod Rodionov
@Jabher
but we're trying to make some score anyway, guys. It's just like we're trying to say "we personally think that project is 90% better than average package because it has 70 forks, 100 closed issues and .eslintrc"
Vsevolod Rodionov
@Jabher
all I'm saying we should stop trying to personally figure out what is "good package" in our opinion and ask people. Like, 1000 people. Ideally, all good developers we know in person. Just to see which packages they count as a good ones, which ones they count as a bad ones - ask them to answer about packages they know, 20, 30, 50 packages, to say whether they are better then avg or worse then avg. All we will need to know is, like, "Dan Abramov thinks that React is better then Babel, and Babel is better then Angular 2". After that we can structure the responses and find that in general this developers who we trust and like (or "but dislike") are counting in general that Babel is nearly as good as React, and Angular 2 is a bit worse, and Webpack is much worse than all of them (e.g.) and after that we will be able to find the metrics that corellate. We may possibly find out something unexpected or we will understand better underlying reasons - as we have an answers and we will try to find some model, that explains corellation between forks, commits, gh issues, downloads, dependents and resulting value - very subjective "goodness score".
Can we do this? Can we reach developers in amount we need to make this statistically signficant?
Sergey Dolganov
@sclinede

As I understood, you'll face the problem of subjectivness in your solution as decisions would be made by human. That means that even if Dan Abramov and other huge part in our reference group think that some project is bad it could be caused just by their feelings about its community or hype or something else and not about quality of work that is done for it.

I agree with idea of asking people. That is why I started that project, I want as much people as possible to think about what is good open-source is and create a system that can measure it.
I think it would be much better to prepare somehow a questionnaire to see what do people use to make their decisions above their subjectivness and intuition, any marks or trends that they check to answer Why that project is so good/bad.
WDYT?

Tim
@tonkonogov
Probably the download top is a kind of the answer for this question
Tim
@tonkonogov
Also I'd like to rephrase this - "we personally think that project is 90% better than average package because it has 70 forks, 100 closed issues and .eslintrc"
we personally think that a project is 90% better than average project because it has more than average forks and closed issues, and it's true for the most downloaded projects as well
Vsevolod Rodionov
@Jabher
agree on that, it was just the example
Tim
@tonkonogov
the real feedback from prominent people is good indeed
Vsevolod Rodionov
@Jabher
but we should take into account the whole project dynamics, as old project with tons of PRs and fancy 6-month project with same amount of PRs are different
Tim
@tonkonogov
that's why there is a pulse section
Vsevolod Rodionov
@Jabher
agree. I'm just reminding that we should not rely only on raw numbers input, but manipulate them
but, speaking personally, I'd be confused about this case. should we rely on slowly-growing old project? Or modern one with powerful support of whole community?
Tim
@tonkonogov
it relies on what is downloaded most in the short run
and for me it's the weakest point in the whole system
Vsevolod Rodionov
@Jabher
are you so sure about this? this metric can be faked in a simple way by the bots.
aha, I'd rather agree on that
Tim
@tonkonogov
but still the simplest way to get the references
Vsevolod Rodionov
@Jabher
BTW, I'm working on some simple insights.
Do anyone has any idea why this is happening?
Screen Shot 2016-12-13 at 5.00.27 PM.png
Tim
@tonkonogov
I can't imagine how much time it could take to ask even 1k famous developers
Vsevolod Rodionov
@Jabher
it's dependency count for most popular packages (around top 10k, limited to 2k+ dependencies coming to them), and there are some weird gaps at the end
Tim
@tonkonogov
what is the source?
Vsevolod Rodionov
@Jabher
I've just dumped whole NPM, built a dependency graph (for all versions of packages, so it's not fully proper right now, but it was faster way), then dumped it
so, like, there's 75k packages on npm who rely upon mocha and so on, and this should be flat curve, but there's anomalies, and I cannot understand the reasons
Tim
@tonkonogov
I can't offer anything better but to pick some projects from gaps
Vsevolod Rodionov
@Jabher
probably I'll find something there, you're right.
Sergey Dolganov
@sclinede
I'm happily announce that we released first StackOverflow metrics!
You now can find number of quesions and resolved percent of them.
Thanks to @tonkonogov
Sergey Dolganov
@sclinede
Fresh news! Search is now works much better. Thanks to @likeath and @psdcoder
See http://ossert.evilmartians.io/search/rak