Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Nov 21 2021 16:20
    serge-hulne closed #36
  • Nov 21 2021 15:34
    serge-hulne edited #36
  • Nov 21 2021 15:29
    serge-hulne opened #36
  • Mar 29 2020 08:41

    watzon on master

    Update FUNDING.yml (compare)

  • Jan 29 2020 04:24
    watzon closed #35
  • Jan 29 2020 04:24
    watzon commented #35
  • Jan 21 2020 22:37
    Calamari commented #35
  • Jan 21 2020 20:09
    watzon commented #35
  • Jan 21 2020 18:28
    Calamari edited #35
  • Jan 21 2020 18:27
    Calamari opened #35
  • Nov 11 2019 19:58
    watzon commented #30
  • Nov 11 2019 19:20
    rmarronnier commented #30
  • Nov 07 2019 23:07
    watzon unlabeled #34
  • Nov 07 2019 23:07
    watzon unlabeled #33
  • Nov 07 2019 23:07
    watzon labeled #33
  • Nov 07 2019 23:07
    watzon unlabeled #33
  • Nov 07 2019 23:06
    watzon labeled #34
  • Nov 07 2019 23:06
    watzon labeled #34
  • Nov 07 2019 23:06
    watzon labeled #33
  • Nov 07 2019 23:06
    watzon labeled #33
Chris Watson
@watzon
Yeah I suppose it could've been RFCd, but I just wanted something more accurate and different from what currently exists (since I haven't actually seen any language detectors made using a classifer; they all rely on algorithms that are pretty finicky). The cadmium_language_detector project is cool, and we can probably do some API merging, but personally I'd like to go with whichever is more accurate for the actual algorithm.
I'm planning on writing some tests tomorrow, but in my testing so far cadmium_lang is pretty dang accurate, even with small text samples
Rémy Marronnier
@rmarronnier

but personally I'd like to go with whichever is more accurate for the actual algorithm

I'm ok with that criteria.

Can't wait to see your test results. That makes me think, I should write that cadmiumcr/evaluation proposal.

Chris Watson
@watzon
Evaluation? Do tell
Rémy Marronnier
@rmarronnier

Using the way back machine (:-p) :

What kind of Evaluation?
it will be a collection of crystal scripts that :
1 - Download a dataset
2- Run ad Cadmium::module against it
3- Compare the results with the good values

It will be useful to evaluate some algos (Language identification, POS tagging, sentiment analysis) and let our users check for themselves the accuracy of our tools
Chris Watson
@watzon
Ahh yeah
Forgot about that haha. Feel free to submit a proposal :)
Chris Watson
@watzon
Btw not sure if you noticed, but I bought an actual domain https://cadmiumcr.com/
Rémy Marronnier
@rmarronnier
.com ? Things are getting serious :-D
Chris Watson
@watzon
Oh yes lol
The other domain was a shitty free one that ended up breaking after a week
Rémy Marronnier
@rmarronnier
free ? I didn't know that was even possible !
Chris Watson
@watzon
Yeah but the free domains are very unreliable
Rémy Marronnier
@rmarronnier
I was thinking about the Cadmium roadmap. Do you have any thought on the subject ?
Chris Watson
@watzon
We need to make one haha, now that things are pretty well separated we also need to work on getting the main repo fleshed out. I don't know if I want to make it a reference repository or actually include everything all at once
Rémy Marronnier
@rmarronnier
Oh you mean putting all modules in the repo instead of shard-linking to them ?
Chris Watson
@watzon
Exactly. Just a "one time import" type of thing.
But idk if I like that idea
Rémy Marronnier
@rmarronnier
We need to have one big repo to generate api docs for the website, and I don't see how we can do that with separate shards
Chris Watson
@watzon
True, but at the same time the documentation generator isn't capable of generating docs for dependencies yet
I'm actually thinking of setting up a workflow to generate all the docs
Rémy Marronnier
@rmarronnier
that what I feared
that'd be awesome
Chris Watson
@watzon
All we need is a docker container that clones everything, cds into each folder, runs crystal docs, clones the website, and moves all of the generated docs into a subfolder
Then pushes the updated site
Rémy Marronnier
@rmarronnier
hmm maybe move all the folders into a unique cadmium one, else we'll have conflicting index files
Chris Watson
@watzon
Yeah I'm thinking for now they'll all be in individual folders
Eventually I want to be able to group them all together though
Rémy Marronnier
@rmarronnier
ok I see
Chris Watson
@watzon
The documentation generator has a long way to go
Rémy Marronnier
@rmarronnier
Can't we change the color to green ?
:-)
Chris Watson
@watzon
For? The website or the docs?
Rémy Marronnier
@rmarronnier
the docs
or we're stuck wwith purple
Chris Watson
@watzon
Not currently. There's no customization available. There is a way to export the docs as json though, which can then be used by another generator
There just aren't any other generators yet
Rémy Marronnier
@rmarronnier
Ok
Chris Watson
@watzon
I would really like one though
Really we just need a command line utility that allows you to define a template and transforms that json into docs based on the template
Rémy Marronnier
@rmarronnier
Can't it just output common markdown. W could plug it into a jekyll theme
?
Chris Watson
@watzon
If only lol
We'd need to make a converter
Rémy Marronnier
@rmarronnier
We'll see that later :-) I really want to finish the POS tagger before october and clean up / comment my code, do some refactoring / grunt work so we have something solid
Chris Watson
@watzon
Sounds great. I'll probably work on the docs angle here in a bit
Lots of meta stuff to do. Docs, benchmarks, finish up the website
Rémy Marronnier
@rmarronnier
Yeah so much work but so much fun :-)
Yvоnnе Мillеr
@frojnd_twitter
Hello