and also don't have to think should I create a module or not for this new Cadmium shard ? Less question, more action
Chris Watson
@watzon
What do you mean?
Rémy Marronnier
@rmarronnier
Cadmium will grow, and we and other contributors and users won't have to guess if the stemmer shard is a module or just classes
Chris Watson
@watzon
Ahh yeah
Ideally anything that has children should be a module
Rémy Marronnier
@rmarronnier
It's a detail, but each detail counts and our human brains are limited
even if there's only one classe (I'm looking at you TfIdf)
*class
Chris Watson
@watzon
In the case of TfIdf, Sentiment, etc which only have one class we should probably just keep it a top level class. Unless you have a better idea.
Rémy Marronnier
@rmarronnier
oh, I see, if we make sentiment a module, we'll have Cadmium::Sentiment::Sentiment. Stutterall over again. You're right. No children, no module.
Chris Watson
@watzon
Yeah exactly
Rémy Marronnier
@rmarronnier
New PR done :-)
Chris Watson
@watzon
Got it
Chris Watson
@watzon
I've signed cadmium up for the github actions beta
As soon as it gets approved I want to move the CI stuff from Travis to there
Rémy Marronnier
@rmarronnier
Great, I've never done CI stuff. It's gonna be fun :-)
Chris Watson
@watzon
Should be pretty similar to the Travis stuff that's already in there
Rémy Marronnier
@rmarronnier
Ok, I'll look at that.
Chris Watson
@watzon
We could actually integrate this into Cadmium as well. I think it's got some things that the Transliterator doesn't have
Rémy Marronnier
@rmarronnier
Looks useful. What do you want to do? Have a Cadmium::sterile shard ? Insert some methods into the transliterator ?
Rémy Marronnier
@rmarronnier
Also,I've looked into Stanford vs Cadmium Glove files format and :
I don't have the skills nor the motivation to write a converter in Crystal the binary files of the coincidence matrices output by Stanford.
However, I could write an adapter to read the .txt files.
For this to be useful, Cadmium::Glove should be able to start a training with word-vectors.json and corpus.json but without the coincidence matrix. Do you have any plans for this ?
Chris Watson
@watzon
I don't, so don't worry about it for now haha
As far as Sterile, I was thinking it could be merged with the Transliterator
_
Rémy Marronnier
@rmarronnier
Yeah, go for it ! :D
The more I read about POS tagging, the more I think we need to port Keras to crystal lol I'm going crazy :-p
Chris Watson
@watzon
It would be a really nice thing to have
Rémy Marronnier
@rmarronnier
Yeah, it's weird Python has all the nice toys. I was expecting the Go or Rust community to build the state of the art ML tools... but no.
Chris Watson
@watzon
Everyone is comfortable with Python and its ease of use
Which sucks
But gives us an opportunity in Crystal :wink:
Rémy Marronnier
@rmarronnier
Exactly ! I can't wait for the 1.0 to come out, just to be taken seriously by companies
Chris Watson
@watzon
Me too
Rémy Marronnier
@rmarronnier
Oh github actions is on !
I've set it up for lemmatizer (without ameba though) and test with my next PR :-)
*will test
Chris Watson
@watzon
Awesome :)
Yay Github Actions! I'm going to try setting up the CI with one of the libraries
Nice that the Lemmatizer is passing. Let's make sure that all of the libraries have ameba installed as a development dependency and then use the same configuration file as cadmiumcr/cadmium