These are chat archives for cltk/cltk

2nd
Mar 2016
Deepanshu Vijay
@deepanshu1995
Mar 02 2016 00:23
Hello, I am a computer science undergraduate student and New to this organization. I would like to work on the project "Extend CLTK core to new language" as a GSOC project under CLTK. Is there a particular Mentor for this project?
Nelson Liu
@nelson-liu
Mar 02 2016 00:32
This message was deleted
Oops, accidentally misclicked and deleted the message. I’ll attempt to rewrite it:
@kylepjohnson i feel like its better practice to squash all PR commits into one commit and rebase onto master before merging the PR into master. That way, each commit corresponds to one issue fix / enhancement and there aren’t commits like Merge branch 'master' into prosody-fix in the history that don’t really mean much.
Kyle P. Johnson
@kylepjohnson
Mar 02 2016 04:56
@deepanshu1995 please write to me at kyle@kyle-p-johnson.com about what language you want add to the CLTK and which corpora/data sets you want to start with. Thanks!
@nelson-liu I have not done much with rebasing before. Would you please share here an example workflow for us? I'll confess that I like the simplicity of our current practice, though you're right that some of these commits aren't always useful.
@nelson-liu just saw your PR. I'll accept, though will probably move the info into the wiki. I'll get back to you about the rebase flow. thx!
Nelson Liu
@nelson-liu
Mar 02 2016 05:02
ah true, thanks. The wiki would be a better place to put it.
so the benefit of rebasing is twofold — you don’t have commits like “merge x into master” everywhere in the history, and you can use it to squash the commits in a PR into one commit, so in the history one commit refers to one issue solved / enhancement (as PRs typical solve an issue or introduce an enhancement)
so to rebase, you would git checkout my_feature_branch
and git rebase master
solve all the merge conflicts, and then it will fast forward your changes on top of the current master, so you don’t have commits like merge x into master
now to squash, you would run git rebase -i HEAD~x
where x is a number signifying the last x commits you want to squash
so if I made 3 commits in a PR, i’d run git rebase -i HEAD~3 to squash the last 3 commits in my branch together. After running that, git opens a text editor with the commit hashes / commit messages of the last x commits. There, you simply put an s or squash before each of the things you want to fold into one commit (usually the first one)
then you write a new commit message, and force push (git push -f). The result is that all of your 3 commits in that PR are now one commit, which is a lot cleaner in terms of history and workflow.
Kyle P. Johnson
@kylepjohnson
Mar 02 2016 06:13
I like it, Nelson. Appreciate the lesson. Will be in touch about this at my earliest opportunity. KJ
Abhishek Singh
@AbhishekKumarSingh
Mar 02 2016 09:00
Hi, I am a GSoC 2016 aspirant. While going through the ideas page of Classical Language Toolkit I came across the idea 'Develop Machine Translation Interface to Moses' and found it interesting and promising. I am first year masters student and my areas includes machine learning, text processing and NLP. I would like to contribute to this project as as part of GSoC. Who is the mentor for this project?
Kyle P. Johnson
@kylepjohnson
Mar 02 2016 16:27
Dear all applicants, our team has been (happily) overwhelmed by the many emails we've received. @lukehollis and I are responding to each on individually. It may take a few more days till we can get back to you, so please don't take the silence as disinterest.
@AbhishekKumarSingh That's a great project. Please write an email to me explaining what parallel corpora you'd make or use in your project. Also any MT experience you have. We can have a conversation from there.
shivch0te
@shivch0te
Mar 02 2016 16:30
@kylepjohnson i send you a mail. & i have knowledge of java , javascript, mysql, ajax and html. i choose classical sanskrit cause i have knowledge of it.. tell me how can i help .. i want to be part of this..
Abhishek Singh
@AbhishekKumarSingh
Mar 02 2016 18:00
@kylepjohnson, I have send you the mail.
kindly, let me know your feedback.
Nishant Suman
@Nishant23
Mar 02 2016 19:32
@kylepjohnson Kindly let me know your feedback of the mail I've send to you.
Ankush Khandelwal
@ankush1995
Mar 02 2016 20:44
@kylepjohnson I have send you a mail regarding what you have asked, please let me know what do you think about it.
Soumya
@soumyag213
Mar 02 2016 22:42

Greetings, I am a GSoc 2016 aspirant. I am a third year undergraduate student from Netaji Subhas Institute of Technology, New Delhi, India. I would love to work on a project in CLTK (in my opinion preserving languages works on more than just linguistic, cultural or philosophical levels)

I am comfortable in working with Python, Java and C++. I've had extensive training in Sanskrit and Hindi and a smattering of Kannada.
Would love to know how to go about it further on
Thank you