Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • 14:21
    kylepjohnson commented #616
  • 14:15
    kylepjohnson commented #946
  • 13:47
    wmshort synchronize #945
  • Oct 20 11:46
    codecov-io commented #945
  • Oct 20 11:17
    codecov-io commented #945
  • Oct 20 11:17
    wmshort synchronize #945
  • Oct 20 10:27
    codecov-io commented #945
  • Oct 20 10:27
    wmshort synchronize #945
  • Oct 20 05:07
    lsbmsb commented #616
  • Oct 20 05:05
    lsbmsb commented #946
  • Oct 19 20:54
    kylepjohnson commented #945
  • Oct 19 17:31
    wmshort commented #945
  • Oct 19 15:38
    codecov-io commented #945
  • Oct 19 15:38
    wmshort synchronize #945
  • Oct 19 15:32
    codecov-io commented #945
  • Oct 19 15:30
    codecov-io commented #945
  • Oct 19 15:01
    codecov-io commented #945
  • Oct 19 15:01
    wmshort synchronize #945
  • Oct 19 13:59
    codecov-io commented #945
  • Oct 19 13:57
    codecov-io commented #945
APOORV SACHAN
@apoos-maximus
i understood it please ignore !
Hearot
@hearot
Guys, I'm gonna launch my own library which is a client for thelatinlibrary.com. You can already install it using pip install thelatinlibrary. It may be useful in a future :P
Soham Ghosh
@isohamnemesis
Hey CLTK society. I am a noob here! I am a B.Tech undergraduate student at National Institute of Technology Karnataka(NITK). I am a open source enthusiast and have research experience in Language Processing from IISc Bangalore. Bengali is one of the most ancient and patronized language of India and boasts of works that achieved milestones at the world stage like Nobel Prize for literature and Academy Awards at The Oscars for the film "Pather Panchali"(originally made in Bengali). A bit amazed to see that it lacks contribution in this sphere but equally enthusiastic about showcasing my desire in contributing for the same. Can anyone help me around with the way I can get a head start. I am willing to join the organisation and start the contributions with a long term motive of helping the organisation and also contribute for GSoc 19'
anshul96go
@anshul96go
Hi All
I am a 4th year UG student in Economics Department at IIT Kanpur. I was always interested in the study of ancient languages and would like to contribute. But, I am not familiar with NLP but have done courses and projects on ML. I am interested in learning NLP too.
Can please someone suggest me some NLP sources, if required and show to start contributing with my current knowledge?
Thanks :)
Kyle P. Johnson
@kylepjohnson

@hearot Thank you for letting us know about your project! Lately a CLTK contrib has been working on readers, too, since this has been a weak spot for our project. See two recent PRs here, though we have not written docs yet:

I am curious: What are the goals for thelatinlibrary project? Who do you imagine your users will be and how will they use it?

Thanks again @hearot please stay in touch. There are so few digital classicists that we should know each other :)

@isohamnemesis Bengali has a great pre-modern history, but if you're interested in doing a GSoC project, you need to prove to us that there exists enough data to write algorithms. Please see our latest blog post on cltk.org -- it has everything you need to know
Kyle P. Johnson
@kylepjohnson
@anshul96go I'll be honest, a GSoC proposal might be difficult for you, this year. The bar for our project is unusually high, since a student needs, at least, (a) a little NLP knowledge and (b) some understanding of an ancient/classical language. I feel that ML is of secondary importance, in particular as a way to assist doing NLP better.
Soham Ghosh
@isohamnemesis
I can assure you about the dataset because I have contacted one of the professor named Prasanta Kumar Ghosh of IISc Bangalore(Asia's largest science and engineering research institute) who has a significant contribution in the NLP society and has large corpuses of dataset of the Bengali Language!
Apoorv Patne
@apoorvpatne10
Hey everyone, I'm new to cltk. How can I familiaize myself with the source code and start contributing?
Apoorv Patne
@apoorvpatne10
@kylepjohnson one question, how much understanding is needed for a classical language or an ancient language?
Vamsi Krishna Pendyala
@code-krishna
Hi, this is vamsi a final year computer science and engineering student from National Institute of Technology, Agartala. I am student with some experience in Natural Language Processing. After going through cltk I feel like to contributing in something different. I wish to propose a textual entailment feature to the tool kit on any available language scripts, personally I am good with Hindi, will it be an effective contribution? Kindly let me know @kylepjohnson .
Shradhit Subudhi
@shradhit
Hello,
Hello, I am Shradhit Subudhi. I'm a computer engineer and a graduate student majoring in Data Science - Information Technology at Rutgers University, NJ. I've worked on NLP projects & have completed various courses over Coursera in Applied NLP / ML. I'm confident about my capabilities in coding being fluent in Python.
Shradhit Subudhi
@shradhit
I want to contribute to old/ middle english.
^ Old English - Dictionary links
Kyle P. Johnson
@kylepjohnson
@shradhit Focus on your application and answers to the 6 questions in the blog post. Everything is in the wiki
@code-krishna We are not interested in textual entailment but building more fundamental tools
Shradhit Subudhi
@shradhit
Thanks for your kind reply @kylepjohnson . Once I'm done answering where do I post them for you to read ?
Kyle P. Johnson
@kylepjohnson
you can dm me here with questions, but only after you at least have answers to the 6 questions.
otherwise save the entire essay for the usual gsoc process, which won't begin for a while
Shradhit Subudhi
@shradhit
@kylepjohnson Okay! Once I get the answer for all the questions! I'll direct message you!
thanks for your prompt replies.
Hamza Ali
@ryzbaka
@kylepjohnson Hi, I'm new to the room. Which wiki are your referring to?
Anjali Bansal
@Anjibansal
Hello everyone !! I am Anjali Bansal a second year undergraduate at Indira Gandhi Delhi Technical Uinversity For Women, India .
Srinivas
@srinivasmachiraju
Hello everyone I have couple of douts regarding this organization. 1. How is it different from NLTK. 2. Why doesn't english defined in this tool. Thank you
Srinivas
@srinivasmachiraju
Is this tool only for ancient of non english languages??
Sushant Mehta
@SMe12435
@kylepjohnson
Hello,
I am Sushant Mehta, a 3rd Year Computer Science undergrad at Manipal University Jaipur, India. I have a significant experience in NLP. I am looking forward to contribute to the CLTK community and also participate in GSoC 2019. Please guide further.
Indranil Biswas
@glitch401
Hi this is Indranil , Computer Science and Engineering undergraduate , in third year .
I am attracted to your project idea . I would like to start contribution to your project repo for GSOC and further , can you please guide me to get started , solving issues (good for beginners) :)
Kyle P. Johnson
@kylepjohnson
for those interested in gsoc, the project page has been updated: https://github.com/cltk/cltk/wiki/Project-ideas
Please read it carefully. Nearly all answers are in there. We do not have org resources available to repeat answers here
Indranil Biswas
@glitch401
sure thing sir ! :)
Indranil Biswas
@glitch401
@kylepjohnson I just went through the project idea and had the CLTK setup , I can be a contributer to the Sanskrit part of the extension of the project , but there is one section of the wiki that is unclear to me .
Do we have to prepare / gather / annote , datasets for the extension ?
for the data, how much preparation will be required. Concerning this last point, please remember that GSoC is about code, not data cleaning. We are not able to accept an otherwise brilliant application that also requires 6 weeks of data annotation or cleanup. If you believe you are able to do your data prep during application period or Community Bonding period, please explain that.
Soham Ghosh
@isohamnemesis
"For GSoC 2019, we are not encouraging applicants to make small code contributions, but instead to use this time to learn about the CLTK and make excellent proposals. " Does this mean that we need to solve the beginner's problem to be qualified for the selection process in GSoc 2019?
saurabhbazzad
@saurabhbazzad
Hello everyone. I would love to contribute to cltk. Can someone help me please?
Indranil Biswas
@glitch401
Hi ! @saurabhbazzad check out the page https://github.com/cltk/cltk/wiki/Project-ideas
MC
@michiboo
@kylepjohnson hi Kyle I have send you a answer for the 6 questions by email.
Deepak Divya Tejaswi
@deeox

Hey everyone! I'm Deepak Divya Tejaswi studying CSE+Economics dual major at BITS Pilani Goa. I stumbled upon this project while I was going through gsoc projects and was very fascinated by the idea and the impact of this project. I am looking forward to contribute positively towards the community.

@kylepjohnson I am interested in contributing towards adding the language Sanskrit. I hope I have answered some of the questions in the following doc. I read the latest blog article and came up with some brief answers to questions that can be viewed here

I have also included some working links to Sanskrit text datasets in the first answer.

Kyle P. Johnson
@kylepjohnson
Hi @deeox can you DM me this link, please?
@/all GSoC announces orgs this week, so it won't be until after then that someone from the CLTK team will be able to review draft documents. Thank you
SANMITRA
@sanmitraD
is CLTK is participating in GSoC 2019?
Sushant Mehta
@SMe12435
No
CLTK wasn't selected. I'd like to contribute to CLTK during my summer.
Shubhangi Dutta
@celinarose
Hello everyone, I'm Shubhangi Dutta from IIIT Hyderabad.
I'd like to contribute to cltk. I'm trying to write a tokeniser for Middle English but I'm facing some issues with non-standardised spellings and normalisation. Does anyone have any pointers?
Kyle P. Johnson
@kylepjohnson
Hi guys, we're not part of GSoC this summer. Contributions still welcome
@celinarose have you used our default tokenizer? http://docs.cltk.org/en/latest/middle_english.html#stopword-filtering
It is not specific to ME, but our other users have not had issues. That said, we want to know about any shortcomings