Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • 22:29
    codecov-io commented #959
  • 22:26
    codecov-io commented #959
  • 22:06
    kylepjohnson commented #959
  • 22:06
    kylepjohnson commented #959
  • 22:02
    codecov-io commented #959
  • 22:02
    kylepjohnson synchronize #959
  • 21:58
    kylepjohnson review_requested #959
  • 20:53
    nolanee opened #959
  • 04:31
    kylepjohnson unassigned #956
  • 04:31
    kylepjohnson assigned #956
  • 04:31
    kylepjohnson commented #956
  • Nov 14 22:31
    nolanee commented #956
  • Nov 14 22:09
    kylepjohnson labeled #956
  • Nov 14 22:09
    kylepjohnson labeled #956
  • Nov 14 22:09
    kylepjohnson commented #956
  • Nov 14 22:04
    kylepjohnson assigned #956
  • Nov 13 10:42
    codecov-io commented #958
  • Nov 13 10:40
    codecov-io commented #958
  • Nov 13 10:15
    codecov-io commented #958
  • Nov 13 10:15
    clemsciences synchronize #958
Ghost
@ghost~5bd5e42dd73408ce4fad0b93
@kylepjohnson thanks
sainimohit23
@sainimohit23
@kylepjohnson Do language corpus has to be in original scrpits?
For example - I did some research about the maithili language. It is primarily spoken in Eastern parts of India and Nepal. The original script for maithili is 'Tirhuta'. But, since 20th century 'devnagiri' script is preferred by the writers. 'Tirhuta' is not digitized yet. And now all of the digitized maithili texts are in 'devnagiri' script.
Kyle P. Johnson
@kylepjohnson
Do language corpus has to be in original scrpits?
@sainimohit23 no, they do not. For example, last summer's cuneiform/akkadian project was entirely done in the Latin alphabet. this is because the scholars who digitize these texts choose the Latin alphabet (and we must use what they have created).
APOORV SACHAN
@apoos-maximus
hi iam interested in contributing to cltk project
could someone direct me to the right resources
Nishchith Shetty
@inishchith
Welcome to CLTK @apoos-maximus ! you can have a look at the beginners-excercieses and also follow up quickstart . :smile:
APOORV SACHAN
@apoos-maximus
@inishchith thanks !
APOORV SACHAN
@apoos-maximus
would cltk work with a python3.7 installation
?
sainimohit23
@sainimohit23
@apoos-maximus yes, it will
APOORV SACHAN
@apoos-maximus
@sainimohit23 :)
in the documentation corpus refers to a knowledge database about a particular language right ?
and corpora refers to collection of such databases ?
am i being right in the interpretation of words corpus and corpora ?
APOORV SACHAN
@apoos-maximus
i understood it please ignore !
Hearot
@hearot
Guys, I'm gonna launch my own library which is a client for thelatinlibrary.com. You can already install it using pip install thelatinlibrary. It may be useful in a future :P
Soham Ghosh
@isohamnemesis
Hey CLTK society. I am a noob here! I am a B.Tech undergraduate student at National Institute of Technology Karnataka(NITK). I am a open source enthusiast and have research experience in Language Processing from IISc Bangalore. Bengali is one of the most ancient and patronized language of India and boasts of works that achieved milestones at the world stage like Nobel Prize for literature and Academy Awards at The Oscars for the film "Pather Panchali"(originally made in Bengali). A bit amazed to see that it lacks contribution in this sphere but equally enthusiastic about showcasing my desire in contributing for the same. Can anyone help me around with the way I can get a head start. I am willing to join the organisation and start the contributions with a long term motive of helping the organisation and also contribute for GSoc 19'
anshul96go
@anshul96go
Hi All
I am a 4th year UG student in Economics Department at IIT Kanpur. I was always interested in the study of ancient languages and would like to contribute. But, I am not familiar with NLP but have done courses and projects on ML. I am interested in learning NLP too.
Can please someone suggest me some NLP sources, if required and show to start contributing with my current knowledge?
Thanks :)
Kyle P. Johnson
@kylepjohnson

@hearot Thank you for letting us know about your project! Lately a CLTK contrib has been working on readers, too, since this has been a weak spot for our project. See two recent PRs here, though we have not written docs yet:

I am curious: What are the goals for thelatinlibrary project? Who do you imagine your users will be and how will they use it?

Thanks again @hearot please stay in touch. There are so few digital classicists that we should know each other :)

@isohamnemesis Bengali has a great pre-modern history, but if you're interested in doing a GSoC project, you need to prove to us that there exists enough data to write algorithms. Please see our latest blog post on cltk.org -- it has everything you need to know
Kyle P. Johnson
@kylepjohnson
@anshul96go I'll be honest, a GSoC proposal might be difficult for you, this year. The bar for our project is unusually high, since a student needs, at least, (a) a little NLP knowledge and (b) some understanding of an ancient/classical language. I feel that ML is of secondary importance, in particular as a way to assist doing NLP better.
Soham Ghosh
@isohamnemesis
I can assure you about the dataset because I have contacted one of the professor named Prasanta Kumar Ghosh of IISc Bangalore(Asia's largest science and engineering research institute) who has a significant contribution in the NLP society and has large corpuses of dataset of the Bengali Language!
Apoorv Patne
@apoorvpatne10
Hey everyone, I'm new to cltk. How can I familiaize myself with the source code and start contributing?
Apoorv Patne
@apoorvpatne10
@kylepjohnson one question, how much understanding is needed for a classical language or an ancient language?
Vamsi Krishna Pendyala
@code-krishna
Hi, this is vamsi a final year computer science and engineering student from National Institute of Technology, Agartala. I am student with some experience in Natural Language Processing. After going through cltk I feel like to contributing in something different. I wish to propose a textual entailment feature to the tool kit on any available language scripts, personally I am good with Hindi, will it be an effective contribution? Kindly let me know @kylepjohnson .
Shradhit Subudhi
@shradhit
Hello,
Hello, I am Shradhit Subudhi. I'm a computer engineer and a graduate student majoring in Data Science - Information Technology at Rutgers University, NJ. I've worked on NLP projects & have completed various courses over Coursera in Applied NLP / ML. I'm confident about my capabilities in coding being fluent in Python.
Shradhit Subudhi
@shradhit
I want to contribute to old/ middle english.
^ Old English - Dictionary links
Kyle P. Johnson
@kylepjohnson
@shradhit Focus on your application and answers to the 6 questions in the blog post. Everything is in the wiki
@code-krishna We are not interested in textual entailment but building more fundamental tools
Shradhit Subudhi
@shradhit
Thanks for your kind reply @kylepjohnson . Once I'm done answering where do I post them for you to read ?
Kyle P. Johnson
@kylepjohnson
you can dm me here with questions, but only after you at least have answers to the 6 questions.
otherwise save the entire essay for the usual gsoc process, which won't begin for a while
Shradhit Subudhi
@shradhit
@kylepjohnson Okay! Once I get the answer for all the questions! I'll direct message you!
thanks for your prompt replies.
Hamza Ali
@ryzbaka
@kylepjohnson Hi, I'm new to the room. Which wiki are your referring to?
Anjali Bansal
@Anjibansal
Hello everyone !! I am Anjali Bansal a second year undergraduate at Indira Gandhi Delhi Technical Uinversity For Women, India .
Srinivas
@srinivasmachiraju
Hello everyone I have couple of douts regarding this organization. 1. How is it different from NLTK. 2. Why doesn't english defined in this tool. Thank you
Srinivas
@srinivasmachiraju
Is this tool only for ancient of non english languages??
Sushant Mehta
@SMe12435
@kylepjohnson
Hello,
I am Sushant Mehta, a 3rd Year Computer Science undergrad at Manipal University Jaipur, India. I have a significant experience in NLP. I am looking forward to contribute to the CLTK community and also participate in GSoC 2019. Please guide further.
Indranil Biswas
@glitch401
Hi this is Indranil , Computer Science and Engineering undergraduate , in third year .
I am attracted to your project idea . I would like to start contribution to your project repo for GSOC and further , can you please guide me to get started , solving issues (good for beginners) :)
Kyle P. Johnson
@kylepjohnson
for those interested in gsoc, the project page has been updated: https://github.com/cltk/cltk/wiki/Project-ideas
Please read it carefully. Nearly all answers are in there. We do not have org resources available to repeat answers here
Indranil Biswas
@glitch401
sure thing sir ! :)
Indranil Biswas
@glitch401
@kylepjohnson I just went through the project idea and had the CLTK setup , I can be a contributer to the Sanskrit part of the extension of the project , but there is one section of the wiki that is unclear to me .