Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Dec 03 13:24
    clemsciences commented #958
  • Dec 03 13:23
    clemsciences commented #952
  • Dec 03 13:23
    clemsciences commented #950
  • Dec 03 13:20
    clemsciences commented #947
  • Dec 01 09:38
    lsbmsb commented #616
  • Nov 29 17:18
    kylepjohnson commented #222
  • Nov 29 08:04
    rash150996 commented #222
  • Nov 27 19:17
    diyclassics commented #961
  • Nov 27 18:49
    clemsciences opened #961
  • Nov 24 21:33
    clemsciences commented #949
  • Nov 20 15:29
    codecov-io commented #949
  • Nov 20 15:26
    codecov-io commented #949
  • Nov 20 15:03
    codecov-io commented #949
  • Nov 20 15:02
    clemsciences synchronize #949
  • Nov 20 09:07
    codecov-io commented #947
  • Nov 20 09:05
    codecov-io commented #947
  • Nov 20 08:41
    codecov-io commented #947
  • Nov 20 08:41
    clemsciences synchronize #947
  • Nov 19 23:22
    diyclassics commented #949
  • Nov 19 23:21
    diyclassics commented #949
Tanuj Garg
@tanuj208
Hi, I'm new and I would like to contribute, can someone get me started?
Tanuj Garg
@tanuj208
I have done the beginner's exercises
Piyush Yadav
@Erikishiru
read the contributors guide
find some issues and get working
Ashish Ratnawat
@ashish-ratn
Hi @kylepjohnson , I am computer Science Undergraduate student interested in Deep learning and its intersection with NLP.
Ashish Ratnawat
@ashish-ratn
I want to participate in GSOC this summer.I have done basic tutorials and excercises. I have read the blog http://cltk.org/blog/2018/12/30/under-resourced-languages-cltk.html and I have few queries - 1. Is it necessary to add more datasets or we can work with the old datasets(like greek and latin ) and add new algorithms or make the older ones more effective ? 2. Can I work on a Language like greek/Latin/Older versions of Sanskrit/Hindi for that matter to apply NLP algorithms and do things like POS Tagging, Translation and other stuff ? 3. If I have the answer to 6 questions that the blog asks, where do i put forward those to you ?
AadilMehdi J Sanchawala
@aadilmehdis
Hey, I am new around here. Can someone tell me how I can start contributing?
AadilMehdi J Sanchawala
@aadilmehdis
I've gone through the beginners guide and documentations
jerryfrancis-97
@jerryfrancis-97
hi , i'm new here . what should i do first to contribute?
Kyle P. Johnson
@kylepjohnson
@/all read our blog post (and the project page has been updated, too)
This year, we do NOT want small contributions. We want you to focus on making very good project proposals, instead.
Kyle P. Johnson
@kylepjohnson

@ashish-ratn

we can work with the old datasets(like greek and latin

Of course, reuse is fine, but likely we don't have nearly enough.

Think about the tasks you want to accomplish. And remember, you must know a language somewhat well if you want to work on it.

@Erikishiru We prefer that you not give the other students directions.
Ghost
@ghost~5bd5e42dd73408ce4fad0b93
When can we start submitting our proposals for gsoc? @kylepjohnson
Kyle P. Johnson
@kylepjohnson
@SunilKu12355774_twitter No not yet but focus on your proposal. If you have answers to the 6 questions in our blog post, DM on this
Ghost
@ghost~5bd5e42dd73408ce4fad0b93
@kylepjohnson thanks
sainimohit23
@sainimohit23
@kylepjohnson Do language corpus has to be in original scrpits?
For example - I did some research about the maithili language. It is primarily spoken in Eastern parts of India and Nepal. The original script for maithili is 'Tirhuta'. But, since 20th century 'devnagiri' script is preferred by the writers. 'Tirhuta' is not digitized yet. And now all of the digitized maithili texts are in 'devnagiri' script.
Kyle P. Johnson
@kylepjohnson
Do language corpus has to be in original scrpits?
@sainimohit23 no, they do not. For example, last summer's cuneiform/akkadian project was entirely done in the Latin alphabet. this is because the scholars who digitize these texts choose the Latin alphabet (and we must use what they have created).
APOORV SACHAN
@apoos-maximus
hi iam interested in contributing to cltk project
could someone direct me to the right resources
Nishchith Shetty
@inishchith
Welcome to CLTK @apoos-maximus ! you can have a look at the beginners-excercieses and also follow up quickstart . :smile:
APOORV SACHAN
@apoos-maximus
@inishchith thanks !
APOORV SACHAN
@apoos-maximus
would cltk work with a python3.7 installation
?
sainimohit23
@sainimohit23
@apoos-maximus yes, it will
APOORV SACHAN
@apoos-maximus
@sainimohit23 :)
in the documentation corpus refers to a knowledge database about a particular language right ?
and corpora refers to collection of such databases ?
am i being right in the interpretation of words corpus and corpora ?
APOORV SACHAN
@apoos-maximus
i understood it please ignore !
Hearot
@hearot
Guys, I'm gonna launch my own library which is a client for thelatinlibrary.com. You can already install it using pip install thelatinlibrary. It may be useful in a future :P
Soham Ghosh
@isohamnemesis
Hey CLTK society. I am a noob here! I am a B.Tech undergraduate student at National Institute of Technology Karnataka(NITK). I am a open source enthusiast and have research experience in Language Processing from IISc Bangalore. Bengali is one of the most ancient and patronized language of India and boasts of works that achieved milestones at the world stage like Nobel Prize for literature and Academy Awards at The Oscars for the film "Pather Panchali"(originally made in Bengali). A bit amazed to see that it lacks contribution in this sphere but equally enthusiastic about showcasing my desire in contributing for the same. Can anyone help me around with the way I can get a head start. I am willing to join the organisation and start the contributions with a long term motive of helping the organisation and also contribute for GSoc 19'
anshul96go
@anshul96go
Hi All
I am a 4th year UG student in Economics Department at IIT Kanpur. I was always interested in the study of ancient languages and would like to contribute. But, I am not familiar with NLP but have done courses and projects on ML. I am interested in learning NLP too.
Can please someone suggest me some NLP sources, if required and show to start contributing with my current knowledge?
Thanks :)
Kyle P. Johnson
@kylepjohnson

@hearot Thank you for letting us know about your project! Lately a CLTK contrib has been working on readers, too, since this has been a weak spot for our project. See two recent PRs here, though we have not written docs yet:

I am curious: What are the goals for thelatinlibrary project? Who do you imagine your users will be and how will they use it?

Thanks again @hearot please stay in touch. There are so few digital classicists that we should know each other :)

@isohamnemesis Bengali has a great pre-modern history, but if you're interested in doing a GSoC project, you need to prove to us that there exists enough data to write algorithms. Please see our latest blog post on cltk.org -- it has everything you need to know
Kyle P. Johnson
@kylepjohnson
@anshul96go I'll be honest, a GSoC proposal might be difficult for you, this year. The bar for our project is unusually high, since a student needs, at least, (a) a little NLP knowledge and (b) some understanding of an ancient/classical language. I feel that ML is of secondary importance, in particular as a way to assist doing NLP better.
Soham Ghosh
@isohamnemesis
I can assure you about the dataset because I have contacted one of the professor named Prasanta Kumar Ghosh of IISc Bangalore(Asia's largest science and engineering research institute) who has a significant contribution in the NLP society and has large corpuses of dataset of the Bengali Language!
Apoorv Patne
@apoorvpatne10
Hey everyone, I'm new to cltk. How can I familiaize myself with the source code and start contributing?
Apoorv Patne
@apoorvpatne10
@kylepjohnson one question, how much understanding is needed for a classical language or an ancient language?
Vamsi Krishna Pendyala
@code-krishna
Hi, this is vamsi a final year computer science and engineering student from National Institute of Technology, Agartala. I am student with some experience in Natural Language Processing. After going through cltk I feel like to contributing in something different. I wish to propose a textual entailment feature to the tool kit on any available language scripts, personally I am good with Hindi, will it be an effective contribution? Kindly let me know @kylepjohnson .
Shradhit Subudhi
@shradhit
Hello,
Hello, I am Shradhit Subudhi. I'm a computer engineer and a graduate student majoring in Data Science - Information Technology at Rutgers University, NJ. I've worked on NLP projects & have completed various courses over Coursera in Applied NLP / ML. I'm confident about my capabilities in coding being fluent in Python.
Shradhit Subudhi
@shradhit
I want to contribute to old/ middle english.
^ Old English - Dictionary links
Kyle P. Johnson
@kylepjohnson
@shradhit Focus on your application and answers to the 6 questions in the blog post. Everything is in the wiki
@code-krishna We are not interested in textual entailment but building more fundamental tools