These are chat archives for cltk/cltk_api

9th
Mar 2018
aboelhamd
@aboelhamd
Mar 09 2018 09:34
Hello , Is there any Arabic mentors in CLTK this year ?
Thanks in advance .
Kyle P. Johnson
@kylepjohnson
Mar 09 2018 14:26
@kevinstadler This is a great question. Thanks for digging into the issue. I'll venture a short answer here to get things started …
1) I wrote the v2 of the API, which has 2 basic functions: serving texts and doing NLP processing. However @lukehollis leads the web projects and can answer exactly how the project will leverage it in the near future.
2) Luke can speak to the workflow of JSON and TEI-formatted texts.
3) About writing a better reader, we have had lots of thoughts … but not many decisions. @diyclassics has done some work , within the core python project, to create a reader. However nothing we'd quite call official yet.
4) So you are correct in seeing this as a weak spot throughout the CLTK, however I believe we have some OK ad hoc solutions. But this topic falls as much into the frontend as back, so I think @lukehollis should give his full opinion too, about whether this is a priority for GSoC '18
@aboelhamd Yes, we are very happy to say that we do have an Arabic mentor this year.
aboelhamd
@aboelhamd
Mar 09 2018 15:34
Thank you Kyle , So what to do next ? Contact this mentor or what ?
Kyle P. Johnson
@kylepjohnson
Mar 09 2018 16:40
@aboelhamd good question. Please do the first two steps first, then I will introduce you to the Arabic mentor:
Kyle P. Johnson
@kylepjohnson
Mar 09 2018 16:47
1) Do the Beginners' exercises, with Classical Arabic: https://github.com/cltk/cltk/wiki/Beginners'-exercises
2) Write a draft proposal according to our GSoC propsosal template: https://github.com/cltk/cltk/wiki/GSoC-proposal-template . For Arabic, you'll want to focus on what NLP processing you will be able to add for classical arabic (things like word tokenization, pos tagging, etc). If training data is required, it is critical that you explain what free data you will use.
@aboelhamd I will actually move this conversation to the channel for the python project