Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    Oguzhan Unlu
    @oguzhanunlu
    Oh, thank you @lukehollis ! Just sent the email.
    Aakarsh Singh
    @aakarshsingh
    @lukehollis How about we write a custom XML reader, which will clean up the XML and give us the data that we need
    Luke Hollis
    @lukehollis
    @aakarshsingh, I think that sounds really interesting-- seems like you and @gree-gorey have been thinking along the same lines! Check out his xslt script here: https://github.com/gree-gorey/latin_text_perseus/blob/master/xslt/simplify.xsl
    I think that this would be really useful if we could generalize it enough to accommodate the variations in the different xml formats.
    (among the perseus latin and greek corpora)
    Luke Hollis
    @lukehollis
    we can write different scripts in xslt/python/etc. to deal with the other formats when necessary
    Grigory Ignatyev
    @gree-gorey
    Yes, indeed.
    I write my observations and thoughts about the issue cltk/cltk_api#11
    Aakarsh Singh
    @aakarshsingh
    Yes, I went through the script. It does a similar thing what I was thinking about. I think we just need to be open to use different logic to parse and get the different XMLs to the working format. Once we have done enough of that, we would know how to generalize it.
    Kyle P. Johnson
    @kylepjohnson
    FYI I'm making a backup of the API & Meteor server, so had to reboot. It'll be back up soon. Sorry @lukehollis I should have warned you
    Luke Hollis
    @lukehollis
    hey no worries, sounds good
    Kyle P. Johnson
    @kylepjohnson
    Luke the server is back up, but I can't mind your directions for restarting Meteor. I'm sorry, but whenever you can, I suggest do it
    Luke Hollis
    @lukehollis
    Yep, need to write those up
    sudo start cltkfrontend
    Kyle P. Johnson
    @kylepjohnson
    i'll try that now
    one sec
    Luke Hollis
    @lukehollis
    oh sorry, just did it to make sure it worked!
    i can stop it so you can replicate
    kk stopped
    Kyle P. Johnson
    @kylepjohnson
    very cool, we're back up! :boom:
    Luke Hollis
    @lukehollis
    sweet xD
    hoping to dedicate some more time to restructuring the reading templates tonight after answering some emails--will add more to the readme about building/deploying
    Luke Hollis
    @lukehollis
    added a bit more about the build and deploy process in the readme
    Kyle P. Johnson
    @kylepjohnson
    Thanks Luke I'll try it myself next time. When the time is right, we should make this autostart, too
    Luke Hollis
    @lukehollis
    sounds great
    Kyle P. Johnson
    @kylepjohnson
    @lukehollis I was working with @SameerIITKGP on data types for plays. I asked to give a shot at the following structure:
    {  
       "text":{  
          "Actor 1":{  
             "1":"This is the first line"
          }
       },
       "Actor 2":{  
          "2":"This is the first line",
          "3":"The third line."
       }
    }
    damn, this is wrong, hold on
    This:
    {  
       "text":{  
          "Actor 1":{  
             "1":"This is the first line"
          },
          "Actor 2":{  
             "2":"This is the first line",
             "3":"The third line."
          }
       }
    }
    If (IF!) this looks good, let's encode like this going forward, calling it something like "actor-line"
    Luke Hollis
    @lukehollis
    Okay interesting..
    Looks good, but what we have now is structured so that the key is always an integer
    We can design for this use case though
    Kyle P. Johnson
    @kylepjohnson
    Yes, I know this isn't perfect ... but I don't know what would be better. There's something else, I'm sure.
    The central problem is: How can we
    ... How can we convey
    (Sorry, iPhone app keeps sending my message too soon!)
    How can we mark the beginning of a new speaker? In my NLP work I could care less who the speaker is, however for reading this is obviously necessary!
    Luke, when you're ready to work on this, I expect you'll have better ideas
    Luke Hollis
    @lukehollis
    Oh man, that's a good question! Hrm... well, it seems like if it works for now, I'm all for it.
    We can work for a more complex solution after we get to a simple one, I think
    One reflection with this also: we're currently sorting on all the keys for book-chapter, chapter-section, book-line, etc.
    So when we have something like
    {
       'text' : {
          '1' : {
             '1' : "This is the first line",
             '2' : "This is the second line",
             ... 
          }
       }
    }
    Our current queries look like this:
    Luke Hollis
    @lukehollis
    Texts.find({work:"the_work"}, {sort:{ n_1 : 1, n_2 : 1}})
    Which means generally sort by the first nested keys and then sort by the second nested keys
    The simplest possible solution (maybe not optimal?) that I could imagine if we wanted to keep sorting on the keys would just be to do something like this: