Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    Alex Leone
    @alexleone
    These are effective groups!
    Traun Leyden
    @tleyden
    yeah so far I'm super impressed
    Thought Object
    @thoughtobj
    Is this using the go-tesseract wrapper for tesseract?
    Traun Leyden
    @tleyden
    It's disabled by default due to some bugs
    So it just calls tesseract via exec
    Omnipresent
    @Omnipresent
    Ok. I guess thats same since it would require the image to be saved to disk anywayd.
    Im new to docker. The readme says there will be 3 docker images running on same server. Could they be run on different servers? Dedicated server for rabbitmq, for api requests, and for ocr workers? To be more scalable
    Traun Leyden
    @tleyden
    yes definitely!
    I've set it up so that rabbitmq is running on https://www.cloudamqp.com/, which is nice because they manage it and give lots of nice web UI administration for rabbitmq
    Omnipresent
    @Omnipresent
    Awesome. This is super great. Ive been building a opencv and tesseract pre processing pipeline and was looking to scale it. Now i can just add mine as a preprocessing step
    Btw curious...what did u use for the diagram on the readme?
    Omnipresent
    @Omnipresent
    Is the httpd service load balancing between the workers?
    Daemian Mack
    @daemianmack
    hey all. is there a demo running somewhere of this project (or of tesseract) that i can try out?
    Traun Leyden
    @tleyden
    @daemianmack hey! nope, no public api, but it should be easy to deploy it on your own cloud.
    Daemian Mack
    @daemianmack
    @tleyden i was hoping to avoid setup if the sort of text i'm looking to OCR turns out to be impractical. maybe you could opine -- does the text in this image look like it might be possible to OCR with tesseract, given i might need to use character whitelisting and some kind of positioning bounding/transform? http://i.imgur.com/SMbdWzK.jpg
    Traun Leyden
    @tleyden
    @daemianmack it's really hard to know without trying, but my gut tells me that tesseract will struggle with that
    Thought Object
    @thoughtobj
    Is tesseract ran as a command line or does it use the provided C-APIs? Want to know if everything is done in memory or I/O
    Traun Leyden
    @tleyden
    @thoughtobj initially it was using a g
    .. a go binding to the c api
    However I ran into limitations and switched to a command line approach (fork / exec) subprocess
    Thought Object
    @thoughtobj
    @tleyden do you remember what limitations you ran into and whether they were from the actual c-api or from the go binding? command line approach would work fine however, it requires writing the file to the disk which includes I/O. Doing everything in memory would be better, no?
    Traun Leyden
    @tleyden
    @thoughtobj yeah there were limitations to the go bindings and I filed an issue (that I can dig up), which may have been fixed by now. I believe I made the commandline exec() approach the default but kept the gobinding approach as optional.
    But yeah, the gobinding approach is cleaner and more efficient and was my original approach
    simkimsia
    @simkimsia
    I was googling around for OCR as a service and your github came up
    How actively is the github repo maintained?
    simkimsia
    @simkimsia
    I have created an issue for this tleyden/open-ocr#52
    @tleyden Sorry I had to ping you directly. I was hoping you had an answer to this
    Traun Leyden
    @tleyden
    @simkimsia it's been maintained in the sense that it's been low maintenance, and I have been helping people that get stuck. Haven't added much in the way of new features, and I still need to get back to documenting and cleaning up the stroke width transform stuff.
    simkimsia
    @simkimsia
    @tleyden Thanks for clarifying.
    @tleyden I have somehow resolved my issue with the docker-compose up by turning on my VPN. Not sure why.
    Traun Leyden
    @tleyden
    Just saw that, thanks for posting the follow up!
    That is strange, I'd ask the person maintaining your network to see if that is on purpose. Never seen that before.
    simkimsia
    @simkimsia
    I am running the docker on my mac book pro
    so I am not sure what's my RABBITMQ_HOST ip address
    Traun Leyden
    @tleyden
    You are using docker compose right?
    simkimsia
    @simkimsia
    i used docker-compose up
    Traun Leyden
    @tleyden
    Actually would you mind opening a new ticket? I will tag as a question.. I think this will be useful to lots of people
    Things are much simpler with docker compose
    simkimsia
    @simkimsia
    This message was deleted
    simkimsia
    @simkimsia
    @tleyden This is the issue. You can tag as question. tleyden/open-ocr#53
    simkimsia
    @simkimsia
    Any ideas?