by

Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
Thiago Alessio Pereira
@thiagoalessio
@MabutasGroup not necessarily downgrade, just not using an alpha release might help.
4.0.0 and 4.1.0 are released already https://github.com/tesseract-ocr/tesseract/releases
@jona950 no idea ... I'd suggest you file an issue specifying all the versions / environment you are using so I can attempt to reproduce it. (the issue template will guide you on how to get all the necessary info)
nschleder
@nschleder
Is there a guide on how to install without composer, or is it required?
Thiago Alessio Pereira
@thiagoalessio
@nschleder it is not required, just clone / copy the source and require it from wherever you want to consume it ... composer just makes autoloading more convenient, but the library is made in pure PHP, nothing preventing you from using it in any way you see fit
nschleder
@nschleder
@thiagoalessio Thanks. I only ask because I'm getting "Class 'TesseractOCR' not found" error when requiring the TesseractOCR.php file and using the class
nschleder
@nschleder
@thiagoalessio I include the file. Write use thiagoalessio\TesseractOCR\TesseractOCR; $ocr = new TesseractOCR(); and get "thiagoalessio\TesseractOCR\Command" TesseractOCR.php ln 14. Am I missing something?
@nschleder * thiagoalessio\TesseractOCR\Command not found
nschleder
@nschleder
@thiagoalessio Nevermind. Probably namespace stuff I'm not familiar with. I'll figure it out
Ninsew
@Ninsew
I am having the same issue as @nschleder. I have tried both including the TesseractOCR.php file, and use thiagoalessio\TesseractOCR. I have also tried running it using exec('tesseract') ect, but nothing happens (it works from Terminal). Is there anything I'm missing?
syngetprog
@syngetprog
Hi. I'm using TesseractOCR for php and I'd like to get the 'rotate' number that I can get when I run "tesseract 'file.png' - -psm 0" from the command line (..it detects if the image needs to be rotated). I need to use that number within my php script. Does anyone know if I can do this using the script?
Thiago Alessio Pereira
@thiagoalessio
/all ANOUNCEMENT: Important security fix addressed in version 2.9.3, please make sure y'all update it to stay secure.
skynettsoftware
@skynettsoftware
Hi - thanks for the great library. Ive been looking through the docs and code but cant seem to find anything about outputting to multiple streams (like PDF and TXT) or re-using the same instance so it doesnt need to process the image more than once?
JPMulder
@JPMulder
Hi, I'm new to Tesseract, Got it all setup and working, Id like to know if I can convert image with Monospaced text and columns as a csv file for importing into my db. I can do a text file but it look a bit unusable for a db import.
zlanich
@zlanich
Hey all! I'm very new to Tesseract, and I intend to use it to convert reports with rows of tabular data into usable text that I can sift through with regex, etc to produce records to insert into a database (I know, probably common). I'm having a hard time figuring out where to start in the learning process. Does anyone have any suggestions for what I should research? I'm very unfamiliar with OCR terminology, so I'm even having a hard time Googling how to do things and reading through docs, because even with nearly 10yrs of heavy software experience, the docs are kind of a foreign language to me atm. Any help would be very appreciated!
zlanich
@zlanich
Here's an example of me tossing a document into it and watching not-so-usable gobbledegook come out the other end, lol: http://admin.ocr.test/ocr
Angelos
@AngelosNaoumis
hello, can I host my php code in cpanel and have tesseract installed in my windows and still be able to run it? Because I tried with the explained procedure but I get err The command "tesseract" was not found, while I can still run cmd and tesseract commands fine locally in my cmd
Iip Muhamad Ikbal
@iipmuhamadikbal
Hi
can it be ocr by image url?
tesseract
yshyp
@yshyp
Fatal error: Uncaught Error: Class 'thiagoalessio\TesseractOCR\TesseractOCR' not found in H:\env\htdocs\tesrr\test.php:4 Stack trace: #0 {main} thrown in H:\env\htdocs\tesrr\test.php on line 4
getting error
am new to tessaract
please help
Mateus Cirino
@mateus-cirino
@yshyp
image.png
Robert Eichholtz
@eichie

Hi, i have a question .. and i wonder why it happens .. i became these error message:

Error! The command did not produce any output.

Generated command:

"tesseract" "/my/path/pdf-5efa072583fb18.61443940.png" "/tmp/ocrLtBUjw" -l deu -psm 0

Returned message:

Tesseract Open Source OCR Engine v3.04.01 with Leptonica

has anyone a hint for me? .. thank you :-)

Thiago Alessio Pereira
@thiagoalessio
@eichie have you tried to run that same command directly on the command line to see if shows more helpful error messages?
tesseract /my/path/pdf-5efa072583fb18.61443940.png /tmp/ocrLtBUjw -l deu -psm 0
Robert Eichholtz
@eichie
yes, the only output are Tesseract Open Source OCR Engine v3.04.01 with Leptonica
but the first one i think should be successfull, or? .. and the second are a real error message

tesseract -v

tesseract 3.04.01
leptonica-1.74.1
libgif 5.1.4 : libjpeg 6b (libjpeg-turbo 1.5.1) : libpng 1.6.28 : libtiff 4.0.8 : zlib 1.2.8 : libwebp 0.5.2 : libopenjp2 2.1.2
Thiago Alessio Pereira
@thiagoalessio
hm ... have you tried with different psm values, such as 3 ? maybe tesseract is not understanding how to read that particular image with psm 0.
oh, and of course ... cat /tmp/ocrLtBUjw.txt after that ... tesseract now supports output directly to stdout, but i'm not sure that is the case for 3.04
TutanRamon
@TutanRamon
Hi all. I am trying to use Tesseract within Laravel. I have a simple setup now, but I get this exception and I am not sure how to fix it:
Method thiagoalessio\TesseractOCR\Command::__toString() must not throw an exception, caught ErrorException: Undefined offset: 1
Thiago Alessio Pereira
@thiagoalessio
@TutanRamon hm, that really looks like an issue ... would you mind filling in this template with more details about your PHP version, etc?
https://github.com/thiagoalessio/tesseract-ocr-for-php/issues/new?template=Bug_report.md
itchytendai
@itchytendai
Hi all, I'm using symfony 5.1, Tesseract works fine when the path to image is hard coded like this : $ocr->image('/public/encheres/img_20200612_143224-5efd845f41afc.jpeg'). But when I try to replace the string with a variable like this : $fileName = '/public/encheres/img_20200612_143224-5efd845f41afc.jpeg'; and $ocr->image($fileName); I don't get any response nor error. Would you be so kind and tell me what I did wrong? Thanks*
Thiago Alessio Pereira
@thiagoalessio
@itchytendai can you share the whole code snippet you are using?
itchytendai
@itchytendai

Yes of course there is the non working sinppet : /**

 * @Route("/show2/{id}", name="encheres_show2", methods={"GET","POST"})
 */
public function show2(EncheresRepository $encheresRepository, Encheres $enchere): Response
  {  $fileName = '/public/encheres/img_20200612_143224-5efd845f41afc.jpeg'; $ocr = new TesseractOCR(); $ocr->image($fileName);  $coco = $ocr->run();         return $this->render('encheres/show2.html.twig', [  'ocr' =>  $coco,  ]);
    } 

//////But with $ocr->image('/public/encheres/img_20200612_143224-5efd845f41afc.jpeg'); it's working properly. Thank you very much.

AlisaFeld
@AlisaFeld

Hello! Thank you very much for this wrapper. It works really well when I want a .txt file as output. But when I try to get a pdf-file I get the following exception: „Warning: Invalid resolution 0 dpi. Using 70 instead. Estimating resolution as 311“

I am using it with Tesseract 4.1.1 on Clear Linux, PHP Version 7.4 (Laravel).

I know why this exception is thrown, but with this wrapper, I get no resulting document.

When I try the same picture in the terminal on my Mac (same Tesseract Version) I also get the exception but also a usable document.

The code:

public function tessPDF(){
        $ocr = new TesseractOCR();
        $ocr->image($this->path.$this->filedata);
        $ocr->lang('deu'); 
        $ocr->tessdataDir($this->helppath."/trainingdata"); 
        $ocr->pdf(); 
        $ocr->setOutputFile($this->path.'text.pdf');
        $ocr->run();
    }

I would really appreciate any help!

AlisaFeld
@AlisaFeld

Same problem with an hocr file.

Error! The command did not produce any output. Generated command: "tesseract" "/var/www/storage/app/public/images/VgA2BalDuIH283VAH4CznfzuQbBPauDhM8yaE1JB.jpeg" "/tmp/ocrPyAgbm" -l deu --oem 1 -c "tessedit_char_whitelist=abcdefghijklmnopqrstuvwxyz" --tessdata-dir "/var/www/storage/app/trainingdata" --user-words "/var/www/storage/appfachwoerter.txt" hocr Returned message: read_params_file: Can't open hocr Tesseract Open Source OCR Engine v4.0.0 with Leptonica Warning: Invalid resolution 0 dpi. Using 70 instead. Estimating resolution as 1265

mallugunjate
@mallugunjate
Hi...
All
Good evening
This Mallappa from india
Hi All I need 1 help how we can use TesseractOCR without installing it via composer
is there any way?
Why because I have application which is not using composer
Ashwini Parihar
@pariharashwini
Hi @thiagoalessio : I am new in Tesseract.. For image it is working fine at my end..I just want to know, will be applicable for scanned pdf files?
Thiago Alessio Pereira
@thiagoalessio

@pariharashwini i guess pdf is not readable by tesseract 3.x (maybe 4.x already support it)

what you can do is to convert it to TIFF first and then send it to tesseract

Ashwini Parihar
@pariharashwini
Thanks @thiagoalessio for your quick reply but my requirement is to read 5-6 scanned pdf of 100-200 pages simultaneously. So TIFF option is not workable for me.
pkumar13
@pkumar13
Hi @thiagoalessio : I am new in Tesseract.. i like to develop project for mcr red can you help me