Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • Oct 20 09:22
    mrityunjaygr8 opened #146
  • Oct 19 18:57
    jbsilva commented #144
  • Oct 11 22:11
    andkirby closed #265
  • Oct 11 22:11
    andkirby commented #265
  • Oct 11 19:37
    alissonsv opened #272
  • Oct 08 22:04
    alissonsv opened #271
  • Oct 08 09:52
    cscanlin edited #270
  • Oct 08 07:17
    cscanlin opened #270
  • Oct 05 06:45
    anakin87 commented #259
  • Oct 04 17:36
    paul-tharun commented #259
  • Oct 02 14:04
    tiagosamaha commented #265
  • Oct 02 14:03
    tiagosamaha commented #265
  • Oct 02 14:02
    tiagosamaha commented #268
  • Oct 01 23:33
    myrhillion commented #268
  • Oct 01 23:29
    tiagosamaha commented #268
  • Oct 01 23:22
    tiagosamaha commented #268
  • Oct 01 23:02
    tiagosamaha commented #265
  • Oct 01 12:17
    alwinw commented #265
  • Sep 30 11:11
    orent opened #269
  • Sep 29 21:40
    Sargastico commented #144
Arky
@arky
Thanks, Not sure Makefile for mulitple OS's without modification.
13 replies
Arky
@arky
Can someone tell me what's solution for 'No module named click' issue on gnu/Linux. ./excalibur-ubuntu-latest-x64 Traceback (most recent call last): File "arthur.py", line 5, in <module> File "/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/site-packages/PyInstaller/loader/pyimod03_importers.py", line 623, in exec_module File "excalibur/cli.py", line 5, in <module> ModuleNotFoundError: No module named 'click' [12776] Failed to execute script arthur
Vinayak Mehta
@vinayak-mehta
Oh you'll need to do pip install click for that
1 reply
Arky
@arky
Hey, I have made a usability suggestion to better handle excalibur missing requirements on MS Windows. Looking forward to your feedback camelot-dev/excalibur#111
Arky
@arky
Please download and test the latest exacalibur 0.4.3 https://github.com/camelot-dev/excalibur/releases/tag/v0.4.3
Arky
@arky
@vinayak-mehta I have been trying to test Excalibur with Python 3 (32bit) Windows 7. I got stuck with 'backports' module not found errors. Still trying to figure it out.
Vinayak Mehta
@vinayak-mehta
Can you post the full traceback here?
6 replies
Or in a gist / pastebin
nftopham
@nftopham
Hello, I am getting a huge amount of debug messages when running Camelot. The extraction works fine and passing suppres_warnings=True does not do anything.
they are all logs/debug messages from pdfminer
nftopham
@nftopham
I have disabled them manually via logging.getLogger("pdfminer").setLevel(logging.WARNING) but this is not really desirable
Vinayak Mehta
@vinayak-mehta
@nftopham I understand, thanks for reporting it here. I'll start work on fixing logging and the CLI's terminal output in general soon.
nftopham
@nftopham
Thanks, it's great otherwise
nftopham
@nftopham
Hi @vinayak-mehta I wanted to share with you a problem I had with Camelot and the solution
so I was getting a NotImplementedError because the PDF version I was reading had an unsupported encryption protocol, as stated on the camelot docs
so I searched for some solutions and ended up re-writing the file using ghostscript and downgrading the version. this actually completely removed the encryption which is quite funny. so much for password protected PDFs!
here is my solution, a bit messy right now but you get the gist. would be great if this could be included in future releases as there are only going to be more PDFs written > version 1.4 and PyPDF2 seems to be not interested in a fix
try:
    tables = camelot.read_pdf(**camelot_params)
except NotImplementedError:
    output = os.system('gswin64c -sDEVICE=pdfwrite -dCompatabilityLevel=1.4 -dSAFER -dNOPAUSE -dBATCH -o temp.pdf C:/Users/User/Desktop/input.pdf')
    url = os.path.join(os.getcwd(),"temp.pdf")
    camelot_params = get_camelot_params(meta, url)
    tables = camelot.read_pdf(**camelot_params)
Vinayak Mehta
@vinayak-mehta
@nftopham Did you also try qpdf like mentioned in the docs? https://camelot-py.readthedocs.io/en/master/user/quickstart.html#reading-encrypted-pdfs
Arky
@arky
I think it would be great to have OCR feature in Camelot/Excalibur, it would be really helpful to create open datasets from such PDF documents https://data.opendevelopmentmekong.net/dataset/facility-quarantine-is-necessary-for-returners-from-other-state-and-region-mandalay'
Vinayak Mehta
@vinayak-mehta
Yes, I'm trying to find time to experiment with https://github.com/JaidedAI/EasyOCR as it says that it works with different languages and even on snapshots.
Arky
@arky
Great
SolarDesalination
@SolarDesalination
Hi, I've installed excalibur, but when I write excalibur initdb in the python terminal, it says invalid syntax
4 replies
squareofseo
@squareofseo
hello i have question..
RuntimeError: Please make sure that Ghostscript is installed
i want to solve this error..ㅠ^ㅠ

OSError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/camelot/ext/ghostscript/_gsprint.py in <module>()
259 try:
--> 260 libgs = cdll.LoadLibrary("libgs.so")
261 except OSError:

8 frames
OSError: libgs.so: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

RuntimeError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/camelot/ext/ghostscript/_gsprint.py in <module>()
265 libgs = ctypes.util.find_library("gs")
266 if not libgs:
--> 267 raise RuntimeError("Please make sure that Ghostscript is installed")
268 libgs = cdll.LoadLibrary(libgs)
269

RuntimeError: Please make sure that Ghostscript is installed

thie error..
Vinayak Mehta
@vinayak-mehta
Looks like ghostscript isn't available on your PATH, how did you install it? These are the install instructions: https://camelot-py.readthedocs.io/en/master/user/install-deps.html
Shivam-Fullstack
@Shivam-Fullstack
Hi @vinayak-mehta , i'm using camelot-py==0.7.3 and i have facing infinite waiting during read table , so i have to re-run scheduler and then same pdf get read. is it a known issue ? . please suggest how can i fix it.
Shivam-Fullstack
@Shivam-Fullstack
any one please ans. my question
AndrewDaher
@AndrewDaher
hey guys, having some issues with excalibur
trying to run on windows
tried the executable but doesn't work properly, had a bunch of issues, so trying the manual way, but can't get past this step
$ excalibur initdb
image.png
Arky
@arky
@AndrewDaher I think it might work if you use within virtualenv 'python -m venv ./venv'
Vinayak Mehta
@vinayak-mehta
@Shivam-Fullstack Sorry for the late reply. This is not a known issue. Does it only happen on that particular PDF? There might be a problem with the file itself.
@AndrewDaher python -m excalibur initdb should also work.
Arky
@arky
@vinayak-mehta Let's catch up, just put something up on your calendar.
Vinayak Mehta
@vinayak-mehta
:+1:
Vinayak Mehta
@vinayak-mehta
Would love to get everyone's thoughts on this: camelot-dev/camelot#233
Arky
@arky
@vinayak-mehta How do you update Docs theme 'alabaster' to latest ? "-e git+https://github.com/bitprophet/alabaster/@3b68afcfe55a80508254b22904294100a160e6a7#egg=alabaster"
3 replies
avidalonc
@avidalonc
Hello everybody, I want to export all tables from my pdf into an excel or csv, but it only exports the first page, even though i have put this code to read all the tables: tables = camelot.read_pdf(file, pages="all") Could you help me with this pls?
Vinayak Mehta
@vinayak-mehta
Hi @avidalonc you can export all tables into multiple csvs by following the docs here: https://camelot-py.readthedocs.io/en/master/user/quickstart.html
To get a single csv, you'll have to combine all table dataframes into one using Python code, and then export that single dataframe
Gaurav
@gggauravgandhi
The cell coords are calculated always with 72dpi, regardless of what dpi I pass when to read_pdf. is this correct assumption?
1 reply
Pankaj S Y
@pankajsy9_twitter
Can "camelot" run on Python version 2 ?
2 replies
Pankaj S Y
@pankajsy9_twitter
Is it possible to detect tables ?
3 replies