Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • Nov 29 09:02
    Jhen-wanderlust opened #149
  • Nov 27 16:36
    answerquest commented #148
  • Nov 27 14:17
    answerquest opened #148
  • Nov 27 06:36
    answerquest commented #124
  • Nov 22 02:29
    juthaip opened #147
  • Nov 17 14:53
    ncarboni opened #273
  • Nov 15 17:43
    answerquest commented #146
  • Nov 11 16:17
    olivierbouman commented #218
  • Nov 11 16:16
    olivierbouman commented #218
  • Nov 09 22:11
    myrhillion commented #268
  • Nov 09 15:53
    tiagosamaha commented #268
  • Nov 08 12:40
    alissonsv closed #271
  • Nov 08 12:40
    alissonsv commented #271
  • Nov 08 06:19
    ConMan05 commented #142
  • Nov 02 18:28
    denschmitz commented #261
  • Nov 02 18:21
    denschmitz commented #261
  • Oct 29 12:10
    joackobengochea commented #135
  • Oct 29 12:10
    joackobengochea commented #135
  • Oct 20 09:22
    mrityunjaygr8 opened #146
  • Oct 19 18:57
    jbsilva commented #144
phdkiran
@phdkiran
is there an option to complement strip_text like replace_text?
I am trying to insert a space for a new line during the table detection
Mariano Rodriguez
@marianorodriguez
hello! Can someone explain to me the difference between pip install camelot-pyand pip install camelot-py[cv]? which of those should I install to use Camelot inside a Python script?
Vinayak Mehta
@vinayak-mehta
@phdkiran You can do that in the pandas dataframe itself
@marianorodriguez Please use pip install camelot-py[cv]
Arky
@arky
@vinayak-mehta Trying to adopt excalibur for my workshops in SE. Asia. Is it possible to build executables for easy install for Windows, Mac, Linux both 32bit and 64bit. Perhaps using Pyinstaller or something similar. This would save me lot of time and also drive adoption.
Vinayak Mehta
@vinayak-mehta
Hi @arky did you try the existing Win and Linux (should work on macOS) executables? https://github.com/camelot-dev/excalibur/releases/tag/v0.4.0
Ghostscript still needs to be installed separately for these
Arky
@arky
@vinayak-mehta Sweet I wasn't aware there were already available. I am going to give them a shot.
Vinayak Mehta
@vinayak-mehta
Please let me know if you face any problems. And if you have any ideas for the following issues, please comment. Solving these issues will make installation easy. I haven't been getting time from the day job to work on them.
Arky
@arky
Will do. For starters, I think it much simple to do user education, by providing clear instructions on how to install and where to get them would solve most issues. Perhaps
a wiki page or dedicated web site/page ie. get-camelot.github.io and it has big blue button to get the exe for target OS along with links to dependecies would drive adoption.
slhappyls
@slhappyls
Chinese user is saying hi
1 reply
Arky
@arky
@vinayak-mehta Any update on camelot-dev/excalibur#99
5 replies
@vinayak-mehta These are quite important for wider adoption of the tools in civil socities out here
Arky
@arky
@vinayak-mehta Is this the correct commands to generate the current excalibur executables for gnu/linux and macOS ? "pyi-makespec --paths=excalibur/executors/celery_executor.py arthur.py
pyinstaller --onefile --add-data "excalibur/www/templates:excalibur/www/templates" --add-data "excalibur/www/static:excalibur/www/static" --add-data "excalibur/config_templates:excalibur/config_templates" arthur.py"
Vinayak Mehta
@vinayak-mehta
Arky
@arky
Thanks, Not sure Makefile for mulitple OS's without modification.
13 replies
Arky
@arky
Can someone tell me what's solution for 'No module named click' issue on gnu/Linux. ./excalibur-ubuntu-latest-x64 Traceback (most recent call last): File "arthur.py", line 5, in <module> File "/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/site-packages/PyInstaller/loader/pyimod03_importers.py", line 623, in exec_module File "excalibur/cli.py", line 5, in <module> ModuleNotFoundError: No module named 'click' [12776] Failed to execute script arthur
Vinayak Mehta
@vinayak-mehta
Oh you'll need to do pip install click for that
1 reply
Arky
@arky
Hey, I have made a usability suggestion to better handle excalibur missing requirements on MS Windows. Looking forward to your feedback camelot-dev/excalibur#111
Arky
@arky
Please download and test the latest exacalibur 0.4.3 https://github.com/camelot-dev/excalibur/releases/tag/v0.4.3
Arky
@arky
@vinayak-mehta I have been trying to test Excalibur with Python 3 (32bit) Windows 7. I got stuck with 'backports' module not found errors. Still trying to figure it out.
Vinayak Mehta
@vinayak-mehta
Can you post the full traceback here?
6 replies
Or in a gist / pastebin
nftopham
@nftopham
Hello, I am getting a huge amount of debug messages when running Camelot. The extraction works fine and passing suppres_warnings=True does not do anything.
they are all logs/debug messages from pdfminer
nftopham
@nftopham
I have disabled them manually via logging.getLogger("pdfminer").setLevel(logging.WARNING) but this is not really desirable
Vinayak Mehta
@vinayak-mehta
@nftopham I understand, thanks for reporting it here. I'll start work on fixing logging and the CLI's terminal output in general soon.
nftopham
@nftopham
Thanks, it's great otherwise
nftopham
@nftopham
Hi @vinayak-mehta I wanted to share with you a problem I had with Camelot and the solution
so I was getting a NotImplementedError because the PDF version I was reading had an unsupported encryption protocol, as stated on the camelot docs
so I searched for some solutions and ended up re-writing the file using ghostscript and downgrading the version. this actually completely removed the encryption which is quite funny. so much for password protected PDFs!
here is my solution, a bit messy right now but you get the gist. would be great if this could be included in future releases as there are only going to be more PDFs written > version 1.4 and PyPDF2 seems to be not interested in a fix
try:
    tables = camelot.read_pdf(**camelot_params)
except NotImplementedError:
    output = os.system('gswin64c -sDEVICE=pdfwrite -dCompatabilityLevel=1.4 -dSAFER -dNOPAUSE -dBATCH -o temp.pdf C:/Users/User/Desktop/input.pdf')
    url = os.path.join(os.getcwd(),"temp.pdf")
    camelot_params = get_camelot_params(meta, url)
    tables = camelot.read_pdf(**camelot_params)
Vinayak Mehta
@vinayak-mehta
@nftopham Did you also try qpdf like mentioned in the docs? https://camelot-py.readthedocs.io/en/master/user/quickstart.html#reading-encrypted-pdfs
Arky
@arky
I think it would be great to have OCR feature in Camelot/Excalibur, it would be really helpful to create open datasets from such PDF documents https://data.opendevelopmentmekong.net/dataset/facility-quarantine-is-necessary-for-returners-from-other-state-and-region-mandalay'
Vinayak Mehta
@vinayak-mehta
Yes, I'm trying to find time to experiment with https://github.com/JaidedAI/EasyOCR as it says that it works with different languages and even on snapshots.
Arky
@arky
Great
SolarDesalination
@SolarDesalination
Hi, I've installed excalibur, but when I write excalibur initdb in the python terminal, it says invalid syntax
4 replies
squareofseo
@squareofseo
hello i have question..
RuntimeError: Please make sure that Ghostscript is installed
i want to solve this error..ㅠ^ㅠ

OSError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/camelot/ext/ghostscript/_gsprint.py in <module>()
259 try:
--> 260 libgs = cdll.LoadLibrary("libgs.so")
261 except OSError:

8 frames
OSError: libgs.so: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

RuntimeError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/camelot/ext/ghostscript/_gsprint.py in <module>()
265 libgs = ctypes.util.find_library("gs")
266 if not libgs:
--> 267 raise RuntimeError("Please make sure that Ghostscript is installed")
268 libgs = cdll.LoadLibrary(libgs)
269

RuntimeError: Please make sure that Ghostscript is installed

thie error..
Vinayak Mehta
@vinayak-mehta
Looks like ghostscript isn't available on your PATH, how did you install it? These are the install instructions: https://camelot-py.readthedocs.io/en/master/user/install-deps.html
Shivam-Fullstack
@Shivam-Fullstack
Hi @vinayak-mehta , i'm using camelot-py==0.7.3 and i have facing infinite waiting during read table , so i have to re-run scheduler and then same pdf get read. is it a known issue ? . please suggest how can i fix it.
Shivam-Fullstack
@Shivam-Fullstack
any one please ans. my question
AndrewDaher
@AndrewDaher
hey guys, having some issues with excalibur
trying to run on windows