Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • Jun 27 19:31
    MartinThoma commented #215
  • Jun 27 12:06
    Philippe-M opened #160
  • Jun 27 11:26
    echidne commented #151
  • Jun 27 11:26
    echidne commented #151
  • Jun 24 07:35
    lahdjirayhan commented #174
  • Jun 24 07:14
    lahdjirayhan commented #195
  • Jun 23 11:14
    kyuzh commented #159
  • Jun 23 08:56
    kyuzh commented #159
  • Jun 23 07:17
    RyosukeSakaguchi opened #312
  • Jun 22 19:45
    LuizMosciaro commented #286
  • Jun 22 09:43
    parthplc commented #142
  • Jun 20 15:41
    wangui-monicah commented #286
  • Jun 20 12:40
    HeskethGD commented #103
  • Jun 20 09:36
    kyuzh closed #158
  • Jun 20 08:53
    kyuzh commented #158
  • Jun 20 08:53
    kyuzh commented #158
  • Jun 17 13:34
    LuizMosciaro opened #311
  • Jun 17 13:34
    LuizMosciaro labeled #311
  • Jun 17 10:53
    huyz commented #306
  • Jun 17 10:51
    elsheikh21 closed #310
Vinayak Mehta
@vinayak-mehta
Hello world!
lsternlicht
@lsternlicht
Anyone know an HTML table parsing library as good as camelot?
Vinayak Mehta
@vinayak-mehta
@lsternlicht HTML table parsing is way more deterministic than PDF table parsing. pandas.read_html works most of the time for me.
Oleg Gavrilov
@OlegGavrilov
Hello guys! Can anyone help me out with this, I need to strip the "non breaking space" character from my output, but -strip '\u00a0' doesn't work
any other options I can try?
Deepak Dhaka
@dhaka22
Hi Vinayak, i am working on table extraction and camelot is giving me content of one column in a single row, how to handle that.. and it is now working with border less tables.
essentialols
@essentialols
Hi Vinayak, I'm trying to use camelot but I receive different kinds of error messages. The last error I received was OSError: [Errno 22] Invalid argument
Dimiter Naydenov
@dimitern
@vinayak-mehta Hey, do you think you'll have time to fix the TravisCI setup for Camelot after yesterday's renaming of the repo to atlanhq/camelot ?
Vinayak Mehta
@vortex_ape_twitter
I'm fixing it today.
Vinayak Mehta
@vortex_ape_twitter
I've fixed the failing tests. Travis now runs on https://github.com/camelot-dev/camelot. We can continue development on there.
Dimiter Naydenov
@dimitern
Awesome! I've some PRs to propose :)
Vinayak Mehta
@vinayak-mehta
Camelot v0.7.3 released. This is a bugfix release.
Abhi0495
@Abhi0495
hi Vinayak so i am having an issue in reading tables an exception is appearing "OSError: exception: access violation writing 0x16F3B7B0" could please suggest how to resolve this
Attila Skalina
@Synzzz
Hi, by any chance did anyone create a java wrapper for camelot?
Attila Skalina
@Synzzz
Also what's the situation with ghostscript having a paid commercial license but camelot itself having MIT license?
Éléonore
@Eleonore9
Hello!
I'm failing to extract a PDF table using Excalibur and would love to have a sample data, like a simple PDF that should work for sure.
Éléonore
@Eleonore9
@Eleonore9 I've selected a table and I'm stuck on a 'Refresh' page like camelot-dev/excalibur#69
Attila Skalina
@Synzzz
did you refresh? how much time did you wait?
Pravar Agrawal
@pravarag
@vinayak-mehta is there any way I can point my virtual environment to my local camelot in order to test local changes?
Vinayak Mehta
@vinayak-mehta
@pravarag You can create a new virtual env altogether and then install Camelot in editable mode.
@pravarag These are some of the easy open issues that you could pick up:
Pravar Agrawal
@pravarag
@vinayak-mehta sure. Thanks :)
Pravar Agrawal
@pravarag
@vinayak-mehta is pip install camelot-py[dev] same for editable mode?
nightwarrior-xxx
@nightwarrior-xxx
@vinayak-mehta Can you explain again how does camelot calculate the accuracy? Correct me if I am wrong. Firstly coordinates of pdf tables is calculated then coordinates of each cell is calculated and from each cell after combining we again get the whole tables and from that we calculate the coordinates.
Pravar Agrawal
@pravarag
@vinayak-mehta I was able to run camelot-py with changes to stream.py in reference to following issue: camelot-dev/camelot#88 . Now, while trying to handle exception for no text present in either (xmin, ymin, xmax, ymax) I'm wondering where to have text_bbox defined? Otherwise I'm greeted with (xmin, ymin, xmax, ymax) variable referenced before assignment error. Any suggestions?
Vinayak Mehta
@vinayak-mehta

@vinayak-mehta is pip install camelot-py[dev] same for editable mode?

pip install -e . for editable mode

@nightwarrior-xxx
  1. Calculate table coordinates (which include cell coordinates)
  2. Get list of text boxes from PDF
  3. Assign text box one by one checking overlap with a table cell. More the overlap, better the accuracy.
Vinayak Mehta
@vinayak-mehta
@pravarag Not sure about your questions. Can you point me to the line where you're trying to do this? A simple try..except should do the trick.
Pravar Agrawal
@pravarag
@vinayak-mehta trying to put a try on this line: https://github.com/camelot-dev/camelot/blob/master/camelot/parsers/stream.py#L98 . Once this has been handled, I've put "text_bbox" after the except as of now and that is where I'm getting above mentioned error.
Pravar Agrawal
@pravarag
@vinayak-mehta I've submitted a PR for the same, kindly review and let me know for any changes.
Vinayak Mehta
@vinayak-mehta
I'll check it out today! :)
Pravar Agrawal
@pravarag
Sure
Pravar Agrawal
@pravarag
@vinayak-mehta could you please review my pull request so that I can make further changes if required.
Vinayak Mehta
@vinayak-mehta
Yep I'll check it out today
Vinayak Mehta
@vinayak-mehta
@pravarag This weekend for sure, sorry for the lateness.
Pravar Agrawal
@pravarag
no problem @vinayak-mehta even I'm enjoying festival season :D
Pravar Agrawal
@pravarag
@vinayak-mehta did you check the PR?
abhishekasodaria
@abhishekasodaria
hello
i am trying to install camelot
but showing
cv version
has no matching distribution