Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    stefanCCS
    @stefanCCS
    I call ocrd-cis-ocropy-resegment -I $inFolder -O $outFolder -P level-of-operation region -P method baseline.
    Then, resegments adds this missing line NOT as extra line but it is added to both(!) lines around it:
    image.png
    Is this, what you expect resegment should do?
    Any other parameter possible, which
    • creates the extra line
    • or ignores this text
      ?
    5 replies
    Robert Sachunsky
    @bertsky
    Dear /@all, @hnesk's fabulous OCR-D Browser now features a Dockerized webservice which serves @b2m's Broadwayd recipe for running the Gtk UI in the browser via broadwayd.
    It also comes with a simplistic landing page that recursively indexes all METS files in the mounted volume, and when clicking the respective link, runs browse-ocrd for that workspace and redirects the browser to the broadwayd site.
    This can be useful for non-Linux users or on headless servers.
    See hnesk/browse-ocrd#45 for screenshots and https://github.com/hnesk/browse-ocrd#docker-service for a description.
    Lena Hinrichsen
    @lena-hinrichsen
    @/all After a break of a few weeks, our Ground Truth Call takes place today. Join us at 1 p.m. in https://meet.gwdg.de/b/eli-ufa-unu and discuss about today's topic (noise in GT) or ask your questions about GT and transcribing.
    2 replies
    Lena Hinrichsen
    @lena-hinrichsen
    @/all Today, 2–3pm is our open TechCall. Feel free to join us in https://meet.gwdg.de/b/eli-ufa-unu As always, we start with the bi-weekly update. Feel free to add your topics to the list!
    Lena Hinrichsen
    @lena-hinrichsen
    @/all Due to health reasons, the OCR-D-GT-Call today unfortunately has to be postponed until next week. It will therefore take place on 30 June and 7 July at 1 pm.
    Jim Salmons
    @Jim_Salmons_twitter

    Hi all! πŸ‘‹πŸ€“ Been a while since I posted due primarily to my being consumed with rehab and recovery from my severe spinal cord injury. But I am getting back to work on my DH research focused on developing a ground truth storage format for magazines focused on the ~33,000 computer magazines at the Internet Archive.

    I just started a short course on NLP with spaCy this week as part of the JSTOR TAP Institute. During the first class session I met a fellow, Daniel Hutchinson, who is an historian and DH professor at Belmont Abbey College in the USA (http://dhl.bac.edu/danielhutchinson). He is working on digitizing, transcribing, translating, and analyzing German-language WWII era newspapers published in Axis POW camps in the U.S and Europe. I immediately thought of this group and referred him here. So please keep an eye out for him to join us. I’m sure he will be happy to meet helpful kindred spirits.

    Happy Healthy Vibes from Colorado! 😎

    JBBalling
    @JBBalling
    Hi all, can someone recommend a repository for neuronal line segmentation which can be wrapped in OCR-D? Or is there already an existing OCR-D-Processor that runs on GPU? Thank you in advance!
    Lena Hinrichsen
    @lena-hinrichsen

    @/all As you know, we would have liked to make up for last week's OCR-D GT Call today. Unfortunately, we are again unable to do so due to health reasons.

    However, we would be happy to see you tomorrow at the OCR-D Forum (10 am CET in https://play.workadventu.re/@/hab/hab/ocr-d-forum) !

    stefanCCS
    @stefanCCS
    Hi
    (maybe a bit of side-topic):
    Does anybody know, if there is an xsd or rng file existing for famous DFG METS and/or MODS Definitions:
    https://dfg-viewer.de/fileadmin/groups/dfgviewer/METS-Anwendungsprofil_2.3.1.pdf
    https://dfg-viewer.de/fileadmin/groups/dfgviewer/MODS-Anwendungsprofil_2.3.1.pdf
    ?
    20 replies
    Lena Hinrichsen
    @lena-hinrichsen
    @/all Tomorrow, 2–3pm (CET) is our next open TechCall. Feel free to join us in https://meet.gwdg.de/b/eli-ufa-unu if you are interested in the topics:
    • Bi-weekly
    • Processor API Implementation (@tdoan2010)
    Rohan Chauhan
    @rohanchauhan_gitlab

    Hi, all!
    I come here after a almost a year long hiatus during which I have been developing ground truth datasets for my research languages. I have some useful training data for Bengali and Hindi historical print, and I am presently working on Urdu print. I am hoping to use this very useful resource for batch processing my documents.

    Earlier today, I was running some pre-processing steps on an 800 page book. Everything was going as expected, however there was a power outage for a short while (I'm in India), and my computer shut down. When I turned it back on, I realised that everything started from page one, even though I had already processed 650+ out of the 800+ pages. I wanted to ignore the first 650+ pages that were already processed, and start from the pages that weren't processed. However, I couldn't find a way to do it.

    Am I missing an existing method to force the processors from running on a specific range of images in the METS file?

    3 replies
    Robert Sachunsky
    @bertsky
    @stweil, have you published your OCR models trained on the ONB dataset anywhere? I can only see the Tesseract models on your server, but not the ones for Kraken or Calamari...
    Also, IIRC you mentioned you did the same procedure on your Reichsanzeiger dataset. Can you share anything yet?
    Stefan Weil
    @stweil
    The Kraken models are now available here: https://ub-backup.bib.uni-mannheim.de/~stweil/tesstrain/kraken/.
    10 replies
    Stefan Weil
    @stweil
    As far as I remember the Calamari trainings were only a comparision of the performance on different platforms / with different settings which was aborted before producing usable models, so if you want to have models, I'd have to run it again.
    1 reply
    Robert Sachunsky
    @bertsky
    Ok, I'll go for Tesseract and Kraken then. Thanks a lot!
    Robert Sachunsky
    @bertsky
    Does anyone recall what's the status of the OCR-D GT convention for marking lines/words as illegible or impaired? IIRC we agreed the newly specified NoiseRegion had a different use-case than Labels on a TextLine/Word . There was a proposal to synchronize with Transkribus' Tag or Property elements for that. But I don't remember anything specific, and cannot find any notes or specs...
    Does anyone already use some way of marking broken lines not to be used for OCR training?
    8 replies
    stefanCCS
    @stefanCCS
    Hi,
    I am trying to do a new setup for ocrd_all (native variant) on Ubuntu 20.04.
    I have installed Python 3.7 before (using "deadsnakes").
    I have created a new venv with Python 3.7.
    sudo make deps-ubuntuhas run fine.
    Now, have got in error while running make all(in this Python 3.7 venv) as follows:
    Successfully installed PyWavelets-1.3.0 absl-py-1.1.0 aiohttp-3.8.1 aiosignal-1.2.0 async-timeout-4.0.2 asynctest-0.13.0 cachetools-5.2.0 commonmark-0.9.1 coremltools-5.2.0 frozenlist-1.3.0 fsspec-2022.5.0 google-auth-2.9.0 google-auth-oauthlib-0.4.6 grpcio-1.47.0 imageio-2.19.3 kraken-4.1.2 markdown-3.3.7 mpmath-1.2.1 multidict-6.0.2 networkx-2.6.3 oauthlib-3.2.0 ocrd-kraken-0.1.2 packaging-21.3 protobuf-3.19.4 pyDeprecate-0.3.2 pyarrow-8.0.0 pyasn1-0.4.8 pyasn1-modules-0.2.8 pygments-2.12.0 pyparsing-3.0.9 python-bidi-0.4.2 pytorch-lightning-1.6.4 regex-2022.7.9 requests-oauthlib-1.3.1 rich-12.4.4 rsa-4.8 scikit-image-0.19.2 scipy-1.7.3 six-1.16.0 sympy-1.10.1 tensorboard-2.9.1 tensorboard-data-server-0.6.1 tensorboard-plugin-wit-1.8.1 tifffile-2021.11.2 torch-1.11.0 torchmetrics-0.9.2 torchvision-0.12.0 tqdm-4.64.0 yarl-1.7.2 python3 -m venv /home/gputest/ocrd-3.7/sub-venv/headless-tf1 Error: Command '['/home/gputest/ocrd-3.7/sub-venv/headless-tf1/bin/python3', '-Im', 'ensurepip', '--upgrade', '--default-pip']' returned non-zero exit status 1. make: *** [Makefile:180: /home/gputest/ocrd-3.7/sub-venv/headless-tf1/bin/activate] Error 1
    Any idea ?
    54 replies
    Uwe Hartwig
    @M3ssman
    @/all, we published our evaluation tool from within OCR-D-project scope at https://github.com/ulb-sachsen-anhalt/digital-eval .
    Please note, that this is only meant to be used for automated evaluation of larger data sets, not to inspect single pages (but it might give you an idea which data should be examined in more detail)
    4 replies
    Moarc
    @Moarc
    yo
    Sorry if this is the wrong place to ask - my hope is that since the developers of cor-asv-ann hang out here, it might be. I'm trying to get cor-asv-ann running, I hit a weird behavior and have no idea how to troubleshoot.
    Moarc
    @Moarc
    The first step goes well, keraslm-rate produces a language model and that gets converted with transfer-dta-lm. But after that, cor-asv-ann-train loads the data (?), compiles a model - and just stops. It prints "Epoch 1", but no progress bar yet, and after a while a TensorFlow error is printed saying "An input could not be retrieved. It could be because a worker has died". I tried increasing the width and window length to what ASV have used, I tried both GT<TAB>GT files and pickled confmat, nothing changed.
    3 replies
    Robert Sachunsky
    @bertsky
    Are there any plans on making a new ocrd_all release yet? Since the last version, the download URL for the eynollah default model https://qurator-data.de/eynollah/models_eynollah.tar.gz has changed (see update in core), and there have already been fixes in other modules.
    3 replies
    sjscotti
    @sjscotti
    Has anyone tried to run the "light" version of eynollah as a command in OCR-D? I've run it as a standalone, but I am unsure of the options that are available under OCR-D, and what it assumes as defaults (e.g., does it run "light_version" as default, or "original"?)
    1 reply
    Konstantin Baierer
    @kba

    @/all We have released a new version 2.36.0, which besides fixing typos and minor bug fixes, also contains these changes:

    We have also released a new version v2022-07-18 of ocrd_all. This release includes the updated OCR-D/core, improvements to ocrd_calamari and ocrd_segment and upgrades tesseract to the latest release v5.2.0.

    ocrd_all releases are available from GitHub and DockerHub. OCR-D/core releases are available from GitHub, PyPI and as part of ocrd_all.

    To update a native installation of ocrd_all, remove the existing venv and run make all again to reinstall.

    To update Docker deployment of ocrd_all, run docker pull ocrd/all and recreate your containers.

    Many thanks to @bertsky, @joschrew, @stweil, @hnesk and all other contributors!

    3 replies
    Lena Hinrichsen
    @lena-hinrichsen

    @/all Tomorrow, 2–3pm (CET) is our next open TechCall. Feel free to join us in https://meet.gwdg.de/b/eli-ufa-unu if you are interested in the topics:

    • Place for Documentation about decisions and adding lacking documentation (@lena-hinrichsen)
    • Finishing Kwalitee Dashboard backend implementation for projects tab (@mweidling)
    • Benchmarking spike (@mweidling)
    • Bugfixing (@kba)
    • Resource list (@kba)
    • Convert ocrd process syntax to Nextflow script: https://github.com/MehmedGIT/OtoN_Converter (@MehmedGIT)
    • Finalizing Processor API (@tdoan2010)
    • Impediment: school holiday & covid

    Please note: Unfortunately, due to health reasons, we are unable to hold a GT Call this week either.

    stefanCCS
    @stefanCCS
    Hi,
    I am searching for a faster "image-to-workspace" add processor.
    So far I am using a bash loop which is calling ocrd workspace add(as described here: https://ocr-d.de/en/user_guide).
    I have found ocrd-import, but there you cannot distinguish between the path
    • where the source images are located
    • and where to put the workspace
      (it is the same!)
      Or, maybe I am wrong?
      Or, maybe there is another mass-import processor?
    25 replies
    stefanCCS
    @stefanCCS

    I use ocrd-segment-extract-lineslike this:

    ocrd-segment-extract-lines -I <myinput> -O <myoutput>  -P  output-types '["text", "json"]' -P min-line-length 1 -P min-line-width 5 -P min-line-height 5

    I get warnings, that no text is available (which is true in my case), like this:

    2022-07-28 14:21:51.985 WARNING processor.ExtractLines - Line 'TR-6_line0004' contains no text content

    As a result, I do NOT get any line PNGs (this is my intention here) - why?
    (I have the feeling, that this might have worked in the past ...)

    2 replies
    vahidrezanezhad
    @vahidrezanezhad
    @/all Hi, In order to update our sbb_binarizer model we would like to ask you to provide us the documents that the current model results were not satisfying. Thank you in advance.
    Lena Hinrichsen
    @lena-hinrichsen

    @/all Tomorrow, 2–3pm (CET) is our next open TechCall. Feel free to join us in https://meet.gwdg.de/b/eli-ufa-unu if you are interested in the topics:

    • Bi-weekly
      • Update tech stack (@kba)
      • Bugfixing w.r.t to ocrd zip (@kba)
      • OPERANDI Alpha (Forum on Friday) (@MehmedGIT)
      • OtoN (ocrd process to nextflow converter) (@MehmedGIT)
    • Perspective correction pre-processor (@jbarth-ubhd) https://github.com/jbarth-ubhd/fix-perspective

    We are pleased to announce that our GT Call can take place this week after we had to cancel it a few times: Thursday at 1pm CET.

    On Friday at 10 am CET we will continue with our monthly OCR Forum. This month we celebrate the alpha release of OPERANDI. :tada:

    Konstantin Baierer
    @kba

    @/all We have released new versions of our specifications, of core and ocrd_all.

    In the specs, we changed the return values of the /workspace Web API endpoints to return either JSON description or OCRD-ZIP and removed tbe obsolete logging spec that is better described in the cli spec.

    The newest OCR-D/core, besides some fixes to the WorkspaceBagger and ocrd workspace merge, reorganizes the resource manager so that resources can be picked up dynamically from the --dump-json output of processors, effectively decentralizing the resource list because processors can now describe the resources relevant for them in their ocrd-tool.json. Currently, only ocrd_tesserocr makes use of this capability, the other processors will follow suit shortly.

    The latest ocrd_all release contains the newer releases of core and ocrd_tesserocr, as well as the first release of the refactored recognition and new segmentation of ocrd_kraken, adding proper support for kraken in OCR-D, which might be especially useful for users of eScriptorium.

    As always: python packages are available from PyPI and GitHub; ocrd_all native installations should be updated by deleting the venv and running make all again; ocrd_all Docker images will be available later today and can be updated with docker pull ocrd/all.

    Many thanks to @bertsky, @lena-hinrichsen, @joschrew, @mittagessen and all other contributors in the OCR-D community!

    3 replies
    Konstantin Baierer
    @kba

    @stweil what is the difference between frak2021 and frak2021_09 in https://ub-backup.bib.uni-mannheim.de/~stweil/tesstrain/? And which of the models have language data and which are without?

    And many thanks for https://ocr-bw.bib.uni-mannheim.de/anwendung/druckwerke/ very helpful explanations.

    4 replies
    Lena Hinrichsen
    @lena-hinrichsen
    @/all Tomorrow, 2–3pm (CET) is our next open TechCall. Join us in https://meet.gwdg.de/b/eli-ufa-unu if you are interested in our Bi-weekly and add more topics to our agenda:
    • Bi-weekly
      • New releases of core, spec, ocrd_all
      • Dynamic Resource Management finally A Thing
      • Next: Bugfixing, ocr-d/core#690 ocr-d/core#363, ocr-d/core#690 ocr-d/core#825, ocr-d/core#802 (@kba)
      • Revisiting PR in OCR-D/core and close where appropriate (@kba)
      • internal coordination project workshop next week (BBAW, HAB, SUB, GWDG, SBB)
      • Documentation: Nextflow done; next: Web API
      • Workflow part in Web API: write automatic tests for workflow part (@MehmedGIT)
      • Refactor Web API (@joschrew)
    Robert Sachunsky
    @bertsky

    @vahidrezanezhad, I have a question about eynollah (mainline):

    For high-res material, the OCR-D processor is quite slow (2.5 min/page) even on a fast (A40) GPU. I was wondering how to sacrifice some quality over speed. Konstantin said it should be possible to run Eynollah without the column classifier (perhaps passing a prior manually if already known in advance). But I cannot see how this would need to be wrapped for OCR-D: I see run always enters run_enhancement which always calls resize_and_enhance_image_with_column_classifier (which looks like a neural model) and later run_graphics_and_columns including find_num_col (which looks like a heuristic model).

    Also, I wonder how to utilise the available computing resources more. I can see only about 120% CPU (which is only a tiny fraction of the available 64 cores) and 0-15% GPU (but 0% most of the time). Could it be that high-res images incur too much cost for the CPU-side calculations and should therefore be downsampled beforehand? See qurator-spk/eynollah#85 in this regard.

    stefanCCS
    @stefanCCS
    Question concerning resmgr.
    I do not understand how to use resmgr, if I want to use an own Tesseract model on ocrd-tesserocr-recognize.
    • Must I put it manually in the path, the tesseract cli expects to have it (which I can find out using tesseract --lists-langs?
    • Or can I use the resmgrwith the download command somehow ( e.g. ocrd resmgr download ocrd-tesserocr-recognize <localfilename> ) ?
      ==> in the end I simply want to call: ocrd-tesserocr-recognize -I $Γ¬nputfolder -O $outputFolder -P model $myTesseractModel
    35 replies
    helkejaa
    @helkejaa
    Question on Kraken's segmentation training for regions (ping @mittagessen ): I'm training a model for a book/books with two vertical paragraphs per page. If I train the regions (for the two paragraphs) to be too close, Kraken's recognizer merges them as one region. If on the other hand I train these regions to be further away from each other, Kraken's recognizer chooses the region so that some letters may be not included. It seems that the more gt files I provide for training, the better the results are, but still there is the need to check the results. Is there a logic to the segmentation that I may have missed? I have learned, that having the region border further away from the text (in the gt files) successfully trains Kraken model not to exclude bits of text, but then there is the problem of merging regions.
    2 replies
    Lena Hinrichsen
    @lena-hinrichsen

    On 20/21 October 2022, another (German-speaking) Kitodo practice meeting will take place as a face-to-face event in Braunschweig.

    Please submit contributions (presentations, demonstrations, workshops) on topics related to Kitodo.Production and Kitodo.Presentation.

    All kinds of application-related topics are welcome and permitted.
    Contributions on these topics are of particular interest:

    • Practical examples and use cases
    • current projects or plans
    • Planning for future projects
    • migrations
    • Connections to various presentation systems
    • Newspaper digitisation
    • OCR
    • Interfaces such as iiif

    Please send your proposals (with a short summary of no more than one page of text) to contact@kitodo.org by 9 September 2022. Notification of acceptance and publication of the programme will be made at the end of September.

    Lena Hinrichsen
    @lena-hinrichsen

    @/all Tomorrow, 2–3pm (CET) is our next open TechCall. Join us in https://meet.gwdg.de/b/eli-ufa-unu if you are interested in our Bi-weekly and add more topics to our agenda:

    • Bi-weekly

      • most of the issues of the previous sprint :/ (@kba)
      • Draft for QA Spec (@mweidling & @kba)
      • OCR-D Coord. internal Workshop -> Business Canvas, Benchmarking, Product Lvls
      • Preparing processor maintenance plan
      • Web API in spec: OCR-D/spec#222 in review (@tdoan2010)
      • Rest API Wrapper for processors (@tdoan2010)
      • Web API Refactoring (@joschrew)
    • Update Plan for Ubuntu, Python, libraries(@kba)

    • OCR-Evaluation and Statistics with digital-eval (@M3ssman)
    Lena Hinrichsen
    @lena-hinrichsen

    @/all Join us today, 1–2 pm (Berlin Time) in our OCR-D-GT-Call. Link to our Big Blue Button: https://meet.gwdg.de/b/eli-ufa-unu

    Additionally, our OCR-D Forum will take place on Friday, 10 am. After a standup from our projects, we are happy to hear @M3ssman presenting his tool digital-eval he developed to evaluate outcomes from mass digitalization workflows. Link to the event: https://play.workadventu.re/@/hab/hab/ocr-d-forum

    Konstantin Baierer
    @kba
    @/all Tomorrow, 2–3pm (CET) is our next open TechCall. Join us in https://meet.gwdg.de/b/eli-ufa-unu if you are interested in our Bi-weekly and feel free to add more topics to our agenda:
    Robert Sachunsky
    @bertsky

    @/all there are some new XSLs for various PAGE-XML related tasks available in workflow-configuration:

    page-add-nsprefix-pc # adds namespace prefix 'pc:'
    page-remove-alternativeimages # remove selected AlternativeImage entries
    page-remove-metadataitem # remove all MetadataItem entries
    page-remove-dead-regionrefs # remove non-existing regionRefs
    page-remove-empty-readingorder # remove empty ReadingOrder or groups
    page-remove-all-regions # remove all *Region (and TextLine and Word and Glyph) entries
    page-remove-regions # remove all *Region (and TextLine and Word and Glyph) entries of some type
    page-remove-text-regions # remove all TextRegion (and TextLine and Word and Glyph) entries
    page-remove-lines # remove all TextLine (and Word and Glyph) entries
    page-remove-words # remove all Word (and Glyph) entries
    page-remove-glyphs # remove all Glyph entries
    page-ensure-textequiv-unicode # create empty TextEquiv/Unicode elements when TextEquiv is empty
    page-sort-textequiv-index # sort TextEquiv by @index
    page-fix-coords # replace negative values in coordinates by zero
    page-set-nsversion-2019 # update the PAGE namespace schema version to 2019
    page-move-alternativeimage-below-page # try to push page-level AlternativeImage back to subsegments
    page-textequiv-lines-to-regions # project text from TextLines to TextRegions (concat with LF in between)
    page-textequiv-words-to-lines # project text from Words to TextLines (concat with spaces in between)
    page-extract-text # extract (TextRegion|TextLine|Word|Glyph)/TextEquiv/Unicode consecutively
    page-extract-lines # extract TextLine/TextEquiv/Unicode consecutively
    page-extract-words # extract Word/TextEquiv/Unicode consecutively
    page-extract-glyphs # extract Glyph/TextEquiv/Unicode consecutively

    In particular, page-remove-alternativeimages.xsl (e.g. with params level=line and which=clipped) is useful for cases where in your workflow, the usual last-deepest AlternativeImage is not what you want your processor to consume. In absence of a general mechanism for run-time selection of derived images, here you can do the next best thing: create a new annotation with the images you don't want removed from the output – and use that as next input.

    Moreover, page-sort-textequiv-index.xsl lets you sort all TextEquiv by their respective @index (which may be useful with editors like LAREX).

    Also, page-extract-text lets you extract the consecutive text content. Parameters are lb=\n (for line break), pg=\n\n (for paragraph break), level=highest (for hierarchy level to read from, also region|line|word|glyph) and order=reading-order (or else document). The latter is notable for its unique ability to respect recursive ReadingOrder in XSLT 1.0.

    Within a normal installation (make -C workflow-configuration install or make all in ocrd_all), all transforms are available both in a standalone CLI and as OCR-D processor. See https://bertsky.github.io/workflow-configuration/#ocrd-page-transform

    Lena Hinrichsen
    @lena-hinrichsen
    @/all Join us today, 1–2 pm (Berlin Time) in our OCR-D-GT-Call. Link to our Big Blue Button: https://meet.gwdg.de/b/eli-ufa-unu
    Matthias Boenig
    @tboenig
    @all Liebe Kolleginnen und Kollegen, leider muss ich kurzfristig den heutigen GT-Call absagen. Wir werden uns am 29.9 wieder treffen.
    Dear colleagues, unfortunately I have to cancel today's GT-Call at short notice. We will meet again on 29.9.
    Lena Hinrichsen
    @lena-hinrichsen
    @/all Tomorrow, 2–3pm (CET) is our next open TechCall. Join us in https://meet.gwdg.de/b/eli-ufa-unu if you are interested in our Bi-weekly and add more topics to our agenda:
    • Bi-weekly
      • composed the first GT corpora that serve as a basis for benchmarking (link) (@mweidling)
      • submitting abstracts/papers
      • Front end QUIVER
      • Bug Fixing core, then METS Server (@kba)
      • OtoN converter was refactored a bit. Docker support will be added soon, check planned extensions. (@MehmedGIT)
      • Working on Processing Server pull requests (@tdoan2010)
      • Working on webapi-processing-broker: use (servers from) processing server pull request for the webapi (@joschrew)
    • page-transform and XSL scripts and snippets in https://github.com/bertsky/workflow-configuration (@bertsky)
    • QUIVER Workflow Tab (@paulpestov)
    Lena Hinrichsen
    @lena-hinrichsen
    @/all Join us today, 1–2 pm (Berlin Time) in our OCR-D-GT-Call. Link to our Big Blue Button: https://meet.gwdg.de/b/eli-ufa-unu
    mweidling
    @mweidling
    @/all As discussed in the Tech Call yesterday, the first draft of the QA Specs are now available at OCR-D/spec#225. Feel free to share your thoughts and feedback in the PR!
    stefanCCS
    @stefanCCS
    Hi,
    is there a Font-Type detection processors, which works similar like ocrd-typegroups-classifierbut on TextRegion-Level (or a level given via CLI)?
    Idea is to get per TextRegion (resp. on level given on CLI) one (or more) detected font types.
    3 replies
    Robert Sachunsky
    @bertsky
    Does anyone here know of a good German OCR model for typewriter?