Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
test/
dist/
build/
repo/assets
3 changes: 2 additions & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
ARG DOCKER_BASE_IMAGE
FROM $DOCKER_BASE_IMAGE
# install ocrd-tesserocr (until here commands for installing tesseract-ocr)
ARG VCS_REF
ARG BUILD_DATE
LABEL \
Expand Down Expand Up @@ -46,6 +45,8 @@ COPY . .
COPY ocrd-tool.json .
# prepackage ocrd-tool.json as ocrd-all-tool.json
RUN ocrd ocrd-tool ocrd-tool.json dump-tools > $(dirname $(ocrd bashlib filename))/ocrd-all-tool.json
# prepackage ocrd-all-module-dir.json
RUN ocrd ocrd-tool ocrd-tool.json dump-module-dirs > $(dirname $(ocrd bashlib filename))/ocrd-all-module-dir.json
# install everything and reduce image size
RUN make deps-ubuntu \
&& make -j4 install GIT_SUBMODULE=: \
Expand Down
7 changes: 4 additions & 3 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ export

SHELL = /bin/bash
PYTHON = python3
DOCKER = docker
PIP = pip3
GIT_SUBMODULE = git submodule
LOG_LEVEL = INFO
Expand All @@ -28,7 +29,7 @@ PYTEST_ARGS =

# Docker container tag
DOCKER_TAG = 'ocrd/tesserocr'
DOCKER_BASE_IMAGE = docker.io/ocrd/core:v3.3.0
DOCKER_BASE_IMAGE = docker.io/ocrd/core:latest

help:
@echo ""
Expand Down Expand Up @@ -102,7 +103,7 @@ deps-test:

# Build docker image
docker: repo/tesseract repo/tesserocr
docker build \
$(DOCKER) build \
--build-arg DOCKER_BASE_IMAGE=$(DOCKER_BASE_IMAGE) \
--build-arg VCS_REF=$$(git rev-parse --short HEAD) \
--build-arg BUILD_DATE=$$(date -u +"%Y-%m-%dT%H:%M:%SZ") \
Expand All @@ -122,7 +123,7 @@ $(TESSERACT_PREFIX)/bin/tesseract: build/tesseract/Makefile
$(TESSERACT_PREFIX)/bin/lstmtraining: build/tesseract/Makefile
$(MAKE) -C build/tesseract training-install

TESSERACT_CONFIG ?= --disable-openmp --disable-shared CXXFLAGS="-g -O2 -fPIC -fno-math-errno -Wall -Wextra -Wpedantic"
TESSERACT_CONFIG ?= --disable-openmp --disable-shared CXXFLAGS="-g -O2 -fPIC -fno-math-errno -Wall -Wextra -Wpedantic -UNDEBUG"
Copy link
Contributor

@stweil stweil Apr 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bertsky, it's a bit strange if you add an unrelated commit to a pull request which was already reviewed by yourself.

I am not sure that your change in line 126 is a good idea (debug code adds instructions that increase processing time), and you have not given a proper explanation of why you made it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid your change will do exactly what you wanted to avoid.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see this why we must at least avoid NDEBUG; keeping -g otherwise is a good idea to get symbols when there is a crash

all this is a far cry from being satisfactory (libtesseract must use exceptions in the end!), just less insane (segfault)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you prefer getting a SIGSEGV because of your change instead of getting an abort() like in the current configuration?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in other places like this we need !defined(NDEBUG) for certain safety checks

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you prefer getting a SIGSEGV because of your change instead of getting an abort() like in the current configuration?

No, segfault is obviously even worse, but I'm in a bind: the place that actually triggers my latest segfault/abort needs !defined(NDEBUG) – so the snippet I showed first behaves different, but is (currently) not as relevant

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See sirfz/tesserocr#365 for how I would like this to be solved.

build/tesseract/Makefile: repo/tesseract/Makefile.in
mkdir -p $(@D)
cd $(@D) && $(CURDIR)/repo/tesseract/configure \
Expand Down