Skip to content

Install inotifytools for k8s sidecars log tailing#1364

Merged
kba merged 1 commit into
masterfrom
inotify-image
Jun 4, 2026
Merged

Install inotifytools for k8s sidecars log tailing#1364
kba merged 1 commit into
masterfrom
inotify-image

Conversation

@kba

@kba kba commented Jun 4, 2026

Copy link
Copy Markdown
Member

Does not add too many dependencies and allows a mechanism like

inotifywait -m /tmp/ocrd-network -e create |
while read dir event file; do
    tail -F "$dir/$file" &
done

to get the logging of the workers into the STDOUT stream for log analysis.

It's still not a great solution but the least invasive change.

In the mid-term we should find a way to have the ocrd_network logging make fewer assumptions, perhaps carry the info from the filename (job UUID) as a logging metadata field or even just part of the log message.

@MehmedGIT

Copy link
Copy Markdown
Contributor

Looks good.

In the mid-term we should find a way to have the ocrd_network logging make fewer assumptions, perhaps carry the info from the filename (job UUID) as a logging metadata field or even just part of the log message.

Agreed.

@kba kba merged commit 04be227 into master Jun 4, 2026
17 of 18 checks passed
@kba

kba commented Jun 5, 2026

Copy link
Copy Markdown
Member Author

FYI, this is how I use that in Kubernetes in a sidecar container that only collects log messages:

      - name: log-forwarder
        image: ocrd-core-image 
        command:
          - /bin/sh
          - -c
          - |

            # Tail already existing log files
            find /logs -type f -name '*.log' | while read -r f; do
              echo "[processor=$(OCRD_PROCESSOR_NAME)]: Tailing existing log: $f"
              tail -n +1 -F "$f" &
            done

            # Tail newly created log files
            inotifywait -m -r -e create /logs --format '%w%f' |
            while read -r f; do
              case "$f" in
                *.log)
                  echo "Tailing new log: $f" >&2
                  job="${f##*/}"
                  job="${job%.log}"

                  tail -n +1 -F "$f" |
                  while IFS= read -r line; do
                  printf '[processor=%s job=%s] %s\n' "$(OCRD_PROCESSOR_NAME)" "$job" "$line"
                  done &
                  ;;
              esac
            done

Which gives me a complete log across all the log files, e.g.

info [processor=ocrd-page2alto-transform job=worker.1.ocrd-page2alto-transform] 2026-06-05 12:13:42,783.783 INFO ocrd_network.processing_worker - Starting consuming from queue: ocrd-page2alto-transform 
info [processor=ocrd-page2alto-transform job=worker.1.ocrd-page2alto-transform] 2026-06-05 12:13:42,782.782 INFO ocrd_network.processing_worker - Configuring consuming from queue: ocrd-page2alto-transform 
info [processor=ocrd-page2alto-transform job=worker.1.ocrd-page2alto-transform] 2026-06-05 12:13:42,780.780 INFO ocrd_network.processing_worker - Successfully connected RMQConsumer 
info [processor=ocrd-page2alto-transform job=worker.1.ocrd-page2alto-transform] 2026-06-05 12:13:42,772.772 INFO ocrd_network.processing_worker - Connecting client to RabbitMQ server: ocrd-rabbitmq:5672/ 
info [processor=ocrd-page2alto-transform job=worker.1.ocrd-page2alto-transform] 2026-06-05 12:13:42,772.772 INFO ocrd_network.processing_worker - Initialized processing worker: ocrd-page2alto-transform 
info [processor=ocrd-page2alto-transform job=worker.1.ocrd-page2alto-transform] 2026-06-05 12:13:42,714.714 INFO ocrd_network.processing_worker - Verified RabbitMQ URL: amqp://network_test:network_test@ocrd-rabbitmq:5672 
info [processor=ocrd-page2alto-transform job=worker.1.ocrd-page2alto-transform] 2026-06-05 12:13:42,714.714 INFO ocrd_network.processing_worker - Verified MongoDB URL: mongodb://root:root123@ocrd-mongodb:27017 
debug [processor=ocrd-page2alto-transform job=371e4fd9-77d6-43a4-934b-97d8e09bd8ff] 10:48:11.323 DEBUG ocrd_network.processing_worker - Acking message with tag: 2 
info [processor=ocrd-page2alto-transform job=371e4fd9-77d6-43a4-934b-97d8e09bd8ff] 10:48:11.323 INFO ocrd_network.processing_worker - Successfully processed RabbitMQ message 
info [processor=ocrd-page2alto-transform job=371e4fd9-77d6-43a4-934b-97d8e09bd8ff] 10:48:11.323 INFO ocrd_network.processing_worker - Response from callback_url "<Response [200]>" 
info [ocrd-ps-649578f6fb-wvlgq] 2026-06-05 10:48:11,085.085 INFO ocrd_network.server_cache.processing_requests - The new request counter of /nfs/DEV/queue/running/PPN729722937/mets.xml: 0 
info [ocrd-ps-649578f6fb-wvlgq] 2026-06-05 10:48:11,084.084 INFO ocrd_network.server_cache.locked_pages - Unlocking all pages for: ALTO 
info [processor=ocrd-page2alto-transform job=371e4fd9-77d6-43a4-934b-97d8e09bd8ff] 10:48:11.079 INFO ocrd_network.processing_worker - Posting result message to callback_url "http://ocrd-ps-service/result_callback" 
info [processor=ocrd-page2alto-transform job=371e4fd9-77d6-43a4-934b-97d8e09bd8ff] 10:48:11.079 INFO ocrd_network.processing_worker - Publishing result to internal callback url (Processing Server): http://ocrd-ps-service/result_callback 
info [processor=ocrd-page2alto-transform job=371e4fd9-77d6-43a4-934b-97d8e09bd8ff] 10:48:11.079 INFO ocrd_network.processing_worker - Result message: {'job_id': '371e4fd9-77d6-43a4-934b-97d8e09bd8ff', 'state': 'SUCCESS', 'workspace_id': '', 'path_to_mets': '/nfs/DEV/queue/running/PPN729722937/mets.xml'} 
info [processor=ocrd-page2alto-transform job=371e4fd9-77d6-43a4-934b-97d8e09bd8ff] 10:48:10.996 INFO ocrd.process.profile - Executing processor 'ocrd-page2alto-transform' took 8.442048s (wall) 6.093049s (CPU)( [--input-file-grp='TESS' --output-file-grp='ALTO' --parameter='{"check_border": false, "check_words": true, "skip_empty_lines": false, "trailing_dash_to_hyp": false, "dummy_word": true, "dummy_textline": true, "textequiv_index": 0, "region_order": "document", "textline_order": "document", "textequiv_fallback_strategy": "first", "alto_version": "v4.2", "timestamp_src": "LastChange"}' --page-id=''] 
debug [processor=ocrd-page2alto-transform job=371e4fd9-77d6-43a4-934b-97d8e09bd8ff] 10:48:10.996 DEBUG ocrd.processor.base - stopped executor <pebble.pool.process.ProcessPool object at 0x7ff5eff90370> after 281 tasks 
debug [processor=ocrd-page2alto-transform job=371e4fd9-77d6-43a4-934b-97d8e09bd8ff] 10:48:10.977 DEBUG ocrd.processor.base - succeeded 281, missed 0 of 281 pages due to {} 
debug [processor=ocrd-page2alto-transform job=371e4fd9-77d6-43a4-934b-97d8e09bd8ff] 10:48:10.976 DEBUG ocrd.processor.Page2AltoProcessor - [PHYS_0280] page worker completed for page PHYS_0280 
debug [processor=ocrd-page2alto-transform job=371e4fd9-77d6-43a4-934b-97d8e09bd8ff] 10:48:10.977 DEBUG ocrd.processor.base - page worker completed for page PHYS_0281 
debug [processor=ocrd-page2alto-transform job=371e4fd9-77d6-43a4-934b-97d8e09bd8ff] 10:48:10.977 DEBUG ocrd.processor.base - waiting for output of task <ProcessFuture at 0x7ff60e7ade40 state=finished returned NoneType> (page PHYS_0281) 
debug [processor=ocrd-page2alto-transform job=371e4fd9-77d6-43a4-934b-97d8e09bd8ff] 10:48:10.976 DEBUG ocrd.processor.base - page worker completed for page PHYS_0280 
debug [processor=ocrd-page2alto-transform job=371e4fd9-77d6-43a4-934b-97d8e09bd8ff] 10:48:10.975 DEBUG ocrd.processor.Page2AltoProcessor - [PHYS_0281] page worker completed for page PHYS_0281 
debug [processor=ocrd-page2alto-transform job=371e4fd9-77d6-43a4-934b-97d8e09bd8ff] 10:48:10.944 DEBUG ocrd.processor.Page2AltoProcessor - [PHYS_0279] page worker completed for page PHYS_0279 
debug [processor=ocrd-page2alto-transform job=371e4fd9-77d6-43a4-934b-97d8e09bd8ff] 10:48:10.969 DEBUG ocrd.processor.base - waiting for output of task <ProcessFuture at 0x7ff60e7e7ee0 state=running> (page PHYS_0280) 
debug [processor=ocrd-page2alto-transform job=371e4fd9-77d6-43a4-934b-97d8e09bd8ff] 10:48:10.969 DEBUG ocrd.processor.base - page worker completed for page PHYS_0279 
debug [processor=ocrd-page2alto-transform job=371e4fd9-77d6-43a4-934b-97d8e09bd8ff] 10:48:10.935 DEBUG ocrd.workspace.add_file - [PHYS_0281] outputfile file_grp=ALTO local_filename=ALTO/FILE_0281_ALTO.xml content=True 

Still not ideal but it does allow me to debug individual jobs effectively and I don't need any ln -sf /proc/1/fd/1 /tmp/ocrd.log-style hacks anymore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants