in_tail: fix data loss with buffered data on shutdown and gzip files #11269

jinyongchoi · 2025-12-09T02:04:24Z

Unprocessed data in the internal buffer is discarded when Fluent Bit stops, causing data loss because the DB offset is already advanced.

This patch fixes the issue by rewinding the file offset by the remaining buffer length on exit, ensuring data is re-read on restart.

For compressed gzip files, a separate issue caused data duplication after restart because skip_bytes was incorrectly decremented during runtime. A new field 'exclude_bytes' is introduced as a runtime-only counter, preserving skip_bytes for correct DB persistence.

Additionally, this patch prevents resurrecting deleted file entries in the DB by resetting db_id to 0 upon deletion and checking it before updating the offset.

The SQLite schema is updated to include 'anchor_offset' and 'skip_bytes' columns. On upgrade from older versions, these columns are automatically added via ALTER TABLE if they do not exist.

Closes #11265

Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

Example configuration file for the change

[SERVICE]
    flush 2
    grace 60
    log_level debug
    log_file /tmp/testing/logs/testing.log
    parsers_file /tmp/testing/parsers.conf
    plugins_file /tmp/testing/plugins.conf
    http_server on
    http_listen 0.0.0.0
    http_port 22002

    storage.path /tmp/testing/storage
    storage.metrics on
    storage.max_chunks_up 512
    storage.sync full
    storage.checksum off
    storage.backlog.mem_limit 100M

[INPUT]
    Name tail
    Path /tmp/testing.input
    Tag testing
    Key message
    Offset_Key   log_offset

    Read_from_Head true
    Refresh_Interval 3
    Rotate_Wait 31557600

    Buffer_Chunk_Size 1MB
    Buffer_Max_Size 16MB
    Inotify_Watcher false

    storage.type filesystem
    storage.pause_on_chunks_overlimit true

    DB /tmp/testing/storage/testing.db
    DB.sync normal
    DB.locking false

    Alias input_log

[OUTPUT]
    Name file
    Match *
    File /tmp/testing.out

[SERVICE]
    flush 2
    grace 60
    log_level debug
    log_file /tmp/testing/logs/testing.log
    parsers_file /tmp/testing/parsers.conf
    plugins_file /tmp/testing/plugins.conf
    http_server on
    http_listen 0.0.0.0
    http_port 22002

    storage.path /tmp/testing/storage
    storage.metrics on
    storage.max_chunks_up 512
    storage.sync full
    storage.checksum off
    storage.backlog.mem_limit 100M

[INPUT]
    Name tail
    Path /tmp/testing.input.gz
    Tag testing
    Key message
    Offset_Key   log_offset

    Read_from_Head true
    Refresh_Interval 3
    Rotate_Wait 31557600

    Buffer_Chunk_Size 1MB
    Buffer_Max_Size 16MB
    Inotify_Watcher false

    storage.type filesystem
    storage.pause_on_chunks_overlimit true

    DB /tmp/testing/storage/testing.db
    DB.sync normal
    DB.locking false

    Alias input_log

[OUTPUT]
    Name file
    Match *
    File /tmp/testing.out

Debug log output from testing the change

normal file
[2025/12/15 20:40:56.47094045] [debug] [input:tail:input_log] inode=50643270 rewind offset for /tmp/testing.input: old=185883589 new=185883490 (buf_len=99)

compressed file
[2025/12/15 20:45:12.615579997] [debug] [input:tail:input_log] Skipping: anchor=0 offset=0 exclude=1119419529 decompressed=999999
[2025/12/15 20:45:12.617241577] [debug] [input:tail:input_log] Skipping: anchor=0 offset=999999 exclude=1118419530 decompressed=15809
...
[2025/12/15 20:45:15.408197399] [debug] [input:tail:input_log] Skipping: anchor=0 offset=10923918 exclude=1014921 decompressed=999999
[2025/12/15 20:45:15.408206153] [debug] [input:tail:input_log] Skipping: anchor=0 offset=10923918 exclude=14922 decompressed=15809
[2025/12/15 20:45:20.13095551] [debug] [input:tail:input_log] Gzip member completed: updating anchor from 0 to 10923918, resetting skip from 2147483784 to 0

Attached Valgrind output that shows no leaks or memory corruption was found

valgrind --leak-check=full ./bin/fluent-bit -v -c ./fluentbit.conf
...
==546544== 
==546544== HEAP SUMMARY:
==546544==     in use at exit: 0 bytes in 0 blocks
==546544==   total heap usage: 1,973,893 allocs, 1,973,893 frees, 2,123,730,947 bytes allocated
==546544== 
==546544== All heap blocks were freed -- no leaks are possible
==546544== 
==546544== For lists of detected and suppressed errors, rerun with: -s
==546544== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

Run local packaging test showing all targets (including any new ones) build.
Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

[N/A] Documentation required for this feature

Backporting

[N/A] Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Summary by CodeRabbit

New Features
- Persistent resume for gzip-compressed logs, database schema migration to store resume state, and a new sentinel for "no DB id".
Bug Fixes
- Improved seek/offset handling for compressed and plain files; correct behavior across truncation, rotation, removal and shutdown; reliable skipping of already-processed decompressed bytes.
Tests
- Added DB + gzip tests covering resume loss, append/inotify append, rotation, and multi-resume.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2025-12-09T02:04:43Z

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Adds gzip-aware resume state and DB-backed bookkeeping: new per-file fields (anchor_offset, skip_bytes, exclude_bytes, skipping_mode), DB schema migration and SQL updates for skip/anchor, seek/processing changes to respect gzip decompression and member boundaries, rewind DB offsets when buffered data remains, and new gzip/DB tests.

Changes

Cohort / File(s)	Summary
Core file & decompression logic `plugins/in_tail/tail_file.c`	Add gzip-aware resume logic and includes; choose seek position from DB/anchor/read_from_head; initialize anchor/exclude/skipping after seek; apply skip/exclude logic during chunk processing; increment/reset skip_bytes across gzip members; rewind DB offset on removal when buffered data exists (unless decompression active).
DB layer & SQL `plugins/in_tail/tail_db.c`, `plugins/in_tail/tail_sql.h`	Add pragma-based migration to add `skip`/`anchor` columns; replace in-file query_status with `cb_column_exists`; extend `db_file_exists` to return `skip` and `anchor`; bind/read `skip`/`anchor` in insert/update/offset/rotate/delete paths; reset `db_id` to `FLB_TAIL_DB_ID_NONE` after delete.
File struct & internal headers `plugins/in_tail/tail_file_internal.h`, `plugins/in_tail/tail.h`	Add per-file fields `anchor_offset` (int64_t), `skip_bytes` (uint64_t), `exclude_bytes` (uint64_t), `skipping_mode` (int); add macro `FLB_TAIL_DB_ID_NONE`.
FS event / stat handling `plugins/in_tail/tail_fs_inotify.c`, `plugins/in_tail/tail_fs_stat.c`	On truncation (size_delta < 0) initialize `anchor_offset`, `skip_bytes`, `exclude_bytes`, and `skipping_mode` alongside offset and `buf_len`.
File removal & counters `plugins/in_tail/tail_file.c` (remove/adjust_counters)	On removal, if buffered data exists and no decompression, rewind `file->offset` by `buf_len` (floor 0) and persist to DB; on truncation reset anchor/skip/exclude/skipping and update DB when enabled.
SQL definitions `plugins/in_tail/tail_sql.h`	Add `skip` and `anchor` columns (DEFAULT 0) to `in_tail_files`; update `SQL_INSERT_FILE` and `SQL_UPDATE_OFFSET` to include/handle `skip` and `anchor`.
Tests & helpers `tests/runtime/in_tail.c`	Add raw write helper, gzip create/append utilities, gzip-resume inspection and wait utility; add DB/gzip resume, append and rotation tests and register them in TEST_LIST.

Sequence Diagram(s)

sequenceDiagram
    participant Disk as Disk File
    participant Reader as in_tail Reader
    participant Decompress as Gzip Decompressor
    participant Buffer as In-memory Buffer
    participant DB as SQLite DB

    Disk->>Reader: read compressed/raw bytes (advance raw offset)
    Reader->>Buffer: append raw bytes
    alt decompression_context (gzip)
        Buffer->>Decompress: feed compressed bytes
        Decompress-->>Buffer: decompressed data (may span members)
        Buffer->>Buffer: apply exclude_bytes / skipping_mode (drop initial decompressed bytes)
        alt gzip member boundary reached
            Decompress->>Reader: notify member end
            Reader->>Reader: set anchor_offset (member start), reset skip_bytes
        end
        Reader->>DB: persist raw offset, skip, anchor
    else no decompression
        Buffer->>DB: persist raw offset
    end
    alt Shutdown with buffered unprocessed data and no decompression
        Reader->>DB: rewind offset (offset -= buf_len) and persist
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Pay attention to:
- DB migration and SQL binding/column ordering (plugins/in_tail/tail_db.c, plugins/in_tail/tail_sql.h).
- Seek selection and post-seek initialization for compressed streams (plugins/in_tail/tail_file.c).
- Lifecycle and transitions of skip_bytes / exclude_bytes / skipping_mode across reads and gzip-member boundaries.
- Correctness of offset rewind logic in flb_tail_file_remove and DB synchronization on shutdown.
- New test helpers and gzip test correctness in tests/runtime/in_tail.c.

Possibly related PRs

in_tail: fix memory leak when using generic unicode conversion (backport #10781) #10785 — touches plugins/in_tail/tail_file.c; potential overlap in file removal/process paths.
in_tail: Implement long line truncation #11059 — modifies in_tail chunk/process logic; likely to intersect with gzip resume and offset handling.

Suggested labels

backport to v4.0.x, backport to v4.1.x

Suggested reviewers

edsiper
cosmo0920
leonardo-albertovich
koleini
fujimotos

Poem

🐇
I nibble compressed bytes at dawn,
I mark anchors where members yawn.
When partial lines hide from sight,
I hop back offsets through the night.
Hops, skips, and crumbs — I make logs right.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 41.38% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title accurately summarizes the main objectives: fixing data loss with buffered data on shutdown and adding gzip file handling.
Linked Issues check	✅ Passed	The PR comprehensively addresses all coding requirements from issue #11265: implements offset rewinding for uncompressed files, handles gzip with schema changes and resume logic, resets db_id to prevent resurrection, and adds DB migration for new columns.
Out of Scope Changes check	✅ Passed	All changes are scoped to addressing the data loss issue: buffer handling, gzip resume logic, DB schema migration, and test coverage. No unrelated modifications were detected.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

📜 Recent review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8640940 and fdd8174.

📒 Files selected for processing (6)

plugins/in_tail/tail_db.c (11 hunks)
plugins/in_tail/tail_file.c (11 hunks)
plugins/in_tail/tail_file_internal.h (1 hunks)
plugins/in_tail/tail_fs_inotify.c (1 hunks)
plugins/in_tail/tail_fs_stat.c (1 hunks)
plugins/in_tail/tail_sql.h (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (3)

plugins/in_tail/tail_fs_inotify.c
plugins/in_tail/tail_file_internal.h
plugins/in_tail/tail_sql.h

🧰 Additional context used

🧠 Learnings (12)

📓 Common learnings

Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 11059
File: plugins/in_tail/tail_file.c:1618-1640
Timestamp: 2025-10-23T07:43:16.216Z
Learning: In plugins/in_tail/tail_file.c, when truncate_long_lines is enabled and the buffer is full, the early truncation path uses `lines > 0` as the validation pattern to confirm whether process_content successfully processed content. This is intentional to track occurrences of line processing rather than byte consumption, and consuming bytes based on `processed_bytes > 0` would be overkill for this validation purpose.

📚 Learning: 2025-10-23T07:43:16.216Z

Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 11059
File: plugins/in_tail/tail_file.c:1618-1640
Timestamp: 2025-10-23T07:43:16.216Z
Learning: In plugins/in_tail/tail_file.c, when truncate_long_lines is enabled and the buffer is full, the early truncation path uses `lines > 0` as the validation pattern to confirm whether process_content successfully processed content. This is intentional to track occurrences of line processing rather than byte consumption, and consuming bytes based on `processed_bytes > 0` would be overkill for this validation purpose.

Applied to files:

plugins/in_tail/tail_fs_stat.c
plugins/in_tail/tail_file.c
plugins/in_tail/tail_db.c

📚 Learning: 2025-09-22T15:59:55.794Z

Learnt from: nicknezis
Repo: fluent/fluent-bit PR: 10882
File: plugins/out_http/http.c:112-116
Timestamp: 2025-09-22T15:59:55.794Z
Learning: When users consider bug fixes out of scope for their focused PRs, it's appropriate to create separate GitHub issues to track those concerns rather than expanding the current PR scope.

Applied to files:

plugins/in_tail/tail_file.c

📚 Learning: 2025-08-29T06:25:27.250Z

Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:93-107
Timestamp: 2025-08-29T06:25:27.250Z
Learning: In Fluent Bit, ZSTD compression is enabled by default and is treated as a core dependency, not requiring conditional compilation guards like `#ifdef FLB_HAVE_ZSTD`. Unlike some other optional components such as ARROW/PARQUET (which use `#ifdef FLB_HAVE_ARROW` guards), ZSTD support is always available and doesn't need build-time conditionals. ZSTD headers are included directly without guards across multiple plugins and core components.

Applied to files:

plugins/in_tail/tail_file.c

📚 Learning: 2025-08-29T06:24:26.170Z

Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:39-42
Timestamp: 2025-08-29T06:24:26.170Z
Learning: In Fluent Bit, ZSTD compression support is enabled by default and does not require conditional compilation guards (like #ifdef FLB_HAVE_ZSTD) around ZSTD-related code declarations and implementations.

Applied to files:

plugins/in_tail/tail_file.c

📚 Learning: 2025-08-29T06:25:27.250Z

Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:93-107
Timestamp: 2025-08-29T06:25:27.250Z
Learning: In Fluent Bit, ZSTD compression is enabled by default and is treated as a core dependency, not requiring conditional compilation guards like `#ifdef FLB_HAVE_ZSTD`. Unlike some other optional components, ZSTD support is always available and doesn't need build-time conditionals.

Applied to files:

plugins/in_tail/tail_file.c

📚 Learning: 2025-08-29T06:25:02.561Z

Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:7-7
Timestamp: 2025-08-29T06:25:02.561Z
Learning: In Fluent Bit, ZSTD (zstandard) compression library is bundled directly in the source tree at `lib/zstd-1.5.7` and is built unconditionally as a static library. Unlike optional external dependencies, ZSTD does not use conditional compilation guards like `FLB_HAVE_ZSTD` and is always available. Headers like `<fluent-bit/flb_zstd.h>` can be included directly without guards.

Applied to files:

plugins/in_tail/tail_file.c

📚 Learning: 2025-08-29T06:24:55.855Z

Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: src/aws/flb_aws_compress.c:52-56
Timestamp: 2025-08-29T06:24:55.855Z
Learning: ZSTD compression is always available in Fluent Bit and does not require conditional compilation guards. Unlike Arrow/Parquet which use #ifdef FLB_HAVE_ARROW guards, ZSTD is built unconditionally with flb_zstd.c included directly in src/CMakeLists.txt and a bundled ZSTD library at lib/zstd-1.5.7/.

Applied to files:

plugins/in_tail/tail_file.c

📚 Learning: 2025-08-29T06:24:44.797Z

Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: src/aws/flb_aws_compress.c:26-26
Timestamp: 2025-08-29T06:24:44.797Z
Learning: In Fluent Bit, ZSTD support is always available and enabled by default. The build system automatically detects and uses either the system libzstd library or builds the bundled ZSTD version. Unlike other optional dependencies like Arrow which use conditional compilation guards (e.g., FLB_HAVE_ARROW), ZSTD does not require conditional includes or build flags.

Applied to files:

plugins/in_tail/tail_file.c

📚 Learning: 2025-09-08T11:21:33.975Z

Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 10851
File: include/fluent-bit/flb_simd.h:60-66
Timestamp: 2025-09-08T11:21:33.975Z
Learning: Fluent Bit currently only supports MSVC compiler on Windows, so additional compiler compatibility guards may be unnecessary for Windows-specific code paths.

Applied to files:

plugins/in_tail/tail_file.c

📚 Learning: 2025-08-31T12:46:11.940Z

Learnt from: ThomasDevoogdt
Repo: fluent/fluent-bit PR: 9277
File: .github/workflows/pr-compile-check.yaml:147-151
Timestamp: 2025-08-31T12:46:11.940Z
Learning: In fluent-bit CMakeLists.txt, the system library preference flags are defined as FLB_PREFER_SYSTEM_LIB_ZSTD and FLB_PREFER_SYSTEM_LIB_KAFKA with the FLB_ prefix.

Applied to files:

plugins/in_tail/tail_file.c

📚 Learning: 2025-11-21T06:23:29.770Z

Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 11171
File: include/fluent-bit/flb_lib.h:52-53
Timestamp: 2025-11-21T06:23:29.770Z
Learning: In Fluent Bit core (fluent/fluent-bit repository), function descriptions/documentation are not required for newly added functions in header files.

Applied to files:

plugins/in_tail/tail_file.c

🧬 Code graph analysis (2)

plugins/in_tail/tail_file.c (1)

plugins/in_tail/tail_db.c (1)

flb_tail_db_file_offset (366-399)

plugins/in_tail/tail_db.c (2)

src/flb_sqldb.c (2)

flb_sqldb_query (151-166)

flb_sqldb_close (129-149)

plugins/in_tail/tail_file.h (1)

flb_tail_target_file_name_cmp (64-116)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (31)

GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_COMPILER_STRICT_POINTER_TYPES=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
GitHub Check: pr-compile-without-cxx (3.31.6)
GitHub Check: PR - fuzzing test
GitHub Check: pr-compile-centos-7

🔇 Additional comments (16)

plugins/in_tail/tail_fs_stat.c (1)

139-142: LGTM! Consistent gzip state reset on truncation.

The initialization of anchor_offset, skip_bytes, exclude_bytes, and skipping_mode on file truncation is correct and consistent with the corresponding logic in tail_fs_inotify.c and adjust_counters() in tail_file.c.

plugins/in_tail/tail_file.c (8)

47-47: LGTM!

Required include for the decompression state constants used in the gzip member boundary detection logic.

1031-1096: Well-structured gzip-aware DB position handling.

The logic correctly handles three scenarios:

Gzip files with persisted anchor/skip: seeks to anchor_offset and enables skipping mode

Normal files or legacy gzip entries: uses the stored offset

No DB position with read_from_head=off: seeks to EOF

The comment at lines 1089-1094 appropriately documents the multi-member gzip limitation where skip_bytes only tracks bytes within the current member. This is a known limitation acknowledged in the PR discussion.

1130-1138: LGTM!

Correctly initializes gzip resume fields for compressed files without DB persistence. Setting stream_offset = 0 is intentional as there's no prior decompressed byte count to track.

1329-1338: LGTM!

Good initialization of the gzip resume fields. Using FLB_TAIL_DB_ID_NONE instead of a magic 0 improves code clarity and consistency with the deletion path.

1509-1548: Correct offset rewind implementation for uncompressed files.

The logic properly:

Rewinds offset by buf_len for uncompressed files to prevent data loss on restart

Logs a debug message documenting the acknowledged gzip limitation

Uses db_id > FLB_TAIL_DB_ID_NONE check to avoid resurrecting deleted DB entries (addressing the past review concern)

The gzip limitation (lines 1537-1541) is intentional and documented in the PR objectives—mapping decompressed buffer positions back to compressed offsets is infeasible with streaming decompression.

1668-1671: LGTM!

Correctly resets all gzip resume state fields on file truncation, consistent with the truncation handling in tail_fs_stat.c and tail_fs_inotify.c.

1872-1892: Correct implementation of decompressed data skipping.

The skip logic properly handles:

Full skip: when exclude_bytes >= decompressed_data_length, decrement and discard all data

Partial skip: calculate remaining bytes, use memmove for the overlapping buffer shift, clear skipping_mode

Using memmove is correct here since source and destination overlap.

1933-1954: Solid gzip member boundary detection and anchor update.

The logic correctly:

Tracks decompressed bytes within the current member via skip_bytes

Detects member completion when the decompressor expects a new header and all buffers are empty

Updates anchor_offset to the current raw file position for safe resume

As noted in the PR discussion (by cosmo0920), there's a known corner case with multi-member gzip + multiline where a shutdown between the in-memory skip_bytes increment and the DB persist can cause small duplication on restart. This is technically difficult to eliminate and is appropriately documented as a limitation rather than a bug.

plugins/in_tail/tail_db.c (7)

28-34: LGTM!

Clean callback implementation for detecting query results. This addresses the previous review feedback about using pragma_table_info for reliable column existence detection.

61-107: Robust schema migration using pragma_table_info.

This addresses the previous review feedback:

Uses pragma_table_info to reliably detect column existence

Properly distinguishes between query failures (returns NULL with error log) and missing columns (triggers migration)

Using flb_plg_debug for migration messages is appropriate per the PR discussion

182-232: LGTM! Proper type handling for skip/anchor columns.

The extended signature correctly uses:

int64_t for offset and anchor (matching sqlite3_column_int64 return type)

uint64_t for skip (matching file->skip_bytes type)

This addresses the previous review concern about potential truncation on platforms where off_t is 32-bit. The added cleanup at lines 204-205 before returning on error is also good practice.

262-263: LGTM!

Correctly binds skip_bytes and anchor_offset to the INSERT statement parameters.

323-359: LGTM! Correct restoration of gzip resume state from DB.

The logic properly:

Retrieves skip_bytes and anchor_offset from the database

Initializes skipping_mode and exclude_bytes when skip_bytes > 0, enabling the skip logic in flb_tail_file_chunk()

Uses correct types matching the db_file_exists signature

373-375: LGTM!

Binding order correctly matches the SQL_UPDATE_OFFSET statement: offset, skip, anchor for SET clause, then db_id for WHERE clause.

444-444: Critical fix: Reset db_id to prevent DB entry resurrection.

Setting db_id = FLB_TAIL_DB_ID_NONE after deletion works in conjunction with the db_id > FLB_TAIL_DB_ID_NONE check in flb_tail_file_remove() to prevent the bug where a deleted file's DB entry could be recreated if offset rewinding occurs after deletion.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

plugins/in_tail/tail_file.c

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7ded9ae and 71208f6.

📒 Files selected for processing (1)

plugins/in_tail/tail_file.c (1 hunks)

🧰 Additional context used

🧠 Learnings (2)

📓 Common learnings

Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 11059
File: plugins/in_tail/tail_file.c:1618-1640
Timestamp: 2025-10-23T07:43:16.216Z
Learning: In plugins/in_tail/tail_file.c, when truncate_long_lines is enabled and the buffer is full, the early truncation path uses `lines > 0` as the validation pattern to confirm whether process_content successfully processed content. This is intentional to track occurrences of line processing rather than byte consumption, and consuming bytes based on `processed_bytes > 0` would be overkill for this validation purpose.

📚 Learning: 2025-10-23T07:43:16.216Z

Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 11059
File: plugins/in_tail/tail_file.c:1618-1640
Timestamp: 2025-10-23T07:43:16.216Z
Learning: In plugins/in_tail/tail_file.c, when truncate_long_lines is enabled and the buffer is full, the early truncation path uses `lines > 0` as the validation pattern to confirm whether process_content successfully processed content. This is intentional to track occurrences of line processing rather than byte consumption, and consuming bytes based on `processed_bytes > 0` would be overkill for this validation purpose.

Applied to files:

plugins/in_tail/tail_file.c

🧬 Code graph analysis (1)

plugins/in_tail/tail_file.c (1)

plugins/in_tail/tail_db.c (1)

flb_tail_db_file_offset (290-321)

🪛 Cppcheck (2.18.0)

plugins/in_tail/tail_file.c

[information] Limiting analysis of branches. Use --check-level=exhaustive to analyze all branches.

(normalCheckLevelMaxBranches)

[information] Too many #ifdef configurations - cppcheck only checks 12 configurations. Use --force to check all configurations. For more details, use --enable=information.

(toomanyconfigs)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (32)

GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
GitHub Check: Agent
GitHub Check: run-ubuntu-unit-tests (-DFLB_COMPILER_STRICT_POINTER_TYPES=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
GitHub Check: pr-compile-without-cxx (3.31.6)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
GitHub Check: PR - fuzzing test
GitHub Check: pr-compile-centos-7

Copilot

Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

plugins/in_tail/tail_file.c

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (1)

plugins/in_tail/tail_file.c (1)
1471-1471: Remove trailing whitespace.

There are trailing spaces after #endif on this line.

Apply this diff:
-#endif        
+#endif

🧹 Nitpick comments (1)

plugins/in_tail/tail_file.c (1)
1451-1472: Consider adding a warning for compressed files with buffered data.

While the fix correctly excludes compressed files (where rewinding isn't possible due to gzip decompression state requirements), users monitoring .gz files may be unaware that buffered data can still be lost on shutdown. Consider adding a debug or warning message similar to:
if (file->buf_len > 0) {
    if (file->decompression_context != NULL) {
        flb_plg_debug(ctx->ins, 
                      "inode=%"PRIu64" compressed file %s has %lu bytes buffered; "
                      "rewind not supported for compressed files",
                      file->inode, file->name, (unsigned long)file->buf_len);
    }
    else {
        /* existing rewind logic */
        ...
    }
}
This would help users understand the limitation without affecting correctness.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 71208f6 and f63f1f4.

📒 Files selected for processing (1)

plugins/in_tail/tail_file.c (1 hunks)

🧰 Additional context used

🧠 Learnings (2)

📓 Common learnings

Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 11059
File: plugins/in_tail/tail_file.c:1618-1640
Timestamp: 2025-10-23T07:43:16.216Z
Learning: In plugins/in_tail/tail_file.c, when truncate_long_lines is enabled and the buffer is full, the early truncation path uses `lines > 0` as the validation pattern to confirm whether process_content successfully processed content. This is intentional to track occurrences of line processing rather than byte consumption, and consuming bytes based on `processed_bytes > 0` would be overkill for this validation purpose.

📚 Learning: 2025-10-23T07:43:16.216Z

Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 11059
File: plugins/in_tail/tail_file.c:1618-1640
Timestamp: 2025-10-23T07:43:16.216Z
Learning: In plugins/in_tail/tail_file.c, when truncate_long_lines is enabled and the buffer is full, the early truncation path uses `lines > 0` as the validation pattern to confirm whether process_content successfully processed content. This is intentional to track occurrences of line processing rather than byte consumption, and consuming bytes based on `processed_bytes > 0` would be overkill for this validation purpose.

Applied to files:

plugins/in_tail/tail_file.c

🧬 Code graph analysis (1)

plugins/in_tail/tail_file.c (1)

plugins/in_tail/tail_db.c (1)

flb_tail_db_file_offset (290-321)

🪛 Cppcheck (2.18.0)

plugins/in_tail/tail_file.c

[information] Limiting analysis of branches. Use --check-level=exhaustive to analyze all branches.

(normalCheckLevelMaxBranches)

[information] Too many #ifdef configurations - cppcheck only checks 12 configurations. Use --force to check all configurations. For more details, use --enable=information.

(toomanyconfigs)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (31)

GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_COMPILER_STRICT_POINTER_TYPES=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
GitHub Check: PR - fuzzing test
GitHub Check: pr-compile-centos-7
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
GitHub Check: pr-compile-without-cxx (3.31.6)

🔇 Additional comments (1)

plugins/in_tail/tail_file.c (1)

1451-1472: LGTM! Rewind logic correctly prevents data loss for regular files.

The implementation properly addresses the data loss issue when Fluent Bit shuts down with unprocessed buffered data:

The condition file->buf_len > 0 && file->decompression_context == NULL correctly identifies regular files with unprocessed data while excluding compressed files (where offset and buf_len have incompatible units).

The rewind arithmetic offset -= buf_len (clamped to 0) correctly repositions the file offset to the start of unprocessed data.

The DB persistence call ensures the rewound offset survives restarts.

This addresses the core issue described in #11265 where partial lines and buffered data were lost on restart.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

plugins/in_tail/tail_file.c (2)
1451-1472: Core rewind logic looks correct; consider adding warning for compressed files.

The check for file->decompression_context == NULL properly prevents the compressed-file arithmetic bug flagged in earlier reviews. For regular files, the offset and buffer length are in the same units, so the subtraction and DB update are valid.

However, when compressed files have buffered decompressed data at shutdown, that data will be silently lost (since rewinding is skipped). Consider logging a warning in that case so users are aware:
 if (file->buf_len > 0 && file->decompression_context == NULL) {
     /*
      * If there is data in the buffer, it means it was not processed.
      * We must rewind the offset to ensure this data is re-read on restart.
      */
     off_t old_offset = file->offset;

     if (file->offset > file->buf_len) {
         file->offset -= file->buf_len;
     } else {
         file->offset = 0;
     }

     flb_plg_debug(ctx->ins, "inode=%"PRIu64" rewind offset for %s: old=%"PRId64" new=%"PRId64" (buf_len=%lu)",
                   file->inode, file->name, old_offset, file->offset, (unsigned long)file->buf_len);

 #ifdef FLB_HAVE_SQLDB
     if (ctx->db) {
         flb_tail_db_file_offset(file, ctx);
     }
 #endif
+}
+else if (file->buf_len > 0 && file->decompression_context != NULL) {
+    flb_plg_warn(ctx->ins, "inode=%"PRIu64" cannot rewind compressed file %s; "
+                 "%lu decompressed bytes in buffer may be lost on restart",
+                 file->inode, file->name, (unsigned long)file->buf_len);
 }
1471-1471: Remove trailing whitespace.

Minor formatting issue: extra spaces after #endif.
-#endif        
+#endif

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f63f1f4 and 76f3c05.

📒 Files selected for processing (1)

plugins/in_tail/tail_file.c (1 hunks)

🧰 Additional context used

🧠 Learnings (2)

📓 Common learnings

Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 11059
File: plugins/in_tail/tail_file.c:1618-1640
Timestamp: 2025-10-23T07:43:16.216Z
Learning: In plugins/in_tail/tail_file.c, when truncate_long_lines is enabled and the buffer is full, the early truncation path uses `lines > 0` as the validation pattern to confirm whether process_content successfully processed content. This is intentional to track occurrences of line processing rather than byte consumption, and consuming bytes based on `processed_bytes > 0` would be overkill for this validation purpose.

📚 Learning: 2025-10-23T07:43:16.216Z

Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 11059
File: plugins/in_tail/tail_file.c:1618-1640
Timestamp: 2025-10-23T07:43:16.216Z
Learning: In plugins/in_tail/tail_file.c, when truncate_long_lines is enabled and the buffer is full, the early truncation path uses `lines > 0` as the validation pattern to confirm whether process_content successfully processed content. This is intentional to track occurrences of line processing rather than byte consumption, and consuming bytes based on `processed_bytes > 0` would be overkill for this validation purpose.

Applied to files:

plugins/in_tail/tail_file.c

🧬 Code graph analysis (1)

plugins/in_tail/tail_file.c (1)

plugins/in_tail/tail_db.c (1)

flb_tail_db_file_offset (290-321)

🪛 Cppcheck (2.18.0)

plugins/in_tail/tail_file.c

[information] Limiting analysis of branches. Use --check-level=exhaustive to analyze all branches.

(normalCheckLevelMaxBranches)

[information] Too many #ifdef configurations - cppcheck only checks 12 configurations. Use --force to check all configurations. For more details, use --enable=information.

(toomanyconfigs)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (30)

GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_COMPILER_STRICT_POINTER_TYPES=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
GitHub Check: pr-compile-without-cxx (3.31.6)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
GitHub Check: PR - fuzzing test
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
GitHub Check: pr-compile-centos-7

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

plugins/in_tail/tail_db.c (1)

258-263: Verify SQL_INSERT_FILE parameter order and address skip_bytes type mismatch.

The bindings at lines 258-263 correctly match SQL_INSERT_FILE's parameter order (name, offset, inode, created, skip, anchor). However, skip_bytes is defined as uint64_t but bound via sqlite3_bind_int64, which expects a signed sqlite3_int64. While values exceeding INT64_MAX are unlikely for a byte offset, consider either casting to int64_t explicitly or using sqlite3_bind_int64 with a comment documenting the assumption that skip_bytes remains within signed int64 range.

🧹 Nitpick comments (1)

plugins/in_tail/tail_file.c (1)

1027-1141: Document multi-member gzip stream_offset limitation more prominently.

The comment at lines 1090-1095 acknowledges that stream_offset = skip_bytes is only correct for single-member gzip files. For multi-member gzip, skip_bytes resets at member boundaries, so stream_offset won't reflect total decompressed bytes from all prior members. This could affect offset_key accuracy when users concatenate multiple gzip streams.

Consider either:

Adding a more prominent warning in the code or documentation that offset_key for multi-member gzip reflects position within the current member, not total decompressed bytes across all members.

Tracking total decompressed bytes separately if multi-member gzip is a common use case.

Otherwise, the seek logic correctly handles DB-backed resume for both gzip and plain files, with proper initialization of stream_offset, exclude_bytes, and skipping_mode.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1263901 and 8640940.

📒 Files selected for processing (6)

plugins/in_tail/tail_db.c (11 hunks)
plugins/in_tail/tail_file.c (9 hunks)
plugins/in_tail/tail_file_internal.h (1 hunks)
plugins/in_tail/tail_fs_inotify.c (1 hunks)
plugins/in_tail/tail_fs_stat.c (1 hunks)
plugins/in_tail/tail_sql.h (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

plugins/in_tail/tail_file_internal.h
plugins/in_tail/tail_fs_stat.c

🧰 Additional context used

🧠 Learnings (12)

📓 Common learnings

Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 11059
File: plugins/in_tail/tail_file.c:1618-1640
Timestamp: 2025-10-23T07:43:16.216Z
Learning: In plugins/in_tail/tail_file.c, when truncate_long_lines is enabled and the buffer is full, the early truncation path uses `lines > 0` as the validation pattern to confirm whether process_content successfully processed content. This is intentional to track occurrences of line processing rather than byte consumption, and consuming bytes based on `processed_bytes > 0` would be overkill for this validation purpose.

📚 Learning: 2025-10-23T07:43:16.216Z

Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 11059
File: plugins/in_tail/tail_file.c:1618-1640
Timestamp: 2025-10-23T07:43:16.216Z
Learning: In plugins/in_tail/tail_file.c, when truncate_long_lines is enabled and the buffer is full, the early truncation path uses `lines > 0` as the validation pattern to confirm whether process_content successfully processed content. This is intentional to track occurrences of line processing rather than byte consumption, and consuming bytes based on `processed_bytes > 0` would be overkill for this validation purpose.

Applied to files:

plugins/in_tail/tail_db.c
plugins/in_tail/tail_file.c
plugins/in_tail/tail_fs_inotify.c

📚 Learning: 2025-09-22T15:59:55.794Z

Learnt from: nicknezis
Repo: fluent/fluent-bit PR: 10882
File: plugins/out_http/http.c:112-116
Timestamp: 2025-09-22T15:59:55.794Z
Learning: When users consider bug fixes out of scope for their focused PRs, it's appropriate to create separate GitHub issues to track those concerns rather than expanding the current PR scope.

Applied to files:

plugins/in_tail/tail_file.c

📚 Learning: 2025-08-29T06:25:27.250Z

Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:93-107
Timestamp: 2025-08-29T06:25:27.250Z
Learning: In Fluent Bit, ZSTD compression is enabled by default and is treated as a core dependency, not requiring conditional compilation guards like `#ifdef FLB_HAVE_ZSTD`. Unlike some other optional components such as ARROW/PARQUET (which use `#ifdef FLB_HAVE_ARROW` guards), ZSTD support is always available and doesn't need build-time conditionals. ZSTD headers are included directly without guards across multiple plugins and core components.

Applied to files:

plugins/in_tail/tail_file.c

📚 Learning: 2025-08-29T06:24:26.170Z

Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:39-42
Timestamp: 2025-08-29T06:24:26.170Z
Learning: In Fluent Bit, ZSTD compression support is enabled by default and does not require conditional compilation guards (like #ifdef FLB_HAVE_ZSTD) around ZSTD-related code declarations and implementations.

Applied to files:

plugins/in_tail/tail_file.c

📚 Learning: 2025-08-29T06:25:27.250Z

Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:93-107
Timestamp: 2025-08-29T06:25:27.250Z
Learning: In Fluent Bit, ZSTD compression is enabled by default and is treated as a core dependency, not requiring conditional compilation guards like `#ifdef FLB_HAVE_ZSTD`. Unlike some other optional components, ZSTD support is always available and doesn't need build-time conditionals.

Applied to files:

plugins/in_tail/tail_file.c

📚 Learning: 2025-08-29T06:25:02.561Z

Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:7-7
Timestamp: 2025-08-29T06:25:02.561Z
Learning: In Fluent Bit, ZSTD (zstandard) compression library is bundled directly in the source tree at `lib/zstd-1.5.7` and is built unconditionally as a static library. Unlike optional external dependencies, ZSTD does not use conditional compilation guards like `FLB_HAVE_ZSTD` and is always available. Headers like `<fluent-bit/flb_zstd.h>` can be included directly without guards.

Applied to files:

plugins/in_tail/tail_file.c

📚 Learning: 2025-08-29T06:24:55.855Z

Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: src/aws/flb_aws_compress.c:52-56
Timestamp: 2025-08-29T06:24:55.855Z
Learning: ZSTD compression is always available in Fluent Bit and does not require conditional compilation guards. Unlike Arrow/Parquet which use #ifdef FLB_HAVE_ARROW guards, ZSTD is built unconditionally with flb_zstd.c included directly in src/CMakeLists.txt and a bundled ZSTD library at lib/zstd-1.5.7/.

Applied to files:

plugins/in_tail/tail_file.c

📚 Learning: 2025-08-29T06:24:44.797Z

Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: src/aws/flb_aws_compress.c:26-26
Timestamp: 2025-08-29T06:24:44.797Z
Learning: In Fluent Bit, ZSTD support is always available and enabled by default. The build system automatically detects and uses either the system libzstd library or builds the bundled ZSTD version. Unlike other optional dependencies like Arrow which use conditional compilation guards (e.g., FLB_HAVE_ARROW), ZSTD does not require conditional includes or build flags.

Applied to files:

plugins/in_tail/tail_file.c

📚 Learning: 2025-09-08T11:21:33.975Z

Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 10851
File: include/fluent-bit/flb_simd.h:60-66
Timestamp: 2025-09-08T11:21:33.975Z
Learning: Fluent Bit currently only supports MSVC compiler on Windows, so additional compiler compatibility guards may be unnecessary for Windows-specific code paths.

Applied to files:

plugins/in_tail/tail_file.c

📚 Learning: 2025-08-31T12:46:11.940Z

Learnt from: ThomasDevoogdt
Repo: fluent/fluent-bit PR: 9277
File: .github/workflows/pr-compile-check.yaml:147-151
Timestamp: 2025-08-31T12:46:11.940Z
Learning: In fluent-bit CMakeLists.txt, the system library preference flags are defined as FLB_PREFER_SYSTEM_LIB_ZSTD and FLB_PREFER_SYSTEM_LIB_KAFKA with the FLB_ prefix.

Applied to files:

plugins/in_tail/tail_file.c

📚 Learning: 2025-11-21T06:23:29.770Z

Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 11171
File: include/fluent-bit/flb_lib.h:52-53
Timestamp: 2025-11-21T06:23:29.770Z
Learning: In Fluent Bit core (fluent/fluent-bit repository), function descriptions/documentation are not required for newly added functions in header files.

Applied to files:

plugins/in_tail/tail_file.c

🧬 Code graph analysis (2)

plugins/in_tail/tail_db.c (2)

src/flb_sqldb.c (2)

flb_sqldb_query (151-166)

flb_sqldb_close (129-149)

plugins/in_tail/tail_file.h (1)

flb_tail_target_file_name_cmp (64-116)

plugins/in_tail/tail_file.c (1)

plugins/in_tail/tail_db.c (1)

flb_tail_db_file_offset (366-399)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (31)

GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_COMPILER_STRICT_POINTER_TYPES=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
GitHub Check: PR - fuzzing test
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
GitHub Check: pr-compile-centos-7
GitHub Check: pr-compile-without-cxx (3.31.6)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)

🔇 Additional comments (16)

plugins/in_tail/tail_db.c (6)

28-34: LGTM!

The column-existence callback is correctly implemented. It properly sets the flag when a row is returned from the pragma query, enabling reliable column detection for schema migration.

61-108: LGTM!

The migration logic correctly uses pragma_table_info to detect column existence, distinguishing real errors from missing columns. The pattern is consistent for both skip and anchor columns, with proper error handling and logging.

318-363: LGTM!

The function correctly restores skip_bytes and anchor_offset from the database and initializes the runtime-only exclude_bytes and skipping_mode fields based on the persisted skip state. The logic properly handles both cases (skip > 0 and skip == 0).

424-446: LGTM!

Resetting db_id to FLB_TAIL_DB_ID_NONE after deletion prevents accidental resurrection of deleted DB entries when flb_tail_db_file_offset is called later. This is an important safeguard for data integrity.

365-398: The parameter binding order in flb_tail_db_file_offset correctly matches the SQL_UPDATE_OFFSET statement: offset (parameter 1), skip (parameter 2), anchor (parameter 3), and id (parameter 4).

182-221: Column indices for skip and anchor are correct.

The code correctly reads skip from column index 6 and anchor from column index 7, matching the table schema defined in SQL_CREATE_FILES. When SELECT * is executed, SQLite returns columns in the order they appear in the table definition: id (0), name (1), offset (2), inode (3), created (4), rotated (5), skip (6), and anchor (7).

plugins/in_tail/tail_fs_inotify.c (1)

259-281: LGTM!

The truncation handler correctly resets all gzip resume state fields (anchor_offset, skip_bytes, exclude_bytes, skipping_mode) alongside the file offset and buffer, ensuring a clean state after truncation. This initialization is consistent with similar handling in tail_fs_stat.c and adjust_counters in tail_file.c.

plugins/in_tail/tail_sql.h (3)

30-40: LGTM!

The table schema correctly adds skip and anchor columns with INTEGER DEFAULT 0, ensuring backward compatibility with existing databases. Column indices (skip=6, anchor=7) align with the reads in db_file_exists.

45-47: LGTM!

The SQL_INSERT_FILE statement correctly includes skip and anchor in both the column list and VALUES clause. The parameter order matches the binding sequence in db_file_insert (tail_db.c lines 258-263).

52-53: LGTM!

The SQL_UPDATE_OFFSET statement correctly updates all three position-tracking fields (offset, skip, anchor) atomically. The parameter order matches the binding sequence in flb_tail_db_file_offset (tail_db.c lines 372-375).

plugins/in_tail/tail_file.c (6)

47-47: LGTM!

Including flb_compression.h is appropriate for the gzip decompression functionality added in this PR.

1328-1338: LGTM!

All gzip resume fields (anchor_offset, skip_bytes, exclude_bytes, skipping_mode) are correctly initialized to zero/false when a new file is appended, ensuring clean initial state.

1509-1548: LGTM!

The offset rewind logic correctly handles buffered data on shutdown:

For non-compressed files: rewinds the offset by buf_len (with bounds checking) so unprocessed data is re-read on restart.

For compressed files: logs a warning explaining that accurate rewinding is infeasible with streaming decompression.

DB updates are properly guarded by db_id > FLB_TAIL_DB_ID_NONE to prevent resurrecting deleted entries.

This addresses the data loss issue described in the PR objectives.

1657-1679: LGTM!

The truncation handler in adjust_counters correctly resets all gzip resume state fields (anchor_offset, skip_bytes, exclude_bytes, skipping_mode) when a file is truncated, consistent with the inotify and stat-based truncation handlers.

1871-1891: LGTM!

The skip logic during decompression correctly handles resuming from a mid-stream position:

When exclude_bytes >= decompressed_data_length, all newly decompressed data is skipped and exclude_bytes is decremented.

When exclude_bytes < decompressed_data_length, the remaining bytes are shifted to the buffer start using memmove, and skipping_mode is cleared.

This enables accurate gzip resume from the persisted skip_bytes position.

1932-1953: LGTM!

The gzip member boundary handling correctly tracks position using the anchor/skip pattern:

skip_bytes is incremented by processed_bytes to track position within the current member.

When a member completes (decompressor transitions to EXPECTING_HEADER state and all buffers are empty), anchor_offset advances to the current raw file position and skip_bytes resets to 0.

This enables resume at member boundaries for multi-member gzip files, addressing the data loss issue for compressed inputs mentioned in the PR objectives.

jinyongchoi · 2025-12-18T05:42:18Z

Thanks for the detailed analysis. I fully agree with your opinion. Although there is a risk of duplication in case of abrupt shutdown, it is technically difficult to solve completely, and duplication is definitely better than data loss.

Also, regarding your question about the logs, I have changed the log level of the database migration messages to debug. This ensures they are not too noisy during normal operation while still being available for troubleshooting if needed.

Finally, should I add a note about the limitation (potential duplication on crash) to the documentation? I think adding a warning/note to the 'Database file' section would be helpful for users.

Let me know what you think!
Thanks!

cosmo0920

I found a small nitpick issue but your PR is not following our coding style.
So, we need to follow the style of defining variables.

plugins/in_tail/tail_file.c

cosmo0920 · 2025-12-19T06:27:37Z

Finally, should I add a note about the limitation (potential duplication on crash) to the documentation? I think adding a warning/note to the 'Database file' section would be helpful for users.

Let me know what you think! Thanks!

I suppose that we need to Note annotations to depicts the possibility for database corruptions in the official documentation which should be corresponding PR for documentation. This could be corner cases but it's technically hard to solve cleanly.

Previously, when tailing gzip files, there was no mechanism to persistently store the uncompressed position ('skip_bytes'). This meant that upon restart, the plugin could not correctly locate the reading position, identifying it as a rotation or new file case, potentially leading to data loss. To fix this, 'skip_bytes' is now stored in the database to persist the uncompressed offset. Additionally, 'exclude_bytes' is introduced to track runtime skipping without interfering with the persistent value. The SQLite schema has been updated to include 'anchor_offset' and 'skip_bytes' columns to support these features. Signed-off-by: jinyong.choi <[email protected]>

jinyongchoi · 2025-12-19T07:09:25Z

Finally, should I add a note about the limitation (potential duplication on crash) to the documentation? I think adding a warning/note to the 'Database file' section would be helpful for users.
Let me know what you think! Thanks!

I suppose that we need to Note annotations to depicts the possibility for database corruptions in the official documentation which should be corresponding PR for documentation. This could be corner cases but it's technically hard to solve cleanly.

Got it! I'll create a separate PR for the documentation.
Thanks!

This commit adds a 'Data Reliability and Recovery' hint to the Tail input plugin documentation. It clarifies the behavior of the database offset mechanism during unexpected shutdowns (e.g., system crash, power loss). Specifically, it explains that while Fluent Bit guarantees at-least-once delivery, there is a possibility of slight offset lag and minimal data duplication upon recovery. This ensures users understand that no data is lost even in these scenarios. Related to: fluent/fluent-bit#11269 Signed-off-by: jinyong.choi <[email protected]>

This commit adds a 'Data Reliability and Recovery' hint to the Tail input plugin documentation. It clarifies the behavior of the database offset mechanism during unexpected shutdowns (e.g., system crash, power loss). Specifically, it explains that while Fluent Bit guarantees at-least-once delivery, there is a possibility of slight offset lag and minimal data duplication upon recovery. This ensures users understand that no data is lost even in these scenarios. refs: fluent/fluent-bit#11269 Signed-off-by: jinyong.choi <[email protected]>

Copilot AI review requested due to automatic review settings December 9, 2025 02:04

jinyongchoi requested review from cosmo0920 and edsiper as code owners December 9, 2025 02:04

github-actions bot added the docs-required label Dec 9, 2025

jinyongchoi temporarily deployed to pr December 9, 2025 02:04 — with GitHub Actions Inactive

Copilot started reviewing on behalf of jinyongchoi December 9, 2025 02:04 View session

chatgpt-codex-connector bot reviewed Dec 9, 2025

View reviewed changes

plugins/in_tail/tail_file.c Outdated Show resolved Hide resolved

coderabbitai bot reviewed Dec 9, 2025

View reviewed changes

Copilot AI reviewed Dec 9, 2025

View reviewed changes

plugins/in_tail/tail_file.c Outdated Show resolved Hide resolved

plugins/in_tail/tail_file.c Show resolved Hide resolved

jinyongchoi temporarily deployed to pr December 9, 2025 02:26 — with GitHub Actions Inactive

jinyongchoi force-pushed the fix/11265-in-tail-data-loss branch from 71208f6 to f63f1f4 Compare December 9, 2025 03:10

jinyongchoi temporarily deployed to pr December 9, 2025 03:10 — with GitHub Actions Inactive

jinyongchoi force-pushed the fix/11265-in-tail-data-loss branch from f63f1f4 to 76f3c05 Compare December 9, 2025 03:14

jinyongchoi temporarily deployed to pr December 9, 2025 03:14 — with GitHub Actions Inactive

coderabbitai bot reviewed Dec 9, 2025

View reviewed changes

jinyongchoi temporarily deployed to pr December 9, 2025 03:35 — with GitHub Actions Inactive

jinyongchoi temporarily deployed to pr December 9, 2025 03:36 — with GitHub Actions Inactive

jinyongchoi temporarily deployed to pr December 18, 2025 05:31 — with GitHub Actions Inactive

coderabbitai bot reviewed Dec 18, 2025

View reviewed changes

jinyongchoi temporarily deployed to pr December 18, 2025 05:50 — with GitHub Actions Inactive

cosmo0920 requested changes Dec 19, 2025

View reviewed changes

plugins/in_tail/tail_file.c Outdated Show resolved Hide resolved

jinyongchoi force-pushed the fix/11265-in-tail-data-loss branch from 8640940 to fdd8174 Compare December 19, 2025 06:45

jinyongchoi temporarily deployed to pr December 19, 2025 06:45 — with GitHub Actions Inactive

cosmo0920 approved these changes Dec 19, 2025

View reviewed changes

cosmo0920 added this to the Fluent Bit v5.0 milestone Dec 19, 2025

jinyongchoi temporarily deployed to pr December 19, 2025 07:04 — with GitHub Actions Inactive

jinyongchoi temporarily deployed to pr December 19, 2025 07:05 — with GitHub Actions Inactive

jinyongchoi mentioned this pull request Dec 19, 2025

in_tail: Add data reliability note fluent/fluent-bit-docs#2310

Open

in_tail: fix data loss with buffered data on shutdown and gzip files #11269

Are you sure you want to change the base?

in_tail: fix data loss with buffered data on shutdown and gzip files #11269

Uh oh!

Conversation

jinyongchoi commented Dec 9, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Other AI code review bot(s) detected

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

jinyongchoi commented Dec 18, 2025

Uh oh!

cosmo0920 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cosmo0920 commented Dec 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jinyongchoi commented Dec 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jinyongchoi commented Dec 9, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 9, 2025 •

edited

Loading

cosmo0920 commented Dec 19, 2025 •

edited

Loading