Skip to content

Deadlock during reload #10274

@TimeToogo

Description

@TimeToogo

We encountered a case where fluent-bit deadlocked during a reload (SIGHUP via the api).

Backtrace

(gdb) bt
#0  futex_wait (private=0, expected=2, futex_word=0x7fd3d5a00740 <tzset_lock>) at ../sysdeps/nptl/futex-internal.h:146
#1  __GI___lll_lock_wait_private (futex=futex@entry=0x7fd3d5a00740 <tzset_lock>) at lowlevellock.c:35
#2  0x00007fd3d5909c1c in __tz_convert (timer=1745935171, use_localtime=1, tp=0x7ffc9a3aefd0) at tzset.c:572
#3  0x000000000054108d in flb_log_construct ()
#4  0x000000000054140c in flb_log_print ()
#5  0x00000000005787b2 in flb_reload ()
#6  0x00000000004b152f in flb_main ()
#7  0x00007fd3d583fee0 in __libc_start_call_main (main=main@entry=0x4af110 <main>, argc=argc@entry=3, argv=argv@entry=0x7ffc9a3b03d8) at ../sysdeps/nptl/libc_start_call_main.h:58
#8  0x00007fd3d583ff90 in __libc_start_main_impl (main=0x4af110 <main>, argc=3, argv=0x7ffc9a3b03d8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffc9a3b03c8) at ../csu/libc-start.c:389
#9  0x00000000004af155 in _start ()

Environment

  • Amazon Linux 2023 6.1.132-147.221.amzn2023.x86_64
  • Glibc 2.34
  • Fluent Bit v4.0.1

After post-mortem debugging an instance of this, it seems the a possible culprit is using localtime (AS-Unsafe) in a signal handler.

cur = localtime(&now);

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions