You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When running under high load we encounter seg faults:
#0 0xffff98bcd810 in ???() at ???:0
#1 0xffff98bcffa3 in ???() at ???:0
#2 0xffff98bd0d4f in ???() at ???:0
#3 0x5f884f in msgpack_sbuffer_write() at lib/msgpack-c/include/msgpack/sbuffer.h:81
#4 0x5b8367 in msgpack_pack_map() at lib/msgpack-c/include/msgpack/pack_template.h:753
#5 0x5bc78b in flb_mp_map_header_init() at src/flb_mp.c:326
#6 0x5f8a83 in flb_log_event_encoder_dynamic_field_scope_enter() at src/flb_log_event_encoder_dynamic_field.c:70
#7 0x5f8b7b in flb_log_event_encoder_dynamic_field_begin_map() at src/flb_log_event_encoder_dynamic_field.c:117
#8 0x5ee66f in flb_log_event_encoder_begin_record() at src/flb_log_event_encoder.c:250
#9 0xafdcc3 in apply_modifying_rules() at plugins/filter_modify/modify.c:1414
#10 0xafe127 in cb_modify_filter() at plugins/filter_modify/modify.c:1526
#11 0x4dcc53 in flb_filter_do() at src/flb_filter.c:161
#12 0x4d25e7 in input_chunk_append_raw() at src/flb_input_chunk.c:1608
#13 0x4d2e4f in flb_input_chunk_append_raw() at src/flb_input_chunk.c:1929
#14 0x5d2923 in input_log_append() at src/flb_input_log.c:71
#15 0x5d29ab in flb_input_log_append() at src/flb_input_log.c:90
#16 0x744ccb in ml_stream_buffer_flush() at plugins/in_tail/tail_file.c:412
#17 0x745ceb in ml_flush_callback() at plugins/in_tail/tail_file.c:919
#18 0x574883 in flb_ml_flush_stream_group() at src/multiline/flb_ml.c:1516
#19 0x571e13 in flb_ml_flush_parser_instance() at src/multiline/flb_ml.c:117
#20 0x5fd42f in flb_ml_stream_id_destroy_all() at src/multiline/flb_ml_stream.c:316
#21 0x746cb7 in flb_tail_file_remove() at plugins/in_tail/tail_file.c:1256
#22 0x74898b in check_purge_deleted_file() at plugins/in_tail/tail_file.c:1936
#23 0x748d07 in flb_tail_file_purge() at plugins/in_tail/tail_file.c:1992
#24 0x4cbb13 in flb_input_collector_fd() at src/flb_input.c:1982
#25 0x50f3af in flb_engine_handle_event() at src/flb_engine.c:577
#26 0x50f3af in flb_engine_start() at src/flb_engine.c:960
#27 0x4ad693 in flb_lib_worker() at src/flb_lib.c:835
#28 0xffff98bc0933 in ???() at ???:0
#29 0xffff98b64e5b in ???() at ???:0
#30 0xffffffffffffffff in ???() at ???:0
#0 0xffffb400e810 in ???() at ???:0
#1 0xffffb4010fa3 in ???() at ???:0
#2 0xffffb4011d4f in ???() at ???:0
#3 0x1219a63 in msgpack_unpacker_init() at lib/msgpack-c/src/unpack.c:372
#4 0xafd9e3 in apply_modifying_rules() at plugins/filter_modify/modify.c:1372
#5 0xafe127 in cb_modify_filter() at plugins/filter_modify/modify.c:1526
#6 0x4dcc53 in flb_filter_do() at src/flb_filter.c:161
#7 0x4d25e7 in input_chunk_append_raw() at src/flb_input_chunk.c:1608
#8 0x4d2e4f in flb_input_chunk_append_raw() at src/flb_input_chunk.c:1929
#9 0x5d2923 in input_log_append() at src/flb_input_log.c:71
#10 0x5d29ab in flb_input_log_append() at src/flb_input_log.c:90
#11 0x744ccb in ml_stream_buffer_flush() at plugins/in_tail/tail_file.c:412
#12 0x745ceb in ml_flush_callback() at plugins/in_tail/tail_file.c:919
#13 0x574883 in flb_ml_flush_stream_group() at src/multiline/flb_ml.c:1516
#14 0x571e13 in flb_ml_flush_parser_instance() at src/multiline/flb_ml.c:117
#15 0x5fd42f in flb_ml_stream_id_destroy_all() at src/multiline/flb_ml_stream.c:316
#16 0x746cb7 in flb_tail_file_remove() at plugins/in_tail/tail_file.c:1256
#17 0x74898b in check_purge_deleted_file() at plugins/in_tail/tail_file.c:1936
#18 0x748d07 in flb_tail_file_purge() at plugins/in_tail/tail_file.c:1992
#19 0x4cbb13 in flb_input_collector_fd() at src/flb_input.c:1982
#20 0x50f3af in flb_engine_handle_event() at src/flb_engine.c:577
#21 0x50f3af in flb_engine_start() at src/flb_engine.c:960
#22 0x4ad693 in flb_lib_worker() at src/flb_lib.c:835
#23 0xffffb4001933 in ???() at ???:0
#24 0xffffb3fa5e5b in ???() at ???:0
#25 0xffffffffffffffff in ???() at ???:0
Aborted (core dumped)
Seems to be occur under high stress. We can encounter within 1-2 minutes on a c7g.large EC2 instance that is tailing a log that is producing at 50k lines/s. Performane at 25k line/s performance appears stable. The log is a java log that will rotate once it hits 500mb and is then deleted once there's > 3 logs. At 50k line/s we'd expect the log producer is running at a higher rate then what fluent-bit is likely able to consume.
Environment name and version (e.g. Kubernetes? What version?): EC2
Server type and version: AWS Linux c7g.large
Operating System and version: Amazon Linux
Filters and plugins: tail input, java_multiline, java_capture, modify, http output
Additional context
Stress testing fluent-bit attempting to understand performance limitations. Possibly we need throttle fluent-bit but unclear if this will actually resolve the seg fault.
The text was updated successfully, but these errors were encountered:
Bug Report
Describe the bug
When running under high load we encounter seg faults:
Indicates some memory corruption around:
fluent-bit/lib/msgpack-c/src/unpack.c
Line 372 in 81f62b9
fluent-bit/lib/msgpack-c/include/msgpack/sbuffer.h
Line 81 in 81f62b9
Valgrind logs:
valgrindx-1.log
valgrindx.log
Any advice on how to troubleshoot further would be well received.
To Reproduce
Seems to be occur under high stress. We can encounter within 1-2 minutes on a c7g.large EC2 instance that is tailing a log that is producing at 50k lines/s. Performane at 25k line/s performance appears stable. The log is a java log that will rotate once it hits 500mb and is then deleted once there's > 3 logs. At 50k line/s we'd expect the log producer is running at a higher rate then what fluent-bit is likely able to consume.
Expected behavior
Performance to degrade without seg fault crash.
Screenshots
Your Environment
config.zip
Additional context
Stress testing fluent-bit attempting to understand performance limitations. Possibly we need throttle fluent-bit but unclear if this will actually resolve the seg fault.
The text was updated successfully, but these errors were encountered: