Skip to content

server timestamps in logs received from server#434

Open
korowa wants to merge 1 commit intomymarilyn:masterfrom
korowa:server-log-timestamp
Open

server timestamps in logs received from server#434
korowa wants to merge 1 commit intomymarilyn:masterfrom
korowa:server-log-timestamp

Conversation

@korowa
Copy link

@korowa korowa commented May 13, 2024

Checklist:

  • Add tests that demonstrate the correct behavior of the change. Tests should fail without the change.
  • Add or update relevant docs, in the docs folder and in code.
  • Ensure PR doesn't contain untouched code reformatting: spaces, etc.
  • Run flake8 and fix issues.
  • Run pytest no tests failed. See https://clickhouse-driver.readthedocs.io/en/latest/development.html.

Context:

For certain types of queries (e.g. backup/restore operations) ClickHouse server might trigger sending logs to the client with significant delay. In such cases, if client has send_logs_level setting enabled, it might be helpful (or less confusing) to also have server event timestamp on client side.

This PR adds server timestamp to log messages by default, and also adds an ability to disable log attributes concatenation to a single string message (which may be helpful in case of structured logging, to avoid parsing the string back to original attributes).

@korowa korowa force-pushed the server-log-timestamp branch from f94628d to 3f539cd Compare May 13, 2024 09:28
@korowa korowa marked this pull request as draft May 13, 2024 11:44
@coveralls
Copy link

coveralls commented May 14, 2024

Coverage Status

coverage: 96.26% (+0.009%) from 96.251%
when pulling 2a17ea5 on korowa:server-log-timestamp
into 59ea586 on mymarilyn:master.

@korowa korowa force-pushed the server-log-timestamp branch 3 times, most recently from 71b0109 to af09da7 Compare May 16, 2024 07:20
@korowa korowa changed the title preserve server log timestamp in clickhouse logs server timestamps in logs received from server May 16, 2024
@korowa korowa marked this pull request as ready for review May 16, 2024 07:20
@korowa korowa requested a review from xzkostyan May 20, 2024 06:37
@korowa
Copy link
Author

korowa commented May 24, 2024

@xzkostyan could you please take a look when you have time?

@xzkostyan
Copy link
Member

Replacing client datetime with server datetime breaks user experience.

Example: I'm digging into some application issue and grep-ing logs by date/datetime. If server has significant time drift from client there will be no logs in grep output in application logs.

I'd suggest to add another timestamp into log for server timestamp. Result format will be:

        logger.info(
            '[ %s ] [ %s ] [ %s ] {%s} <%s> %s: %s',
            row['server_timestamp'],
            row['host_name'],
            thread_id,
            row['query_id'],
            priority,
            row['source'],
            row['text']
            row['text']
        )

Result log will be

2018-12-14 10:24:53,875 INFO     clickhouse_driver.log: [2018-12-14 10:22:53,875] [ klebedev-ThinkPad-T460 ] [ 25 ] {b328ad33-60e8-4012-b4cc-97f44a7b28f2} <Debug> MemoryTracker: Peak memory usage (for query): 40.23 KiB.

What do you think?

@korowa korowa marked this pull request as draft June 24, 2024 15:28
@korowa korowa force-pushed the server-log-timestamp branch 2 times, most recently from a332f0f to e83c2da Compare June 27, 2024 08:36
Copy link
Contributor

@azat azat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@korowa I guess it is not a draft anymore?

row['thread_id'] = row.get('thread_id') or row['thread_number']

# put log block row into LogRecord extra
extra = {"server"+k: v for k, v in row.items()}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this will be better?

Suggested change
extra = {"server"+k: v for k, v in row.items()}
extra = {"clickhouse"+k: v for k, v in row.items()}

Also I don't see how does _ added here, can you clarify?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wasn't -- all test failres has been caused by this.

Regarding attribute prefix -- "server" simply is shorter and doesn't conflict with any of LogRecord attributes, so it seemed to be sufficient, but I don't have any string preferences here.

@korowa korowa force-pushed the server-log-timestamp branch 2 times, most recently from be19911 to bb16df5 Compare July 1, 2024 07:43
@korowa korowa force-pushed the server-log-timestamp branch from bb16df5 to 2a17ea5 Compare July 1, 2024 07:53
@korowa korowa marked this pull request as ready for review July 1, 2024 07:53
@korowa
Copy link
Author

korowa commented Jul 1, 2024

@xzkostyan I've added server timestamp to the log message as a default behaviour.

In addition, now there is an ability to disable this concatenation, and set log format, using any of ClickHouse log row attributes, which are passed as extra to the log call.

return True


def configure_logger(raw_log_record=False):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW defaults can be error prone, and in this particular case it could be avoided, the function is not very common. So, I would rather remove the default here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants