Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(log-collector-script): journal log with precise timestamp #2151

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

guessi
Copy link
Member

@guessi guessi commented Feb 12, 2025

Issue #, if available:

Description of changes:

  • Make it possible to log timestamp more precisely.
    • Before: Feb 12 06:25:29
      • No timezone info (it might have different timezone setup at node bootstrap).
      • Round to second only, no that precise.
    • After: 2025-02-12T06:25:29.149489+0000
      • With timezone info included.
      • Explicitly display timestamp, down to nanosecond.
  • Unify journalctl commands inside log-collector script.
    • With ${JOURNALCTL_CMD} defined, try to unify the command with --output flag set.
    • Unify the cmd/flag order, timeout -> journalctl -> --unit --> --since (if needed)
  • Improve readability
    • Before: --since $(date -d "-10 days" '+%Y-%m-%d %H:%M')
      • Not that easy to understand for newbie.
    • After: --since "-10d"
      • Easier to understand and easier to read.

Testing Done

readonly JOURNALCTL_CMD='/usr/bin/journalctl --output=short-iso-precise'

# Collect log start from 10 days ago, could be '-10d (10 days ago)', '-10h (10 hours ago)', '-10m (10 minutes ago)'
readonly JOURNALCTL_SINCE='-10d'
Copy link
Member Author

@guessi guessi Feb 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we need to gather log since 30 days ago, it could easily be done by setting the value as -30d.

readonly PROGRAM_SOURCE="https://github.com/awslabs/amazon-eks-ami/blob/main/log-collector-script/"
readonly PROGRAM_NAME="$(basename "$0" .sh)"
readonly PROGRAM_DIR="/opt/log-collector"
readonly LOG_DIR="/var/log"
readonly COLLECT_DIR="/tmp/eks-log-collector"
readonly CURRENT_TIME=$(date --utc +%Y-%m-%d_%H%M-%Z)
readonly DAYS_10=$(date -d "-10 days" '+%Y-%m-%d %H:%M')
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before change, it is confused to have hard-coded DAYS_10 here. It's just a variable, would better to have no explicitly days defined in variable name.


timeout 75 journalctl --unit=nodeadm-run --since "${DAYS_10}" > "${COLLECT_DIR}"/nodeadm/nodeadm-run.log
timeout 75 ${JOURNALCTL_CMD} --unit=nodeadm-config > "${COLLECT_DIR}"/nodeadm/nodeadm-config.log
timeout 75 ${JOURNALCTL_CMD} --unit=nodeadm-run > "${COLLECT_DIR}"/nodeadm/nodeadm-run.log
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to add --since for the nodadm-{config,run} services.

@guessi guessi force-pushed the fine-tune-log-collector branch from e909a99 to d275801 Compare February 12, 2025 10:36
@@ -780,7 +788,7 @@ get_containerd_info() {
# force containerd to dump goroutines
timeout 75 killall -sUSR1 containerd
timeout 75 containerd config dump > "${COLLECT_DIR}"/containerd/containerd-config.txt 2>&1 || echo -e "\tTimed out, ignoring \"containerd info output \" "
timeout 75 journalctl -u containerd > "${COLLECT_DIR}"/containerd/containerd-log.txt 2>&1 || echo -e "\tTimed out, ignoring \"containerd info output \" "
timeout 75 ${JOURNALCTL_CMD} --unit=containerd --since "${JOURNALCTL_SINCE}" > "${COLLECT_DIR}"/containerd/containerd-log.txt 2>&1 || echo -e "\tTimed out, ignoring \"containerd info output \" "
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add --since flag to aligned with log gathering command for docker.


# Unify the out output with precise ISO timestamp (e.g. 2025-02-12T06:25:29.149489+0000)
# It might be needed to have nanosecond timestamp for advanced troubleshooting.
readonly JOURNALCTL_CMD='/usr/bin/journalctl --output=short-iso-precise'
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EKS node might have customized timezone set, precisely log the timestamp would be very helpful for troubleshooting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant