Skip to content

The LXA TAC sometimes does not reboot correctly due to kernel hang #127

@hnez

Description

@hnez

We have seen this issue when trying to reboot after a rauc install (but it is not clear if the rauc install plays a role or is just coincidence).

Here is a log of the log output during an attempted reboot, why was kindly recorded by @Bastian-Krause:

root@lxatac-00001:~# dmesg -n 7
root@lxatac-00001:~#
root@lxatac-00001:~#
root@lxatac-00001:~# [593755.976008] EXT4-fs (mmcblk1p2): mounted filesystem 7bd8e28e-fa40-41ea-b1ee-c2e5193ff824 r/w with ordered data mode. Quota mode: disabled.
[593757.532580] EXT4-fs (mmcblk1p2): unmounting filesystem 7bd8e28e-fa40-41ea-b1ee-c2e5193ff824.
[593764.707277] block nbd0: NBD_DISCONNECT
[593764.710349] block nbd0: Disconnected due to user request.
[593764.716062] block nbd0: shutting down sockets
[593764.722040] block nbd0: NBD_DISCONNECT
[593764.725182] block nbd0: Send disconnect failed -32
[593778.142757] watchdog: watchdog0: watchdog did not stop!
[593779.907858] EXT4-fs (mmcblk1p4): re-mounted 8ac00a5a-3255-4613-9119-dc58209566e8 ro. Quota �
[593779.957491] EXT4-fs (mmcblk1p4): re-mounted 8ac00a5a-3255-4613-9119-dc58209566e8 ro. Quota �
[593779.975127] EXT4-fs (mmcblk1p4): re-mounted 8ac00a5a-3255-4613-9119-dc58209566e8 ro. Quota �
[593779.993143] EXT4-fs (mmcblk1p4): re-mounted 8ac00a5a-3255-4613-9119-dc58209566e8 ro. Quota �
[593780.051262] EXT4-fs (mmcblk1p4): unmounting filesystem 8ac00a5a-3255-4613-9119
[593780.135084] EXT4-fs (mmcblk1p3): re-mounted 7bfa8c6d-c4f2-4091-a699-93d542d10ac2 ro. Quota �
[593780.398389] watchdog: watchdog0: nowayout prevents watchdog�
[593780.404425] systemd-shutdown[1]: Failed to disable hardware watchdog, ignoring: Device or�
[593780.413160] watchdog: watchdog0: nowayout prevents watchdog�
[593780.419264] watchdog: watchdog0: watchdo�
[593780.466408] ksz-switch spi0.0 uplink
[593784.562626] ksz-switch spi0.0 uplink: Link is Up - 1Gbps/Full - flo�
[594014.802931] INFO: task kworker/0:1:30966 blocked for more tha
[594014.810866]       Not tainted 6.7.�
[594014.814523] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
[594014.821492] task:kworker/0:1     state:D stack:0     pid:30966 tgid:30966 ppid:2      fl�
[594014.829993] Workqueue: ipv6_addrconf addrc�
[594014.834399]  __schedule from sch�
[594014.838060]  schedule from schedule_preempt_dis�
[594014.842977]  schedule_preempt_disabled from __mutex_lock.constpro�
[594014.849495]  __mutex_lock.constprop.0 from addrconf_verif�
[594014.855328]  addrconf_verify_work from process_one_wo
[594014.860626]  process_one_work from worker_thre�
[594014.865401]  worker_thread from kthre
594014.869290]  kthread from ret_from�
[594014.873065] Exception stack(0xe18f5fb0�
[594014.877034] 5fa0:                                     00000000 00000000 00�
[594014.884311] 5fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00�
[594014.891611] 5fe0: 00000000 00000000 00000000 00000000 00�
[594014.897337] INFO: task kworker/1:1:30972 blocked for more tha
[594014.903480]       Not tainted 6.7.�[594014.907149] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables�
[594014.914114] task:kworker/1:1     state:D stack:0     pid:30972 tgid:30972 ppid:2      fl�
[594014.922609] Workqueue: events �[594014.925895]  __schedule from sch�
[594014.929373]  schedule from schedule_preempt_dis�
[594014.934248]  schedule_preempt_disabled from __mutex_lock.constpro�
[594014.940812]  __mutex_lock.constprop.0 from linkwatch_
[594014.946200]  linkwatch_event from process_one_w�
[594014.951142]  process_one_work from worker_thre�
[594014.955832]  worker_thread from kthre
[594014.959809]  kthread from ret_from�
[594014.963760] Exception stack(0xe1861fb0�
[594014.967946] 1fa0:                                     00000000 00000000 00�
[594014.975336] 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00�
[594014.982712] 1fe0: 00000000 00000000 00000000 00000000 00��
NOTICE:  CPU: STM32MP157C?? Rev.Z
NOTICE:  Model: Linux Automation Test Automation Controller (TAC)
WARNING: VDD unknown
INFO:    Reset reason (0x214):
INFO:      IWDG2 Reset (rst_iwdg2)
INFO:    FCONF: Reading TB_FW firmware configuration file from: 0x2ffe2000
INFO:    FCONF: Reading firmware configuration information for: stm32mp_io
INFO:    Using EMMC
INFO:      Instance 2

It would also be interesting to know why log lines printed after the reboot process has started are a bit mangled and truncated.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions