|
2 | 2 |
|
3 | 3 | This page is designed to cover various issues that arise when trying to PXE boot nodes in an HPE Cray EX system.
|
4 | 4 |
|
5 |
| -- [Configuration required for PXE booting](#configuration-required-for-pxe-booting) |
6 |
| -- [Switch configuration](#switch-configuration) |
7 |
| - - [Aruba configuration](#aruba-configuration) |
8 |
| - - [Mellanox configuration](#mellanox-configuration) |
9 |
| -- [Next steps](#next-steps) |
10 |
| - - [Node iPXE retries and NIC order](#node-ipxe-retries-and-nic-order) |
11 |
| - - [Restart BSS](#restart-bss) |
12 |
| - - [Restart Kea](#restart-kea) |
13 |
| - - [Missing BSS data](#missing-bss-data) |
| 5 | +- [PXE Boot Troubleshooting](#pxe-boot-troubleshooting) |
| 6 | + - [Configuration required for PXE booting](#configuration-required-for-pxe-booting) |
| 7 | + - [Switch configuration](#switch-configuration) |
| 8 | + - [Aruba configuration](#aruba-configuration) |
| 9 | + - [Mellanox configuration](#mellanox-configuration) |
| 10 | + - [Next steps](#next-steps) |
| 11 | + - [Kernel panic when unpacking initrd](#kernel-panic-when-unpacking-initrd) |
| 12 | + - [Node iPXE retries and NIC order](#node-ipxe-retries-and-nic-order) |
| 13 | + - [Restart BSS](#restart-bss) |
| 14 | + - [Restart Kea](#restart-kea) |
| 15 | + - [Missing BSS data](#missing-bss-data) |
14 | 16 |
|
15 | 17 | In order for PXE booting to work successfully, the management network switches need to be configured correctly.
|
16 | 18 |
|
@@ -232,7 +234,38 @@ To successfully PXE boot nodes, the following is required:
|
232 | 234 |
|
233 | 235 | ## Next steps
|
234 | 236 |
|
235 |
| -If the configuration looks good and PXE boot is still not working, then there are some other things to try. |
| 237 | +If the configuration looks good and PXE boot is still not working, there are some other things to try. |
| 238 | + |
| 239 | +### Kernel panic when unpacking initrd |
| 240 | + |
| 241 | +In rare cases, when a node is PXE booting it may kernel panic when unpacking the `initrd`. The error message will be similar to the following, but may vary depending on a number of factors including the kernel and hardware in use. |
| 242 | + |
| 243 | +```text |
| 244 | +... |
| 245 | +http://rgw-vip.nmn/boot-images/9e2032aa-2e1d-4cd5-acb5-cb1ad7bac001/kernel... ok |
| 246 | +http://rgw-vip.nmn/boot-images/9e2032aa-2e1d-4cd5-acb5-cb1ad7bac001/initrd... ok |
| 247 | +DxeTpm2MeasureBootHandler: PeCoffLoaderGetImageInfo failed! Status = Unsupported |
| 248 | +Image path: |
| 249 | +"". |
| 250 | +[ 0.554122][ T222] Initramfs unpacking failed: invalid magic at start of compressed archive |
| 251 | +[ 1.202522][ T1] i8042: Can't read CTR while initializing i8042 |
| 252 | +[ 1.588403][ T1] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0) |
| 253 | +[ 1.597815][ T1] CPU: 19 PID: 1 Comm: swapper/0 Not tainted 6.4.0-150600.23.17-default #1 SLE15-SP6 |
| 254 | +[ 1.611314][ T1] Hardware name: HPE ProLiant DL325 Gen10 Plus/ProLiant DL325 Gen10 Plus, BIOS A43 02/06/2023 |
| 255 | +... |
| 256 | +``` |
| 257 | + |
| 258 | +This is typically a transient issue and can be resolved by rebooting the node with `ipmitool`. |
| 259 | + |
| 260 | +> |
| 261 | +> `read -s` is used to prevent the password from being written to the screen or the shell history. |
| 262 | +> |
| 263 | +> ```bash |
| 264 | +> USERNAME=root |
| 265 | +> read -r -s -p "NCN BMC ${USERNAME} password: " IPMI_PASSWORD |
| 266 | +> export IPMI_PASSWORD |
| 267 | +> ipmitool -I lanplus -U "${USERNAME}" -E -H <bmc-hostname> power reset |
| 268 | +> ``` |
236 | 269 |
|
237 | 270 | ### Node iPXE retries and NIC order
|
238 | 271 |
|
|
0 commit comments