Increase DRBD Net/ping-timeout #45

benjamreis · 2023-10-20T09:45:08Z

This would avoid fake dead node assumption

…probe calls Signed-off-by: Mark Syms <[email protected]> Signed-off-by: Ronan Abhamon <[email protected]>

This was a patch added to the sm RPM git repo before we had this forked git repo for sm in the xcp-ng github organisation.

…p#401)

The driver is needed to transition to the ext driver. Users who upgrade from XCP-ng <= 8.0 need a working driver so that they can move the VMs out of the ext4 SR and delete the SR. Not keeping that driver would force such users to upgrade to 8.1 first, convert their SR, then upgrade to a higher version. However, like in XCP-ng 8.1, the driver will refuse any new ext4 SR creation.

Some important points: - linstor.KV must use an identifier name that starts with a letter (so it uses a "sr-" prefix). - Encrypted VDI are supported with key_hash attribute (not tested, experimental). - When a new LINSTOR volume is created on a host (via snapshot or create), the remaining diskless devices are not necessarily created on other hosts. So if a resource definition exists without local device path, we ask it to LINSTOR. Wait 5s for symlink creation when a new volume is created => 5s is is purely arbitrary, but this guarantees that we do not try to access the volume if the symlink has not yet been created by the udev rule. - Can change the provisioning using the device config 'provisioning' param. - We can only increase volume size (See: LINBIT/linstor-server#66), it would be great if we could shrink volumes to limit the space used by the snapshots. - Inflate/Deflate can only be executed on the master host, a linstor-manager plugin is present to do this from slaves. The same plugin is used to open LINSTOR ports + start controller. - Use a `total_allocated_volume_size` method to have a good idea of the reserved memory Why? Because `physical_free_size` is computed using the LVM used size, in the case of thick provisioning it's ok, but when thin provisioning is choosen LVM returns only the allocated size using the used block count. So this method solves this problem, it takes the fixed virtual volume size of each node to compute the required size to store the volume data. - Call vhd-util on remote hosts using the linstor-manager when necessary, i.e. vhd-util is called to get vhd info, the DRBD device can be in use (and unusable by external processes), so we must use the local LVM device that contains the DRBD data or a remote disk if the DRBD device is diskless. - If a DRBD device is in use when vhdutil.getVHDInfo is called, we must have no errors. So a LinstorVhdUtil wrapper is now used to bypass DRBD layer when VDIs are loaded. - Refresh PhyLink when unpause in called on DRBD devices: We must always recreate the symlink to ensure we have the right info. Why? Because if the volume UUID is changed in LINSTOR the symlink is not directly updated. When live leaf coalesce is executed we have these steps: "A" -> "OLD_A" "B" -> "A" Without symlink update the previous "A" path is reused instead of "B" path. Note: "A", "B" and "OLD_A" are UUIDs. - Since linstor python modules are not present on every XCP-ng host, module imports are protected by try.. except... blocks. - Provide a linstor-monitor daemon to check master changes

- Check if "create" doesn't succeed without zfs packages - Check if "scan" failed if the path is not mounted (not a ZFS mountpoint)

Some QNAP devices do not provide ACL when fetching NFS mounts. In this case the assumed ACL should be: "*". This commit fixes the crash when attempting to access the non existing ACL. Relevant issues: - xapi-project#511 - xcp-ng/xcp#113

Co-authored-by: Piotr Robert Konopelko <[email protected]> Signed-off-by: Aleksander Wieliczko <[email protected]> Signed-off-by: Ronan Abhamon <[email protected]>

`umount` should not be called when `legacy_mode` is enabled, otherwise a mounted dir used during SR creation is unmounted at the end of the `create` call (and also when a PBD is unplugged) in `detach` block. Signed-off-by: Ronan Abhamon <[email protected]>

A sm-config boolean param `subdir` is available to configure where to store the VHDs: - In a subdir with the SR UUID, the new behavior - In the root directory of the MooseFS SR By default, new SRs are created with `subdir` = True. Existing SRs are not modified and continue to use the folder that was given at SR creation, directly, without looking for a subdirectory. Signed-off-by: Ronan Abhamon <[email protected]>

Ensure all shared drivers are imported in `_is_open` definition to register them in the driver list. Otherwise this function always fails with a SRUnknownType exception. Also, we must add two fake mandatory parameters to make MooseFS happy: `masterhost` and `rootpath`. Same for CephFS with: `serverpath`. (NFS driver is directly patched to ensure there is no usage of the `serverpath` param because its value is equal to None.) `location` param is required to use ZFS, to be more precise, in the parent class: `FileSR`. Signed-off-by: Ronan Abhamon <[email protected]>

SR_CACHING offers the capacity to use IntelliCache, but this feature is only available using NFS SR. For more details, the implementation of `_setup_cache` in blktap2.py uses only an instance of NFSFileVDI for the shared target. Signed-off-by: Ronan Abhamon <[email protected]>

The probe method is not implemented so we shouldn't advertise it. Signed-off-by: BenjiReis <[email protected]>

When static vdis are used there is no snapshots and we don't want to call method from XAPI. Signed-off-by: Guillaume <[email protected]>

This file is meant to remain unchanged and regularly updated along with the SM component. Users can create a custom configuration file in /etc/multipath/conf.d/ instead. Signed-off-by: Samuel Verschelde <[email protected]> (cherry picked from commit b44d3f5)

Meant to be installed as /etc/multipath/conf.d/custom.conf for users to have an easy entry point for editing, as well as information on what will happen to this file through future system updates and upgrades. Signed-off-by: Samuel Verschelde <[email protected]> (cherry picked from commit 18b79a5)

Update Makefile so that the file is installed along with sm. Signed-off-by: Samuel Verschelde <[email protected]>

Otherwise the SIGALRM signal can be emitted after the execution of the given user function. Signed-off-by: Ronan Abhamon <[email protected]>

Signed-off-by: Ronan Abhamon <[email protected]>

Details: - vdi_attach and vdi_detach are now exclusive - lock volumes on slaves (when vdi_xxx command is used) and avoid release if a timeout is reached - load all VDIs only when necessary, so only if it exists at least a journal entry or if sr_scan/sr_attach is executed - use a __slots__ attr in LinstorVolumeManager to increase performance - use a cache directly in LinstorVolumeManager to reduce network request count with LINSTOR - try to always use the same LINSTOR KV object to limit netwok usage - use a cache to avoid a new JSON parsing when all VDIs are loaded in LinstorSR - limit request count when LINSTOR storage pool info is fetched using a fetch interval - avoid race condition in cleanup: check if a volume is locked in a slave or not before modify it - ... Signed-off-by: Ronan Abhamon <[email protected]>

…alled outside module Signed-off-by: Ronan Abhamon <[email protected]>

Signed-off-by: Ronan Abhamon <[email protected]>

…te allocated size stats Signed-off-by: Ronan Abhamon <[email protected]>

…_from_config is executed Signed-off-by: Ronan Abhamon <[email protected]>

Now, we can: - Start a controller on any node - Share the LINSTOR volume list using a specific volume "xcp-persistent-database" - Use the HA with "xcp-persistent-ha-statefile" and "xcp-persistent-redo-log" volumes - Create the nodes automatically during SR creation Signed-off-by: Ronan Abhamon <[email protected]>

Signed-off-by: Ronan Abhamon <[email protected]>

…mes when master satellite is down Steps to reproduce: - Ensure the linstor satellite is not running on the master host, otherwise stop it - Then restart the controller on the right host where the LINSTOR database is mounted - Run st_attach command => All volumes will be forgotten To avoid this, it's possible to restart the satellite on the master before the sr_attach command. Also it's funny to see you can start and stop the satellite juste before the sr_attach, and the volumes will not be removed. Explanations: In theory this bug is impossible because during the sr_attach execution, an exception is thrown (so sr_scan should not be executed) BUT there is a piece of code that is executed in SRCommand.py when sr_attach is called: ```python try: return sr.attach(sr_uuid) finally: if is_master: sr.after_master_attach(sr_uuid) ``` The exception is not immediately forwarded because a finally block must be executed before. And what is the implementation of after_master_attach? ```python def after_master_attach(self, uuid): """Perform actions required after attaching on the pool master Return: None """ self.scan(uuid) ``` Oh! Of course, a scan is always executed after a attach... What's the purpose of a scan if we can't execute correctly an attach command before? I don't know, but it's probably error-prone like this context. When scan is called, we suppose the SR is attached and we have all VDIs loaded but it's not the case because an exception has been thrown. To solve this problem we forbid the execution of the scan if the attach failed. Signed-off-by: Ronan Abhamon <[email protected]>

MarkSymsCtx and others added 30 commits October 13, 2023 14:43

backport of ccd121c: CA-354692: check for device parameter in create/…

b7d3ea7

…probe calls Signed-off-by: Mark Syms <[email protected]> Signed-off-by: Ronan Abhamon <[email protected]>

Update xs-sm.service's description for XCP-ng

3a0c67d

This was a patch added to the sm RPM git repo before we had this forked git repo for sm in the xcp-ng github organisation.

Add TrueNAS multipath config

a8168e1

This was a patch added to the sm RPM git repo before we had this forked git repo for sm in the xcp-ng github organisation.

feat(drivers): add CephFS, GlusterFS and XFS drivers

6120e7f

feat(drivers): add ZFS driver to avoid losing VDI metadata (xcp-ng/xc…

dbfbe5f

…p#401)

feat(tests): add unit tests concerning ZFS (close xcp-ng/xcp#425)

d64ac06

- Check if "create" doesn't succeed without zfs packages - Check if "scan" failed if the path is not mounted (not a ZFS mountpoint)

If no NFS ACLs provided, assume everyone:

68e67e6

Some QNAP devices do not provide ACL when fetching NFS mounts. In this case the assumed ACL should be: "*". This commit fixes the crash when attempting to access the non existing ACL. Relevant issues: - xapi-project#511 - xcp-ng/xcp#113

Added SM Driver for MooseFS

769a4b9

Co-authored-by: Piotr Robert Konopelko <[email protected]> Signed-off-by: Aleksander Wieliczko <[email protected]> Signed-off-by: Ronan Abhamon <[email protected]>

Remove SR_PROBE from ZFS capabilities (#37)

4de671d

The probe method is not implemented so we shouldn't advertise it. Signed-off-by: BenjiReis <[email protected]>

Fix vdi-ref when static vdis are used

0aec61e

When static vdis are used there is no snapshots and we don't want to call method from XAPI. Signed-off-by: Guillaume <[email protected]>

Install /etc/multipath/conf.d/custom.conf

398471d

Update Makefile so that the file is installed along with sm. Signed-off-by: Samuel Verschelde <[email protected]>

Fix timeout_call: alarm must be reset in case of success

71e0c52

Otherwise the SIGALRM signal can be emitted after the execution of the given user function. Signed-off-by: Ronan Abhamon <[email protected]>

timeout_call returns the result of user function now

f2d089b

Signed-off-by: Ronan Abhamon <[email protected]>

fix(LinstorSR): repair volumes only if an exclusive command is executed

5492351

Signed-off-by: Ronan Abhamon <[email protected]>

feat(LinstorSR): robustify scan to avoid losing VDIs if function is c…

df21142

…alled outside module Signed-off-by: Ronan Abhamon <[email protected]>

feat(LinstorSR): display a correctly readable size for the user

7f6b21f

Signed-off-by: Ronan Abhamon <[email protected]>

feat(linstor-monitord): scan all LINSTOR SRs every 12 minutes to upda…

c6ecf4e

…te allocated size stats Signed-off-by: Ronan Abhamon <[email protected]>

fix(LinstorSR): call correctly method in _locked_load when vdi_attach…

2b91007

…_from_config is executed Signed-off-by: Ronan Abhamon <[email protected]>

feat(LinstorSR): ensure heartbeat and redo_log VDIs are not diskless

0d512e7

Signed-off-by: Ronan Abhamon <[email protected]>

Wescoeur force-pushed the 2.30.8-8.2-linstor-fixes-staging branch from 44f2ee3 to bf38210 Compare December 20, 2023 14:01

Wescoeur force-pushed the 2.30.8-8.2-linstor-fixes-staging branch 2 times, most recently from 4222231 to 3499398 Compare January 23, 2024 13:17

Wescoeur force-pushed the 2.30.8-8.2-linstor-fixes-staging branch 3 times, most recently from 150d510 to f87c3eb Compare February 12, 2024 19:56

Wescoeur force-pushed the 2.30.8-8.2-linstor-fixes-staging branch 6 times, most recently from 7238799 to f36a7a2 Compare April 29, 2024 15:22

Nambrok force-pushed the 2.30.8-8.2-linstor-fixes-staging branch from a217ee4 to 89f927e Compare May 7, 2024 13:22

Wescoeur force-pushed the 2.30.8-8.2-linstor-fixes-staging branch from 89f927e to 2b01dd1 Compare May 31, 2024 13:41

Wescoeur force-pushed the 2.30.8-8.2-linstor-fixes-staging branch 3 times, most recently from 76209bf to 8249dcc Compare June 13, 2024 11:25

Wescoeur force-pushed the 2.30.8-8.2-linstor-fixes-staging branch 2 times, most recently from f9e6a8e to e7ffbab Compare June 28, 2024 13:09

Wescoeur force-pushed the 2.30.8-8.2-linstor-fixes-staging branch 7 times, most recently from 51d4f89 to 3f63f6a Compare July 26, 2024 12:49

Wescoeur force-pushed the 2.30.8-8.2-linstor-fixes-staging branch 2 times, most recently from 028c295 to 31d150b Compare August 6, 2024 15:17

Wescoeur force-pushed the 2.30.8-8.2-linstor-fixes-staging branch from 04c2c93 to 0722952 Compare September 24, 2024 08:29

Wescoeur force-pushed the 2.30.8-8.2-linstor-fixes-staging branch from 0722952 to 119dc63 Compare October 3, 2024 15:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Increase DRBD Net/ping-timeout #45

Increase DRBD Net/ping-timeout #45

benjamreis commented Oct 20, 2023

Increase DRBD Net/ping-timeout #45

Are you sure you want to change the base?

Increase DRBD Net/ping-timeout #45

Conversation

benjamreis commented Oct 20, 2023