Skip to content

SR-IOV: Fix API logic and add I350 NIC support #5604

Open
uncleDecart wants to merge 5 commits intolf-edge:masterfrom
uncleDecart:sriov-shenanigans
Open

SR-IOV: Fix API logic and add I350 NIC support #5604
uncleDecart wants to merge 5 commits intolf-edge:masterfrom
uncleDecart:sriov-shenanigans

Conversation

@uncleDecart
Copy link
Member

@uncleDecart uncleDecart commented Feb 13, 2026

Description

This PR contains various fixes to SR-IOV package and how we create Virtual Functions:

  1. With commit 0d66949 we do not expect controller to send Virtual Function configuration (even empty one) to create IoBundles (that was causing SR-IOV to fail creating VFs)
  2. With commit eb8be0f we don't treat Virtual Functions as Network Interfaces: currently Virtual Functions are supposed to be PCI devices which are to be passed through to the device, having them on host implicate having VF drivers and it is a feature we yet do not support. Treating VF as Network Interface caused a race where passing through a device failed due to testing state (although VF should not be tested) moreover, we should not consider VF as a last resort interface
  3. Some of the NIC drivers (such as igb for I350) bring Physical Function interface into DOWN state, which makes using Virtual Functions impossible. Commit cf97dfe ensures that PF iface is in UP state after VFs are created. It's intentionally done inside sriov package, because we don't want nim to be robust and monitor change configurations to intended state, some customers might want to change it on the fly and we do not want to interfere with it
  4. Currently SR-IOV throws Fatal when we cannot create virtual functions, which results in a boot cycle, which is unpleasant, to say the least, commit 3e61cfd removes "fatality" of VF creation failure, because this is recoverable thing
  5. Lastly, commit a430608 does small, but important nitpicking. When working with sysfs one doesn't work with files per se. Unix philosophy is that everything is a file, however writing to sysfs "files" means doing syscalls, so when we shouldn't try to create files on sysfs if they're not present, that could dangerously backfire for us.

At last this PR improves some error messages in whole, to make debugging easier for the next person to look into it.

How to test and validate this PR

One need a SR-IOV enabled NIC card. Currently SR-IOV tested with Intel X710 and Intel I350 NICs.

Then we should create device model, with multiple Virtual Functions (VFs) (at least 2, but better 4, check with Docs on how many VFs you can create on your specific NIC). That is specified in Physical Function under Vfs.count (your ordinary network interface %IFACE% like eth8 linked to SR-IOV enabled NIC becomes Physical Function).

Then those Virtual functions should be passed through to EdgeApps (for simplicity, use Ubuntu), one function to one EdgeApp.

Then after booting on EVE ls -la /sys/class/net/%IFACE%/devices should show you N virtual functions you specified and their respective PCI addresses. And each of this virtual functions should have vfio_pci driver to check that run
readlink /sys/bus/pci/device/%PCI_ADDRESS%/driver. Screenshot below shows my setup.

image

After you confirm this, enter your Edge Apps and inside you should see interface which belongs to PCI device which is Virtual Function (use ethtool to check that out) . Then bring those interfaces up using sudo ip link set dev %VF_IFACE% up, set their IP address (I for example had PF physically connected to network which had DHCP, so I used dhclient to obtain IP address) then run iperf3 server on one EdgeApp using %VF_FACE%, i.e sudo iperf3 -s --bind-dev %VF_IFACE% and in another Edge App run iperf3 client, pointing to that interface IP address using %VF_IFACE% of the second EdgeApp. In my example my server got IP 10.208.13.173 for VF iface so on the client EdgeApp I ran the following sudo iperf3 -c 10.208.13.173 --bind-dev enp4s0 -t 30 You should see packets going through. You can also test North-South traffic running client (or server) inside the network to which your PF has access to, North-South traffic should has more bandwidth than East-West one (between two EdgeApps). For example that's what I've got on Dell XR11 with Intel I350 NIC . Results may vary.

image

Changelog notes

SR-IOV now tested on Intel I350, fixed bug with controller EVE API behaviour mismatch

PR Backports

  • 16.0-stable: To be backported
  • 14.5-stable: To be backported
  • 13.4-stable: To be backported

Checklist

  • I've provided a proper description
  • I've added the proper documentation
  • I've tested my PR on amd64 device
  • I've tested my PR on arm64 device
  • I've written the test verification instructions
  • I've set the proper labels to this PR

And the last but not least:

  • I've checked the boxes above, or I've provided a good reason why I didn't
    check them.

Please, check the boxes above after submitting the PR in interactive mode.

instead log error nad move on, that helps to avoid
infinite boot cycle if something is not working.

Signed-off-by: Pavel Abramov <uncle.decart@gmail.com>
files under /sys are not files, those are calls, so
we shouldn't try to create files if they are not explicitly
present, that can cause us trouble

Signed-off-by: Pavel Abramov <uncle.decart@gmail.com>
currently we create Virtual Function VF only to pass it through to the VM
we are not initializing interface on the host side, thus we shoudn't try
to test VF or probe it for last resort.

Former obstructs EVE from passing through VFs due to race condition, i.e.
we start to test at some point and only some part of VFs is actually passed
since we're not considering VFs Network devices (isNet is false) we pass
them through and won't try them for last resort.

Signed-off-by: Pavel Abramov <uncle.decart@gmail.com>
if controller does not send configuration for Virtual Functions
(even empty ones) we ignore it and create bundles based on information
from the system

it also includes expansion of debug messages

Signed-off-by: Pavel Abramov <uncle.decart@gmail.com>
certain drivers (like igb on Intel I350 NIC) put Physical Function (PF)
interface in DOWN state, which doesn't work for us, since Virtual Functions (VF)
need PF to be in UP state. We handle this in SR-IOV package, specifically
after creating VFs.

This is done intentionally not in nim. We don't want to make nim that
robust, because some EVE users want to change configuration manually and
that might trip them.

Signed-off-by: Pavel Abramov <uncle.decart@gmail.com>
@uncleDecart uncleDecart added bug Something isn't working stable Should be backported to stable release(s) labels Feb 13, 2026
@codecov
Copy link

codecov bot commented Feb 13, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 28.08%. Comparing base (2281599) to head (cf97dfe).
⚠️ Report is 273 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5604      +/-   ##
==========================================
+ Coverage   19.52%   28.08%   +8.55%     
==========================================
  Files          19       19              
  Lines        3021     2314     -707     
==========================================
+ Hits          590      650      +60     
+ Misses       2310     1520     -790     
- Partials      121      144      +23     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.


if _, err := os.Stat(autoprobePath); err == nil {
if err := os.WriteFile(autoprobePath, []byte("0"), 0); err != nil {
// log Warn ("Warning: could not disable autoprobe on %s: %w\n", device, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove commented code, if this warning is too much here, you can just log it as a function debug message.....

return fmt.Errorf("netlink: failed to bring %s up: %w", device, err)
}

if err := pollLinkUp(device, 3*time.Minute, 100*time.Millisecond); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is 3 min reasonable for any device? I think is better to parametrize this and maybe the other timeouts (as variables)...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rene I think polling for it is not a good idea at all but rework would take too much. Yes, 3 min is reasonable for any kernel driver to start, I think we can leave it as-is for now and revisit the whole strategy later

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working stable Should be backported to stable release(s)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants