-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
netdev CI testing #6666
Open
kuba-moo
wants to merge
299
commits into
kernel-patches:bpf-next_base
Choose a base branch
from
linux-netdev:to-test
base: bpf-next_base
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
netdev CI testing #6666
+6,595
−5,724
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
4f22ee0
to
8a9a8e0
Compare
64c403f
to
8da1f58
Compare
78ebb17
to
9325308
Compare
c8c7b2f
to
a71aae6
Compare
9325308
to
7940ae1
Compare
d8feb00
to
b16a6b9
Compare
7940ae1
to
8f1ff3c
Compare
4164329
to
c5cecb3
Compare
skb is not used in fib_nl2rule() other than sock_net(skb->sk), which is already available in callers, fib_nl_newrule() and fib_nl_delrule(). Let's pass net directly to fib_nl2rule(). Signed-off-by: Kuniyuki Iwashima <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Signed-off-by: NipaLocal <nipa@local>
We will move RTNL down to fib_nl_newrule() and fib_nl_delrule(). Some operations in fib_nl2rule() require RTNL: fib_default_rule_pref() and __dev_get_by_name(). Let's split the RTNL parts as fib_nl2rule_rtnl(). Signed-off-by: Kuniyuki Iwashima <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Signed-off-by: NipaLocal <nipa@local>
The following patch will not set skb->sk from VRF path. Let's fetch net from fib_rule->fr_net instead of sock_net(skb->sk) in fib[46]_rule_configure(). Signed-off-by: Kuniyuki Iwashima <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Signed-off-by: NipaLocal <nipa@local>
fib_nl_newrule() / fib_nl_delrule() is the doit() handler for RTM_NEWRULE / RTM_DELRULE but also called from vrf_newlink(). Currently, we hold RTNL on both paths but will not on the former. Also, we set dev_net(dev)->rtnl to skb->sk in vrf_fib_rule() because fib_nl_newrule() / fib_nl_delrule() fetch net as sock_net(skb->sk). Let's Factorise the two functions and pass net and rtnl_held flag. Signed-off-by: Kuniyuki Iwashima <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Signed-off-by: NipaLocal <nipa@local>
fib_nl_newrule() is the doit() handler for RTM_NEWRULE but also called from vrf_newlink(). In the latter case, RTNL is already held and the 4th arg is true. Let's hold per-netns RTNL in fib_newrule() if rtnl_held is false. Note that we call fib_rule_get() before releasing per-netns RTNL to call notify_rule_change() without RTNL and prevent freeing the new rule. Signed-off-by: Kuniyuki Iwashima <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Signed-off-by: NipaLocal <nipa@local>
We will hold RTNL just before calling fib_nl2rule_rtnl() in fib_delrule() and release it before kfree(nlrule). Let's add a new rule to make the following change cleaner. Signed-off-by: Kuniyuki Iwashima <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Signed-off-by: NipaLocal <nipa@local>
fib_nl_delrule() is the doit() handler for RTM_DELRULE but also called from vrf_newlink() in case something fails in vrf_add_fib_rules(). In the latter case, RTNL is already held and the 4th arg is true. Let's hold per-netns RTNL in fib_delrule() if rtnl_held is false. Now we can place ASSERT_RTNL_NET() in call_fib_rule_notifiers(). Signed-off-by: Kuniyuki Iwashima <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Signed-off-by: NipaLocal <nipa@local>
Commit df542f6 ("net: stmmac: Switch to zero-copy in non-XDP RX path") makes DMA write received frame into buffer at offset of NET_SKB_PAD and sets page pool parameters to sync from offset of NET_SKB_PAD. But when Header Payload Split is enabled, the header is written at offset of NET_SKB_PAD, while the payload is written at offset of zero. Uncorrect offset parameter for the payload breaks dma coherence [1] since both CPU and DMA touch the page buffer from offset of zero which is not handled by the page pool sync parameter. And in case the DMA cannot split the received frame, for example, a large L2 frame, pp_params.max_len should grow to match the tail of entire frame. [1] https://lore.kernel.org/netdev/[email protected]/ Reported-by: Jon Hunter <[email protected]> Reported-by: Brad Griffis <[email protected]> Suggested-by: Ido Schimmel <[email protected]> Fixes: df542f6 ("net: stmmac: Switch to zero-copy in non-XDP RX path") Signed-off-by: Furong Xu <[email protected]> Tested-by: Jon Hunter <[email protected]> Signed-off-by: NipaLocal <nipa@local>
When validation on the backup slave is enabled, we need to validate the Neighbor Solicitation (NS) messages received on the backup slave. To receive these messages, the correct destination MAC address must be added to the slave. However, the target in bonding is a unicast address, which we cannot use directly. Instead, we should first convert it to a Solicited-Node Multicast Address and then derive the corresponding MAC address. Fix the incorrect MAC address setting on both slave_set_ns_maddr() and slave_set_ns_maddrs(). Since the two function names are similar. Add some description for the functions. Also only use one mac_addr variable in slave_set_ns_maddr() to save some code and logic. Fixes: 8eb3616 ("bonding: add ns target multicast address to slave device") Signed-off-by: Hangbin Liu <[email protected]> Signed-off-by: NipaLocal <nipa@local>
The correct mac address for NS target 2001:db8::254 is 33:33:ff:00:02:54, not 33:33:00:00:02:54. The same with client maddress. Fixes: 86fb617 ("selftests: bonding: add ns multicast group testing") Signed-off-by: Hangbin Liu <[email protected]> Signed-off-by: NipaLocal <nipa@local>
Without the .owner field, the module can be unloaded while /dev/vmclock0 is open, leading to an oops. Fixes: 2050327 ("ptp: Add support for the AMZNC10C 'vmclock' device") Cc: [email protected] Signed-off-by: David Woodhouse <[email protected]> Signed-off-by: Thomas Weißschuh <[email protected]> Signed-off-by: NipaLocal <nipa@local>
If vmclock_ptp_register() fails during probing, vmclock_remove() is called to clean up the ptp clock and misc device. It uses dev_get_drvdata() to access the vmclock state. However the driver data is not yet set at this point. Assign the driver data earlier. Fixes: 2050327 ("ptp: Add support for the AMZNC10C 'vmclock' device") Cc: [email protected] Signed-off-by: Thomas Weißschuh <[email protected]> Reviewed-by: Mateusz Polchlopek <[email protected]> Acked-by: Richard Cochran <[email protected]> Reviewed-by: David Woodhouse <[email protected]> Signed-off-by: NipaLocal <nipa@local>
vmclock_remove() tries to detect the successful registration of the misc device based on the value of its minor value. However that check is incorrect if the misc device registration was not attempted in the first place. Always initialize the minor number, so the check works properly. Fixes: 2050327 ("ptp: Add support for the AMZNC10C 'vmclock' device") Cc: [email protected] Signed-off-by: Thomas Weißschuh <[email protected]> Acked-by: Richard Cochran <[email protected]> Reviewed-by: David Woodhouse <[email protected]> Signed-off-by: NipaLocal <nipa@local>
Most resources owned by the vmclock device are managed through devres. Only the miscdev and ptp clock are managed manually. This makes the code slightly harder to understand than necessary. Switch them over to devres and remove the now unnecessary drvdata. Signed-off-by: Thomas Weißschuh <[email protected]> Acked-by: Richard Cochran <[email protected]> Reviewed-by: David Woodhouse <[email protected]> Signed-off-by: NipaLocal <nipa@local>
vmclock_probe() uses an "out:" label to return from the function on error. This indicates that some cleanup operation is necessary. However the label does not do anything as all resources are managed through devres, making the code slightly harder to read. Remove the label and just return directly. Signed-off-by: Thomas Weißschuh <[email protected]> Acked-by: Richard Cochran <[email protected]> Reviewed-by: David Woodhouse <[email protected]> Signed-off-by: NipaLocal <nipa@local>
Reference counting is used to ensure that batadv_hardif_neigh_node and batadv_hard_iface are not freed before/during batadv_v_elp_throughput_metric_update work is finished. But there isn't a guarantee that the hard if will remain associated with a soft interface up until the work is finished. This fixes a crash triggered by reboot that looks like this: Call trace: batadv_v_mesh_free+0xd0/0x4dc [batman_adv] batadv_v_elp_throughput_metric_update+0x1c/0xa4 process_one_work+0x178/0x398 worker_thread+0x2e8/0x4d0 kthread+0xd8/0xdc ret_from_fork+0x10/0x20 (the batadv_v_mesh_free call is misleading, and does not actually happen) I was able to make the issue happen more reliably by changing hardif_neigh->bat_v.metric_work work to be delayed work. This allowed me to track down and confirm the fix. Cc: [email protected] Fixes: c833484 ("batman-adv: ELP - compute the metric based on the estimated throughput") Signed-off-by: Andy Strohman <[email protected]> [[email protected]: prevent entering batadv_v_elp_get_throughput without soft_iface] Signed-off-by: Sven Eckelmann <[email protected]> Signed-off-by: Simon Wunderlich <[email protected]> Signed-off-by: NipaLocal <nipa@local>
If a temporary error happened in the evaluation of the neighbor throughput information, then the invalid throughput result should not be stored in the throughtput EWMA. Cc: [email protected] Signed-off-by: Sven Eckelmann <[email protected]> Signed-off-by: Simon Wunderlich <[email protected]> Signed-off-by: NipaLocal <nipa@local>
The ELP worker needs to calculate new metric values for all neighbors "reachable" over an interface. Some of the used metric sources require locks which might need to sleep. This sleep is incompatible with the RCU list iterator used for the recorded neighbors. The initial approach to work around of this problem was to queue another work item per neighbor and then run this in a new context. Even when this solved the RCU vs might_sleep() conflict, it has a major problems: Nothing was stopping the work item in case it is not needed anymore - for example because one of the related interfaces was removed or the batman-adv module was unloaded - resulting in potential invalid memory accesses. Directly canceling the metric worker also has various problems: * cancel_work_sync for a to-be-deactivated interface is called with rtnl_lock held. But the code in the ELP metric worker also tries to use rtnl_lock() - which will never return in this case. This also means that cancel_work_sync would never return because it is waiting for the worker to finish. * iterating over the neighbor list for the to-be-deactivated interface is currently done using the RCU specific methods. Which means that it is possible to miss items when iterating over it without the associated spinlock - a behaviour which is acceptable for a periodic metric check but not for a cleanup routine (which must "stop" all still running workers) The better approch is to get rid of the per interface neighbor metric worker and handle everything in the interface worker. The original problems are solved by: * creating a list of neighbors which require new metric information inside the RCU protected context, gathering the metric according to the new list outside the RCU protected context * only use rcu_trylock inside metric gathering code to avoid a deadlock when the cancel_delayed_work_sync is called in the interface removal code (which is called with the rtnl_lock held) Cc: [email protected] Fixes: c833484 ("batman-adv: ELP - compute the metric based on the estimated throughput") Signed-off-by: Sven Eckelmann <[email protected]> Signed-off-by: Simon Wunderlich <[email protected]> Signed-off-by: NipaLocal <nipa@local>
Since commit 4436df4 ("batman-adv: Add flex array to struct batadv_tvlv_tt_data"), the introduction of batadv_tvlv_tt_data's flex array member in batadv_tt_tvlv_ogm_handler_v1() put tt_changes at invalid offset. Those TT changes are supposed to be filled from the end of batadv_tvlv_tt_data structure (including vlan_data flexible array), but only the flex array size is taken into account missing completely the size of the fixed part of the structure itself. Fix the tt_change offset computation by using struct_size() instead of flex_array_size() so both flex array member and its container structure sizes are taken into account. Cc: [email protected] Fixes: 4436df4 ("batman-adv: Add flex array to struct batadv_tvlv_tt_data") Signed-off-by: Remi Pommarel <[email protected]> Signed-off-by: Sven Eckelmann <[email protected]> Signed-off-by: Simon Wunderlich <[email protected]> Signed-off-by: NipaLocal <nipa@local>
Rework EEE handling to depend on PHY negotiation results. Enable EEE only after a valid link is established and allow proper LPI mode activation. Previously, sxgbe_eee_init() was called in sxgbe_open() to invoke phy_init_eee() most probably before the PHY could complete auto-negotiation. Non-automotive PHYs typically take ~1 sec to establish a link, so EEE was rarely enabled. Now, EEE activation is deferred until after PHY negotiation. The driver delegates EEE control to phylib via ethtool set/get calls. Timer values are taken from phydev->eee_cfg.tx_lpi_timer, and sxgbe_eee_adjust() configures EEE based on phydev->enable_tx_lpi and the PHY link status. This activation of LPI may reveal issues that were hidden when LPI was rarely active. WARNING: This patch is only compile tested. Signed-off-by: Andrew Lunn <[email protected]> Signed-off-by: Oleksij Rempel <[email protected]> Signed-off-by: NipaLocal <nipa@local>
Extended RTM_GETMULTICAST to support dumping joined IPv4 multicast addresses, in addition to the existing IPv6 functionality. This allows userspace applications to retrieve both IPv4 and IPv6 multicast addresses through similar netlink command and then monitor future changes by registering to RTNLGRP_IPV4_MCADDR and RTNLGRP_IPV6_MCADDR. Cc: Maciej Żenczykowski <[email protected]> Cc: Lorenzo Colitti <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Signed-off-by: Yuyang Huang <[email protected]> Signed-off-by: NipaLocal <nipa@local>
This change introduces a new selftest case to verify the functionality of dumping IPv4 multicast addresses using the RTM_GETMULTICAST netlink message. The test utilizes the ynl library to interact with the netlink interface and validate that the kernel correctly reports the joined IPv4 multicast addresses. To run the test, execute the following command: $ vng -v --user root --cpus 16 -- \ make -C tools/testing/selftests TARGETS=net \ TEST_PROGS=rtnetlink.py TEST_GEN_PROGS="" run_tests Cc: Maciej Żenczykowski <[email protected]> Cc: Lorenzo Colitti <[email protected]> Signed-off-by: Yuyang Huang <[email protected]> Signed-off-by: NipaLocal <nipa@local>
The core is reset both in `fec_restart()` (called on link-up) and `fec_stop()` (going to sleep, driver remove etc.). These two functions had their separate implementations, which was at first only a register write and a `udelay()` (and the accompanying block comment). However, since then we got soft-reset (MAC disable) and Wake-on-LAN support, which meant that these implementations diverged, often causing bugs. For instance, as of now, `fec_stop()` does not check for `FEC_QUIRK_NO_HARD_RESET`, meaning the MII/RMII mode is cleared on eg. a PM power-down event; and `fec_restart()` missed the refactor renaming the "magic" constant `1` to `FEC_ECR_RESET`. To harmonize current implementations, and eliminate this source of potential future bugs, refactor implementation to a common function. Reviewed-by: Michal Swiatkowski <[email protected]> Reviewed-by: Jacob Keller <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Csókás, Bence <[email protected]> Signed-off-by: NipaLocal <nipa@local>
tc_actions.sh keeps hanging the forwarding tests. sdf@: tdc & tdc-dbg started intermittenly failing around Sep 25th Signed-off-by: NipaLocal <nipa@local>
Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>
Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>
Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>
Disable tests we don't care about, we use alltests in kunit. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Reusable PR for hooking netdev CI to BPF testing.