-
Notifications
You must be signed in to change notification settings - Fork 176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AF_XDP zero-copy makes driver reset #221
Comments
Hi @akhota Thank you for raising this issue and sharing the logs, we'll look into it and provide feedback. |
We have updated ena driver to 2.7.2 on our instances and retried AF_XDP zero-copy mode, but device reset still occurred. According to sar command, the number of received packets of the iperf instance is smaller than the number of packets sent by the AF_XDP server instance. Maybe packets lossed somewhere, and it seems that a device reset has occurred after the packet loss occurred.
Thanks, |
@akhota thanks a lot for testing our AF XDP support implementation. After doing some additional tests we see that the issue indeed reproduces on our machines as well and we're actively working on root-causing and solving the issue. Will update this ticket soon with a possible solution, sorry for the inconvenience |
Due to several bugs discovered in the feature, it is marked experimental. The feature can still be used if the driver is compiled with the TEST_AF_XDP flag set. E.g. TEST_AF_XDP=1 make Please follow amzn#221 issue for an experimental fix for AF XDP issues. Signed-off-by: Shay Agroskin <[email protected]>
Hi @ShayAgros, |
Hi, Once the testing phase ends I'll post the fix on this thread. Also we hope that by the next driver version release a new version of AF XDP support would be published which fixes some of the wrong design assumptions done in this version. I'm sorry for the inconvenience caused by this buggy experience |
Hi @ShayAgros, OK, we are looking forward to the next version. |
Hi, If you'd still like to test the AF XDP implementation, you can use the patch on top of the latest current version (2.7.3) (e.g. using By default the driver would compile without native (zero-copy) AF XDP support. To enable it please specify Please note that the AF XDP is currently in testing phase. We tested it thoroughly with this patch, but if still some issues are discovered or if you have a question then feel free to comment on this thread or write me to my email (listed above) |
Hi @ShayAgros, Thank you for the patch. We will apply your patch to our instances until the next version is released, and retry the AF XDP zero-copy performance test.
OK, I understand. Thanks, |
Hi, I have updated the ena driver and retried performance test in the AF XDP native zero-copy mode. The summary of results is follows:
(We used TRex for measurement and packet generation.) We suppose the patch |
Hi @akhota Thank you for performing the checks on the provided patch and summarizing the results. |
Hi, we were seeing the same issue (device reset) as this. But this issue hasn't been updated for a while. Just wondering is there any new AF_XDP patch which can provide better performance than copy mode? |
Hi @Li-Xiaoyun, (Also answering @akhota) If needed, the patch discussed in this comment was adjusted for 2.8.0 release and is available here. Thanks |
Thanks for the info. @davidarinzon |
Hi @davidarinzon, thanks for the patch. |
@Li-Xiaoyun @akhota just wondering if either of you also measured latency (as opposed to throughput) of the @davidarinzon have you had any success inverstigating the mentioned performance issue? |
I didn't measure latency. |
Any update on the AF_XDP zero copy changes? Looks like they've not yet been merged |
Hi @pstavirs, You are correct they have not been merged. Thanks, |
Hi, any updates? |
Hi @oicnysa Thanks for reaching out. |
@akiyano maybe you have some specific branch which we could test? |
@oicnysa I would be interested in that aswell! Would be great to test out with openonload |
@oicnysa and @moscovium115, |
For those who want to experiment with AF_XDP, the original patch posted in this comment was developed on top of 2.8.0 release. |
@davidarinzon Amazing thank you |
Hi, Official AF_XDP support was released with 2.13.0g. @akhota and others, please let us know if you face any issues. |
Resolving this ticket, please re-open it in case you face any new issues with AF_XDP |
We are trying to compare the performance of AF_XDP socket with regular (AF_INET) socket on AWS EC2 instances.
On ena driver version 2.7.1, driver reset occurs when a load is applied using AF_XDP zero-copy mode.
dmesg command shows the following message:
I have attached a sample program for reproduction and the full output of dmesg.
You can reproduce this behavior by following the steps below:
Server instance
We are using eth1 for test.
This program listens on port 13333/udp.
Client instance
Our environment is:
Both the server and client instances have the same specifications.
Thanks,
The text was updated successfully, but these errors were encountered: