Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

granular parallel generic kernel for 64u_byteswap #679

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

marcusmueller
Copy link
Member

This simplifies (at least to me) understanding what the generic kernel does, and it's also about 1.5 times faster on my machines.

@marcusmueller marcusmueller changed the title granular paralel generic kernel for 64u_byteswap granular parallel generic kernel for 64u_byteswap Oct 23, 2023
@marcusmueller
Copy link
Member Author

marcusmueller commented Oct 23, 2023

is a good addition to make the byteswap somewhat performant on non-x86 platforms, especially in light of #680

@jdemel
Copy link
Contributor

jdemel commented Oct 24, 2023

Does this PR touch the intend of #606 ? I know, the concern different implementations.

@marcusmueller
Copy link
Member Author

No, it's unrelated. However, #680 addressed that quite directly.

Copy link
Contributor

@jdemel jdemel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR LGTM.

However, this whole kernel has issues. The include guards are at least confusing. The _a include guard is around the _u kernels and vice versa. Tail handling creates copypasta code. Loop variables are defined outside of loops. All in all, this kernel needs even more clean up. This is beyond this PR though.

One concern: We just removed all the _a_generic kernels. The diff looks like you rename one of these kernels.

Could you rebase your PR first before we merge it?

Copy link
Contributor

@jdemel jdemel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you rebase this PR onto the current main? I'd say we can go with it then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants