Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add custom attributes to fn items #3088

Open
wants to merge 66 commits into
base: main
Choose a base branch
from

Conversation

oberrich
Copy link

@oberrich oberrich commented Jan 11, 2025

This PR adds functionality to add custom attributes to function items (proposed in #2978).
I've implemented this by extending the TypeKind used by AttributeInfo to include the Function(FunctionKind) variant (see AttributeItemKind).

Unifying the callback this way is a breaking change when --with_attribute_custom or matching on the kind field is used.

@oberrich oberrich marked this pull request as draft January 11, 2025 20:28
@oberrich oberrich marked this pull request as ready for review January 11, 2025 20:44
@oberrich oberrich requested a review from jschwe January 15, 2025 12:22
@oberrich
Copy link
Author

I'd love to add some tests as well but I'm not sure how I would do that, can't get anything other than basic tests to run.
Some of the tests seem to be Linux only and depend on some bash script.
Would love some advice although I don't think we necessarily need tests for this since I've basically just copied the logic from @mkroening's implementation and this should just work™.

@mkroening
Copy link
Contributor

You can take a look at #2866, where I added some tests along with the support for custom attributes on other items (attribute-custom.h, attribute-custom-cli.h, and their respective expected output).

@oberrich oberrich marked this pull request as draft January 29, 2025 20:59
@oberrich
Copy link
Author

oberrich commented Jan 29, 2025

I will be adding further changes to this PR.

The add_attributes callback isn't sufficient to generate bindings with #[cfg(...)] or #[link(...)] attributes due to type aliasing.
It also results in a lot of duplicate as well as unnecessary attributes.

I've made the following changes already:

  • Codegen will track attributes for each item inside CodegenResult in a HashMap<ItemId, HashSet<String>>
  • Codegen will apply attributes relevant to compilation, currently only #[cfg(...)] and #[link] are handled
  • Changed callback to include information about current attributes
-    fn add_attributes(&self, _info: &AttributeInfo<'_>) -> Vec<String> { vec![] }
+    fn process_attributes(&self, _info: &AttributeInfo<'_>, _attributes: &mut HashSet<String>) {}

Example usage:

fn process_attributes(&self, info: &bindgen::callbacks::AttributeInfo<'_>, attrs: &mut HashSet<String>) {
   match info.kind {
      AttrKind::Var => {
         attrs.insert(r#"#[cfg(feature="statics")]"#.to_owned());
      }
      AttrKind::Struct | AttrKind::Union | AttrKind::Enum => {
         attrs.insert(r#"#[cfg(feature="structs")]"#.to_owned());
      }
      AttrKind::Function(_) => {
         attrs.insert(r#"#[cfg(feature="functions")]"#.to_owned());
         attrs.insert(if info.name.starts_with("NtUser") {
            r#"#[link(name = "ntuser")]"#
         } else {
            r#"#[link(name = "ntdll")]"#
         }.to_owned());
      }
   }
}

Example output with merge_cfg_attributes:

#[cfg(feature = "functions")]
pub mod unsafe_extern_C_cfg_feature_functions_link_name_ntdll {
   #[link(name = "ntdll")]
   unsafe extern "C" {
      pub fn NtCallbackReturn(
         OutputBuffer: PVOID,
         OutputLength: ULONG,
         Status: NTSTATUS,
      ) -> NTSTATUS;
   // ...
}
#[cfg(feature = "functions")]
pub use unsafe_extern_C_cfg_feature_functions_link_name_ntdll::*;
  • Added a new merge_cfg_attributes post-processing pass that combines items with a unique combination of attributes relevant to compilation into a #[cfg(...)] pub mod { items... } and re-exports them via #[cfg(...)] pub use mod::*;

I still have to restore CLI functionality, we can possibly even add replace-attribute-* options and implement them using the new callback.

Progress can be tracked at https://github.com/oberrich/rust-bindgen/tree/alias for now, I will be merging them into this branch once it's ready for a review.

@oberrich oberrich mentioned this pull request Jan 29, 2025
@oberrich
Copy link
Author

oberrich commented Feb 5, 2025

I've rebased my changes onto upstream to simplify merging and cleaned up the commit history to make the changes more digestible.

@jschwe, I noticed you've merged #3103, and I'm wondering about the usefulness of process_comment in its current form. Shouldn't we invoke all callbacks so they can be layered?

We could take a similar approach to process_attributes and introduce a process_comments function that takes &mut Vec<String>. This would also help address #3073.

The process_comments callbacks would be invoked before process_attributes, acting as a subset of the process_attributes callback. Alternatively, we could consider removing the callback entirely, since its functionality is now largely covered by process_attributes and starts_with("#[doc"), though that might negatively impact UX.

I'm open to feedback on the best approach—would love to hear any thoughts!

@jschwe
Copy link
Contributor

jschwe commented Feb 5, 2025

@jschwe, I noticed you've merged #3103, and I'm wondering about the usefulness of process_comment in its current form. Shouldn't we invoke all callbacks so they can be layered?

Just to clarify: that PR only changed the behavior from using the last registered callback (where the implementation might very well be the default None), to the last callback that actually returns something.

I find it very convenient to setup a default global implementation to process comments, e.g. converting doxygen to markdown, and then for specific header files specify another callback, which overrides the global implementation and perhaps works around some issues for that specific header file.

I also don't really see a use case for layering different comment processing steps via different callbacks. Is there something that can't be done in one callback, or would be more convenient? I think it would be much harder to reason about, since the order of the steps matter.

This would also help address #3073.

I think that issue is easily fixable without any changes. Let me check.
Edit: Opened #3114

The process_comments callbacks would be invoked before process_attributes, acting as a subset of the process_attributes callback.

For my use-case at least I would prefer getting the original comment, since parsing doxygen comments is easier, then the post-processed markdown I generate in my case.

@oberrich
Copy link
Author

@emilio Since you're merging PRs kindly pining again for #3088 (comment)
If this patch looks good apart from the remaining issue (incorrectly emitting an attribute here and there) and 1-2 test cases, most of the changes are implemented.
I'll then fix the tests and make sure the remaining issues are fixed and add some documentation so this can get merged.
Will most definitely squash commit history and rebase everything; beware this is a larger patch and contains some refactoring of codegen.

Looking forward to a preliminary review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants