Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DefaultAccessorNamingStrategy converts to lowercases the whole first group of uppercase characters #4682

Open
1 task done
etrandafir93 opened this issue Aug 30, 2024 · 20 comments
Labels
to-evaluate Issue that has been received but not yet evaluated

Comments

@etrandafir93
Copy link

Search before asking

  • I searched in the issues and found nothing similar.

Describe the bug

Hello, I've noticed that DefaultAccessorNamingStrategy will convert to lowercase more than one letter that follows the prefix of a getter. This is happening in the for-loop here: DefaultAccessorNamingStrategy::legacyManglePropertyName.

Based on the code and comments, this seems to be the intended default behavior, but it can result in unintuitive outcomes.

This was my use case:

@Data
class MyObject {
    String iPhone; // <-- lombok generates the getter: getIPhone() {...}
}

If we use the default configuration, the resulting json can be a bit surprising:

{ 
    "iphone": "test"
}

Are there use cases where we can benefit from lowercasing more than one character?

Version Information

latest master version

Reproduction

static class MyObj {
  String iPhone;
  // constructor

  public String getIPhone() {
    return iPhone;
  }
}

@Test
void lowercasingMoreThanOneChar() throws Exception {
  ObjectMapper mapper = new ObjectMapper();
  String json = mapper.writeValueAsString(new MyObj("test"));
  assertEquals("{\"iPhone\":\"test\"}", json); // <- fails, the key will be lowercase: "iphone"
}

Expected behavior

The expected outcome was to lowercase only the first character that follows the 'get' prefix.

Additional context

No response

@etrandafir93 etrandafir93 added the to-evaluate Issue that has been received but not yet evaluated label Aug 30, 2024
@yawkat
Copy link
Member

yawkat commented Aug 30, 2024

The reason is abbreviations, e.g. HTMLParser -> htmlParser. That this doesn't work well with lombok is a known issue.

@etrandafir93
Copy link
Author

hello @yawkat, thanks for the quick response!
I see, but even for this example it will still not work exactly as intended, right?

HTMLParser -> getHTMLParser() -> "htmlparser" (instead of "htmlParser")

@JooHyukKim
Copy link
Member

Hmmmm, very tricky topic here.
Do u have any suggestions wrt what can be done here? @etrandafir93

@yawkat
Copy link
Member

yawkat commented Aug 30, 2024

Maybe USE_STD_BEAN_NAMING is enough to fix your particular case.

@etrandafir93
Copy link
Author

@yawkat, @JooHyukKim,

Unfortunately USE_STD_BEAN_NAMING also fails, producing the key with with both characters as uppercase:
field: iPhone -> getter: getIPhone() -> json field: "IPhone".

Tested here:

@Test
void lowercasingMoreThanOneChar() throws Exception {
  ObjectMapper mapper = new ObjectMapper().configure(USE_STD_BEAN_NAMING, true);
  String json = mapper.writeValueAsString(new MyObj("test"));
  assertEquals("{\"iPhone\":\"test\"}", json); // <-- fails, actual: {"IPhone":"test"}
}

To summarize, switching between Lombok data/value classes (or just class with IDE-generated getters) to Java records results in different json for these cases.
There are a few workarounds to avoid this at a field level - for eg: serialize using the fields instead of getters, custom @JsonProperty for this specific field, or even setting lombok.accessors.fluent=true for a record-like accessor) - same goes for deserialization.

Maybe a new naming strategy or some config property can address this at a project level - but you'll know better if a new config property will make sense here.

@JooHyukKim
Copy link
Member

JooHyukKim commented Aug 30, 2024

Maybe a new naming strategy or some config property can address this at a project level - but you'll know better if a new config property will make sense here

New naming strategy might be possible. But I doubt changing the default will push through --this will create disaster for so many projects out there.

Based on the code and comments, this seems to be the intended default behavior, but it can result in unintuitive outcomes.

With such expected side effect in mind, intuition may not enough to drive changing default behavior.
If you need to resolve your case, maybe there is a customization point for accessor naming strategy that I am not aware of @etrandafir93 ?

@etrandafir93
Copy link
Author

@JooHyukKim - I never suggested changing the default configuration :)

Just wanted to highlight this situation and ask if it is/will be possible to achieve this via a dedicated configuration at the project level.

@cowtowncoder
Copy link
Member

As I recall, the fundamental problem in case like this is that the name should be based on field name, and not on extracting it from getter or setter: latter is ambiguous as there's no 1-to-1 mapping.
This might also require at least partially name-insensitive matching, to connect the two.

I don't think naming strategy is, unfortunately, quite enough to solve the problem.

@JooHyukKim
Copy link
Member

@JooHyukKim - I never suggested changing the default configuration :)

Right, I guess I thought that's where discussion was heading towards :))

@yihtserns
Copy link
Contributor

yihtserns commented Sep 1, 2024

Just to be clear: by "latest master version", do you mean the unreleased version 3.0 (https://github.com/FasterXML/jackson-databind/tree/master)?

EDIT: Nope, likely means 2.17.2. Also:

To summarize, switching between Lombok data/value classes (or just class with IDE-generated getters) to Java records results in different json for these cases.

...means field named iPhone will result in JSON Object field name "iphone" for Lombok data/value class with , but "iPhone" when using Records (for e.g. record MyObj(String iPhone) { ... })

(I've just understood what that statement meant, just sharing in case it helps someone else.)


NOTE: What happens internally is that POJOPropertiesCollector found:

  1. iPhone POJOPropertyBuilder for the package-private field
  2. iphone POJOPropertyBuilder for the public getter

...then removing the 1st property because it is "not visible" (default settings only considers public field as visible), leaving us with the 2nd property.

@yihtserns
Copy link
Contributor

yihtserns commented Sep 3, 2024

Just found out that JavaBean's API considers getIPhone() to belong to a property named IPhone:

image

...showing us how Records is not a 1-to-1 "drop-in" replacement for Bean classes.

@cowtowncoder
Copy link
Member

Records do not follow Bean convention anyway, due to no set-/get-prefixes. And no need to modify capitalization.

@yihtserns
Copy link
Contributor

yihtserns commented Sep 3, 2024

The expected outcome was to lowercase only the first character that follows the 'get' prefix.

Out of curiosity, I tried to hack something out but these things don't look like public APIs...:

objectMapper.setAccessorNaming(new DefaultAccessorNamingStrategy.Provider() {

    @Override
    public AccessorNamingStrategy forPOJO(MapperConfig<?> config, AnnotatedClass targetClass) {
        return new DefaultAccessorNamingStrategy(config, targetClass, _setterPrefix, _getterPrefix, _isGetterPrefix, _baseNameValidator) {

            @Override
            public String findNameForRegularGetter(AnnotatedMethod am, String methodName) {
                if (_getterPrefix != null && methodName.startsWith(_getterPrefix)) {
                    String name = methodName.substring(_getterPrefix.length());
                    char firstChar = name.charAt(0);
                    char secondChar = name.charAt(1);

                    boolean annoyingName = Character.isUpperCase(firstChar) && Character.isUpperCase(secondChar);
                    if (annoyingName) {
                        int firstLowerCaseIndex = 0;
                        char[] chars = name.toCharArray();
                        for (int i = 0; i < chars.length; i++) {
                            if (Character.isLowerCase(chars[i])) {
                                firstLowerCaseIndex = i;
                                break;
                            }
                        }

                        // To convert e.g.:
                        // - `IPhone` --> `iPhone`
                        // - `HTMLParser` --> `htmlParser`
                        int lastFrontUpperCaseIndex = firstLowerCaseIndex - 1;
                        return name.substring(0, lastFrontUpperCaseIndex).toLowerCase(Locale.ROOT)
                                + name.substring(lastFrontUpperCaseIndex);
                    }
                    return super.findNameForRegularGetter(am, methodName);
                }
                return null;
            }
        };
    }
});

Is this really the "official" way to provide an alternative AccessorNamingStrategy, or it's actually at another place, or there's none (e.g. AccessorNamingStrategy is an internal API)?

@JooHyukKim
Copy link
Member

'setAccessorNaming()' method seems quite public (behavior wise) to me. It provides 'Annotated' and value as inputs. 🤔
May I ask what attributes made u think it isnt public?

Also There might be some JavaDoc written about customizing idk

@yihtserns
Copy link
Contributor

May I ask what attributes made u think it isnt public?

While having protected access modifier normally means "subclasses are welcome to use me", seeing a field name starting with underscore always gives me "I AM AN INTERNAL API" vibes, especially since I'm not familiar with naming convention used in this project.

@cowtowncoder
Copy link
Member

cowtowncoder commented Sep 3, 2024

So, AccessorNamingStrategy is a public extension point, but only meant for application developers (sort of end user), not for modules and so on. And not by jackson-databind itself (it shouldn't need that).

So it is not an internal extension point.

@etrandafir93
Copy link
Author

etrandafir93 commented Sep 4, 2024

Hey @cowtowncoder and everyone else, thank you all for stepping in here!

To clarify, I mentioned Lombok and records just to provide context for how we encountered the issue. The question is whether it’s ok to lowercase the entire first group of uppercase characters following the get/set prefix, without offering an alternative strategy via configuration.

From what I can see, there are cases where this default behavior might make sense, particularly when dealing with single words in all uppercase:

getID() -> "id"
getHTML() => "html"

However, this will make it impossible to serialize correctly -via getters- fields starting with a single lowercase character, such as: iPhone, iPad, eGaming, eCommerce... etc.

So, I think the core question of this issue is: does it make sense to have a naming strategy that only lowercase the first letter following the get prefix? If not, and we should just use field serialization for these cases, I think we can go ahead and close this issue. :)

@cowtowncoder
Copy link
Member

I don't think that changing default behavior (either with or without USE_STD_BEAN_NAMING) is something we can do for 2.x.

I thought USE_STD_BEAN_NAMING would lower-case all leading upper-case; and non-USE_STD_BEAN_NAMING just the first one, fwtw.

But ultimately to fix the issue would, I think, require new option: anchoring property name to Field name, and matching possibly case-mismatching setter/getter to that name.
That is something we could consider for 3.0, even as default -- or possibly, as 2.19 feature, defaulting to false in 2.x, but true in 3.0.

@cowtowncoder
Copy link
Member

Oh. And looking at MapperFeature.USE_STD_BEAN_NAMING; case of "getURL()"

  • When enabled -> "URL" (and hence)
  • When disabled -> "url"

so it won't do what would work here, "Url". But even if there was such a setting, I don't think it would cover all cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
to-evaluate Issue that has been received but not yet evaluated
Projects
None yet
Development

No branches or pull requests

5 participants