Skip to content

Conversation

@f-liva
Copy link

@f-liva f-liva commented Jul 1, 2025

In Italian, unlike English, when forming the plural of a combination of words, often all the words need to be pluralized, not just the last one. For example, "libro giallo" (yellow book) becomes "libri gialli" (yellow books) so both the noun and the adjective change.

To handle this peculiarity, I created a new class called MultiWordInflector that uses and extends the standard WordInflector class to correctly transform each word in a phrase into plural or singular.

I had to modify some parts of the word inflection system. In particular, the RulesetInflector class was changed into an abstract class based on a generic class, so that a new version specific to Italian could be created without altering the original one. I also updated the class factory to allow choosing which version of RulesetInflector to use. Finally, I modified the Italian inflection class to use this new configuration and thus allow correct plural handling in multi-word phrases.

@f-liva f-liva changed the title Feature/multiword inflector Improved pluralization with multi-word inflection in Italian Jul 1, 2025
@f-liva f-liva force-pushed the feature/multiword-inflector branch 5 times, most recently from a12a49f to 99dfff8 Compare July 2, 2025 08:36
@f-liva
Copy link
Author

f-liva commented Jul 2, 2025

Hi, I have completed the corrections on the PR.

@f-liva f-liva force-pushed the feature/multiword-inflector branch from 99dfff8 to 0138449 Compare July 2, 2025 08:38
@f-liva f-liva changed the title Improved pluralization with multi-word inflection in Italian Improved Italian pluralization with multi-word inflection Jul 2, 2025
@garrettw
Copy link

This would be great for other romance languages like Spanish as well.

@f-liva
Copy link
Author

f-liva commented Jul 16, 2025

@garrettw yes, definitely, but I don't know why the PR hasn't been merged yet even though all the automated tests have passed.

@greg0ire
Copy link
Member

This is not a bugfix, it should target 2.1.x

@f-liva
Copy link
Author

f-liva commented Jul 22, 2025

Yes, I confirm that it is not a fix but an expansion, although it introduces a simplification of the rules for pluralization and singularization of Italian words.

@greg0ire greg0ire changed the base branch from 2.0.x to 2.1.x July 22, 2025 07:39
// Split the phrase into words and process each one
$words = preg_split('/([ -])/', $word, -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY);
if ($words === false) {
return parent::inflect($word);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this fails, it feels like you should rather throw an exception.

https://3v4l.org/evCNt#vnull

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’d say that as a fallback it’s right for it to handle the string as if it were a single word, so I would leave it like that without raising exceptions, because in Italian there are often compound or multiple words, so if there were any unmanaged cases exceptions would be raised, whereas this way it would still be handled.

['fascia', 'fasce'],
['fascio', 'fasci'],
['targa', 'targhe'],
// ['carta di credito', 'carte di credito'],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are some of these commented out? I can see that some multi-word phrases are inflected correctly, is there an issue with these particular ones?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left it commented out because it's a compound-word case that uses a preposition and we should handle it with a dedicated PR separately since Italian is very complex grammatically. So when there are words like "carta di credito" it should convert to "carte di credito" and, as you can see, the last word stays singular. If I uncommented that test right now the test would fail, but simply because the current multilingual inflector does not handle prepositions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scusa, i miei giorni di scuola italiana erano molti anni fá, quindi non mi ricordo tutte le regole grammaticali. ;)

Is there a pattern we can apply? As much as I appreciate adding an inflector for Italian, I'd like to not merge it while it doesn't support a good part of phrases, as that tends to generate a number of bug reports once the inflector is used. Unfortunately my high school days are long gone, so I forgot most of my Italian grammar rules, but if it always affects <noun> <preposition> <word> for a list of prepositions (best list I could find was "di, a, da, in, con, su, per, tra, fra", not sure if we should add "allo, dalla" and some others) then it might be possible to extract the noun, inflect that as its own word, then reassemble the compound phrase.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ciao! Non sapevo avessi studiato italiano! Grande! 💪🏼

I confirm that the current behavior is intentional: "simple" compound words (e.g., compounds without prepositions like “carta intestata” or joined compound words) are already handled by the inflector and the related works; whereas phrases of the type (e.g., "carta di credito") are deliberately left to fallback and therefore commented out in the tests because the current logic does not correctly inflect only the initial part while leaving the rest unchanged.

Italian has morphological/agreement rules and cases (articles, elisions, contractions) that make prepositional cases more complex.
Implementing robust handling requires: recognizing the preposition, isolating the noun to inflect, respecting agreement rules and any elisions/contractions (e.g., "a + il = al", "di + lo = dello") and an extensive list of prepositions/variants.
To avoid regressions and user-facing bugs I have chosen to leave these cases as a fallback until a specific and tested solution is introduced.
But for now the inflection rules have been simplified at the core of Italian grammar and everything works as it should, except for possible incorrect inflections caused by words with irregular pluralization/singularization forms, which will be added gradually because there is no list of all the irregular words of the Italian language from which we can draw.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you be willing to take a stab at figuring out whether the inflector is able to handle these prepositional cases? I haven't dug into the internals of the inflector, so I'm not sure if it's possible to fetch and handle the full phrase. Doesn't have to be in this PR, it's just about figuring out whether it's actually possible.

Copy link
Author

@f-liva f-liva Sep 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it’s absolutely possible, and a solution could certainly be found. In the end a very simple rule might be to make the word before the preposition plural, while leaving the word after it singular. This alone would cover 90% of cases. Ex. "carta di credito" will become "carte di credito". As you can see, the last word "credito" it remains to singular form.

Then now that I think about it, it should work the same way in English: if I want to go from the singular "carta di credito" to the plural "carte di credito", it should become "credit card" which becomes "credit cards" and not "credits cards". Right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is correct. Good to know -- with that in mind, I'd be fine merging this and then taking care of prepositional cases.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be awesome


return implode('', $result);
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class should come with tests

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have implemented the tests for the MultiWordInflector class.

@greg0ire greg0ire changed the base branch from 2.1.x to 2.2.x August 10, 2025 19:38
@f-liva f-liva force-pushed the feature/multiword-inflector branch from 35b4de3 to 2e23870 Compare September 18, 2025 08:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants