-
-
Notifications
You must be signed in to change notification settings - Fork 36
Design document for percent formatting #1068
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This document is focused for now on documenting the options.
|
||
#### Use a dedicated `:percent` function | ||
|
||
Use a new function `:percent` dedicated to percentages. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we consider names besides :percent
?
The function could apply to all dimensionless units including permille, permillion, perbillion, etc.
For example: {$var :dimensionless unit=permillion}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should consider other names. I'll add that option.
I'm not wild about unit=percent
(mille, billion, etc. etc.). It's verbose and the other uses seems rare. Really only percent and permille are backed by CLDR data. The others strike me as special uses for unit or number formatting.
exploration/percent-format.md
Outdated
#### Scaling | ||
Implementation always scales the number. The value `0.5` formats as `50%` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we go with the "always scales" approach, :unit
can still be used to not scale.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A first pass. I don't agree with the currently proposed design, but let's first get the available options better presented.
It would be nice to have more example code for the options.
Implementation automatically does (or does not) scale. | ||
There is an option to switch to the other behavior. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementation automatically does (or does not) scale. | |
There is an option to switch to the other behavior. | |
Formatter automatically does (or does not) scale. | |
There is an option to switch to the other behavior. | |
The option here may be: | |
- An option `scaling` with boolean values `true` and `false`. | |
- An option `scale` with a small set of supported integer values, possibly only `1` and `100`. |
exploration/percent-format.md
Outdated
- Allow `unit=percent` in `:unit` that is identical to `:percent` in formatting performance, | ||
for compatibility with CLDR units, | ||
but document that this usage is not preferred. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this mean that :unit unit=percent
would or would not apply scaling? And why do we need or benefit from compatibility with CLDR units here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need or benefit from compatibility with CLDR units here?
We don't have to be compatible, except that currently the definition of the unit
option values is completely delegated to the unit identifiers found here in TR35. It would be unfortunate to say "unit identifiers except this one specific one"
Would this mean that :unit unit=percent would or would not apply scaling?
That's a good question. In the most recent WG discussion, there was a sentiment that we should make them behave identically to avoid confusion. There's an equal sentiment that they should be opposite each other (for utility). Here I'm trying to express equivalent performance without binding to a specific scaling/not-scaling choice (since that is separate).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, my agreement with the proposal is dependent on :unit unit=percent
not scaling. So, I see specifying that here one way or another as important.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I see specifying that here one way or another as important.
👍 It is imperative that we specify one or the other.
Currently, my agreement with the proposal is dependent on :unit unit=percent not scaling.
The proposal is to make :percent
and :unit unit=percent
perform identically, so both would scale by default. Is your opposition to :unit
scaling so that message writers could get access to both behaviors without having to use a scale
option? Articulating your reasoning will help me improve the design doc to include that as a design we considered (and perhaps sway consensus).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a misunderstanding of unit scaling.
For unit formatting, CLDR has both an input unit and an output unit, where the output unit typically depends on the unit preferences. For example, <3.5, meter
> input with foot
output formats as "11.5 feet" (in English). There is scaling involved, in the conversion of 3.5 to 11.5. If there is no specified output unit, or the output unit is explicitly the same as the input unit, then there is no scaling. Thus:
<3.5, meter
> input with meter
output doesn't scale.
- If I supply <0.35
percent
> as the input and the output unit werepercent
, it would format as 0.35%. Just like meter ==> meter doesn't scale.
However, if I supply the right input unit, then percent does scale (just like meter ==> foot). And the base unit is for such dimensionless units is 'part'.
With <0.35 part
> as the input and the output unit of percent
, the format is "35%".
Here are sample conversions that I just generated (no formatting)
0.35 part 0.35 part
0.35 part 35.0 percent
0.35 part 350.0 permille
0.35 part 3500.0 permyriad
0.35 part 350000.0 part-per-1e6
0.35 part 3.5E8 part-per-1e9
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understood that to be the case.
:unit
can override the unit, in which case scaling occurs. The question is what happens when there is no other unit? Using MeasureFormat
in ICU4J can only be an approximation, since the only way to call it is with a Measure
object. Presumably a bare number operand in MF would, behind the scenes, be packaged with the unit.
I'm not suggesting that :unit
does not convert. Only that the default behavior of unit=percent
is unscaled given a numeric operand. This is different from MF1's handling of operand,number,percent
formatting and the proposed performance of :percent
. Do you disagree?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is your opposition to :unit scaling so that message writers could get access to both behaviors without having to use a scale option?
Sorry I wasn't clear on that. Yes, I want them to act differently, so I guess that just isn't this proposal but an alternative. The reason, though, is not just to have access to both behaviors (though it's an excellent side benefit) but because it makes semantic/intuitive sense to me.
I see them as for different purposes, where the input value to :percent
is for (or from) some computation which results in a ~number, and the input to :unit
would be roughly a string (or semantically equivalent in its static intent, if that makes sense).
1/10 = .1
-> format via :percent
-> "You've completed 10% of your tasks"
vs
A user inputs into a marketing tool a discount value of "10" and selects "%" (as opposed to "$", "lbs", "items" etc.), and that uses :unit
to render things like "10% off", "$10 off", "Get 10 lbs free...", "Buy 10 get 1 free" or similar.
This is how I, presumptuously, think most people would expect each to work. Happy to be wrong about that, though.
I'm not terribly familiar with the input -> output scaling Mark mentioned, so I'll try to digest that a bit more and see if it changes my perspective. It doesn't initially seem problematic, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bearfriend I think those are both great use cases and will add them to the document.
Note that the "proposed solution" is a strawman. The alternatives considered are what is important. We'll see if a consensus emerges--or vote on which technical decisions to make.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a misunderstanding of unit scaling.
I added your example (suitably edited and expanded) at length to the document. Check for veracity.
Co-authored-by: Eemeli Aro <[email protected]>
Co-authored-by: Eemeli Aro <[email protected]>
Co-authored-by: Eemeli Aro <[email protected]>
Co-authored-by: Eemeli Aro <[email protected]>
- Only percent and permille are backed with CLDR data and symbols. | ||
Other scales would impose an implementation burden. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CLDR has data for other scales, too, via portion
units.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not really percent/per mille type scaling though, is it?
exploration/percent-format.md
Outdated
{{Completion: {$n :unit unit=percent}.}} | ||
``` | ||
|
||
#### Use `:math multiply` to scale |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please note my concern about implementation burden due to having to support a more general function than we actually need.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The :math
function, currently in draft and originally included to support plural offsets from MF1, is certainly a potential "slippery slope".
Many programming languages have math-related classes or function sets with many different operators in them. The existence of a :math
function in MF would certainly invite proposals for many of these to migrate into messages, regardless of utility. This in a specification that is strongly **UN**typed.
If we go down the :math
route, I would suggest that we write a full design document, including considerations for what our policies would be about future expansion. We should also consider whether math
is the right name or different design strategies, such as unbundling functionality into separate functions (is it a better imposition of burden to have separate required :add
and :subtract
functions than a required function that has addition, subtraction, scaling, etc. into which we might add hard-to-achieve functionality? There is also the question of versioning the :math
function if we add new operations to it over time, creating a portability hazard)
|
||
**Pros** | ||
- Consistent with ICU MessageFormat | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Widely view as a number format option in spreadsheets and other contexts, so many people are familiar with it as a type of number format. | |
- Consistent with compact number formats, which _also_ scale; eg "3.5 M" for 3500000. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps debatable?
It's certainly proximate to numeric formats, at least in some spreadsheets. FWIW, we do group it into the number functions and it certainly takes a numeric operand. But I think a case can be made that :number type=percent
or :percent
are both intuitive--and the latter becomes maybe a bit more obvious given :currency
.
The meta debate we're having is a classic in the I18N space: split or lump? Should we prefer functions that do many things with lots of options? Or should we prefer functions that do roughly one thing with minimal options (and lots and lots of functions)?
Note that the "123" button is "More Formats" in Google sheets:
Excel puts percent after date/time:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A fair number of the other Pros/Cons are debatable... But I'll tweak my suggested change.
I'm not wild about :percent as a separate function; nor wild about :scientific or :engineering or :compact or even :integer. Just the sheer volume of duplicated options gets to be very daunting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just the sheer volume of duplicated options gets to be very daunting.
The worst of all worlds would be lots of functions each of which has lots of options and where some functions are general purpose and overlap with special purpose ones. With support for custom functions, that will sometimes be unavoidable. But for the default function set we should have a clear policy/design philosophy. The meta debate is, in many ways, more important, than the concrete decision of what to name the percent formatting function (but percent is as good a trial horse, I think, as we're going to get). Note that the discussion about semantic skeletons also is considering the problem of function packaging.
I agree that unless there is a way to distinguish the source and target
unit, it wouldn't scale. I was just pointing that out for people unfamiliar
with the way that units work.
(And I'm not a fan of using :unit unit=percent at all to solve the problem
of producing "10%" from 0.1, for a number of reasons).
…On Mon, Apr 21, 2025 at 4:22 PM Addison Phillips ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In exploration/percent-format.md
<#1068 (comment)>
:
> +- Allow `unit=percent` in `:unit` that is identical to `:percent` in formatting performance,
+ for compatibility with CLDR units,
+ but document that this usage is not preferred.
I understood that to be the case.
:unit can override the unit, in which case scaling occurs. The question
is what happens when there is no other unit? Using MeasureFormat in ICU4J
can only be an approximation, since the only way to call it is with a
Measure object. Presumably a bare number operand in MF would, behind the
scenes, be packaged with the unit.
I'm not suggesting that :unit does not convert. Only that the default
behavior of unit=percent is unscaled given a numeric operand. This is
different from MF1's handling of operand,number,percent formatting and
the *proposed* performance of :percent. Do you disagree?
—
Reply to this email directly, view it on GitHub
<#1068 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACJLEMF53JDQ2ZTOSQ3WQ7T22V4U5AVCNFSM6AAAAAB2USHKLKVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDOOBSGQYDIMBUGE>
.
You are receiving this because you were assigned.Message ID:
***@***.***>
|
Co-authored-by: Mark Davis <[email protected]>
I just realised that this whole discussion is also related to #1015 (review), which we probably ought to address as well. In other words, as we currently don't have |
In MF, a bare number literal, such as `.local $foo = {35}` | ||
or an implementation-specific number type (such as an `int` in Java) | ||
might be considered to use the input unit of `part` | ||
unless we specified that the `percent` unit value or `:percent` function overrode the `part` unit with `percent`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does MeasureFormat
or any other unit formatter implementation ever make such an assumption, of having a numerical input value not match the formatter's output units?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICT, it's not possible in MeasureFormat
to format a number without a unit... and "part" is not a unit. MeasureFormat
's Javadoc says that it doesn't do conversions. But then I shouldn't have been looking at that class, but rather NumberFormatter
. That class implements the interface that @macchiati describes, including conversion. And it can be fed a number whose unit is "nothing":
LocalizedNumberFormatter nf = NumberFormatter.withLocale(Locale.getDefault())
.unit(MeasureUnit.PERCENT);
System.out.println(nf.format(5.0));
This doesn't scale though. It produces 5%
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know about ICU, but CLDR does appear to treat part
as a unit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not exactly. There are various measures (such as parts-per-million
, which is a flavor of concentr-part
[concentration]) that have "parts", but not a measurement part
all by itself. ICU represents the various measures in MeasureUnit
and it doesn't have a standalone PART
member.
Which is beside the point. The percent format in NumberFormatter
is unscaled when working on a raw number.
Co-authored-by: Eemeli Aro <[email protected]>
Co-authored-by: Eemeli Aro <[email protected]>
This document is focused for now on documenting the options.