-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Supporting unit()s comprehensively in ggplot2 aesthetics #5609
Comments
Heavy ggplot2 user here: I would absolutely love functionality like this |
Playing around more, here are some triangular points that always rest on the line even when resized (not possible without a custom geom currently AFAIK): set.seed(1234)
data.frame(var1 = c("a","b"), var2 = rnorm(100, c(1,2))) |>
ggplot(aes(y = var1, x = var2)) +
stat_summary(fun.data = mean_se, geom = "linerange") +
stat_summary(aes(y = stage(var1, after_stat = y + as_pt(4.5))), fun = mean, geom = "point", shape = 25, size = as_pt(5)) This suggests that perhaps |
I agree that this would be very nice to have in principle.
With regards to this, I think mostly the upstream plumbing is the issue, but wrapping |
Thanks for taking a look @teunbrand! I agree an important question is what should live where. Re: non-positional aesthetics, I agree that may be largely {grid}, so perhaps this issue should be split into two issues: one for positional aesthetics and the other for non-positional aesthetics, and to focus on positional aesthetics first. At this stage I would be curious: what is the sense among ggplot2 maintainers that this feature would be useful, and is it worth me investing additional time working on solutions? Basically: is there a chance of this getting in? If there is, I'd be happy to continue prototyping, and then come back with a proposal (or maybe a few possible proposals) outlining changes to {ggplot2} and {grid} (and maybe {vctrs}) that would be necessary. I can already foresee different levels of complexity for different solutions, so I would probably propose at least a "minimal" solution and one or two more complex versions with varying tradeoffs. But I'd like to know the time wouldn't be wasted before putting the work in ;). Thanks! |
A separate issue for discussing improvements to {grid} units sounds useful. There are plenty of complications to consider. For example, something like |
I'd say that there is definitely merit to the idea of providing access to some of grid's unit system for positional aesthetics. However, no amount of hacking or heuristical assumptions on the unit system will get into ggplot2 — that is simply too brittle I think a good way foreward is to nail down what the minimal amount of features would be that can solve this. Is a single absolute unit enough? Then maybe we can avoid a lot of issues by not using grids unit system at all As for unit support in non-positional aesthetics that will not make it into ggplot2 before they make it into |
Makes sense :)
Agreed re: nailing down minimal features. Though I'm not sure what you mean by "is a single absolute unit enough"? I think the minimal version, conceptually, might be that the user needs some way to transform positional variables ( That is more or less what the prototype I came up with allows, but in a bit of a convoluted way, by essentially hiding the transformation inside a So maybe the problem is that we need some way for the user to be able to specify "unit transformations", where a unit transformation is a function (or expression or something) that takes in positional variables from the layer data ( One version of the interface could be an "after_coord" stage, like Although, the hidden-column solution might not work with direct specification of position values outside of Finally, if desired (personally I think it makes things much more accessible), a lightweight
👍 |
The problem
Currently, a big source of fiddlyness in ggplot2 happens when you need to make manual adjustments to the positions of objects in the final plot. Often (but not always) this happens when doing annotation, e.g. positioning labels such that they are nicely and consistently spaced or positioned relative to data, and/or relative to the plot itself. This can be frustrating because if the plot or data dimensions change, manually-positioned elements will also move, because the only way to position them is by supplying values in data units. However, annotation usually needs to be specified in plot units (e.g., points, npcs, etc)---or even worse---in a combination of data units and plot units.
That said, {grid} has robust support for unit conversion and combination in the form of
grid::unit()
. If ggplot was able to specify positions (and evensize
s,linewidth
s, etc) usinggrid::unit()
s, it would be a lot easier to create charts that look good even when the limits of underlying x/y scales change or when the dimensions of the plot changes.Even better would be able to combine data units and plot units: e.g., to be able to specify something like "put this text label exactly 5 points to the left of this data position".
grid::unit()
already has a data unit (the"native"
unit) that could be used for this purpose. It is somewhat unused in ggplot2 because ggplot internally scales everything intoc(0,1)
, so"native"
and"npc"
are effectively equivalent. That means"native"
units could be used to solve this problem...A possible solution for positional aesthetics (draft PR: #5610)
I started playing with the above problem earlier this week, and I think I've come up with something that, with some polish, might be able to solve it without much surgery---at least for positional aesthetics. The solution would also allow extension packages to more-or-less automatically take advantage of it.
The basic idea is to allow
unit()
vectors containing at most one"native"
unit to be assigned to positional aesthetics, and for this native unit to represent the (transformed) data. Then you can do stuff like this:The implementation is a work-in-progress (draft PR: #5610). It hides/exposes units in a way similar to what @teunbrand implemented for
"AsIs"
objects. It works slightly differently in that when a unit column is hidden, the"native"
unit contained within the unit expression is left behind so scales can manipulate it, then the transformed native values replace their corresponding values in the hidden unit expression when it is unhidden later.I was able to do this without modifying most components of the grammar, except for
Coord
s --- these also need to do some hiding/unhiding of units which cannot be done inggplot_build
. The solution I came up with was to do the hiding/unhiding inCoord$transform()
and move the implementation of each Coord's transform toCoord$transform_native()
, which means that extension packageCoord
s would continue to work (but without supportingunit
s), and they would need only to change the name of theirCoord$transform()
functions toCoord$transform_native()
to get support forunit
s.I also ran into some snags with
grid::unit()
in that it (1) does not directly support being added to data frames (needs anas.data.frame
implementation); (2) doesn't support zero-length vectors (I had to construct them manually); and (3) needed some additionalvctrs
methods to be implemented to work within ggplot2 more easily. I put these in the draft PR (#5610), but these are probably more appropriate for some combination ofvctrs
andgrid
.Finally, I think a subclass of
unit()
specific toggplot2
(call itggunit()
) could be useful, as it would allow some simplification of syntax and improved semantics specific to the "only one native-unit component of the unit subexpression" interpretation of units. Specifically, if casting rules are written such that numerics are cast to"native"
units when combined withggunits()
, this allows instances likeunit(var1, "native")
above to be replaced withvar1
, making the code much cleaner:Some shortcut functions for commonly-used units, like
as_pt(...)
andas_npc(...)
, simplify things further:It's worth noting that this all works with coordinate transformations, too:
On the slightly crazier side, I also prototyped an implementation of
pmin
andpmax
forunit
s, which makes it easy to say things like "put the label 10 pts left/up from the point, but also make sure it's at least 10pts from the plot edge"):(I don't like the names
ggunit_pmin
andggunit_pmax
; and possibly it would be better to makepmin
/pmax
generic)unit
in non-positional aestheticsGetting
unit
to work in non-positional aesthetics, likesize
andlinewidth
, is a bit more complicated. The issue is that the corresponding properties ingrid
grobs for these aesthetics (e.g.fontsize
andlwd
) don't takeunit
s, so it is necessary to wrap thegrid
version of the grob to get the desired functionality.I only did this to
geom_point()
to test it. Here are some points that are always the same width in data space when resized:While a bit of meta-programming could make this a straightforward task, it might make more sense to petition to get the underlying
grid
grobs to fully supportunit
s for these properties. Otherwise, extension package developers would not necessarily get these changes "for free", but would have to change to using ggplot's version of each grob. Tagging @pmur002 for thoughts.Why I think this should be in ggplot2, not an extension
Fundamentally, annotation is very important to good visualization, and can be very fiddly to do well in ggplot2 currently (in fact, in a study of ggplot2 experts I conducted awhile back, this was one of their big pain points; see Sec 5.2.1). Comprehensive support for
unit()
s could go a long way to making this easier, and I think can only be done within core ggplot2 --- plus, if the solution works well, all (or nearly all) extension packages would then support it too.The text was updated successfully, but these errors were encountered: