This is a large super-issue that I hope will drive some discussion and split into smaller issues. Perhaps this is a project/discussion for vizcon2 (I will not be in attendance).
I would really like to see StatsMakie as a first-class tool for interactively visualizing outputs of probabilistic programming languages like Turing, Gen, Soss, and Stan (explicitly, random draws from high-dimensional probability distributions, usually with a shape like (nchains, ndraws, nvariables)).
Due to Makie's interactivity, it could be a prime candidate for building an interactive dashboard for visualizing draws, not just after the analysis has completed, but in an on-line fashion.
To proceed, it would be helpful to formalize/document the grammar of graphics design for StatsMakie.
This would aid in determining at which levels different features should be implemented.
Some useful references for related GoG's are
- Wilkinson's Grammar of Graphics
- Hadley Wickham's thesis, in particular Chapter 3 on layered GoG, which explains the design of ggplot2 in R. See also the succint explanation here.
- This paper on a probabilistic grammar of graphics. I haven't read it very closely, but it uses current research in interpretibility of visualizations to propose changes to a ggplot2 style GoG so probabilities are communicated more accurately. This GoG's major changes happen on the data/aesthetic level, annotating groups with probabilistic information (see Fig 5).
I personally like ggplot2's GoG, augmented by tidybayes.
In ggplot2's GoG, the plots contain:
- a default dataset and set of mappings from variables to aesthetics,
- one or more layers, with each layer having each of the following:
- one geometric object
- one statistical transformation
- one position adjustment
- optionally, one dataset and set of aesthetic mappings,
- one scale for each aesthetic mapping used,
- a coordinate system,
- the facet specification.
tidybayes is a package that augments ggplot2 to aid in plotting outputs of PPLs. It operates at the level of mappings, statistical transformations, and geoms.
The main thing it adds is the ability to flatten multidimensional draws to something like a dataframe that can then be understood by a GoG (i.e. "tidy data").
Many of the additional stats/geoms are drop-in replacements for some in ggplot2, often changing defaults, combining geoms, or applying some principles from the probabilistic GoG.
Having some of these primitives for visualizing uncertainty in StatsMakie would be great:

With these primitives, you can build an impressive array of statistical plots.
I'm sure the devs here have given a lot of thought to the future of StatsMakie.
It would be useful to know how StatsMakie's GoG compares to that above (is it missing elements that are needed? does it have additional elements?, etc)
For example, with Data and symbols, we currently have a convenient syntax to extract columns from a dataframe, but can this be extended?
E.g. what if instead of a dataframe, we have some other structure that uses getindex to access values? What if the data is in an iterator, and we'd like to on-line update plots? Can a useful API for grouping, etc be defined that could be overloaded by individual packages?
Lastly, what is the intended relationship between StatsPlots and StatsMakie? With the coming ability of Makie to consume Plots recipes, should recipes whenever possible be implemented without respect to either package? And should then this GoG be StatsMakie-specific or in some more generic package?
cc-ing @cscherrer, @cpfiffer, @trappmartin
This is a large super-issue that I hope will drive some discussion and split into smaller issues. Perhaps this is a project/discussion for vizcon2 (I will not be in attendance).
I would really like to see StatsMakie as a first-class tool for interactively visualizing outputs of probabilistic programming languages like Turing, Gen, Soss, and Stan (explicitly, random draws from high-dimensional probability distributions, usually with a shape like
(nchains, ndraws, nvariables)).Due to Makie's interactivity, it could be a prime candidate for building an interactive dashboard for visualizing draws, not just after the analysis has completed, but in an on-line fashion.
To proceed, it would be helpful to formalize/document the grammar of graphics design for StatsMakie.
This would aid in determining at which levels different features should be implemented.
Some useful references for related GoG's are
I personally like ggplot2's GoG, augmented by tidybayes.
In ggplot2's GoG, the plots contain:
tidybayes is a package that augments ggplot2 to aid in plotting outputs of PPLs. It operates at the level of mappings, statistical transformations, and geoms.

The main thing it adds is the ability to flatten multidimensional draws to something like a dataframe that can then be understood by a GoG (i.e. "tidy data").
Many of the additional stats/geoms are drop-in replacements for some in ggplot2, often changing defaults, combining geoms, or applying some principles from the probabilistic GoG.
Having some of these primitives for visualizing uncertainty in StatsMakie would be great:
With these primitives, you can build an impressive array of statistical plots.
I'm sure the devs here have given a lot of thought to the future of StatsMakie.
It would be useful to know how StatsMakie's GoG compares to that above (is it missing elements that are needed? does it have additional elements?, etc)
For example, with
Dataand symbols, we currently have a convenient syntax to extract columns from a dataframe, but can this be extended?E.g. what if instead of a dataframe, we have some other structure that uses
getindexto access values? What if the data is in an iterator, and we'd like to on-line update plots? Can a useful API for grouping, etc be defined that could be overloaded by individual packages?Lastly, what is the intended relationship between StatsPlots and StatsMakie? With the coming ability of Makie to consume Plots recipes, should recipes whenever possible be implemented without respect to either package? And should then this GoG be StatsMakie-specific or in some more generic package?
cc-ing @cscherrer, @cpfiffer, @trappmartin