Skip to content

Commit

Permalink
table formatting, copy editing
Browse files Browse the repository at this point in the history
  • Loading branch information
clauswilke committed Sep 22, 2018
1 parent cd83414 commit 4af9fb8
Show file tree
Hide file tree
Showing 3 changed files with 46 additions and 14 deletions.
32 changes: 32 additions & 0 deletions css/style.css
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,38 @@ table {
font-size: .833em !important;
}

.book .book-body .page-wrapper .page-inner section.normal table tr.header {
border-top-width: 0px;
/*background-color: #e8e8e8;*/
}

.book .book-body .page-wrapper .page-inner section.normal table th {
padding-top: 3px;
padding-bottom: 3px;
border-top-width: 1px;
border-bottom-width: 1px;
border-top-color: #808080;
border-bottom-color: #808080;
}

.book .book-body .page-wrapper .page-inner section.normal table td {
padding-top: 3px;
padding-bottom: 3px;
border-top-width: 1px;
border-bottom-width: 1px;
border-top-color: #f0f0f0;
border-bottom-color: #f0f0f0;
}

.book .book-body .page-wrapper .page-inner section.normal table tr:nth-child(2n) {
background-color: #f0f0f0;
}

.book .book-body .page-wrapper .page-inner section.normal table:not(.kable_wrapper) tr:last-child td {
border-bottom-width: 1px;
border-bottom-color: #808080;
}

caption {
display: table-caption;
font-size: .833em !important;
Expand Down
16 changes: 8 additions & 8 deletions figure_titles_captions.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -9,13 +9,13 @@ library(tibble)
library(lubridate)
```

# Putting data into context: Titles, captions, labels, and tables {#figure-titles-captions}
# Titles, captions, and tables {#figure-titles-captions}

Data visualizations are not art meant to be looked at only for its aesthetically pleasing features. Instead, they are supposed to convey information and make a point. To reliably achieve this goal, we have to place data visualizations into context and to present them with accompanying titles, captions, and other annotations. In this chapter, I will discuss how to properly title and label figures. As a somewhat related topic, I will also discuss how to present data in table form.
A data visualization is not a piece of art meant to be looked at only for its aesthetically pleasing features. Instead, its purpose is to convey information and make a point. To reliably achieve this goal when preparing visualizations, we have to place the data into context and provide accompanying titles, captions, and other annotations. In this chapter, I will discuss how to properly title and label figures. I will also discuss how to present data in table form.

## Figure titles and captions

One critical components of every figure is the figure title. Every figure needs the title. The job of the title is to accurately convey to the reader what the figure is about, what point it makes. However, the figure title may not necessarily appear where you were expecting to see it. Consider Figure \@ref(fig:corruption-development). This figure is inspired by a posting in @Economist-corruption. I have recreated their figure with several subtle but important modifications. First, I have updated the data to a more recent dataset (2015) and adapted the color and symbol styling so they follow the styling used throughout this book. Second, and more importantly, I have drawn the actual figure without title or statement about the data sources used. Instead, I provide these items in the caption block underneath the figure display. This is the style I am using throughout this book. I consistently show figures without integrated titles and with separate captions. (One exception are the stylized plot examples in Chapter \@ref(directory-of-visualizations), which instead have titles and no captions.)
One critical components of every figure is the figure title. Every figure needs the title. The job of the title is to accurately convey to the reader what the figure is about, what point it makes. However, the figure title may not necessarily appear where you were expecting to see it. Consider Figure \@ref(fig:corruption-development). This figure is inspired by a posting in @Economist-corruption. I have recreated their figure with several subtle but crucial modifications. First, I have updated the data to a more recent dataset (2015) and adapted the color and symbol styling so they follow the styling used throughout this book. Second, and more importantly, I have drawn the actual figure without title or statement about the data sources used. Instead, I provide these items in the caption block underneath the figure display. This is the style I am using throughout this book. I consistently show figures without integrated titles and with separate captions. (One exception are the stylized plot examples in Chapter \@ref(directory-of-visualizations), which instead have titles and no captions.)

(ref:corruption-development) Corruption and human development: The most developed countries experience the least corruption. Data sources: Transparency International & UN Human Development Report

Expand Down Expand Up @@ -73,7 +73,7 @@ ggsave("figures/corruption_plot_base.png", plot_corrupt_base, width = 7, height
ggdraw() + draw_image("figures/corruption_plot_base.png")
```

For reference, I also provide a version of the figure that has these elements incorporated into the main display (Figure \@ref(fig:corruption-development-infographic)). In a direct comparison, you may find Figure \@ref(fig:corruption-development-infographic) more attractive than Figure \@ref(fig:corruption-development), and you may wonder why I am choosing the latter style throughout this book. I do so because the two styles have different application areas, and figures with integrated titles are not appropriate for conventional book layouts. The underlying principle is that a figure can have only one title. Either the title is integrated into the actual figure display or it is provided as the first element of the caption underneath the figure. And, if a publication is laid out such that each figure has a regular caption block underneath the display item, then the title *must* be provided in that block of text. For this reason, in the context of conventional book or article publishing, we do not normally integrate titles into figures. Figures with integrated titles, subtitles, and data source statements are appropriate, however, if they are meant to be used as stand-alone infographics or to be posted on social media or on a web page without accompanying caption text.
For reference, here I also provide a version of the figure that has these elements incorporated into the main display (Figure \@ref(fig:corruption-development-infographic)). In a direct comparison, you may find Figure \@ref(fig:corruption-development-infographic) more attractive than Figure \@ref(fig:corruption-development), and you may wonder why I am choosing the latter style throughout this book. I do so because the two styles have different application areas, and figures with integrated titles are not appropriate for conventional book layouts. The underlying principle is that a figure can have only one title. Either the title is integrated into the actual figure display or it is provided as the first element of the caption underneath the figure. And, if a publication is laid out such that each figure has a regular caption block underneath the display item, then the title *must* be provided in that block of text. For this reason, in the context of conventional book or article publishing, we do not normally integrate titles into figures. Figures with integrated titles, subtitles, and data source statements are appropriate, however, if they are meant to be used as stand-alone infographics or to be posted on social media or on a web page without accompanying caption text.

(ref:corruption-development-infographic) Figure \@ref(fig:corruption-development) reformatted to be posted on the web or to be used as an infographic. The title, subtitle, and data source statements have been incorporated into the figure.

Expand All @@ -100,13 +100,13 @@ plot_grid(plot_corrupt_title,
If your document layout uses caption blocks underneath each figure, then place the figure titles as the first element of each caption block, not on top of the figures.
```

One of the most common mistakes I see in figure captions is the omission of a proper figure title at the beginning. Take a look back at the caption to Figure \@ref(fig:corruption-development). It begins with "Corruption and human development." It *does not* begin with "This figure shows how corruption is related to human development." The first part of the caption is always the title, not a description of the contents of the figure. A title will usually not be a complete sentence, though should sentences making a clear assertion can serve as title. For example, for that figure, a title such as "The most developed countries are the least corrupt" would have worked fine.
One of the most common mistakes I see in figure captions is the omission of a proper figure title as the first element of the caption. Take a look back at the caption to Figure \@ref(fig:corruption-development). It begins with "Corruption and human development." It *does not* begin with "This figure shows how corruption is related to human development." The first part of the caption is always the title, not a description of the contents of the figure. A title does not have to be a complete sentence, though short sentences making a clear assertion can serve as titles. For example, for Figure \@ref(fig:corruption-development), a title such as "The most developed countries are the least corrupt" would have worked fine.

## Axis and legend titles

Just like every plot needs a title, axes and legends need titles as well. (Axis titles are often colloquially referred to as *axis labels*.) Axis and legend titles and labels explain what the displayed data values are and how they map to plot aesthetics.

As an example of a plot where all axes and legends are appropriately labeled and titled, I will take the blue jay dataset discussed at length in Chapter \@ref(visualizing-associations) and visualize it as a bubble plot (Figure \@ref(fig:blue-jays-scatter-bubbles2)). In this plot, the axis titles clearly indicate that the *x* axis shows body mass in grams and the *y* axis shows head length in milimeters. Similarly, the legend titles show that point coloring indicates the birds' sex and point size indicates the birds' skull size in milimeters. I emphasize that for all numerical variables (body mass, head length, and skull size) the relevant titles not only state the variable shown but also the units in which the variables are measured. This is good practice and should be done whenever possible. Categorical variables (such as sex) do not require units, however.
To present an example of a plot where all axes and legends are appropriately labeled and titled, I have taken the blue jay dataset discussed at length in Chapter \@ref(visualizing-associations) and visualized it as a bubble plot (Figure \@ref(fig:blue-jays-scatter-bubbles2)). In this plot, the axis titles clearly indicate that the *x* axis shows body mass in grams and the *y* axis shows head length in milimeters. Similarly, the legend titles show that point coloring indicates the birds' sex and point size indicates the birds' skull size in milimeters. I emphasize that for all numerical variables (body mass, head length, and skull size) the relevant titles not only state the variables shown but also the units in which the variables are measured. This is good practice and should be done whenever possible. Categorical variables (such as sex) do not require units.

(ref:blue-jays-scatter-bubbles2) Head length versus body mass for 123 blue jays. The birds' sex is indicated by color, and the birds' skull size by symbol size. Head-length measurements include the length of the bill while skull-size measurements do not. Data source: Keith Tarvin, Oberlin College

Expand Down Expand Up @@ -187,7 +187,7 @@ price_plot_base <- ggplot(tech_stocks, aes(x = date, y = price_indexed, color =
price_plot_base + xlab(NULL) + ylab("stock price, indexed")
```

However, we have to be careful when omitting axis or legend titles, because it is easy to misjudge what is and isn't obvious from the context. I frequently see graphs in the popular press that push omitting axis titles to a point that I wouldn't be comfortable with. For example, some publications might produce a figure such as Figure \@ref(fig:tech-stocks-minimal-labeling-bad), assuming that the meaning of the axes is clear from the plot title and subtitle (here: "stock price over time for four major tech companies" and "the stock price for each company has been normalized to equal 100 in June 2012"). I disagree with the perspective that context clearly defines the axes. Most importantly, I think that it is generally a bad design principle to make your readers guess what you mean. And, because the caption typically doesn't include words such as "the *x*/*y* axis shows", some amount of guesswork is always required to interpret the figure. In my own experience, figures without properly labeled axes tend to leave me with a nagging feeling of uncertainty---even if I'm 95% certain I understand what is shown, I don't feel 100% certain. As the designer of a figure, why would you want to create a feeling of uncertainty in your readers?
However, we have to be careful when omitting axis or legend titles, because it is easy to misjudge what is and isn't obvious from the context. I frequently see graphs in the popular press that push omitting axis titles to a point that would make me uncomfortable. For example, some publications might produce a figure such as Figure \@ref(fig:tech-stocks-minimal-labeling-bad), assuming that the meaning of the axes is clear from the plot title and subtitle (here: "stock price over time for four major tech companies" and "the stock price for each company has been normalized to equal 100 in June 2012"). I disagree with the perspective that context clearly defines the axes. Because the caption typically doesn't include words such as "the *x*/*y* axis shows", some amount of guesswork is always required to interpret the figure. In my own experience, figures without properly labeled axes tend to leave me with a nagging feeling of uncertainty---even if I'm 95% certain I understand what is shown, I don't feel 100% certain. As a general principle, I think it is a bad practice to make your readers guess what you mean. Why would you want to create a feeling of uncertainty in your readers?

(ref:tech-stocks-minimal-labeling-bad) Stock price over time for four major tech companies. The stock price for each company has been normalized to equal 100 in June 2012. This variant of Figure \@ref(fig:tech-stocks-minimal-labeling) has been labeled as "bad" because the *y* axis now does not have a title either, and what the values shown along the *y* axis represent is not immediately obvious from the context.

Expand Down Expand Up @@ -220,7 +220,7 @@ Pie charts typically don't have explicit axes (e.g., Figure \@ref(fig:bundestag-

## Tables

Tables are an important tool for visualizing data. Yet because of their apparent simplicity, they may not always receive the attention they need. I have shown a handful of tables throughout this book, for example Tables \@ref(tab:boxoffice-gross), \@ref(tab:titanic-ages), and \@ref(tab:color-codes). Take a moment and locate these tables, look how they are formatted, and compare them to a table you or a colleague has recently made. In all likelihood, there are important differences. In my experience, absent proper training in table formatting few people will instinctively make the right formatting choices, and poorly formatted tables are even more prevalent than poorly designed figures in self-published documents. In addition, most software commonly used to create tables provides defaults that are not recommended. For example, my version of Microsoft Word provides 105 pre-defined table styles, and of these at least 70--80 violate some of the table rules I'm going to discuss here. So if you pick a Microsoft Word table layout at random, you have an 80% chance of picking one that has issues. And if you pick the default, you will end up with poorly formatted tables every time.
Tables are an important tool for visualizing data. Yet because of their apparent simplicity, they may not always receive the attention they need. I have shown a handful of tables throughout this book, for example Tables \@ref(tab:boxoffice-gross), \@ref(tab:titanic-ages), and \@ref(tab:color-codes). Take a moment and locate these tables, look how they are formatted, and compare them to a table you or a colleague has recently made. In all likelihood, there are important differences. In my experience, absent proper training in table formatting few people will instinctively make the right formatting choices, and in self-published documents poorly formatted tables are even more prevalent than poorly designed figures. Also, most software commonly used to create tables provides defaults that are not recommended. For example, my version of Microsoft Word provides 105 pre-defined table styles, and of these at least 70--80 violate some of the table rules I'm going to discuss here. So if you pick a Microsoft Word table layout at random, you have an 80% chance of picking one that has issues. And if you pick the default, you will end up with poorly formatted tables every time.

Some key rules for table layout are the following:

Expand Down
Loading

0 comments on commit 4af9fb8

Please sign in to comment.