Title: | Extension to 'ggplot2' |
---|---|
Description: | The R package 'ggplot2' is a plotting system based on the grammar of graphics. 'GGally' extends 'ggplot2' by adding several functions to reduce the complexity of combining geometric objects with transformed data. Some of these functions include a pairwise plot matrix, a two group pairwise plot matrix, a parallel coordinates plot, a survival plot, and several functions to plot networks. |
Authors: | Barret Schloerke [aut, cre], Di Cook [aut, ths], Joseph Larmarange [aut], Francois Briatte [aut], Moritz Marbach [aut], Edwin Thoen [aut], Amos Elberg [aut], Ott Toomet [ctb], Jason Crowley [aut], Heike Hofmann [ths], Hadley Wickham [ths] |
Maintainer: | Barret Schloerke <[email protected]> |
License: | GPL (>= 2.0) |
Version: | 2.2.1.9000 |
Built: | 2024-11-11 05:16:31 UTC |
Source: | https://github.com/ggobi/ggally |
ggmatrix
object by adding an ggplot2 object to all plotsThis operator allows you to add ggplot2 objects to a ggmatrix
object.
## S3 method for class 'gg' e1 + e2 add_to_ggmatrix(e1, e2, location = NULL, rows = NULL, cols = NULL)
## S3 method for class 'gg' e1 + e2 add_to_ggmatrix(e1, e2, location = NULL, rows = NULL, cols = NULL)
e1 |
An object of class |
e2 |
A component to add to |
location |
|
rows |
numeric vector of the rows to be used. Will be used with |
cols |
numeric vector of the cols to be used. Will be used with |
If the first object is an object of class ggmatrix
, you can add
the following types of objects, and it will return a modified ggplot2
object.
theme
: update plot theme
scale
: replace current scale
coord
: override current coordinate system
The +
operator completely replaces elements
with elements from e2.
add_to_ggmatrix
gives you more control to modify
only some subplots. This function may be replaced and/or removed in the future.
ggplot2::+.gg and ggplot2::theme()
# small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) pm <- ggpairs(tips[, 2:4], ggplot2::aes(color = sex)) ## change to black and white theme pm + ggplot2::theme_bw() ## change to linedraw theme p_(pm + ggplot2::theme_linedraw()) ## change to custom theme p_(pm + ggplot2::theme(panel.background = ggplot2::element_rect(fill = "lightblue"))) ## add a list of information extra <- list(ggplot2::theme_bw(), ggplot2::labs(caption = "My caption!")) p_(pm + extra) ## modify scale p_(pm + scale_fill_brewer(type = "qual")) ## only first row p_(add_to_ggmatrix(pm, scale_fill_brewer(type = "qual"), rows = 1:2)) ## only second col p_(add_to_ggmatrix(pm, scale_fill_brewer(type = "qual"), cols = 2:3)) ## only to upper triangle of plot matrix p_(add_to_ggmatrix( pm, scale_fill_brewer(type = "qual"), location = "upper" ))
# small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) pm <- ggpairs(tips[, 2:4], ggplot2::aes(color = sex)) ## change to black and white theme pm + ggplot2::theme_bw() ## change to linedraw theme p_(pm + ggplot2::theme_linedraw()) ## change to custom theme p_(pm + ggplot2::theme(panel.background = ggplot2::element_rect(fill = "lightblue"))) ## add a list of information extra <- list(ggplot2::theme_bw(), ggplot2::labs(caption = "My caption!")) p_(pm + extra) ## modify scale p_(pm + scale_fill_brewer(type = "qual")) ## only first row p_(add_to_ggmatrix(pm, scale_fill_brewer(type = "qual"), rows = 1:2)) ## only second col p_(add_to_ggmatrix(pm, scale_fill_brewer(type = "qual"), cols = 2:3)) ## only to upper triangle of plot matrix p_(add_to_ggmatrix( pm, scale_fill_brewer(type = "qual"), location = "upper" ))
Add reference boxes around each cell of the glyphmap.
add_ref_boxes( data, var_fill = NULL, color = "white", size = 0.5, fill = NA, ... )
add_ref_boxes( data, var_fill = NULL, color = "white", size = 0.5, fill = NA, ... )
data |
A glyphmap structure. |
var_fill |
Variable name to use to set the fill color |
color |
Set the color to draw in, default is "white" |
size |
Set the line size, default is 0.5 |
fill |
fill value used if |
... |
other arguments passed onto |
Add reference lines for each cell of the glyphmap.
add_ref_lines(data, color = "white", size = 1.5, ...)
add_ref_lines(data, color = "white", size = 1.5, ...)
data |
A glyphmap structure. |
color |
Set the color to draw in, default is "white" |
size |
Set the line size, default is 1.5 |
... |
other arguments passed onto |
About PISA
data(australia_PISA2012)
data(australia_PISA2012)
A data frame with 8247 rows and 32 variables
The Programme for International Student Assessment (PISA) is a triennial international survey which aims to evaluate education systems worldwide by testing the skills and knowledge of 15-year-old students. To date, students representing more than 70 economies have participated in the assessment.
While 65 economies took part in the 2012 study, this data set only contains information from the country of Australia.
gender : Factor w/ 2 levels "female","male": 1 1 2 2 2 1 1 1 2 1 ...
age : Factor w/ 4 levels "4","5","6","7": 2 2 2 4 3 1 2 2 2 2 ...
homework : num 5 5 9 3 2 3 4 3 5 1 ...
desk : num 1 0 1 1 1 1 1 1 1 1 ...
room : num 1 1 1 1 1 1 1 1 1 1 ...
study : num 1 1 1 1 1 1 1 1 1 1 ...
computer : num 1 1 1 1 1 1 1 1 1 1 ...
software : num 1 1 1 1 1 1 1 1 1 1 ...
internet : num 1 1 1 1 1 1 1 1 1 1 ...
literature : num 0 0 1 0 1 1 1 1 1 0 ...
poetry : num 0 0 1 0 1 1 0 1 1 1 ...
art : num 1 0 1 0 1 1 0 1 1 1 ...
textbook : num 1 1 1 1 1 0 1 1 1 1 ...
dictionary : num 1 1 1 1 1 1 1 1 1 1 ...
dishwasher : num 1 1 1 1 0 1 1 1 1 1 ...
PV1MATH : num 562 565 602 520 613 ...
PV2MATH : num 569 557 594 507 567 ...
PV3MATH : num 555 553 552 501 585 ...
PV4MATH : num 579 538 526 521 596 ...
PV5MATH : num 548 573 619 547 603 ...
PV1READ : num 582 617 650 554 605 ...
PV2READ : num 571 572 608 560 557 ...
PV3READ : num 602 560 594 517 627 ...
PV4READ : num 572 564 575 564 597 ...
PV5READ : num 585 565 620 572 598 ...
PV1SCIE : num 583 627 668 574 639 ...
PV2SCIE : num 579 600 665 612 635 ...
PV3SCIE : num 593 574 620 571 666 ...
PV4SCIE : num 567 582 592 598 700 ...
PV5SCIE : num 587 625 656 662 670 ...
SENWGT_STU : num 0.133 0.133 0.141 0.141 0.141 ...
possessions: num 10 8 12 9 11 11 10 12 12 11 ...
https://www.oecd.org/pisa/pisaproducts/database-cbapisa2012.htm
This data frame contains batting statistics for a subset of players collected from http://www.baseball-databank.org/. There are a total of 21,699 records, covering 1,228 players from 1871 to 2007. Only players with more 15 seasons of play are included.
baseball
baseball
A 21699 x 22 data frame
Variables:
id, unique player id
year, year of data
stint
team, team played for
lg, league
g, number of games
ab, number of times at bat
r, number of runs
h, hits, times reached base because of a batted, fair ball without error by the defense
X2b, hits on which the batter reached second base safely
X3b, hits on which the batter reached third base safely
hr, number of home runs
rbi, runs batted in
sb, stolen bases
cs, caught stealing
bb, base on balls (walk)
so, strike outs
ibb, intentional base on balls
hbp, hits by pitch
sh, sacrifice hits
sf, sacrifice flies
gidp, ground into double play
http://www.baseball-databank.org/
RColorBrewer Set1 colors
brew_colors(col)
brew_colors(col)
col |
standard color name used to retrieve hex color value |
broom::augment a model and add broom::glance and broom::tidy output as attributes. X and Y variables are also added.
broomify(model, lmStars = TRUE)
broomify(model, lmStars = TRUE)
model |
model to be sent to |
lmStars |
boolean that determines if stars are added to labels |
broom::augmented data frame with the broom::glance data.frame and broom::tidy data.frame as 'broom_glance' and 'broom_tidy' attributes respectively. var_x
and var_y
variables are also added as attributes
data(mtcars) model <- stats::lm(mpg ~ wt + qsec + am, data = mtcars) broomified_model <- broomify(model) str(broomified_model)
data(mtcars) model <- stats::lm(mpg ~ wt + qsec + am, data = mtcars) broomified_model <- broomify(model) str(broomified_model)
Evaluate data column
eval_data_col(data, aes_col)
eval_data_col(data, aes_col)
data |
data set to evaluate the data with |
aes_col |
Single value from an |
Aes mapping with the x and y values switched
mapping <- ggplot2::aes(Petal.Length) eval_data_col(iris, mapping$x)
mapping <- ggplot2::aes(Petal.Length) eval_data_col(iris, mapping$x)
This data contains physical measurements on three species of flea beetles.
data(flea)
data(flea)
A data frame with 74 rows and 7 variables
species Ch. concinna, Ch. heptapotamica, Ch. heikertingeri
tars1 width of the first joint of the first tarsus in microns
tars2 width of the second joint of the first tarsus in microns
head the maximal width of the head between the external edges of the eyes in 0.01 mm
aede1 the maximal width of the aedeagus in the fore-part in microns
aede2 the front angle of the aedeagus (1 unit = 7.5 degrees)
aede3 the aedeagus width from the side in microns
Lubischew, A. A. (1962), On the Use of Discriminant Functions in Taxonomy, Biometrics 18:455-477.
Function that allows you to call different functions based upon an aesthetic variable value.
fn_switch(types, mapping_val = "y")
fn_switch(types, mapping_val = "y")
types |
list of functions that follow the |
mapping_val |
mapping value to switch on. Defaults to the 'y' variable of the aesthetics list. |
ggnostic_continuous_fn <- fn_switch(list( default = ggally_points, .fitted = ggally_points, .se.fit = ggally_nostic_se_fit, .resid = ggally_nostic_resid, .hat = ggally_nostic_hat, .sigma = ggally_nostic_sigma, .cooksd = ggally_nostic_cooksd, .std.resid = ggally_nostic_std_resid )) ggnostic_combo_fn <- fn_switch(list( default = ggally_box_no_facet, fitted = ggally_box_no_facet, .se.fit = ggally_nostic_se_fit, .resid = ggally_nostic_resid, .hat = ggally_nostic_hat, .sigma = ggally_nostic_sigma, .cooksd = ggally_nostic_cooksd, .std.resid = ggally_nostic_std_resid ))
ggnostic_continuous_fn <- fn_switch(list( default = ggally_points, .fitted = ggally_points, .se.fit = ggally_nostic_se_fit, .resid = ggally_nostic_resid, .hat = ggally_nostic_hat, .sigma = ggally_nostic_sigma, .cooksd = ggally_nostic_cooksd, .std.resid = ggally_nostic_std_resid )) ggnostic_combo_fn <- fn_switch(list( default = ggally_box_no_facet, fitted = ggally_box_no_facet, .se.fit = ggally_nostic_se_fit, .resid = ggally_nostic_resid, .hat = ggally_nostic_hat, .sigma = ggally_nostic_sigma, .cooksd = ggally_nostic_cooksd, .std.resid = ggally_nostic_std_resid ))
ggmatrix
objectRetrieves the ggplot object at the desired location.
getPlot(pm, i, j) ## S3 method for class 'ggmatrix' pm[i, j, ...]
getPlot(pm, i, j) ## S3 method for class 'ggmatrix' pm[i, j, ...]
pm |
|
i |
row from the top |
j |
column from the left |
... |
ignored |
Barret Schloerke
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) plotMatrix2 <- ggpairs(tips[, 3:2], upper = list(combo = "denstrip")) p_(plotMatrix2[1, 2])
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) plotMatrix2 <- ggpairs(tips[, 3:2], upper = list(combo = "denstrip")) p_(plotMatrix2[1, 2])
Make scatterplots compatible with both continuous and categorical variables
using geom_autopoint
from package ggforce.
ggally_autopoint(data, mapping, ...) ggally_autopointDiag(data, mapping, ...)
ggally_autopoint(data, mapping, ...) ggally_autopointDiag(data, mapping, ...)
data |
data set using |
mapping |
aesthetics being used |
... |
other arguments passed to |
Joseph Larmarange
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) p_(ggally_autopoint(tips, mapping = aes(x = tip, y = total_bill))) p_(ggally_autopoint(tips, mapping = aes(x = tip, y = sex))) p_(ggally_autopoint(tips, mapping = aes(x = smoker, y = sex))) p_(ggally_autopoint(tips, mapping = aes(x = smoker, y = sex, color = day))) p_(ggally_autopoint(tips, mapping = aes(x = smoker, y = sex), size = 8)) p_(ggally_autopoint(tips, mapping = aes(x = smoker, y = sex), alpha = .9)) p_(ggpairs( tips, mapping = aes(colour = sex), upper = list(discrete = "autopoint", combo = "autopoint", continuous = "autopoint"), diag = list(discrete = "autopointDiag", continuous = "autopointDiag") ))
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) p_(ggally_autopoint(tips, mapping = aes(x = tip, y = total_bill))) p_(ggally_autopoint(tips, mapping = aes(x = tip, y = sex))) p_(ggally_autopoint(tips, mapping = aes(x = smoker, y = sex))) p_(ggally_autopoint(tips, mapping = aes(x = smoker, y = sex, color = day))) p_(ggally_autopoint(tips, mapping = aes(x = smoker, y = sex), size = 8)) p_(ggally_autopoint(tips, mapping = aes(x = smoker, y = sex), alpha = .9)) p_(ggpairs( tips, mapping = aes(colour = sex), upper = list(discrete = "autopoint", combo = "autopoint", continuous = "autopoint"), diag = list(discrete = "autopointDiag", continuous = "autopointDiag") ))
Displays a bar plot for the diagonal of a ggpairs
plot matrix.
ggally_barDiag(data, mapping, ..., rescale = FALSE)
ggally_barDiag(data, mapping, ..., rescale = FALSE)
data |
data set using |
mapping |
aesthetics being used |
... |
other arguments are sent to geom_bar |
rescale |
boolean to decide whether or not to rescale the count output. Only applies to numeric data |
Barret Schloerke
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) p_(ggally_barDiag(tips, mapping = ggplot2::aes(x = day))) p_(ggally_barDiag(tips, mapping = ggplot2::aes(x = tip), binwidth = 0.25))
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) p_(ggally_barDiag(tips, mapping = ggplot2::aes(x = day))) p_(ggally_barDiag(tips, mapping = ggplot2::aes(x = tip), binwidth = 0.25))
Draws nothing.
ggally_blank(...) ggally_blankDiag(...)
ggally_blank(...) ggally_blankDiag(...)
... |
other arguments ignored |
Makes a "blank" ggplot object that will only draw white space
Barret Schloerke
Make a box plot with a given data set. ggally_box_no_facet
will be a single panel plot, while ggally_box
will be a faceted plot
ggally_box(data, mapping, ...) ggally_box_no_facet(data, mapping, ...)
ggally_box(data, mapping, ...) ggally_box_no_facet(data, mapping, ...)
data |
data set using |
mapping |
aesthetics being used |
... |
other arguments being supplied to geom_boxplot |
Barret Schloerke
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) p_(ggally_box(tips, mapping = ggplot2::aes(x = total_bill, y = sex))) p_(ggally_box( tips, mapping = ggplot2::aes(sex, total_bill, color = sex), outlier.colour = "red", outlier.shape = 13, outlier.size = 8 ))
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) p_(ggally_box(tips, mapping = ggplot2::aes(x = total_bill, y = sex))) p_(ggally_box( tips, mapping = ggplot2::aes(sex, total_bill, color = sex), outlier.colour = "red", outlier.shape = 13, outlier.size = 8 ))
Plot column or row percentage using bar plots.
ggally_colbar( data, mapping, label_format = scales::label_percent(accuracy = 0.1), ..., remove_background = FALSE, remove_percentage_axis = FALSE, reverse_fill_levels = FALSE, geom_bar_args = NULL ) ggally_rowbar( data, mapping, label_format = scales::label_percent(accuracy = 0.1), ..., remove_background = FALSE, remove_percentage_axis = FALSE, reverse_fill_levels = TRUE, geom_bar_args = NULL )
ggally_colbar( data, mapping, label_format = scales::label_percent(accuracy = 0.1), ..., remove_background = FALSE, remove_percentage_axis = FALSE, reverse_fill_levels = FALSE, geom_bar_args = NULL ) ggally_rowbar( data, mapping, label_format = scales::label_percent(accuracy = 0.1), ..., remove_background = FALSE, remove_percentage_axis = FALSE, reverse_fill_levels = TRUE, geom_bar_args = NULL )
data |
data set using |
mapping |
aesthetics being used |
label_format |
formatter function for displaying proportions, not taken into account if a label aesthetic is provided in |
... |
other arguments passed to |
remove_background |
should the |
remove_percentage_axis |
should percentage axis be removed? Removes the y-axis for |
reverse_fill_levels |
should the levels of the fill variable be reversed? |
geom_bar_args |
other arguments passed to |
Joseph Larmarange
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) p_(ggally_colbar(tips, mapping = aes(x = smoker, y = sex))) p_(ggally_rowbar(tips, mapping = aes(x = smoker, y = sex))) # change labels' size p_(ggally_colbar(tips, mapping = aes(x = smoker, y = sex), size = 8)) # change labels' colour and use bold p_(ggally_colbar(tips, mapping = aes(x = smoker, y = sex), colour = "white", fontface = "bold" )) # display number of observations instead of proportions p_(ggally_colbar(tips, mapping = aes(x = smoker, y = sex, label = after_stat(count)))) # custom bar width p_(ggally_colbar(tips, mapping = aes(x = smoker, y = sex), geom_bar_args = list(width = .5))) # change format of labels p_(ggally_colbar(tips, mapping = aes(x = smoker, y = sex), label_format = scales::label_percent(accuracy = .01, decimal.mark = ",") )) p_(ggduo( data = as.data.frame(Titanic), mapping = aes(weight = Freq), columnsX = "Survived", columnsY = c("Sex", "Class", "Age"), types = list(discrete = "rowbar"), legend = 1 ))
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) p_(ggally_colbar(tips, mapping = aes(x = smoker, y = sex))) p_(ggally_rowbar(tips, mapping = aes(x = smoker, y = sex))) # change labels' size p_(ggally_colbar(tips, mapping = aes(x = smoker, y = sex), size = 8)) # change labels' colour and use bold p_(ggally_colbar(tips, mapping = aes(x = smoker, y = sex), colour = "white", fontface = "bold" )) # display number of observations instead of proportions p_(ggally_colbar(tips, mapping = aes(x = smoker, y = sex, label = after_stat(count)))) # custom bar width p_(ggally_colbar(tips, mapping = aes(x = smoker, y = sex), geom_bar_args = list(width = .5))) # change format of labels p_(ggally_colbar(tips, mapping = aes(x = smoker, y = sex), label_format = scales::label_percent(accuracy = .01, decimal.mark = ",") )) p_(ggduo( data = as.data.frame(Titanic), mapping = aes(weight = Freq), columnsX = "Survived", columnsY = c("Sex", "Class", "Age"), types = list(discrete = "rowbar"), legend = 1 ))
Estimate correlation from the given data. If a color variable is supplied, the correlation will also be calculated per group.
ggally_cor( data, mapping, ..., stars = TRUE, method = "pearson", use = "complete.obs", display_grid = FALSE, digits = 3, title_args = list(...), group_args = list(...), justify_labels = "right", align_percent = 0.5, title = "Corr", alignPercent = warning("deprecated. Use `align_percent`"), displayGrid = warning("deprecated. Use `display_grid`") )
ggally_cor( data, mapping, ..., stars = TRUE, method = "pearson", use = "complete.obs", display_grid = FALSE, digits = 3, title_args = list(...), group_args = list(...), justify_labels = "right", align_percent = 0.5, title = "Corr", alignPercent = warning("deprecated. Use `align_percent`"), displayGrid = warning("deprecated. Use `display_grid`") )
data |
data set using |
mapping |
aesthetics being used |
... |
other arguments being supplied to |
stars |
logical value which determines if the significance stars should be displayed. Given the
|
method |
|
use |
|
display_grid |
if |
digits |
number of digits to be displayed after the decimal point. See |
title_args |
arguments being supplied to the title's |
group_args |
arguments being supplied to the split-by-color group's |
justify_labels |
|
align_percent |
relative align position of the text. When |
title |
title text to be displayed |
alignPercent , displayGrid
|
deprecated. Please use their snake-case counterparts. |
Barret Schloerke
ggally_statistic
, ggally_cor_v1_5
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) p_(ggally_cor(tips, mapping = ggplot2::aes(total_bill, tip))) # display with grid p_(ggally_cor( tips, mapping = ggplot2::aes(total_bill, tip), display_grid = TRUE )) # change text attributes p_(ggally_cor( tips, mapping = ggplot2::aes(x = total_bill, y = tip), size = 15, colour = I("red"), title = "Correlation" )) # split by a variable p_(ggally_cor( tips, mapping = ggplot2::aes(total_bill, tip, color = sex), size = 5 ))
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) p_(ggally_cor(tips, mapping = ggplot2::aes(total_bill, tip))) # display with grid p_(ggally_cor( tips, mapping = ggplot2::aes(total_bill, tip), display_grid = TRUE )) # change text attributes p_(ggally_cor( tips, mapping = ggplot2::aes(x = total_bill, y = tip), size = 15, colour = I("red"), title = "Correlation" )) # split by a variable p_(ggally_cor( tips, mapping = ggplot2::aes(total_bill, tip, color = sex), size = 5 ))
(Deprecated. See ggally_cor
.)
ggally_cor_v1_5( data, mapping, alignPercent = 0.6, method = "pearson", use = "complete.obs", corAlignPercent = NULL, corMethod = NULL, corUse = NULL, displayGrid = TRUE, ... )
ggally_cor_v1_5( data, mapping, alignPercent = 0.6, method = "pearson", use = "complete.obs", corAlignPercent = NULL, corMethod = NULL, corUse = NULL, displayGrid = TRUE, ... )
data |
data set using |
mapping |
aesthetics being used |
alignPercent |
right align position of numbers. Default is 60 percent across the horizontal |
method |
|
use |
|
corAlignPercent |
deprecated. Use parameter |
corMethod |
deprecated. Use parameter |
corUse |
deprecated. Use parameter |
displayGrid |
if TRUE, display aligned panel gridlines |
... |
other arguments being supplied to geom_text |
Estimate correlation from the given data.
Barret Schloerke
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) p_(ggally_cor_v1_5(tips, mapping = ggplot2::aes(total_bill, tip))) # display with no grid p_(ggally_cor_v1_5( tips, mapping = ggplot2::aes(total_bill, tip), displayGrid = FALSE )) # change text attributes p_(ggally_cor_v1_5( tips, mapping = ggplot2::aes(x = total_bill, y = tip), size = 15, colour = I("red") )) # split by a variable p_(ggally_cor_v1_5( tips, mapping = ggplot2::aes(total_bill, tip, color = sex), size = 5 ))
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) p_(ggally_cor_v1_5(tips, mapping = ggplot2::aes(total_bill, tip))) # display with no grid p_(ggally_cor_v1_5( tips, mapping = ggplot2::aes(total_bill, tip), displayGrid = FALSE )) # change text attributes p_(ggally_cor_v1_5( tips, mapping = ggplot2::aes(x = total_bill, y = tip), size = 15, colour = I("red") )) # split by a variable p_(ggally_cor_v1_5( tips, mapping = ggplot2::aes(total_bill, tip, color = sex), size = 5 ))
Plot the number of observations by using rectangles with proportional areas.
ggally_count(data, mapping, ...) ggally_countDiag(data, mapping, ...)
ggally_count(data, mapping, ...) ggally_countDiag(data, mapping, ...)
data |
data set using |
mapping |
aesthetics being used |
... |
other arguments passed to |
You can adjust the size of rectangles with the x.width
argument.
Joseph Larmarange
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) p_(ggally_count(tips, mapping = ggplot2::aes(x = smoker, y = sex))) p_(ggally_count(tips, mapping = ggplot2::aes(x = smoker, y = sex, fill = day))) p_(ggally_count( as.data.frame(Titanic), mapping = ggplot2::aes(x = Class, y = Survived, weight = Freq) )) p_(ggally_count( as.data.frame(Titanic), mapping = ggplot2::aes(x = Class, y = Survived, weight = Freq), x.width = 0.5 )) # Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive p_(ggally_countDiag(tips, mapping = ggplot2::aes(x = smoker))) p_(ggally_countDiag(tips, mapping = ggplot2::aes(x = smoker, fill = sex)))
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) p_(ggally_count(tips, mapping = ggplot2::aes(x = smoker, y = sex))) p_(ggally_count(tips, mapping = ggplot2::aes(x = smoker, y = sex, fill = day))) p_(ggally_count( as.data.frame(Titanic), mapping = ggplot2::aes(x = Class, y = Survived, weight = Freq) )) p_(ggally_count( as.data.frame(Titanic), mapping = ggplot2::aes(x = Class, y = Survived, weight = Freq), x.width = 0.5 )) # Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive p_(ggally_countDiag(tips, mapping = ggplot2::aes(x = smoker))) p_(ggally_countDiag(tips, mapping = ggplot2::aes(x = smoker, fill = sex)))
Plot the number of observations by using square points
with proportional areas. Could be filled according to chi-squared
statistics computed by stat_cross()
. Labels could also
be added (see examples).
ggally_cross(data, mapping, ..., scale_max_size = 20, geom_text_args = NULL)
ggally_cross(data, mapping, ..., scale_max_size = 20, geom_text_args = NULL)
data |
data set using |
mapping |
aesthetics being used |
... |
other arguments passed to |
scale_max_size |
|
geom_text_args |
other arguments passed to |
Joseph Larmarange
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) p_(ggally_cross(tips, mapping = aes(x = smoker, y = sex))) p_(ggally_cross(tips, mapping = aes(x = day, y = time))) # Custom max size p_(ggally_cross(tips, mapping = aes(x = smoker, y = sex)) + scale_size_area(max_size = 40)) # Custom fill p_(ggally_cross(tips, mapping = aes(x = smoker, y = sex), fill = "red")) # Custom shape p_(ggally_cross(tips, mapping = aes(x = smoker, y = sex), shape = 21)) # Fill squares according to standardized residuals d <- as.data.frame(Titanic) p_(ggally_cross( d, mapping = aes(x = Class, y = Survived, weight = Freq, fill = after_stat(std.resid)) ) + scale_fill_steps2(breaks = c(-3, -2, 2, 3), show.limits = TRUE)) # Add labels p_(ggally_cross( tips, mapping = aes( x = smoker, y = sex, colour = smoker, label = scales::percent(after_stat(prop)) ) )) # Customize labels' appearance and same size for all squares p_(ggally_cross( tips, mapping = aes( x = smoker, y = sex, size = NULL, # do not map size to a variable label = scales::percent(after_stat(prop)) ), size = 40, # fix value for points size fill = "darkblue", geom_text_args = list(colour = "white", fontface = "bold", size = 6) ))
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) p_(ggally_cross(tips, mapping = aes(x = smoker, y = sex))) p_(ggally_cross(tips, mapping = aes(x = day, y = time))) # Custom max size p_(ggally_cross(tips, mapping = aes(x = smoker, y = sex)) + scale_size_area(max_size = 40)) # Custom fill p_(ggally_cross(tips, mapping = aes(x = smoker, y = sex), fill = "red")) # Custom shape p_(ggally_cross(tips, mapping = aes(x = smoker, y = sex), shape = 21)) # Fill squares according to standardized residuals d <- as.data.frame(Titanic) p_(ggally_cross( d, mapping = aes(x = Class, y = Survived, weight = Freq, fill = after_stat(std.resid)) ) + scale_fill_steps2(breaks = c(-3, -2, 2, 3), show.limits = TRUE)) # Add labels p_(ggally_cross( tips, mapping = aes( x = smoker, y = sex, colour = smoker, label = scales::percent(after_stat(prop)) ) )) # Customize labels' appearance and same size for all squares p_(ggally_cross( tips, mapping = aes( x = smoker, y = sex, size = NULL, # do not map size to a variable label = scales::percent(after_stat(prop)) ), size = 40, # fix value for points size fill = "darkblue", geom_text_args = list(colour = "white", fontface = "bold", size = 6) ))
ggally_crosstable
is a variation of ggally_table
with few modifications: (i) table cells are drawn; (ii) x and y axis are not expanded (and therefore are not aligned with other ggally_*
plots); (iii) content and fill of cells can be easily controlled with dedicated arguments.
ggally_crosstable( data, mapping, cells = c("observed", "prop", "row.prop", "col.prop", "expected", "resid", "std.resid"), fill = c("none", "std.resid", "resid"), ..., geom_tile_args = list(colour = "grey50") )
ggally_crosstable( data, mapping, cells = c("observed", "prop", "row.prop", "col.prop", "expected", "resid", "std.resid"), fill = c("none", "std.resid", "resid"), ..., geom_tile_args = list(colour = "grey50") )
data |
data set using |
mapping |
aesthetics being used |
cells |
Which statistic should be displayed in table cells? |
fill |
Which statistic should be used for filling table cells? |
... |
other arguments passed to |
geom_tile_args |
other arguments passed to |
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) # differences with ggally_table() p_(ggally_table(tips, mapping = aes(x = day, y = time))) p_(ggally_crosstable(tips, mapping = aes(x = day, y = time))) # display column proportions p_(ggally_crosstable(tips, mapping = aes(x = day, y = sex), cells = "col.prop")) # display row proportions p_(ggally_crosstable(tips, mapping = aes(x = day, y = sex), cells = "row.prop")) # change size of text p_(ggally_crosstable(tips, mapping = aes(x = day, y = sex), size = 8)) # fill cells with standardized residuals p_(ggally_crosstable(tips, mapping = aes(x = day, y = sex), fill = "std.resid")) # change scale for fill p_(ggally_crosstable(tips, mapping = aes(x = day, y = sex), fill = "std.resid") + scale_fill_steps2(breaks = c(-2, 0, 2), show.limits = TRUE))
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) # differences with ggally_table() p_(ggally_table(tips, mapping = aes(x = day, y = time))) p_(ggally_crosstable(tips, mapping = aes(x = day, y = time))) # display column proportions p_(ggally_crosstable(tips, mapping = aes(x = day, y = sex), cells = "col.prop")) # display row proportions p_(ggally_crosstable(tips, mapping = aes(x = day, y = sex), cells = "row.prop")) # change size of text p_(ggally_crosstable(tips, mapping = aes(x = day, y = sex), size = 8)) # fill cells with standardized residuals p_(ggally_crosstable(tips, mapping = aes(x = day, y = sex), fill = "std.resid")) # change scale for fill p_(ggally_crosstable(tips, mapping = aes(x = day, y = sex), fill = "std.resid") + scale_fill_steps2(breaks = c(-2, 0, 2), show.limits = TRUE))
Make a 2D density plot from a given data.
ggally_density(data, mapping, ...)
ggally_density(data, mapping, ...)
data |
data set using |
mapping |
aesthetics being used |
... |
parameters sent to either stat_density2d or geom_density2d |
The aesthetic "fill" determines whether or not stat_density2d
(filled) or geom_density2d
(lines) is used.
Barret Schloerke
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) p_(ggally_density(tips, mapping = ggplot2::aes(x = total_bill, y = tip))) p_(ggally_density( tips, mapping = ggplot2::aes(total_bill, tip, fill = after_stat(level)) )) p_(ggally_density( tips, mapping = ggplot2::aes(total_bill, tip, fill = after_stat(level)) ) + ggplot2::scale_fill_gradient(breaks = c(0.05, 0.1, 0.15, 0.2)))
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) p_(ggally_density(tips, mapping = ggplot2::aes(x = total_bill, y = tip))) p_(ggally_density( tips, mapping = ggplot2::aes(total_bill, tip, fill = after_stat(level)) )) p_(ggally_density( tips, mapping = ggplot2::aes(total_bill, tip, fill = after_stat(level)) ) + ggplot2::scale_fill_gradient(breaks = c(0.05, 0.1, 0.15, 0.2)))
Displays a density plot for the diagonal of a ggpairs
plot matrix.
ggally_densityDiag(data, mapping, ..., rescale = FALSE)
ggally_densityDiag(data, mapping, ..., rescale = FALSE)
data |
data set using |
mapping |
aesthetics being used. |
... |
other arguments sent to stat_density |
rescale |
boolean to decide whether or not to rescale the count output |
Barret Schloerke
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) p_(ggally_densityDiag(tips, mapping = ggplot2::aes(x = total_bill))) p_(ggally_densityDiag(tips, mapping = ggplot2::aes(x = total_bill, color = day)))
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) p_(ggally_densityDiag(tips, mapping = ggplot2::aes(x = total_bill))) p_(ggally_densityDiag(tips, mapping = ggplot2::aes(x = total_bill, color = day)))
Displays a Tile Plot as densely as possible.
ggally_denstrip(data, mapping, ...)
ggally_denstrip(data, mapping, ...)
data |
data set using |
mapping |
aesthetics being used |
... |
other arguments being sent to stat_bin |
Barret Schloerke
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) p_(ggally_denstrip(tips, mapping = ggplot2::aes(x = total_bill, y = sex))) p_(ggally_denstrip( tips, mapping = ggplot2::aes(sex, tip), binwidth = 0.2 ) + ggplot2::scale_fill_gradient(low = "grey80", high = "black"))
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) p_(ggally_denstrip(tips, mapping = ggplot2::aes(x = total_bill, y = sex))) p_(ggally_denstrip( tips, mapping = ggplot2::aes(sex, tip), binwidth = 0.2 ) + ggplot2::scale_fill_gradient(low = "grey80", high = "black"))
This function is used when axisLabels == "internal"
.
ggally_diagAxis( data, mapping, label = mapping$x, labelSize = 5, labelXPercent = 0.5, labelYPercent = 0.55, labelHJust = 0.5, labelVJust = 0.5, gridLabelSize = 4, ... )
ggally_diagAxis( data, mapping, label = mapping$x, labelSize = 5, labelXPercent = 0.5, labelYPercent = 0.55, labelHJust = 0.5, labelVJust = 0.5, gridLabelSize = 4, ... )
data |
dataset being plotted |
mapping |
aesthetics being used (x is the variable the plot will be made for) |
label |
title to be displayed in the middle. Defaults to |
labelSize |
size of variable label |
labelXPercent |
percent of horizontal range |
labelYPercent |
percent of vertical range |
labelHJust |
hjust supplied to label |
labelVJust |
vjust supplied to label |
gridLabelSize |
size of grid labels |
... |
other arguments for geom_text |
Jason Crowley and Barret Schloerke
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) p_(ggally_diagAxis(tips, ggplot2::aes(x = tip))) p_(ggally_diagAxis(tips, ggplot2::aes(x = sex)))
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) p_(ggally_diagAxis(tips, ggplot2::aes(x = tip))) p_(ggally_diagAxis(tips, ggplot2::aes(x = sex)))
Add jittering with the box plot. ggally_dot_no_facet
will be a single panel plot, while ggally_dot
will be a faceted plot
ggally_dot(data, mapping, ...) ggally_dot_no_facet(data, mapping, ...)
ggally_dot(data, mapping, ...) ggally_dot_no_facet(data, mapping, ...)
data |
data set using |
mapping |
aesthetics being used |
... |
other arguments being supplied to geom_jitter |
Barret Schloerke
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) p_(ggally_dot(tips, mapping = ggplot2::aes(x = total_bill, y = sex))) p_(ggally_dot( tips, mapping = ggplot2::aes(sex, total_bill, color = sex) )) p_(ggally_dot( tips, mapping = ggplot2::aes(sex, total_bill, color = sex, shape = sex) ) + ggplot2::scale_shape(solid = FALSE))
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) p_(ggally_dot(tips, mapping = ggplot2::aes(x = total_bill, y = sex))) p_(ggally_dot( tips, mapping = ggplot2::aes(sex, total_bill, color = sex) )) p_(ggally_dot( tips, mapping = ggplot2::aes(sex, total_bill, color = sex, shape = sex) ) + ggplot2::scale_shape(solid = FALSE))
X variables are plotted using geom_bar
and are faceted by the Y variable.
ggally_facetbar(data, mapping, ...)
ggally_facetbar(data, mapping, ...)
data |
data set using |
mapping |
aesthetics being used |
... |
other arguments are sent to geom_bar |
Barret Schloerke
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) p_(ggally_facetbar(tips, ggplot2::aes(x = sex, y = smoker, fill = time))) p_(ggally_facetbar(tips, ggplot2::aes(x = smoker, y = sex, fill = time)))
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) p_(ggally_facetbar(tips, ggplot2::aes(x = sex, y = smoker, fill = time))) p_(ggally_facetbar(tips, ggplot2::aes(x = smoker, y = sex, fill = time)))
Make density plots by displaying subsets of the data in different panels.
ggally_facetdensity(data, mapping, ...)
ggally_facetdensity(data, mapping, ...)
data |
data set using |
mapping |
aesthetics being used |
... |
other arguments being sent to stat_density |
Barret Schloerke
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) p_(ggally_facetdensity(tips, mapping = ggplot2::aes(x = total_bill, y = sex))) p_(ggally_facetdensity( tips, mapping = ggplot2::aes(sex, total_bill, color = sex) ))
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) p_(ggally_facetdensity(tips, mapping = ggplot2::aes(x = total_bill, y = sex))) p_(ggally_facetdensity( tips, mapping = ggplot2::aes(sex, total_bill, color = sex) ))
Make tile plot or density plot as compact as possible.
ggally_facetdensitystrip(data, mapping, ..., den_strip = FALSE)
ggally_facetdensitystrip(data, mapping, ..., den_strip = FALSE)
data |
data set using |
mapping |
aesthetics being used |
... |
other arguments being sent to either geom_histogram or stat_density |
den_strip |
boolean to decide whether or not to plot a density strip(TRUE) or a facet density(FALSE) plot. |
Barret Schloerke
example(ggally_facetdensity) example(ggally_denstrip)
example(ggally_facetdensity) example(ggally_denstrip)
Display subsetted histograms of the data in different panels.
ggally_facethist(data, mapping, ...)
ggally_facethist(data, mapping, ...)
data |
data set using |
mapping |
aesthetics being used |
... |
parameters sent to stat_bin() |
Barret Schloerke
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) p_(ggally_facethist(tips, mapping = ggplot2::aes(x = tip, y = sex))) p_(ggally_facethist(tips, mapping = ggplot2::aes(x = tip, y = sex), binwidth = 0.1))
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) p_(ggally_facethist(tips, mapping = ggplot2::aes(x = tip, y = sex))) p_(ggally_facethist(tips, mapping = ggplot2::aes(x = tip, y = sex), binwidth = 0.1))
Draws a large NA
in the middle of the plotting area. This plot is useful when all X or Y data is NA
ggally_na(data = NULL, mapping = NULL, size = 10, color = "grey20", ...) ggally_naDiag(...)
ggally_na(data = NULL, mapping = NULL, size = 10, color = "grey20", ...) ggally_naDiag(...)
data |
ignored |
mapping |
ignored |
size |
size of the geom_text 'NA' |
color |
color of the geom_text 'NA' |
... |
other arguments sent to geom_text |
Barret Schloerke
ggnostic
Cook's distanceA function to display stats::cooks.distance()
.
ggally_nostic_cooksd( data, mapping, ..., linePosition = pf(0.5, length(attr(data, "var_x")), nrow(data) - length(attr(data, "var_x"))), lineColor = brew_colors("grey"), lineType = 2 )
ggally_nostic_cooksd( data, mapping, ..., linePosition = pf(0.5, length(attr(data, "var_x")), nrow(data) - length(attr(data, "var_x"))), lineColor = brew_colors("grey"), lineType = 2 )
data , mapping , ... , lineColor , lineType
|
parameters supplied to |
linePosition |
4 / n is the general cutoff point for Cook's Distance |
A line is added at to display the general cutoff point for Cook's Distance.
Reference: Michael H. Kutner, Christopher J. Nachtsheim, John Neter, and William Li. Applied linear statistical models. The McGraw-Hill / Irwin series operations and decision sciences. McGraw-Hill Irwin, 2005, p. 403
ggplot2 plot object
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive dt <- broomify(stats::lm(mpg ~ wt + qsec + am, data = mtcars)) p_(ggally_nostic_cooksd(dt, ggplot2::aes(wt, .cooksd)))
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive dt <- broomify(stats::lm(mpg ~ wt + qsec + am, data = mtcars)) p_(ggally_nostic_cooksd(dt, ggplot2::aes(wt, .cooksd)))
ggnostic
leverage pointsA function to display stats::influence's hat information against a given explanatory variable.
ggally_nostic_hat( data, mapping, ..., linePosition = 2 * sum(eval_data_col(data, mapping$y))/nrow(data), lineColor = brew_colors("grey"), lineSize = 0.5, lineAlpha = 1, lineType = 2, avgLinePosition = sum(eval_data_col(data, mapping$y))/nrow(data), avgLineColor = brew_colors("grey"), avgLineSize = lineSize, avgLineAlpha = lineAlpha, avgLineType = 1 )
ggally_nostic_hat( data, mapping, ..., linePosition = 2 * sum(eval_data_col(data, mapping$y))/nrow(data), lineColor = brew_colors("grey"), lineSize = 0.5, lineAlpha = 1, lineType = 2, avgLinePosition = sum(eval_data_col(data, mapping$y))/nrow(data), avgLineColor = brew_colors("grey"), avgLineSize = lineSize, avgLineAlpha = lineAlpha, avgLineType = 1 )
data , mapping , ...
|
supplied directly to |
linePosition , lineColor , lineSize , lineAlpha , lineType
|
parameters supplied to
|
avgLinePosition , avgLineColor , avgLineSize , avgLineAlpha , avgLineType
|
parameters supplied
to |
As stated in stats::influence()
documentation:
hat: a vector containing the diagonal of the 'hat' matrix.
The diagonal elements of the 'hat' matrix describe the influence each response value has on the fitted value for that same observation.
A suggested "cutoff" line is added to the plot at a height of 2 * p / n and an expected line at a height of p / n.
If either linePosition
or avgLinePosition
is NULL
, the respective line will not be drawn.
ggplot2 plot object
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive dt <- broomify(stats::lm(mpg ~ wt + qsec + am, data = mtcars)) p_(ggally_nostic_hat(dt, ggplot2::aes(wt, .hat)))
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive dt <- broomify(stats::lm(mpg ~ wt + qsec + am, data = mtcars)) p_(ggally_nostic_hat(dt, ggplot2::aes(wt, .hat)))
ggnostic
background line with geomIf a non-null linePosition
value is given, a line will be drawn before the given continuous_geom
or combo_geom
is added to the plot.
ggally_nostic_line( data, mapping, ..., linePosition = NULL, lineColor = "red", lineSize = 0.5, lineAlpha = 1, lineType = 1, continuous_geom = ggplot2::geom_point, combo_geom = ggplot2::geom_boxplot, mapColorToFill = TRUE )
ggally_nostic_line( data, mapping, ..., linePosition = NULL, lineColor = "red", lineSize = 0.5, lineAlpha = 1, lineType = 1, continuous_geom = ggplot2::geom_point, combo_geom = ggplot2::geom_boxplot, mapColorToFill = TRUE )
data , mapping
|
supplied directly to |
... |
parameters supplied to |
linePosition , lineColor , lineSize , lineAlpha , lineType
|
parameters supplied to
|
continuous_geom |
ggplot2 geom that is executed after the line is (possibly) added and if the x data is continuous |
combo_geom |
ggplot2 geom that is executed after the line is (possibly) added and if the x data is discrete |
mapColorToFill |
boolean to determine if combo plots should cut the color mapping to the fill mapping |
Functions with a color in their name have different default color behavior.
ggplot2 plot object
ggnostic
residualsIf non-null pVal
and sigma
values are given, confidence interval lines will be added to the plot at the specified pVal
percentiles of a N(0, sigma) distribution.
ggally_nostic_resid( data, mapping, ..., linePosition = 0, lineColor = brew_colors("grey"), lineSize = 0.5, lineAlpha = 1, lineType = 1, lineConfColor = brew_colors("grey"), lineConfSize = lineSize, lineConfAlpha = lineAlpha, lineConfType = 2, pVal = c(0.025, 0.975), sigma = attr(data, "broom_glance")$sigma, se = TRUE, method = "auto", formula = y ~ x )
ggally_nostic_resid( data, mapping, ..., linePosition = 0, lineColor = brew_colors("grey"), lineSize = 0.5, lineAlpha = 1, lineType = 1, lineConfColor = brew_colors("grey"), lineConfSize = lineSize, lineConfAlpha = lineAlpha, lineConfType = 2, pVal = c(0.025, 0.975), sigma = attr(data, "broom_glance")$sigma, se = TRUE, method = "auto", formula = y ~ x )
data , mapping , ...
|
parameters supplied to |
linePosition , lineColor , lineSize , lineAlpha , lineType
|
parameters supplied to
|
lineConfColor , lineConfSize , lineConfAlpha , lineConfType
|
parameters supplied to the confidence interval lines |
pVal |
percentiles of a N(0, sigma) distribution to be drawn |
sigma |
sigma value for the |
se |
boolean to determine if the confidence intervals should be displayed |
method , formula
|
parameters supplied to |
ggplot2 plot object
stats::residuals
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive dt <- broomify(stats::lm(mpg ~ wt + qsec + am, data = mtcars)) p_(ggally_nostic_resid(dt, ggplot2::aes(wt, .resid)))
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive dt <- broomify(stats::lm(mpg ~ wt + qsec + am, data = mtcars)) p_(ggally_nostic_resid(dt, ggplot2::aes(wt, .resid)))
ggnostic
fitted value's standard errorA function to display stats::predict
's standard errors
ggally_nostic_se_fit( data, mapping, ..., lineColor = brew_colors("grey"), linePosition = NULL )
ggally_nostic_se_fit( data, mapping, ..., lineColor = brew_colors("grey"), linePosition = NULL )
data , mapping , ... , lineColor
|
parameters supplied to |
linePosition |
base comparison for a perfect fit |
As stated in stats::predict
documentation:
If the logical 'se.fit' is 'TRUE', standard errors of the predictions are calculated. If the numeric argument 'scale' is set (with optional ”df'), it is used as the residual standard deviation in the computation of the standard errors, otherwise this is extracted from the model fit.
Since the se.fit is TRUE
and scale is unset by default, the standard errors are extracted from the model fit.
A base line of 0 is added to give reference to a perfect fit.
ggplot2 plot object
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive dt <- broomify(stats::lm(mpg ~ wt + qsec + am, data = mtcars)) p_(ggally_nostic_se_fit(dt, ggplot2::aes(wt, .se.fit)))
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive dt <- broomify(stats::lm(mpg ~ wt + qsec + am, data = mtcars)) p_(ggally_nostic_se_fit(dt, ggplot2::aes(wt, .se.fit)))
ggnostic
leave one out model sigmaA function to display stats::influence()
's sigma value.
ggally_nostic_sigma( data, mapping, ..., lineColor = brew_colors("grey"), linePosition = attr(data, "broom_glance")$sigma )
ggally_nostic_sigma( data, mapping, ..., lineColor = brew_colors("grey"), linePosition = attr(data, "broom_glance")$sigma )
data , mapping , ... , lineColor
|
parameters supplied to |
linePosition |
line that is drawn in the background of the plot. Defaults to the overall model's sigma value. |
As stated in stats::influence()
documentation:
sigma: a vector whose i-th element contains the estimate of the residual standard deviation obtained when the i-th case is dropped from the regression. (The approximations needed for GLMs can result in this being 'NaN'.)
A line is added to display the overall model's sigma value. This gives a baseline for comparison
ggplot2 plot object
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive dt <- broomify(stats::lm(mpg ~ wt + qsec + am, data = mtcars)) p_(ggally_nostic_sigma(dt, ggplot2::aes(wt, .sigma)))
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive dt <- broomify(stats::lm(mpg ~ wt + qsec + am, data = mtcars)) p_(ggally_nostic_sigma(dt, ggplot2::aes(wt, .sigma)))
ggnostic
standardized residualsIf non-null pVal
and sigma
values are given, confidence interval lines will be added to the plot at the specified pVal
locations of a N(0, 1) distribution.
ggally_nostic_std_resid(data, mapping, ..., sigma = 1)
ggally_nostic_std_resid(data, mapping, ..., sigma = 1)
data , mapping , ...
|
parameters supplied to |
sigma |
sigma value for the |
ggplot2 plot object
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive dt <- broomify(stats::lm(mpg ~ wt + qsec + am, data = mtcars)) p_(ggally_nostic_std_resid(dt, ggplot2::aes(wt, .std.resid)))
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive dt <- broomify(stats::lm(mpg ~ wt + qsec + am, data = mtcars)) p_(ggally_nostic_std_resid(dt, ggplot2::aes(wt, .std.resid)))
Make a scatter plot with a given data set.
ggally_points(data, mapping, ...)
ggally_points(data, mapping, ...)
data |
data set using |
mapping |
aesthetics being used |
... |
other arguments are sent to geom_point |
Barret Schloerke
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(mtcars) p_(ggally_points(mtcars, mapping = ggplot2::aes(disp, hp))) p_(ggally_points(mtcars, mapping = ggplot2::aes(disp, hp))) p_(ggally_points( mtcars, mapping = ggplot2::aes( x = disp, y = hp, color = as.factor(cyl), size = gear ) ))
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(mtcars) p_(ggally_points(mtcars, mapping = ggplot2::aes(disp, hp))) p_(ggally_points(mtcars, mapping = ggplot2::aes(disp, hp))) p_(ggally_points( mtcars, mapping = ggplot2::aes( x = disp, y = hp, color = as.factor(cyl), size = gear ) ))
Plots the mosaic plot by using fluctuation.
ggally_ratio( data, mapping = ggplot2::aes(!!!stats::setNames(lapply(colnames(data)[1:2], as.name), c("x", "y"))), ..., floor = 0, ceiling = NULL )
ggally_ratio( data, mapping = ggplot2::aes(!!!stats::setNames(lapply(colnames(data)[1:2], as.name), c("x", "y"))), ..., floor = 0, ceiling = NULL )
data |
data set using |
mapping |
aesthetics being used. Only x and y will used and both are required |
... |
passed to |
floor |
don't display cells smaller than this value |
ceiling |
max value to scale frequencies. If any frequency is larger than the ceiling, the fill color is displayed darker than other rectangles |
Barret Schloerke
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) p_(ggally_ratio(tips, ggplot2::aes(sex, day))) p_(ggally_ratio(tips, ggplot2::aes(sex, day)) + ggplot2::coord_equal()) # only plot tiles greater or equal to 20 and scale to a max of 50 p_(ggally_ratio( tips, ggplot2::aes(sex, day), floor = 20, ceiling = 50 ) + ggplot2::theme(aspect.ratio = 4 / 2))
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) p_(ggally_ratio(tips, ggplot2::aes(sex, day))) p_(ggally_ratio(tips, ggplot2::aes(sex, day)) + ggplot2::coord_equal()) # only plot tiles greater or equal to 20 and scale to a max of 50 p_(ggally_ratio( tips, ggplot2::aes(sex, day), floor = 20, ceiling = 50 ) + ggplot2::theme(aspect.ratio = 4 / 2))
Add a smoothed condition mean with a given scatter plot.
ggally_smooth( data, mapping, ..., method = "lm", formula = y ~ x, se = TRUE, shrink = TRUE ) ggally_smooth_loess(data, mapping, ...) ggally_smooth_lm(data, mapping, ...)
ggally_smooth( data, mapping, ..., method = "lm", formula = y ~ x, se = TRUE, shrink = TRUE ) ggally_smooth_loess(data, mapping, ...) ggally_smooth_lm(data, mapping, ...)
data |
data set using |
mapping |
aesthetics being used |
method , se
|
parameters supplied to |
formula , ...
|
other arguments to add to geom_smooth |
shrink |
boolean to determine if y range is reduced to range of points or points and error ribbon |
Y limits are reduced to match original Y range with the goal of keeping the Y axis the same across plots.
Barret Schloerke
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) p_(ggally_smooth(tips, mapping = ggplot2::aes(x = total_bill, y = tip))) p_(ggally_smooth(tips, mapping = ggplot2::aes(total_bill, tip, color = sex)))
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) p_(ggally_smooth(tips, mapping = ggplot2::aes(x = total_bill, y = tip))) p_(ggally_smooth(tips, mapping = ggplot2::aes(total_bill, tip, color = sex)))
Generalized text display
ggally_statistic( data, mapping, text_fn, title, na.rm = NA, display_grid = FALSE, justify_labels = "right", justify_text = "left", sep = ": ", family = "mono", title_args = list(), group_args = list(), align_percent = 0.5, title_hjust = 0.5, group_hjust = 0.5 )
ggally_statistic( data, mapping, text_fn, title, na.rm = NA, display_grid = FALSE, justify_labels = "right", justify_text = "left", sep = ": ", family = "mono", title_args = list(), group_args = list(), align_percent = 0.5, title_hjust = 0.5, group_hjust = 0.5 )
data |
data set using |
mapping |
aesthetics being used |
text_fn |
function that takes in |
title |
title text to be displayed |
na.rm |
logical value which determines if |
display_grid |
if |
justify_labels |
|
justify_text |
|
sep |
separation value to be placed between the labels and text |
family |
font family used when displaying all text. This value will be set in |
title_args |
arguments being supplied to the title's |
group_args |
arguments being supplied to the split-by-color group's |
align_percent |
relative align position of the text. When |
title_hjust , group_hjust
|
|
Display summary statistics of a continuous variable for each value of a discrete variable.
ggally_summarise_by( data, mapping, text_fn = weighted_median_iqr, text_fn_vertical = NULL, ... ) weighted_median_iqr(x, weights = NULL) weighted_mean_sd(x, weights = NULL)
ggally_summarise_by( data, mapping, text_fn = weighted_median_iqr, text_fn_vertical = NULL, ... ) weighted_median_iqr(x, weights = NULL) weighted_mean_sd(x, weights = NULL)
data |
data set using |
mapping |
aesthetics being used |
text_fn |
function that takes an x and weights and returns a text string |
text_fn_vertical |
function that takes an x and weights and returns a text string, used when |
... |
other arguments passed to |
x |
a numeric vector |
weights |
an optional numeric vectors of weights. If |
weighted_median_iqr
computes weighted median and interquartile range.
weighted_mean_sd
computes weighted mean and standard deviation.
Joseph Larmarange
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive if (require(Hmisc)) { data(tips) p_(ggally_summarise_by(tips, mapping = aes(x = total_bill, y = day))) p_(ggally_summarise_by(tips, mapping = aes(x = day, y = total_bill))) # colour is kept only if equal to the discrete variable p_(ggally_summarise_by(tips, mapping = aes(x = total_bill, y = day, color = day))) p_(ggally_summarise_by(tips, mapping = aes(x = total_bill, y = day, color = sex))) p_(ggally_summarise_by(tips, mapping = aes(x = day, y = total_bill, color = day))) # custom text size p_(ggally_summarise_by(tips, mapping = aes(x = total_bill, y = day), size = 6)) # change statistic to display p_(ggally_summarise_by(tips, mapping = aes(x = total_bill, y = day), text_fn = weighted_mean_sd)) # custom stat function weighted_sum <- function(x, weights = NULL) { if (is.null(weights)) weights <- 1 paste0("Total : ", round(sum(x * weights, na.rm = TRUE), digits = 1)) } p_(ggally_summarise_by(tips, mapping = aes(x = total_bill, y = day), text_fn = weighted_sum)) }
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive if (require(Hmisc)) { data(tips) p_(ggally_summarise_by(tips, mapping = aes(x = total_bill, y = day))) p_(ggally_summarise_by(tips, mapping = aes(x = day, y = total_bill))) # colour is kept only if equal to the discrete variable p_(ggally_summarise_by(tips, mapping = aes(x = total_bill, y = day, color = day))) p_(ggally_summarise_by(tips, mapping = aes(x = total_bill, y = day, color = sex))) p_(ggally_summarise_by(tips, mapping = aes(x = day, y = total_bill, color = day))) # custom text size p_(ggally_summarise_by(tips, mapping = aes(x = total_bill, y = day), size = 6)) # change statistic to display p_(ggally_summarise_by(tips, mapping = aes(x = total_bill, y = day), text_fn = weighted_mean_sd)) # custom stat function weighted_sum <- function(x, weights = NULL) { if (is.null(weights)) weights <- 1 paste0("Total : ", round(sum(x * weights, na.rm = TRUE), digits = 1)) } p_(ggally_summarise_by(tips, mapping = aes(x = total_bill, y = day), text_fn = weighted_sum)) }
Plot the number of observations as a table. Other statistics computed
by stat_cross
could be used (see examples).
ggally_table( data, mapping, keep.zero.cells = FALSE, ..., geom_tile_args = NULL ) ggally_tableDiag( data, mapping, keep.zero.cells = FALSE, ..., geom_tile_args = NULL )
ggally_table( data, mapping, keep.zero.cells = FALSE, ..., geom_tile_args = NULL ) ggally_tableDiag( data, mapping, keep.zero.cells = FALSE, ..., geom_tile_args = NULL )
data |
data set using |
mapping |
aesthetics being used |
keep.zero.cells |
If |
... |
other arguments passed to |
geom_tile_args |
other arguments passed to |
The colour aesthetic is taken into account only if equal to x or y.
Joseph Larmarange
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) p_(ggally_table(tips, mapping = aes(x = smoker, y = sex))) p_(ggally_table(tips, mapping = aes(x = day, y = time))) p_(ggally_table(tips, mapping = aes(x = smoker, y = sex, colour = smoker))) # colour is kept only if equal to x or y p_(ggally_table(tips, mapping = aes(x = smoker, y = sex, colour = day))) # diagonal version p_(ggally_tableDiag(tips, mapping = aes(x = smoker))) # custom label size and color p_(ggally_table(tips, mapping = aes(x = smoker, y = sex), size = 16, color = "red")) # display column proportions p_(ggally_table( tips, mapping = aes(x = day, y = sex, label = scales::percent(after_stat(col.prop))) )) # draw table cells p_(ggally_table( tips, mapping = aes(x = smoker, y = sex), geom_tile_args = list(colour = "black", fill = "white") )) # Use standardized residuals to fill table cells p_(ggally_table( as.data.frame(Titanic), mapping = aes( x = Class, y = Survived, weight = Freq, fill = after_stat(std.resid), label = scales::percent(after_stat(col.prop), accuracy = .1) ), geom_tile_args = list(colour = "black") ) + scale_fill_steps2(breaks = c(-3, -2, 2, 3), show.limits = TRUE))
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) p_(ggally_table(tips, mapping = aes(x = smoker, y = sex))) p_(ggally_table(tips, mapping = aes(x = day, y = time))) p_(ggally_table(tips, mapping = aes(x = smoker, y = sex, colour = smoker))) # colour is kept only if equal to x or y p_(ggally_table(tips, mapping = aes(x = smoker, y = sex, colour = day))) # diagonal version p_(ggally_tableDiag(tips, mapping = aes(x = smoker))) # custom label size and color p_(ggally_table(tips, mapping = aes(x = smoker, y = sex), size = 16, color = "red")) # display column proportions p_(ggally_table( tips, mapping = aes(x = day, y = sex, label = scales::percent(after_stat(col.prop))) )) # draw table cells p_(ggally_table( tips, mapping = aes(x = smoker, y = sex), geom_tile_args = list(colour = "black", fill = "white") )) # Use standardized residuals to fill table cells p_(ggally_table( as.data.frame(Titanic), mapping = aes( x = Class, y = Survived, weight = Freq, fill = after_stat(std.resid), label = scales::percent(after_stat(col.prop), accuracy = .1) ), geom_tile_args = list(colour = "black") ) + scale_fill_steps2(breaks = c(-3, -2, 2, 3), show.limits = TRUE))
Plot text for a plot.
ggally_text( label, mapping = ggplot2::aes(color = I("black")), xP = 0.5, yP = 0.5, xrange = c(0, 1), yrange = c(0, 1), ... )
ggally_text( label, mapping = ggplot2::aes(color = I("black")), xP = 0.5, yP = 0.5, xrange = c(0, 1), yrange = c(0, 1), ... )
label |
text that you want to appear |
mapping |
aesthetics that don't relate to position (such as color) |
xP |
horizontal position percentage |
yP |
vertical position percentage |
xrange |
range of the data around it. Only nice to have if plotting in a matrix |
yrange |
range of the data around it. Only nice to have if plotting in a matrix |
... |
other arguments for geom_text |
Barret Schloerke
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive p_(ggally_text("Example 1")) p_(ggally_text("Example\nTwo", mapping = ggplot2::aes(size = 15), color = I("red")))
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive p_(ggally_text("Example 1")) p_(ggally_text("Example\nTwo", mapping = ggplot2::aes(size = 15), color = I("red")))
Plot trends using line plots. For continuous y variables, plot the evolution of the mean. For binary y variables, plot the evolution of the proportion.
ggally_trends(data, mapping, ..., include_zero = FALSE)
ggally_trends(data, mapping, ..., include_zero = FALSE)
data |
data set using |
mapping |
aesthetics being used |
... |
other arguments passed to |
include_zero |
Should 0 be included on the y-axis? |
Joseph Larmarange
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) tips_f <- tips tips_f$day <- factor(tips$day, c("Thur", "Fri", "Sat", "Sun")) # Numeric variable p_(ggally_trends(tips_f, mapping = aes(x = day, y = total_bill))) p_(ggally_trends(tips_f, mapping = aes(x = day, y = total_bill, colour = time))) # Binary variable p_(ggally_trends(tips_f, mapping = aes(x = day, y = smoker))) p_(ggally_trends(tips_f, mapping = aes(x = day, y = smoker, colour = sex))) # Discrete variable with 3 or more categories p_(ggally_trends(tips_f, mapping = aes(x = smoker, y = day))) p_(ggally_trends(tips_f, mapping = aes(x = smoker, y = day, color = sex))) # Include zero on Y axis p_(ggally_trends(tips_f, mapping = aes(x = day, y = total_bill), include_zero = TRUE)) p_(ggally_trends(tips_f, mapping = aes(x = day, y = smoker), include_zero = TRUE)) # Change line size p_(ggally_trends(tips_f, mapping = aes(x = day, y = smoker, colour = sex), size = 3)) # Define weights with the appropriate aesthetic d <- as.data.frame(Titanic) p_(ggally_trends( d, mapping = aes(x = Class, y = Survived, weight = Freq, color = Sex), include_zero = TRUE ))
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) tips_f <- tips tips_f$day <- factor(tips$day, c("Thur", "Fri", "Sat", "Sun")) # Numeric variable p_(ggally_trends(tips_f, mapping = aes(x = day, y = total_bill))) p_(ggally_trends(tips_f, mapping = aes(x = day, y = total_bill, colour = time))) # Binary variable p_(ggally_trends(tips_f, mapping = aes(x = day, y = smoker))) p_(ggally_trends(tips_f, mapping = aes(x = day, y = smoker, colour = sex))) # Discrete variable with 3 or more categories p_(ggally_trends(tips_f, mapping = aes(x = smoker, y = day))) p_(ggally_trends(tips_f, mapping = aes(x = smoker, y = day, color = sex))) # Include zero on Y axis p_(ggally_trends(tips_f, mapping = aes(x = day, y = total_bill), include_zero = TRUE)) p_(ggally_trends(tips_f, mapping = aes(x = day, y = smoker), include_zero = TRUE)) # Change line size p_(ggally_trends(tips_f, mapping = aes(x = day, y = smoker, colour = sex), size = 3)) # Define weights with the appropriate aesthetic d <- as.data.frame(Titanic) p_(ggally_trends( d, mapping = aes(x = Class, y = Survived, weight = Freq, color = Sex), include_zero = TRUE ))
ggbivariate
is a variant of ggduo
for plotting
an outcome variable with several potential explanatory variables.
ggbivariate( data, outcome, explanatory = NULL, mapping = NULL, types = NULL, ..., rowbar_args = NULL )
ggbivariate( data, outcome, explanatory = NULL, mapping = NULL, types = NULL, ..., rowbar_args = NULL )
data |
dataset to be used, can have both categorical and numerical variables |
outcome |
name or position of the outcome variable (one variable only) |
explanatory |
names or positions of the explanatory variables (if |
mapping |
additional aesthetic to be used, for example to indicate weights (see examples) |
types |
custom types of plots to use, see |
... |
additional arguments passed to |
rowbar_args |
additional arguments passed to |
Joseph Larmarange
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) p_(ggbivariate(tips, "smoker", c("day", "time", "sex", "tip"))) # Personalize plot title and legend title p_(ggbivariate( tips, "smoker", c("day", "time", "sex", "tip"), title = "Custom title" ) + labs(fill = "Smoker ?")) # Customize fill colour scale p_(ggbivariate(tips, "smoker", c("day", "time", "sex", "tip")) + scale_fill_brewer(type = "qual")) # Customize labels p_(ggbivariate( tips, "smoker", c("day", "time", "sex", "tip"), rowbar_args = list( colour = "white", size = 4, fontface = "bold", label_format = scales::label_percent(accurary = 1) ) )) # Choose the sub-plot from which get legend p_(ggbivariate(tips, "smoker")) p_(ggbivariate(tips, "smoker", legend = 3)) # Use mapping to indicate weights d <- as.data.frame(Titanic) p_(ggbivariate(d, "Survived", mapping = aes(weight = Freq))) # outcome can be numerical p_(ggbivariate(tips, outcome = "tip", title = "tip"))
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) p_(ggbivariate(tips, "smoker", c("day", "time", "sex", "tip"))) # Personalize plot title and legend title p_(ggbivariate( tips, "smoker", c("day", "time", "sex", "tip"), title = "Custom title" ) + labs(fill = "Smoker ?")) # Customize fill colour scale p_(ggbivariate(tips, "smoker", c("day", "time", "sex", "tip")) + scale_fill_brewer(type = "qual")) # Customize labels p_(ggbivariate( tips, "smoker", c("day", "time", "sex", "tip"), rowbar_args = list( colour = "white", size = 4, fontface = "bold", label_format = scales::label_percent(accurary = 1) ) )) # Choose the sub-plot from which get legend p_(ggbivariate(tips, "smoker")) p_(ggbivariate(tips, "smoker", legend = 3)) # Use mapping to indicate weights d <- as.data.frame(Titanic) p_(ggbivariate(d, "Survived", mapping = aes(weight = Freq))) # outcome can be numerical p_(ggbivariate(tips, outcome = "tip", title = "tip"))
Plot the coefficients of a model with broom and ggplot2.
For an updated and improved version, see ggcoef_model()
.
ggcoef( x, mapping = aes(!!as.name("estimate"), !!as.name("term")), conf.int = TRUE, conf.level = 0.95, exponentiate = FALSE, exclude_intercept = FALSE, vline = TRUE, vline_intercept = "auto", vline_color = "gray50", vline_linetype = "dotted", vline_size = 1, errorbar_color = "gray25", errorbar_height = 0, errorbar_linetype = "solid", errorbar_size = 0.5, sort = c("none", "ascending", "descending"), ... )
ggcoef( x, mapping = aes(!!as.name("estimate"), !!as.name("term")), conf.int = TRUE, conf.level = 0.95, exponentiate = FALSE, exclude_intercept = FALSE, vline = TRUE, vline_intercept = "auto", vline_color = "gray50", vline_linetype = "dotted", vline_size = 1, errorbar_color = "gray25", errorbar_height = 0, errorbar_linetype = "solid", errorbar_size = 0.5, sort = c("none", "ascending", "descending"), ... )
x |
a model object to be tidied with |
mapping |
default aesthetic mapping |
conf.int |
display confidence intervals as error bars? |
conf.level |
level of confidence intervals (passed to |
exponentiate |
if |
exclude_intercept |
should the intercept be excluded from the plot? |
vline |
print a vertical line? |
vline_intercept |
|
vline_color |
color of the vertical line |
vline_linetype |
line type of the vertical line |
vline_size |
size of the vertical line |
errorbar_color |
color of the error bars |
errorbar_height |
height of the error bars |
errorbar_linetype |
line type of the error bars |
errorbar_size |
size of the error bars |
sort |
|
... |
additional arguments sent to |
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive library(broom) reg <- lm(Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width, data = iris) p_(ggcoef(reg)) d <- as.data.frame(Titanic) reg2 <- glm(Survived ~ Sex + Age + Class, family = binomial, data = d, weights = d$Freq) ggcoef(reg2, exponentiate = TRUE) ggcoef( reg2, exponentiate = TRUE, exclude_intercept = TRUE, errorbar_height = .2, color = "blue", sort = "ascending" )
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive library(broom) reg <- lm(Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width, data = iris) p_(ggcoef(reg)) d <- as.data.frame(Titanic) reg2 <- glm(Survived ~ Sex + Age + Class, family = binomial, data = d, weights = d$Freq) ggcoef(reg2, exponentiate = TRUE) ggcoef( reg2, exponentiate = TRUE, exclude_intercept = TRUE, errorbar_height = .2, color = "blue", sort = "ascending" )
Function for making a correlation matrix plot, using ggplot2.
The function is directly inspired by Tian Zheng and Yu-Sung Su's
corrplot
function in the 'arm' package.
Please visit https://github.com/briatte/ggcorr for the latest version
of ggcorr
, and see the vignette at
https://briatte.github.io/ggcorr/ for many examples of how to use it.
ggcorr( data, method = c("pairwise", "pearson"), cor_matrix = NULL, nbreaks = NULL, digits = 2, name = "", low = "#3B9AB2", mid = "#EEEEEE", high = "#F21A00", midpoint = 0, palette = NULL, geom = "tile", min_size = 2, max_size = 6, label = FALSE, label_alpha = FALSE, label_color = "black", label_round = 1, label_size = 4, limits = c(-1, 1), drop = is.null(limits) || identical(limits, FALSE), layout.exp = 0, legend.position = "right", legend.size = 9, ... )
ggcorr( data, method = c("pairwise", "pearson"), cor_matrix = NULL, nbreaks = NULL, digits = 2, name = "", low = "#3B9AB2", mid = "#EEEEEE", high = "#F21A00", midpoint = 0, palette = NULL, geom = "tile", min_size = 2, max_size = 6, label = FALSE, label_alpha = FALSE, label_color = "black", label_round = 1, label_size = 4, limits = c(-1, 1), drop = is.null(limits) || identical(limits, FALSE), layout.exp = 0, legend.position = "right", legend.size = 9, ... )
data |
a data frame or matrix containing numeric (continuous) data. If any of the columns contain non-numeric data, they will be dropped with a warning. |
method |
a vector of two character strings. The first value gives the
method for computing covariances in the presence of missing values, and must
be (an abbreviation of) one of |
cor_matrix |
the named correlation matrix to use for calculations.
Defaults to the correlation matrix of |
nbreaks |
the number of breaks to apply to the correlation coefficients,
which results in a categorical color scale. See 'Note'.
Defaults to |
digits |
the number of digits to show in the breaks of the correlation
coefficients: see |
name |
a character string for the legend that shows the colors of the
correlation coefficients.
Defaults to |
low |
the lower color of the gradient for continuous scaling of the
correlation coefficients.
Defaults to |
mid |
the midpoint color of the gradient for continuous scaling of the
correlation coefficients.
Defaults to |
high |
the upper color of the gradient for continuous scaling of the
correlation coefficients.
Defaults to |
midpoint |
the midpoint value for continuous scaling of the
correlation coefficients.
Defaults to |
palette |
if |
geom |
the geom object to use. Accepts either |
min_size |
when |
max_size |
when |
label |
whether to add correlation coefficients to the plot.
Defaults to |
label_alpha |
whether to make the correlation coefficients increasingly
transparent as they come close to 0. Also accepts any numeric value between
|
label_color |
the color of the correlation coefficients.
Defaults to |
label_round |
the decimal rounding of the correlation coefficients.
Defaults to |
label_size |
the size of the correlation coefficients.
Defaults to |
limits |
bounding of color scaling for correlations, set |
drop |
if using |
layout.exp |
a multiplier to expand the horizontal axis to the left if
variable names get clipped.
Defaults to |
legend.position |
where to put the legend of the correlation
coefficients: see |
legend.size |
the size of the legend title and labels, in points: see
|
... |
other arguments supplied to |
Recommended values for the nbreaks
argument are 3
to
11
, as values above 11 are visually difficult to separate and are not
supported by diverging ColorBrewer palettes.
Francois Briatte, with contributions from Amos B. Elberg and Barret Schloerke
cor
and corrplot
in the
arm
package.
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive # Basketball statistics provided by Nathan Yau at Flowing Data. dt <- read.csv("http://datasets.flowingdata.com/ppg2008.csv") # Default output. p_(ggcorr(dt[, -1])) # Labeled output, with coefficient transparency. p_(ggcorr(dt[, -1], label = TRUE, label_alpha = TRUE )) # Custom options. p_(ggcorr( dt[, -1], name = expression(rho), geom = "circle", max_size = 10, min_size = 2, size = 3, hjust = 0.75, nbreaks = 6, angle = -45, palette = "PuOr" # colorblind safe, photocopy-able )) # Supply your own correlation matrix p_(ggcorr( data = NULL, cor_matrix = cor(dt[, -1], use = "pairwise") ))
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive # Basketball statistics provided by Nathan Yau at Flowing Data. dt <- read.csv("http://datasets.flowingdata.com/ppg2008.csv") # Default output. p_(ggcorr(dt[, -1])) # Labeled output, with coefficient transparency. p_(ggcorr(dt[, -1], label = TRUE, label_alpha = TRUE )) # Custom options. p_(ggcorr( dt[, -1], name = expression(rho), geom = "circle", max_size = 10, min_size = 2, size = 3, hjust = 0.75, nbreaks = 6, angle = -45, palette = "PuOr" # colorblind safe, photocopy-able )) # Supply your own correlation matrix p_(ggcorr( data = NULL, cor_matrix = cor(dt[, -1], use = "pairwise") ))
Make a matrix of plots with a given data set with two different column sets
ggduo( data, mapping = NULL, columnsX = 1:ncol(data), columnsY = 1:ncol(data), title = NULL, types = list(continuous = "smooth_loess", comboVertical = "box_no_facet", comboHorizontal = "facethist", discrete = "count"), axisLabels = c("show", "none"), columnLabelsX = colnames(data[columnsX]), columnLabelsY = colnames(data[columnsY]), labeller = "label_value", switch = NULL, xlab = NULL, ylab = NULL, showStrips = NULL, legend = NULL, cardinality_threshold = 15, progress = NULL, xProportions = NULL, yProportions = NULL, legends = deprecated() )
ggduo( data, mapping = NULL, columnsX = 1:ncol(data), columnsY = 1:ncol(data), title = NULL, types = list(continuous = "smooth_loess", comboVertical = "box_no_facet", comboHorizontal = "facethist", discrete = "count"), axisLabels = c("show", "none"), columnLabelsX = colnames(data[columnsX]), columnLabelsY = colnames(data[columnsY]), labeller = "label_value", switch = NULL, xlab = NULL, ylab = NULL, showStrips = NULL, legend = NULL, cardinality_threshold = 15, progress = NULL, xProportions = NULL, yProportions = NULL, legends = deprecated() )
data |
data set using. Can have both numerical and categorical data. |
mapping |
aesthetic mapping (besides |
columnsX , columnsY
|
which columns are used to make plots. Defaults to all columns. |
title , xlab , ylab
|
title, x label, and y label for the graph |
types |
see Details |
axisLabels |
either "show" to display axisLabels or "none" for no axis labels |
columnLabelsX , columnLabelsY
|
label names to be displayed. Defaults to names of columns being used. |
labeller |
labeller for facets. See |
switch |
switch parameter for facet_grid. See |
showStrips |
boolean to determine if each plot's strips should be displayed. |
legend |
May be the two objects described below or the default
|
cardinality_threshold |
maximum number of levels allowed in a character / factor column. Set this value to NULL to not check factor columns. Defaults to 15 |
progress |
|
xProportions , yProportions
|
Value to change how much area is given for each plot. Either |
legends |
types
is a list that may contain the variables
'continuous', 'combo', 'discrete', and 'na'. Each element of the list may be a function or a string. If a string is supplied, If a string is supplied, it must be a character string representing the tail end of a ggally_NAME
function. The list of current valid ggally_NAME
functions is visible in a dedicated vignette.
This option is used for continuous X and Y data.
This option is used for either continuous X and categorical Y data or categorical X and continuous Y data.
This option is used for either continuous X and categorical Y data or categorical X and continuous Y data.
This option is used for categorical X and Y data.
This option is used when all X data is NA
, all Y data is NA
, or either all X or Y data is NA
.
If 'blank' is ever chosen as an option, then ggduo will produce an empty plot.
If a function is supplied as an option, it should implement the function api of function(data, mapping, ...){#make ggplot2 plot}
. If a specific function needs its parameters set, wrap(fn, param1 = val1, param2 = val2)
the function with its parameters.
# small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(baseball) # Keep players from 1990-1995 with at least one at bat # Add how many singles a player hit # (must do in two steps as X1b is used in calculations) dt <- transform( subset(baseball, year >= 1990 & year <= 1995 & ab > 0), X1b = h - X2b - X3b - hr ) # Add # the player's batting average, # the player's slugging percentage, # and the player's on base percentage # Make factor a year, as each season is discrete dt <- transform( dt, batting_avg = h / ab, slug = (X1b + 2 * X2b + 3 * X3b + 4 * hr) / ab, on_base = (h + bb + hbp) / (ab + bb + hbp), year = as.factor(year) ) pm <- ggduo( dt, c("year", "g", "ab", "lg"), c("batting_avg", "slug", "on_base"), mapping = ggplot2::aes(color = lg) ) # Prints, but # there is severe over plotting in the continuous plots # the labels could be better # want to add more hitting information p_(pm) # address overplotting issues and add a title pm <- ggduo( dt, c("year", "g", "ab", "lg"), c("batting_avg", "slug", "on_base"), columnLabelsX = c("year", "player game count", "player at bat count", "league"), columnLabelsY = c("batting avg", "slug %", "on base %"), title = "Baseball Hitting Stats from 1990-1995", mapping = ggplot2::aes(color = lg), types = list( # change the shape and add some transparency to the points continuous = wrap("smooth_loess", alpha = 0.50, shape = "+") ), showStrips = FALSE ) p_(pm) # Use "auto" to adapt width of the sub-plots pm <- ggduo( dt, c("year", "g", "ab", "lg"), c("batting_avg", "slug", "on_base"), mapping = ggplot2::aes(color = lg), xProportions = "auto" ) p_(pm) # Custom widths & heights of the sub-plots pm <- ggduo( dt, c("year", "g", "ab", "lg"), c("batting_avg", "slug", "on_base"), mapping = ggplot2::aes(color = lg), xProportions = c(6, 4, 3, 2), yProportions = c(1, 2, 1) ) p_(pm) # Example derived from: ## R Data Analysis Examples | Canonical Correlation Analysis. UCLA: Institute for Digital ## Research and Education. ## from http://www.stats.idre.ucla.edu/r/dae/canonical-correlation-analysis ## (accessed May 22, 2017). # "Example 1. A researcher has collected data on three psychological variables, four # academic variables (standardized test scores) and gender for 600 college freshman. # She is interested in how the set of psychological variables relates to the academic # variables and gender. In particular, the researcher is interested in how many # dimensions (canonical variables) are necessary to understand the association between # the two sets of variables." data(psychademic) summary(psychademic) (psych_variables <- attr(psychademic, "psychology")) (academic_variables <- attr(psychademic, "academic")) ## Within correlation p_(ggpairs(psychademic, columns = psych_variables)) p_(ggpairs(psychademic, columns = academic_variables)) ## Between correlation loess_with_cor <- function(data, mapping, ..., method = "pearson") { x <- eval_data_col(data, mapping$x) y <- eval_data_col(data, mapping$y) cor <- cor(x, y, method = method) ggally_smooth_loess(data, mapping, ...) + ggplot2::geom_label( data = data.frame( x = min(x, na.rm = TRUE), y = max(y, na.rm = TRUE), lab = round(cor, digits = 3) ), mapping = ggplot2::aes(x = x, y = y, label = lab), hjust = 0, vjust = 1, size = 5, fontface = "bold", inherit.aes = FALSE # do not inherit anything from the ... ) } pm <- ggduo( psychademic, rev(psych_variables), academic_variables, types = list(continuous = loess_with_cor), showStrips = FALSE ) suppressWarnings(p_(pm)) # ignore warnings from loess # add color according to sex pm <- ggduo( psychademic, mapping = ggplot2::aes(color = sex), rev(psych_variables), academic_variables, types = list(continuous = loess_with_cor), showStrips = FALSE, legend = c(5, 2) ) suppressWarnings(p_(pm)) # add color according to sex pm <- ggduo( psychademic, mapping = ggplot2::aes(color = motivation), rev(psych_variables), academic_variables, types = list(continuous = loess_with_cor), showStrips = FALSE, legend = c(5, 2) ) + ggplot2::theme(legend.position = "bottom") suppressWarnings(p_(pm)) # dt, # c("year", "g", "ab", "lg", "lg"), # c("batting_avg", "slug", "on_base", "hit_type"), # columnLabelsX = c("year", "player game count", "player at bat count", "league", ""), # columnLabelsY = c("batting avg", "slug %", "on base %", "hit type"), # title = "Baseball Hitting Stats from 1990-1995 (player strike in 1994)", # mapping = aes(color = year), # types = list( # continuous = wrap("smooth_loess", alpha = 0.50, shape = "+"), # comboHorizontal = wrap(display_hit_type_combo, binwidth = 15), # discrete = wrap(display_hit_type_discrete, color = "black", size = 0.15) # ), # showStrips = FALSE ## make the 5th column blank, except for the legend # australia_PISA2012, # c("gender", "age", "homework", "possessions"), # c("PV1MATH", "PV2MATH", "PV3MATH", "PV4MATH", "PV5MATH"), # types = list( # continuous = "points", # combo = "box", # discrete = "ratio" # ) # australia_PISA2012, # c("gender", "age", "homework", "possessions"), # c("PV1MATH", "PV2MATH", "PV3MATH", "PV4MATH", "PV5MATH"), # mapping = ggplot2::aes(color = gender), # types = list( # continuous = wrap("smooth", alpha = 0.25, method = "loess"), # combo = "box", # discrete = "ratio" # )
# small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(baseball) # Keep players from 1990-1995 with at least one at bat # Add how many singles a player hit # (must do in two steps as X1b is used in calculations) dt <- transform( subset(baseball, year >= 1990 & year <= 1995 & ab > 0), X1b = h - X2b - X3b - hr ) # Add # the player's batting average, # the player's slugging percentage, # and the player's on base percentage # Make factor a year, as each season is discrete dt <- transform( dt, batting_avg = h / ab, slug = (X1b + 2 * X2b + 3 * X3b + 4 * hr) / ab, on_base = (h + bb + hbp) / (ab + bb + hbp), year = as.factor(year) ) pm <- ggduo( dt, c("year", "g", "ab", "lg"), c("batting_avg", "slug", "on_base"), mapping = ggplot2::aes(color = lg) ) # Prints, but # there is severe over plotting in the continuous plots # the labels could be better # want to add more hitting information p_(pm) # address overplotting issues and add a title pm <- ggduo( dt, c("year", "g", "ab", "lg"), c("batting_avg", "slug", "on_base"), columnLabelsX = c("year", "player game count", "player at bat count", "league"), columnLabelsY = c("batting avg", "slug %", "on base %"), title = "Baseball Hitting Stats from 1990-1995", mapping = ggplot2::aes(color = lg), types = list( # change the shape and add some transparency to the points continuous = wrap("smooth_loess", alpha = 0.50, shape = "+") ), showStrips = FALSE ) p_(pm) # Use "auto" to adapt width of the sub-plots pm <- ggduo( dt, c("year", "g", "ab", "lg"), c("batting_avg", "slug", "on_base"), mapping = ggplot2::aes(color = lg), xProportions = "auto" ) p_(pm) # Custom widths & heights of the sub-plots pm <- ggduo( dt, c("year", "g", "ab", "lg"), c("batting_avg", "slug", "on_base"), mapping = ggplot2::aes(color = lg), xProportions = c(6, 4, 3, 2), yProportions = c(1, 2, 1) ) p_(pm) # Example derived from: ## R Data Analysis Examples | Canonical Correlation Analysis. UCLA: Institute for Digital ## Research and Education. ## from http://www.stats.idre.ucla.edu/r/dae/canonical-correlation-analysis ## (accessed May 22, 2017). # "Example 1. A researcher has collected data on three psychological variables, four # academic variables (standardized test scores) and gender for 600 college freshman. # She is interested in how the set of psychological variables relates to the academic # variables and gender. In particular, the researcher is interested in how many # dimensions (canonical variables) are necessary to understand the association between # the two sets of variables." data(psychademic) summary(psychademic) (psych_variables <- attr(psychademic, "psychology")) (academic_variables <- attr(psychademic, "academic")) ## Within correlation p_(ggpairs(psychademic, columns = psych_variables)) p_(ggpairs(psychademic, columns = academic_variables)) ## Between correlation loess_with_cor <- function(data, mapping, ..., method = "pearson") { x <- eval_data_col(data, mapping$x) y <- eval_data_col(data, mapping$y) cor <- cor(x, y, method = method) ggally_smooth_loess(data, mapping, ...) + ggplot2::geom_label( data = data.frame( x = min(x, na.rm = TRUE), y = max(y, na.rm = TRUE), lab = round(cor, digits = 3) ), mapping = ggplot2::aes(x = x, y = y, label = lab), hjust = 0, vjust = 1, size = 5, fontface = "bold", inherit.aes = FALSE # do not inherit anything from the ... ) } pm <- ggduo( psychademic, rev(psych_variables), academic_variables, types = list(continuous = loess_with_cor), showStrips = FALSE ) suppressWarnings(p_(pm)) # ignore warnings from loess # add color according to sex pm <- ggduo( psychademic, mapping = ggplot2::aes(color = sex), rev(psych_variables), academic_variables, types = list(continuous = loess_with_cor), showStrips = FALSE, legend = c(5, 2) ) suppressWarnings(p_(pm)) # add color according to sex pm <- ggduo( psychademic, mapping = ggplot2::aes(color = motivation), rev(psych_variables), academic_variables, types = list(continuous = loess_with_cor), showStrips = FALSE, legend = c(5, 2) ) + ggplot2::theme(legend.position = "bottom") suppressWarnings(p_(pm)) # dt, # c("year", "g", "ab", "lg", "lg"), # c("batting_avg", "slug", "on_base", "hit_type"), # columnLabelsX = c("year", "player game count", "player at bat count", "league", ""), # columnLabelsY = c("batting avg", "slug %", "on base %", "hit type"), # title = "Baseball Hitting Stats from 1990-1995 (player strike in 1994)", # mapping = aes(color = year), # types = list( # continuous = wrap("smooth_loess", alpha = 0.50, shape = "+"), # comboHorizontal = wrap(display_hit_type_combo, binwidth = 15), # discrete = wrap(display_hit_type_discrete, color = "black", size = 0.15) # ), # showStrips = FALSE ## make the 5th column blank, except for the legend # australia_PISA2012, # c("gender", "age", "homework", "possessions"), # c("PV1MATH", "PV2MATH", "PV3MATH", "PV4MATH", "PV5MATH"), # types = list( # continuous = "points", # combo = "box", # discrete = "ratio" # ) # australia_PISA2012, # c("gender", "age", "homework", "possessions"), # c("PV1MATH", "PV2MATH", "PV3MATH", "PV4MATH", "PV5MATH"), # mapping = ggplot2::aes(color = gender), # types = list( # continuous = wrap("smooth", alpha = 0.25, method = "loess"), # combo = "box", # discrete = "ratio" # )
facet_grid
Single ggplot2 plot matrix with facet_grid
ggfacet( data, mapping = NULL, columnsX = 1:ncol(data), columnsY = 1:ncol(data), fn = ggally_points, ..., columnLabelsX = names(data[columnsX]), columnLabelsY = names(data[columnsY]), xlab = NULL, ylab = NULL, title = NULL, scales = "free" )
ggfacet( data, mapping = NULL, columnsX = 1:ncol(data), columnsY = 1:ncol(data), fn = ggally_points, ..., columnLabelsX = names(data[columnsX]), columnLabelsY = names(data[columnsY]), xlab = NULL, ylab = NULL, title = NULL, scales = "free" )
data |
data.frame that contains all columns to be displayed. This data will be melted before being passed into the function |
mapping |
aesthetic mapping (besides |
columnsX |
columns to be displayed in the plot matrix |
columnsY |
rows to be displayed in the plot matrix |
fn |
function to be executed. Similar to |
... |
extra arguments passed directly to |
columnLabelsX , columnLabelsY
|
column and row labels to display in the plot matrix |
xlab , ylab , title
|
plot matrix labels |
scales |
parameter supplied to |
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive if (requireNamespace("chemometrics", quietly = TRUE)) { data(NIR, package = "chemometrics") NIR_sub <- data.frame(NIR$yGlcEtOH, NIR$xNIR[, 1:3]) str(NIR_sub) x_cols <- c("X1115.0", "X1120.0", "X1125.0") y_cols <- c("Glucose", "Ethanol") # using ggduo directly p <- ggduo(NIR_sub, x_cols, y_cols, types = list(continuous = "points")) p_(p) # using ggfacet p <- ggfacet(NIR_sub, x_cols, y_cols) p_(p) # add a smoother p <- ggfacet(NIR_sub, x_cols, y_cols, fn = "smooth_loess") p_(p) # same output p <- ggfacet(NIR_sub, x_cols, y_cols, fn = ggally_smooth_loess) p_(p) # Change scales to be the same in for every row and for every column p <- ggfacet(NIR_sub, x_cols, y_cols, scales = "fixed") p_(p) }
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive if (requireNamespace("chemometrics", quietly = TRUE)) { data(NIR, package = "chemometrics") NIR_sub <- data.frame(NIR$yGlcEtOH, NIR$xNIR[, 1:3]) str(NIR_sub) x_cols <- c("X1115.0", "X1120.0", "X1125.0") y_cols <- c("Glucose", "Ethanol") # using ggduo directly p <- ggduo(NIR_sub, x_cols, y_cols, types = list(continuous = "points")) p_(p) # using ggfacet p <- ggfacet(NIR_sub, x_cols, y_cols) p_(p) # add a smoother p <- ggfacet(NIR_sub, x_cols, y_cols, fn = "smooth_loess") p_(p) # same output p <- ggfacet(NIR_sub, x_cols, y_cols, fn = ggally_smooth_loess) p_(p) # Change scales to be the same in for every row and for every column p <- ggfacet(NIR_sub, x_cols, y_cols, scales = "fixed") p_(p) }
Plot only legend of plot function
gglegend(fn)
gglegend(fn)
fn |
this value is passed directly to an empty |
a function that when called with arguments will produce the legend of the plotting function supplied.
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive # display regular plot p_(ggally_points(iris, ggplot2::aes(Sepal.Length, Sepal.Width, color = Species))) # Make a function that will only print the legend points_legend <- gglegend(ggally_points) p_(points_legend(iris, ggplot2::aes(Sepal.Length, Sepal.Width, color = Species))) # produce the sample legend plot, but supply a string that 'wrap' understands same_points_legend <- gglegend("points") identical( attr(attr(points_legend, "fn"), "original_fn"), attr(attr(same_points_legend, "fn"), "original_fn") ) # Complicated examples custom_legend <- wrap(gglegend("points"), size = 6) p_(custom_legend(iris, ggplot2::aes(Sepal.Length, Sepal.Width, color = Species))) # Use within ggpairs pm <- ggpairs( iris, 1:2, mapping = ggplot2::aes(color = Species), upper = list(continuous = gglegend("points")) ) p_(pm) # Place a legend in a specific location pm <- ggpairs(iris, 1:2, mapping = ggplot2::aes(color = Species)) # Make the legend pm[1, 2] <- points_legend(iris, ggplot2::aes(Sepal.Width, Sepal.Length, color = Species)) p_(pm)
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive # display regular plot p_(ggally_points(iris, ggplot2::aes(Sepal.Length, Sepal.Width, color = Species))) # Make a function that will only print the legend points_legend <- gglegend(ggally_points) p_(points_legend(iris, ggplot2::aes(Sepal.Length, Sepal.Width, color = Species))) # produce the sample legend plot, but supply a string that 'wrap' understands same_points_legend <- gglegend("points") identical( attr(attr(points_legend, "fn"), "original_fn"), attr(attr(same_points_legend, "fn"), "original_fn") ) # Complicated examples custom_legend <- wrap(gglegend("points"), size = 6) p_(custom_legend(iris, ggplot2::aes(Sepal.Length, Sepal.Width, color = Species))) # Use within ggpairs pm <- ggpairs( iris, 1:2, mapping = ggplot2::aes(color = Species), upper = list(continuous = gglegend("points")) ) p_(pm) # Place a legend in a specific location pm <- ggpairs(iris, 1:2, mapping = ggplot2::aes(color = Species)) # Make the legend pm[1, 2] <- points_legend(iris, ggplot2::aes(Sepal.Width, Sepal.Length, color = Species)) p_(pm)
Make a generic matrix of ggplot2 plots.
ggmatrix( plots, nrow, ncol, xAxisLabels = NULL, yAxisLabels = NULL, title = NULL, xlab = NULL, ylab = NULL, byrow = TRUE, showStrips = NULL, showAxisPlotLabels = TRUE, showXAxisPlotLabels = TRUE, showYAxisPlotLabels = TRUE, labeller = NULL, switch = NULL, xProportions = NULL, yProportions = NULL, progress = NULL, data = NULL, gg = NULL, legend = NULL )
ggmatrix( plots, nrow, ncol, xAxisLabels = NULL, yAxisLabels = NULL, title = NULL, xlab = NULL, ylab = NULL, byrow = TRUE, showStrips = NULL, showAxisPlotLabels = TRUE, showXAxisPlotLabels = TRUE, showYAxisPlotLabels = TRUE, labeller = NULL, switch = NULL, xProportions = NULL, yProportions = NULL, progress = NULL, data = NULL, gg = NULL, legend = NULL )
plots |
list of plots to be put into matrix |
nrow , ncol
|
number of rows and columns |
xAxisLabels , yAxisLabels
|
strip titles for the x and y axis respectively. Set to |
title , xlab , ylab
|
title, x label, and y label for the graph. Set to |
byrow |
boolean that determines whether the plots should be ordered by row or by column |
showStrips |
boolean to determine if each plot's strips should be displayed. |
showAxisPlotLabels , showXAxisPlotLabels , showYAxisPlotLabels
|
booleans that determine if the plots axis labels are printed on the X (bottom) or Y (left) part of the plot matrix. If |
labeller |
labeller for facets. See |
switch |
switch parameter for facet_grid. See |
xProportions , yProportions
|
Value to change how much area is given for each plot. Either |
progress |
|
data |
data set using. This is the data to be used in place of 'ggally_data' if the plot is a string to be evaluated at print time |
gg |
ggplot2 theme objects to be applied to every plot |
legend |
May be the two objects described below or the default
|
Now that the print.ggmatrix
method uses a large gtable object, rather than print each plot independently, memory usage may be of concern. From small tests, memory usage flutters around object.size(data) * 0.3 * length(plots)
. So, for a 80Mb random noise dataset with 100 plots, about 2.4 Gb of memory needed to print. For the 3.46 Mb diamonds dataset with 100 plots, about 100 Mb of memory was needed to print. The benefits of using the ggplot2 format greatly outweigh the price of about 20% increase in memory usage from the prior ad-hoc print method.
Barret Schloerke
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive plotList <- list() for (i in 1:6) { plotList[[i]] <- ggally_text(paste("Plot #", i, sep = "")) } pm <- ggmatrix( plotList, 2, 3, c("A", "B", "C"), c("D", "E"), byrow = TRUE ) p_(pm) pm <- ggmatrix( plotList, 2, 3, xAxisLabels = c("A", "B", "C"), yAxisLabels = NULL, byrow = FALSE, showXAxisPlotLabels = FALSE ) p_(pm)
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive plotList <- list() for (i in 1:6) { plotList[[i]] <- ggally_text(paste("Plot #", i, sep = "")) } pm <- ggmatrix( plotList, 2, 3, c("A", "B", "C"), c("D", "E"), byrow = TRUE ) p_(pm) pm <- ggmatrix( plotList, 2, 3, xAxisLabels = c("A", "B", "C"), yAxisLabels = NULL, byrow = FALSE, showXAxisPlotLabels = FALSE ) p_(pm)
ggmatrix
gtable objectSpecialized method to print the ggmatrix
object.
ggmatrix_gtable( pm, ..., progress = NULL, progress_format = formals(ggmatrix_progress)$format )
ggmatrix_gtable( pm, ..., progress = NULL, progress_format = formals(ggmatrix_progress)$format )
pm |
|
... |
ignored |
progress , progress_format
|
Please use the 'progress' parameter in your |
Barret Schloerke
data(tips) pm <- ggpairs(tips, c(1, 3, 2), mapping = ggplot2::aes(color = sex)) ggmatrix_gtable(pm)
data(tips) pm <- ggpairs(tips, c(1, 3, 2), mapping = ggplot2::aes(color = sex)) ggmatrix_gtable(pm)
ggmatrix
plot locationsggmatrix_location(pm, location = NULL, rows = NULL, cols = NULL)
ggmatrix_location(pm, location = NULL, rows = NULL, cols = NULL)
pm |
|
location |
|
rows |
numeric vector of the rows to be used. Will be used with |
cols |
numeric vector of the cols to be used. Will be used with |
Convert many types of location values to a consistent data.frame
of row
and col
values.
Data frame with columns c("row", "col")
containing locations for the plot matrix
pm <- ggpairs(tips, 1:3) # All locations ggmatrix_location(pm, location = "all") ggmatrix_location(pm, location = TRUE) # No locations ggmatrix_location(pm, location = "none") # "upper" triangle locations ggmatrix_location(pm, location = "upper") # "lower" triangle locations ggmatrix_location(pm, location = "lower") # "diag" locations ggmatrix_location(pm, location = "diag") # specific rows ggmatrix_location(pm, rows = 2) # specific columns ggmatrix_location(pm, cols = 2) # row and column combinations ggmatrix_location(pm, rows = c(1, 2), cols = c(1, 3)) # matrix locations mat <- matrix(TRUE, ncol = 3, nrow = 3) mat[1, 1] <- FALSE locs <- ggmatrix_location(pm, location = mat) ## does not contain the 1, 1 cell locs # Use the output of a prior ggmatrix_location ggmatrix_location(pm, location = locs)
pm <- ggpairs(tips, 1:3) # All locations ggmatrix_location(pm, location = "all") ggmatrix_location(pm, location = TRUE) # No locations ggmatrix_location(pm, location = "none") # "upper" triangle locations ggmatrix_location(pm, location = "upper") # "lower" triangle locations ggmatrix_location(pm, location = "lower") # "diag" locations ggmatrix_location(pm, location = "diag") # specific rows ggmatrix_location(pm, rows = 2) # specific columns ggmatrix_location(pm, cols = 2) # row and column combinations ggmatrix_location(pm, rows = c(1, 2), cols = c(1, 3)) # matrix locations mat <- matrix(TRUE, ncol = 3, nrow = 3) mat[1, 1] <- FALSE locs <- ggmatrix_location(pm, location = mat) ## does not contain the 1, 1 cell locs # Use the output of a prior ggmatrix_location ggmatrix_location(pm, location = locs)
ggmatrix
default progress barggmatrix
default progress bar
ggmatrix_progress( format = " plot: [:plot_i, :plot_j] [:bar]:percent est::eta ", clear = TRUE, show_after = 0, ... )
ggmatrix_progress( format = " plot: [:plot_i, :plot_j] [:bar]:percent est::eta ", clear = TRUE, show_after = 0, ... )
format , clear , show_after , ...
|
parameters supplied directly to |
function that accepts a plot matrix as the first argument and ...
for future expansion. Internally, the plot matrix is used to determine the total number of plots for the progress bar.
p_ <- GGally::print_if_interactive pm <- ggpairs(iris, 1:2, progress = ggmatrix_progress()) p_(pm) # does not clear after finishing pm <- ggpairs(iris, 1:2, progress = ggmatrix_progress(clear = FALSE)) p_(pm)
p_ <- GGally::print_if_interactive pm <- ggpairs(iris, 1:2, progress = ggmatrix_progress()) p_(pm) # does not clear after finishing pm <- ggpairs(iris, 1:2, progress = ggmatrix_progress(clear = FALSE)) p_(pm)
Function for plotting network objects using ggplot2, now replaced by the
ggnet2
function, which provides additional control over
plotting parameters. Please visit https://github.com/briatte/ggnet for
the latest version of ggnet2, and https://briatte.github.io/ggnet/ for a
vignette that contains many examples and explanations.
ggnet( net, mode = "fruchtermanreingold", layout.par = NULL, layout.exp = 0, size = 9, alpha = 1, weight = "none", weight.legend = NA, weight.method = weight, weight.min = NA, weight.max = NA, weight.cut = FALSE, group = NULL, group.legend = NA, node.group = group, node.color = NULL, node.alpha = alpha, segment.alpha = alpha, segment.color = "grey50", segment.label = NULL, segment.size = 0.25, arrow.size = 0, arrow.gap = 0, arrow.type = "closed", label = FALSE, label.nodes = label, label.size = size/2, label.trim = FALSE, legend.size = 9, legend.position = "right", names = c("", ""), quantize.weights = FALSE, subset.threshold = 0, top8.nodes = FALSE, trim.labels = FALSE, ... )
ggnet( net, mode = "fruchtermanreingold", layout.par = NULL, layout.exp = 0, size = 9, alpha = 1, weight = "none", weight.legend = NA, weight.method = weight, weight.min = NA, weight.max = NA, weight.cut = FALSE, group = NULL, group.legend = NA, node.group = group, node.color = NULL, node.alpha = alpha, segment.alpha = alpha, segment.color = "grey50", segment.label = NULL, segment.size = 0.25, arrow.size = 0, arrow.gap = 0, arrow.type = "closed", label = FALSE, label.nodes = label, label.size = size/2, label.trim = FALSE, legend.size = 9, legend.position = "right", names = c("", ""), quantize.weights = FALSE, subset.threshold = 0, top8.nodes = FALSE, trim.labels = FALSE, ... )
net |
an object of class |
mode |
a placement method from those provided in the
|
layout.par |
options to be passed to the placement method, as listed in
gplot.layout.
Defaults to |
layout.exp |
a multiplier to expand the horizontal axis if node labels
get clipped: see expand_range for details.
Defaults to |
size |
size of the network nodes. If the nodes are weighted, their area is proportionally scaled up to the size set by |
alpha |
a level of transparency for nodes, vertices and arrows.
Defaults to |
weight |
the weighting method for the nodes, which might be a vertex
attribute or a vector of size values. Also accepts |
weight.legend |
the name to assign to the legend created by
|
weight.method |
see |
weight.min |
whether to subset the network to nodes with a minimum size,
based on the values of |
weight.max |
whether to subset the network to nodes with a maximum size,
based on the values of |
weight.cut |
whether to cut the size of the nodes into a certain number
of quantiles. Accepts |
group |
the groups of the nodes, either as a vector of values or as a
vertex attribute. If set to |
group.legend |
the name to assign to the legend created by
|
node.group |
see |
node.color |
a vector of character strings to color the nodes with,
holding as many colors as there are levels in |
node.alpha |
transparency of the nodes. Inherits from |
segment.alpha |
the level of transparency of the edges.
Defaults to |
segment.color |
the color of the edges, as a color value, a vector of
color values, or as an edge attribute containing color values.
Defaults to |
segment.label |
the labels to plot at the middle of the edges, as a
single value, a vector of values, or as an edge attribute.
Defaults to |
segment.size |
the size of the edges, in points, as a single numeric
value, a vector of values, or as an edge attribute.
Defaults to |
arrow.size |
the size of the arrows for directed network edges, in
points. See |
arrow.gap |
a setting aimed at improving the display of edge arrows by
plotting slightly shorter edges. Accepts any value between |
arrow.type |
the type of the arrows for directed network edges. See
|
label |
whether to label the nodes. If set to |
label.nodes |
see |
label.size |
the size of the node labels, in points, as a numeric value,
a vector of numeric values, or as a vertex attribute containing numeric
values.
Defaults to |
label.trim |
whether to apply some trimming to the node labels. Accepts
any function that can process a character vector, or a strictly positive
numeric value, in which case the labels are trimmed to a fixed-length
substring of that length: see |
legend.size |
the size of the legend symbols and text, in points.
Defaults to |
legend.position |
the location of the plot legend(s). Accepts all
|
names |
deprecated: see |
quantize.weights |
deprecated: see |
subset.threshold |
deprecated: see |
top8.nodes |
deprecated: this functionality was experimental and has
been removed entirely from |
trim.labels |
deprecated: see |
... |
other arguments passed to the |
The degree centrality measures that can be produced through the
weight
argument will take the directedness of the network into account,
but will be unweighted. To compute weighted network measures, see the
tnet
package by Tore Opsahl (help("tnet", package = "tnet")
).
Moritz Marbach and Francois Briatte, with help from Heike Hofmann, Pedro Jordano and Ming-Yu Liu
ggnet2
in this package,
gplot
in the sna
package, and
plot.network
in the network
package
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive library(network) # random adjacency matrix x <- 10 ndyads <- x * (x - 1) density <- x / ndyads m <- matrix(0, nrow = x, ncol = x) dimnames(m) <- list(letters[1:x], letters[1:x]) m[row(m) != col(m)] <- runif(ndyads) < density m # random undirected network n <- network::network(m, directed = FALSE) n ggnet(n, label = TRUE, alpha = 1, color = "white", segment.color = "black") # random groups g <- sample(letters[1:3], 10, replace = TRUE) g # color palette p <- c("a" = "steelblue", "b" = "forestgreen", "c" = "tomato") p_(ggnet(n, node.group = g, node.color = p, label = TRUE, color = "white")) # edge arrows on a directed network p_(ggnet(network(m, directed = TRUE), arrow.gap = 0.05, arrow.size = 10))
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive library(network) # random adjacency matrix x <- 10 ndyads <- x * (x - 1) density <- x / ndyads m <- matrix(0, nrow = x, ncol = x) dimnames(m) <- list(letters[1:x], letters[1:x]) m[row(m) != col(m)] <- runif(ndyads) < density m # random undirected network n <- network::network(m, directed = FALSE) n ggnet(n, label = TRUE, alpha = 1, color = "white", segment.color = "black") # random groups g <- sample(letters[1:3], 10, replace = TRUE) g # color palette p <- c("a" = "steelblue", "b" = "forestgreen", "c" = "tomato") p_(ggnet(n, node.group = g, node.color = p, label = TRUE, color = "white")) # edge arrows on a directed network p_(ggnet(network(m, directed = TRUE), arrow.gap = 0.05, arrow.size = 10))
Function for plotting network objects using ggplot2, with additional control
over graphical parameters that are not supported by the ggnet
function. Please visit https://github.com/briatte/ggnet for the latest
version of ggnet2, and https://briatte.github.io/ggnet/ for a vignette
that contains many examples and explanations.
ggnet2( net, mode = "fruchtermanreingold", layout.par = NULL, layout.exp = 0, alpha = 1, color = "grey75", shape = 19, size = 9, max_size = 9, na.rm = NA, palette = NULL, alpha.palette = NULL, alpha.legend = NA, color.palette = palette, color.legend = NA, shape.palette = NULL, shape.legend = NA, size.palette = NULL, size.legend = NA, size.zero = FALSE, size.cut = FALSE, size.min = NA, size.max = NA, label = FALSE, label.alpha = 1, label.color = "black", label.size = max_size/2, label.trim = FALSE, node.alpha = alpha, node.color = color, node.label = label, node.shape = shape, node.size = size, edge.alpha = 1, edge.color = "grey50", edge.lty = "solid", edge.size = 0.25, edge.label = NULL, edge.label.alpha = 1, edge.label.color = label.color, edge.label.fill = "white", edge.label.size = max_size/2, arrow.size = 0, arrow.gap = 0, arrow.type = "closed", legend.size = 9, legend.position = "right", ... )
ggnet2( net, mode = "fruchtermanreingold", layout.par = NULL, layout.exp = 0, alpha = 1, color = "grey75", shape = 19, size = 9, max_size = 9, na.rm = NA, palette = NULL, alpha.palette = NULL, alpha.legend = NA, color.palette = palette, color.legend = NA, shape.palette = NULL, shape.legend = NA, size.palette = NULL, size.legend = NA, size.zero = FALSE, size.cut = FALSE, size.min = NA, size.max = NA, label = FALSE, label.alpha = 1, label.color = "black", label.size = max_size/2, label.trim = FALSE, node.alpha = alpha, node.color = color, node.label = label, node.shape = shape, node.size = size, edge.alpha = 1, edge.color = "grey50", edge.lty = "solid", edge.size = 0.25, edge.label = NULL, edge.label.alpha = 1, edge.label.color = label.color, edge.label.fill = "white", edge.label.size = max_size/2, arrow.size = 0, arrow.gap = 0, arrow.type = "closed", legend.size = 9, legend.position = "right", ... )
net |
an object of class |
mode |
a placement method from those provided in the
|
layout.par |
options to be passed to the placement method, as listed in
gplot.layout.
Defaults to |
layout.exp |
a multiplier to expand the horizontal axis if node labels
get clipped: see expand_range for details.
Defaults to |
alpha |
the level of transparency of the edges and nodes, which might be
a single value, a vertex attribute, or a vector of values.
Also accepts |
color |
the color of the nodes, which might be a single value, a vertex
attribute, or a vector of values.
Also accepts |
shape |
the shape of the nodes, which might be a single value, a vertex
attribute, or a vector of values.
Also accepts |
size |
the size of the nodes, in points, which might be a single value,
a vertex attribute, or a vector of values. Also accepts |
max_size |
the maximum size of the node when |
na.rm |
whether to subset the network to nodes that are not
missing a given vertex attribute. If set to any vertex attribute of
|
palette |
the palette to color the nodes, when |
alpha.palette |
the palette to control the transparency levels of the
nodes set by |
alpha.legend |
the name to assign to the legend created by
|
color.palette |
see |
color.legend |
the name to assign to the legend created by
|
shape.palette |
the palette to control the shapes of the nodes set by
|
shape.legend |
the name to assign to the legend created by
|
size.palette |
the palette to control the sizes of the nodes set by
|
size.legend |
the name to assign to the legend created by
|
size.zero |
whether to accept zero-sized nodes based on the value(s) of
|
size.cut |
whether to cut the size of the nodes into a certain number of
quantiles. Accepts |
size.min |
whether to subset the network to nodes with a minimum size,
based on the values of |
size.max |
whether to subset the network to nodes with a maximum size,
based on the values of |
label |
whether to label the nodes. If set to |
label.alpha |
the level of transparency of the node labels, as a
numeric value, a vector of numeric values, or as a vertex attribute
containing numeric values.
Defaults to |
label.color |
the color of the node labels, as a color value, a vector
of color values, or as a vertex attribute containing color values.
Defaults to |
label.size |
the size of the node labels, in points, as a numeric value,
a vector of numeric values, or as a vertex attribute containing numeric
values.
Defaults to |
label.trim |
whether to apply some trimming to the node labels. Accepts
any function that can process a character vector, or a strictly positive
numeric value, in which case the labels are trimmed to a fixed-length
substring of that length: see |
node.alpha |
see |
node.color |
see |
node.label |
see |
node.shape |
see |
node.size |
see |
edge.alpha |
the level of transparency of the edges.
Defaults to the value of |
edge.color |
the color of the edges, as a color value, a vector of color
values, or as an edge attribute containing color values.
Defaults to |
edge.lty |
the linetype of the edges, as a linetype value, a vector of
linetype values, or as an edge attribute containing linetype values.
Defaults to |
edge.size |
the size of the edges, in points, as a numeric value, a
vector of numeric values, or as an edge attribute containing numeric values.
All edge sizes must be strictly positive.
Defaults to |
edge.label |
the labels to plot at the middle of the edges, as a single
value, a vector of values, or as an edge attribute.
Defaults to |
edge.label.alpha |
the level of transparency of the edge labels, as a
numeric value, a vector of numeric values, or as an edge attribute
containing numeric values.
Defaults to |
edge.label.color |
the color of the edge labels, as a color value, a
vector of color values, or as an edge attribute containing color values.
Defaults to |
edge.label.fill |
the background color of the edge labels.
Defaults to |
edge.label.size |
the size of the edge labels, in points, as a numeric
value, a vector of numeric values, or as an edge attribute containing numeric
values. All edge label sizes must be strictly positive.
Defaults to |
arrow.size |
the size of the arrows for directed network edges, in
points. See |
arrow.gap |
a setting aimed at improving the display of edge arrows by
plotting slightly shorter edges. Accepts any value between |
arrow.type |
the type of the arrows for directed network edges. See
|
legend.size |
the size of the legend symbols and text, in points.
Defaults to |
legend.position |
the location of the plot legend(s). Accepts all
|
... |
other arguments passed to the |
The degree centrality measures that can be produced through the
size
argument will take the directedness of the network into account,
but will be unweighted. To compute weighted network measures, see the
tnet
package by Tore Opsahl (help("tnet", package = "tnet")
).
The nodes of bipartite networks can be mapped to their mode by passing the
"mode"
argument to any of alpha
, color
, shape
and
size
, in which case the nodes of the primary mode will be mapped as
"actor"
, and the nodes of the secondary mode will be mapped as
"event"
.
Moritz Marbach and Francois Briatte, with help from Heike Hofmann, Pedro Jordano and Ming-Yu Liu
ggnet
in this package,
gplot
in the sna
package, and
plot.network
in the network
package
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive library(network) # random adjacency matrix x <- 10 ndyads <- x * (x - 1) density <- x / ndyads m <- matrix(0, nrow = x, ncol = x) dimnames(m) <- list(letters[1:x], letters[1:x]) m[row(m) != col(m)] <- runif(ndyads) < density m # random undirected network n <- network::network(m, directed = FALSE) n p_(ggnet2(n, label = TRUE)) p_(ggnet2(n, label = TRUE, shape = 15)) p_(ggnet2(n, label = TRUE, shape = 15, color = "black", label.color = "white")) # add vertex attribute x = network.vertex.names(n) x = ifelse(x %in% c("a", "e", "i"), "vowel", "consonant") n %v% "phono" = x p_(ggnet2(n, color = "phono")) p_(ggnet2(n, color = "phono", palette = c("vowel" = "gold", "consonant" = "grey"))) p_(ggnet2(n, shape = "phono", color = "phono")) if (require(RColorBrewer)) { # random groups n %v% "group" <- sample(LETTERS[1:3], 10, replace = TRUE) p_(ggnet2(n, color = "group", palette = "Set2")) } # random weights n %e% "weight" <- sample(1:3, network.edgecount(n), replace = TRUE) p_(ggnet2(n, edge.size = "weight", edge.label = "weight")) # edge arrows on a directed network p_(ggnet2(network(m, directed = TRUE), arrow.gap = 0.05, arrow.size = 10)) # Padgett's Florentine wedding data data(flo, package = "network") flo p_(ggnet2(flo, label = TRUE)) p_(ggnet2(flo, label = TRUE, label.trim = 4, vjust = -1, size = 3, color = 1)) p_(ggnet2(flo, label = TRUE, size = 12, color = "white"))
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive library(network) # random adjacency matrix x <- 10 ndyads <- x * (x - 1) density <- x / ndyads m <- matrix(0, nrow = x, ncol = x) dimnames(m) <- list(letters[1:x], letters[1:x]) m[row(m) != col(m)] <- runif(ndyads) < density m # random undirected network n <- network::network(m, directed = FALSE) n p_(ggnet2(n, label = TRUE)) p_(ggnet2(n, label = TRUE, shape = 15)) p_(ggnet2(n, label = TRUE, shape = 15, color = "black", label.color = "white")) # add vertex attribute x = network.vertex.names(n) x = ifelse(x %in% c("a", "e", "i"), "vowel", "consonant") n %v% "phono" = x p_(ggnet2(n, color = "phono")) p_(ggnet2(n, color = "phono", palette = c("vowel" = "gold", "consonant" = "grey"))) p_(ggnet2(n, shape = "phono", color = "phono")) if (require(RColorBrewer)) { # random groups n %v% "group" <- sample(LETTERS[1:3], 10, replace = TRUE) p_(ggnet2(n, color = "group", palette = "Set2")) } # random weights n %e% "weight" <- sample(1:3, network.edgecount(n), replace = TRUE) p_(ggnet2(n, edge.size = "weight", edge.label = "weight")) # edge arrows on a directed network p_(ggnet2(network(m, directed = TRUE), arrow.gap = 0.05, arrow.size = 10)) # Padgett's Florentine wedding data data(flo, package = "network") flo p_(ggnet2(flo, label = TRUE)) p_(ggnet2(flo, label = TRUE, label.trim = 4, vjust = -1, size = 3, color = 1)) p_(ggnet2(flo, label = TRUE, size = 12, color = "white"))
Plots a network with ggplot2 suitable for overlay on a ggmap plot or ggplot2
ggnetworkmap( gg, net, size = 3, alpha = 0.75, weight, node.group, node.color = NULL, node.alpha = NULL, ring.group, segment.alpha = NULL, segment.color = "grey", great.circles = FALSE, segment.size = 0.25, arrow.size = 0, label.nodes = FALSE, label.size = size/2, ... )
ggnetworkmap( gg, net, size = 3, alpha = 0.75, weight, node.group, node.color = NULL, node.alpha = NULL, ring.group, segment.alpha = NULL, segment.color = "grey", great.circles = FALSE, segment.size = 0.25, arrow.size = 0, label.nodes = FALSE, label.size = size/2, ... )
gg |
an object of class |
net |
an object of class |
size |
size of the network nodes. Defaults to 3. If the nodes are weighted, their area is proportionally scaled up to the size set by |
alpha |
a level of transparency for nodes, vertices and arrows. Defaults to 0.75. |
weight |
if present, the unquoted name of a vertex attribute in |
node.group |
|
node.color |
If |
node.alpha |
transparency of the nodes. Inherits from |
ring.group |
if not |
segment.alpha |
transparency of the vertex links. Inherits from |
segment.color |
color of the vertex links. Defaults to |
great.circles |
whether to draw edges as great circles using the |
segment.size |
size of the vertex links, as a vector of values or as a single value. Defaults to 0.25. |
arrow.size |
size of the vertex arrows for directed network plotting, in centimeters. Defaults to 0. |
label.nodes |
label nodes with their vertex names attribute. If set to |
label.size |
size of the labels. Defaults to |
... |
other arguments supplied to geom_text for the node labels. Arguments pertaining to the title or other items can be achieved through ggplot2 methods. |
This is a descendant of the original ggnet
function. ggnet
added the innovation of plotting the network geographically.
However, ggnet
needed to be the first object in the ggplot chain. ggnetworkmap
does not. If passed a ggplot
object as its first argument,
such as output from ggmap
, ggnetworkmap
will plot on top of that chart, looking for vertex attributes lon
and lat
as coordinates.
Otherwise, ggnetworkmap
will generate coordinates using the Fruchterman-Reingold algorithm.
This is a function for plotting graphs generated by network
or igraph
in a more flexible and elegant manner than permitted by ggnet. The function does not need to be the first plot in the ggplot chain, so the graph can be plotted on top of a map or other chart. Segments can be straight lines, or plotted as great circles. Note that the great circles feature can produce odd results with arrows and with vertices beyond the plot edges; this is a ggplot2 limitation and cannot yet be fixed. Nodes can have two color schemes, which are then plotted as the center and ring around the node. The color schemes are selected by adding scale_fill_ or scale_color_ just like any other ggplot2 plot. If there are no rings, scale_color sets the color of the nodes. If there are rings, scale_color sets the color of the rings, and scale_fill sets the color of the centers. Note that additional arguments in the ... are passed to geom_text for plotting labels.
Amos Elberg. Original by Moritz Marbach, Francois Briatte
# small function to display plots only if it's interactive p_ <- GGally::print_if_interactive invisible(lapply(c("ggplot2", "maps", "network", "sna"), base::library, character.only = TRUE)) ## Example showing great circles on a simple map of the USA ## http://flowingdata.com/2011/05/11/how-to-map-connections-with-great-circles/ airports <- read.csv("http://datasets.flowingdata.com/tuts/maparcs/airports.csv", header = TRUE) rownames(airports) <- airports$iata # select some random flights set.seed(123) flights <- data.frame( origin = sample(airports[200:400, ]$iata, 200, replace = TRUE), destination = sample(airports[200:400, ]$iata, 200, replace = TRUE) ) # convert to network flights <- network(flights, directed = TRUE) # add geographic coordinates flights %v% "lat" <- airports[network.vertex.names(flights), "lat"] flights %v% "lon" <- airports[network.vertex.names(flights), "long"] # drop isolated airports delete.vertices(flights, which(degree(flights) < 2)) # compute degree centrality flights %v% "degree" <- degree(flights, gmode = "digraph") # add random groups flights %v% "mygroup" <- sample(letters[1:4], network.size(flights), replace = TRUE) # create a map of the USA usa <- ggplot(map_data("usa"), aes(x = long, y = lat)) + geom_polygon(aes(group = group), color = "grey65", fill = "#f9f9f9", linewidth = 0.2 ) # overlay network data to map p <- ggnetworkmap( usa, flights, size = 4, great.circles = TRUE, node.group = mygroup, segment.color = "steelblue", ring.group = degree, weight = degree ) p_(p) ## Exploring a community of spambots found on Twitter ## Data by Amos Elberg: see ?twitter_spambots for details data(twitter_spambots) # create a world map world <- fortify(map("world", plot = FALSE, fill = TRUE)) world <- ggplot(world, aes(x = long, y = lat)) + geom_polygon(aes(group = group), color = "grey65", fill = "#f9f9f9", linewidth = 0.2 ) # view global structure p <- ggnetworkmap(world, twitter_spambots) p_(p) # domestic distribution p <- ggnetworkmap(net = twitter_spambots) p_(p) # topology p <- ggnetworkmap(net = twitter_spambots, arrow.size = 0.5) p_(p) # compute indegree and outdegree centrality twitter_spambots %v% "indegree" <- degree(twitter_spambots, cmode = "indegree") twitter_spambots %v% "outdegree" <- degree(twitter_spambots, cmode = "outdegree") p <- ggnetworkmap( net = twitter_spambots, arrow.size = 0.5, node.group = indegree, ring.group = outdegree, size = 4 ) + scale_fill_continuous("Indegree", high = "red", low = "yellow") + labs(color = "Outdegree") p_(p) # show some vertex attributes associated with each account p <- ggnetworkmap( net = twitter_spambots, arrow.size = 0.5, node.group = followers, ring.group = friends, size = 4, weight = indegree, label.nodes = TRUE, vjust = -1.5 ) + scale_fill_continuous("Followers", high = "red", low = "yellow") + labs(color = "Friends") + scale_color_continuous(low = "lightgreen", high = "darkgreen") p_(p)
# small function to display plots only if it's interactive p_ <- GGally::print_if_interactive invisible(lapply(c("ggplot2", "maps", "network", "sna"), base::library, character.only = TRUE)) ## Example showing great circles on a simple map of the USA ## http://flowingdata.com/2011/05/11/how-to-map-connections-with-great-circles/ airports <- read.csv("http://datasets.flowingdata.com/tuts/maparcs/airports.csv", header = TRUE) rownames(airports) <- airports$iata # select some random flights set.seed(123) flights <- data.frame( origin = sample(airports[200:400, ]$iata, 200, replace = TRUE), destination = sample(airports[200:400, ]$iata, 200, replace = TRUE) ) # convert to network flights <- network(flights, directed = TRUE) # add geographic coordinates flights %v% "lat" <- airports[network.vertex.names(flights), "lat"] flights %v% "lon" <- airports[network.vertex.names(flights), "long"] # drop isolated airports delete.vertices(flights, which(degree(flights) < 2)) # compute degree centrality flights %v% "degree" <- degree(flights, gmode = "digraph") # add random groups flights %v% "mygroup" <- sample(letters[1:4], network.size(flights), replace = TRUE) # create a map of the USA usa <- ggplot(map_data("usa"), aes(x = long, y = lat)) + geom_polygon(aes(group = group), color = "grey65", fill = "#f9f9f9", linewidth = 0.2 ) # overlay network data to map p <- ggnetworkmap( usa, flights, size = 4, great.circles = TRUE, node.group = mygroup, segment.color = "steelblue", ring.group = degree, weight = degree ) p_(p) ## Exploring a community of spambots found on Twitter ## Data by Amos Elberg: see ?twitter_spambots for details data(twitter_spambots) # create a world map world <- fortify(map("world", plot = FALSE, fill = TRUE)) world <- ggplot(world, aes(x = long, y = lat)) + geom_polygon(aes(group = group), color = "grey65", fill = "#f9f9f9", linewidth = 0.2 ) # view global structure p <- ggnetworkmap(world, twitter_spambots) p_(p) # domestic distribution p <- ggnetworkmap(net = twitter_spambots) p_(p) # topology p <- ggnetworkmap(net = twitter_spambots, arrow.size = 0.5) p_(p) # compute indegree and outdegree centrality twitter_spambots %v% "indegree" <- degree(twitter_spambots, cmode = "indegree") twitter_spambots %v% "outdegree" <- degree(twitter_spambots, cmode = "outdegree") p <- ggnetworkmap( net = twitter_spambots, arrow.size = 0.5, node.group = indegree, ring.group = outdegree, size = 4 ) + scale_fill_continuous("Indegree", high = "red", low = "yellow") + labs(color = "Outdegree") p_(p) # show some vertex attributes associated with each account p <- ggnetworkmap( net = twitter_spambots, arrow.size = 0.5, node.group = followers, ring.group = friends, size = 4, weight = indegree, label.nodes = TRUE, vjust = -1.5 ) + scale_fill_continuous("Followers", high = "red", low = "yellow") + labs(color = "Friends") + scale_color_continuous(low = "lightgreen", high = "darkgreen") p_(p)
Plot matrix of statistical model diagnostics
ggnostic( model, ..., columnsX = attr(data, "var_x"), columnsY = c(".resid", ".sigma", ".hat", ".cooksd"), columnLabelsX = attr(data, "var_x_label"), columnLabelsY = gsub("\\.", " ", gsub("^\\.", "", columnsY)), xlab = "explanatory variables", ylab = "diagnostics", title = paste(deparse(model$call, width.cutoff = 500L), collapse = "\n"), continuous = list(default = ggally_points, .fitted = ggally_points, .se.fit = ggally_nostic_se_fit, .resid = ggally_nostic_resid, .hat = ggally_nostic_hat, .sigma = ggally_nostic_sigma, .cooksd = ggally_nostic_cooksd, .std.resid = ggally_nostic_std_resid), combo = list(default = ggally_box_no_facet, .fitted = ggally_box_no_facet, .se.fit = ggally_nostic_se_fit, .resid = ggally_nostic_resid, .hat = ggally_nostic_hat, .sigma = ggally_nostic_sigma, .cooksd = ggally_nostic_cooksd, .std.resid = ggally_nostic_std_resid), discrete = list(default = ggally_ratio, .fitted = ggally_ratio, .se.fit = ggally_ratio, .resid = ggally_ratio, .hat = ggally_ratio, .sigma = ggally_ratio, .cooksd = ggally_ratio, .std.resid = ggally_ratio), progress = NULL, data = broomify(model) )
ggnostic( model, ..., columnsX = attr(data, "var_x"), columnsY = c(".resid", ".sigma", ".hat", ".cooksd"), columnLabelsX = attr(data, "var_x_label"), columnLabelsY = gsub("\\.", " ", gsub("^\\.", "", columnsY)), xlab = "explanatory variables", ylab = "diagnostics", title = paste(deparse(model$call, width.cutoff = 500L), collapse = "\n"), continuous = list(default = ggally_points, .fitted = ggally_points, .se.fit = ggally_nostic_se_fit, .resid = ggally_nostic_resid, .hat = ggally_nostic_hat, .sigma = ggally_nostic_sigma, .cooksd = ggally_nostic_cooksd, .std.resid = ggally_nostic_std_resid), combo = list(default = ggally_box_no_facet, .fitted = ggally_box_no_facet, .se.fit = ggally_nostic_se_fit, .resid = ggally_nostic_resid, .hat = ggally_nostic_hat, .sigma = ggally_nostic_sigma, .cooksd = ggally_nostic_cooksd, .std.resid = ggally_nostic_std_resid), discrete = list(default = ggally_ratio, .fitted = ggally_ratio, .se.fit = ggally_ratio, .resid = ggally_ratio, .hat = ggally_ratio, .sigma = ggally_ratio, .cooksd = ggally_ratio, .std.resid = ggally_ratio), progress = NULL, data = broomify(model) )
model |
statistical model object such as output from |
... |
arguments passed directly to |
columnsX |
columns to be displayed in the plot matrix. Defaults to the predictor columns of the |
columnsY |
rows to be displayed in the plot matrix. Defaults to residuals, leave one out sigma value, diagonal of the hat matrix, and Cook's Distance. The possible values are the response variables in the model and the added columns provided by |
columnLabelsX , columnLabelsY
|
column and row labels to display in the plot matrix |
xlab , ylab , title
|
plot matrix labels passed directly to |
continuous , combo , discrete
|
list of functions for each y variable. See details for more information. |
progress |
|
data |
data defaults to a 'broomify'ed model object. This object will contain information about the X variables, Y variables, and multiple broom outputs. See |
columnsY
broom::augment()
collects data from the supplied model and returns a data.frame with the following columns (taken directly from broom documentation). These columns are the only allowed values in the columnsY
parameter to ggnostic
.
Residuals
Diagonal of the hat matrix
Estimate of residual standard deviation when corresponding observation is dropped from model
Cooks distance, stats::cooks.distance()
Fitted values of model
Standard errors of fitted values
Standardized residuals
The response variable in the model may be added. Such as "mpg"
in the model lm(mpg ~ ., data = mtcars)
continuous
, combo
, discrete
typesSimilar to ggduo
and ggpairs
, functions may be supplied to display the different column types. However, since the Y rows are fixed, each row has it's own corresponding function in each of the plot types: continuous, combo, and discrete. Each plot type list can have keys that correspond to the broom::augment()
output: ".fitted"
, ".resid"
, ".std.resid"
, ".sigma"
, ".se.fit"
, ".hat"
, ".cooksd"
. An extra key, "default"
, is used to plot the response variables of the model if they are included. Having a function for each diagnostic allows for very fine control over the diagnostics plot matrix. The functions for each type list are wrapped into a switch function that calls the function corresponding to the y variable being plotted. These switch functions are then passed directly to the types
parameter in ggduo
.
# small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(mtcars) # use mtcars dataset and alter the 'am' column to display actual name values mtc <- mtcars mtc$am <- c("0" = "automatic", "1" = "manual")[as.character(mtc$am)] # step the complete model down to a smaller model mod <- stats::step(stats::lm(mpg ~ ., data = mtc), trace = FALSE) # display using defaults pm <- ggnostic(mod) p_(pm) # color by am value pm <- ggnostic(mod, mapping = ggplot2::aes(color = am)) p_(pm) # turn resid smooth error ribbon off pm <- ggnostic(mod, continuous = list(.resid = wrap("nostic_resid", se = FALSE))) p_(pm) ## plot residuals vs fitted in a ggpairs plot matrix dt <- broomify(mod) pm <- ggpairs( dt, c(".fitted", ".resid"), columnLabels = c("fitted", "residuals"), lower = list(continuous = ggally_nostic_resid) ) p_(pm)
# small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(mtcars) # use mtcars dataset and alter the 'am' column to display actual name values mtc <- mtcars mtc$am <- c("0" = "automatic", "1" = "manual")[as.character(mtc$am)] # step the complete model down to a smaller model mod <- stats::step(stats::lm(mpg ~ ., data = mtc), trace = FALSE) # display using defaults pm <- ggnostic(mod) p_(pm) # color by am value pm <- ggnostic(mod, mapping = ggplot2::aes(color = am)) p_(pm) # turn resid smooth error ribbon off pm <- ggnostic(mod, continuous = list(.resid = wrap("nostic_resid", se = FALSE))) p_(pm) ## plot residuals vs fitted in a ggpairs plot matrix dt <- broomify(mod) pm <- ggpairs( dt, c(".fitted", ".resid"), columnLabels = c("fitted", "residuals"), lower = list(continuous = ggally_nostic_resid) ) p_(pm)
Make a matrix of plots with a given data set
ggpairs( data, mapping = NULL, columns = 1:ncol(data), title = NULL, upper = list(continuous = "cor", combo = "box_no_facet", discrete = "count", na = "na"), lower = list(continuous = "points", combo = "facethist", discrete = "facetbar", na = "na"), diag = list(continuous = "densityDiag", discrete = "barDiag", na = "naDiag"), params = deprecated(), ..., xlab = NULL, ylab = NULL, axisLabels = c("show", "internal", "none"), columnLabels = colnames(data[columns]), labeller = "label_value", switch = NULL, showStrips = NULL, legend = NULL, cardinality_threshold = 15, progress = NULL, proportions = NULL, legends = deprecated() )
ggpairs( data, mapping = NULL, columns = 1:ncol(data), title = NULL, upper = list(continuous = "cor", combo = "box_no_facet", discrete = "count", na = "na"), lower = list(continuous = "points", combo = "facethist", discrete = "facetbar", na = "na"), diag = list(continuous = "densityDiag", discrete = "barDiag", na = "naDiag"), params = deprecated(), ..., xlab = NULL, ylab = NULL, axisLabels = c("show", "internal", "none"), columnLabels = colnames(data[columns]), labeller = "label_value", switch = NULL, showStrips = NULL, legend = NULL, cardinality_threshold = 15, progress = NULL, proportions = NULL, legends = deprecated() )
data |
data set using. Can have both numerical and categorical data. |
mapping |
aesthetic mapping (besides |
columns |
which columns are used to make plots. Defaults to all columns. |
title , xlab , ylab
|
title, x label, and y label for the graph |
upper |
see Details |
lower |
see Details |
diag |
see Details |
params |
|
... |
|
axisLabels |
either "show" to display axisLabels, "internal" for labels in the diagonal plots, or "none" for no axis labels |
columnLabels |
label names to be displayed. Defaults to names of columns being used. |
labeller |
labeller for facets. See |
switch |
switch parameter for facet_grid. See |
showStrips |
boolean to determine if each plot's strips should be displayed. |
legend |
May be the two objects described below or the default
|
cardinality_threshold |
maximum number of levels allowed in a character / factor column. Set this value to NULL to not check factor columns. Defaults to 15 |
progress |
|
proportions |
Value to change how much area is given for each plot. Either |
legends |
upper
and lower
are lists that may contain the variables
'continuous', 'combo', 'discrete', and 'na'. Each element of the list may be a function or a string. If a string is supplied, it must be a character string representing the tail end of a ggally_NAME
function. The list of current valid ggally_NAME
functions is visible in a dedicated vignette.
This option is used for continuous X and Y data.
This option is used for either continuous X and categorical Y data or categorical X and continuous Y data.
This option is used for categorical X and Y data.
This option is used when all X data is NA
, all Y data is NA
, or either all X or Y data is NA
.
diag
is a list that may only contain the variables 'continuous', 'discrete', and 'na'. Each element of the diag list is a string implementing the following options:
exactly one of ('densityDiag', 'barDiag', 'blankDiag'). This option is used for continuous X data.
exactly one of ('barDiag', 'blankDiag'). This option is used for categorical X and Y data.
exactly one of ('naDiag', 'blankDiag'). This option is used when all X data is NA
.
If 'blank' is ever chosen as an option, then ggpairs will produce an empty plot.
If a function is supplied as an option to upper
, lower
, or diag
, it should implement the function api of function(data, mapping, ...){#make ggplot2 plot}
. If a specific function needs its parameters set, wrap(fn, param1 = val1, param2 = val2)
the function with its parameters.
ggmatrix
object that if called, will print
Barret Schloerke, Jason Crowley, Di Cook, Heike Hofmann, Hadley Wickham
John W Emerson, Walton A Green, Barret Schloerke, Jason Crowley, Dianne Cook, Heike Hofmann, Hadley Wickham. The Generalized Pairs Plot. Journal of Computational and Graphical Statistics, vol. 22, no. 1, pp. 79-91, 2012.
wrap v1_ggmatrix_theme
# small function to display plots only if it's interactive p_ <- GGally::print_if_interactive ## Quick example, with and without colour data(flea) ggpairs(flea, columns = 2:4) pm <- ggpairs(flea, columns = 2:4, ggplot2::aes(colour = species)) p_(pm) # Note: colour should be categorical, else you will need to reset # the upper triangle to use points instead of trying to compute corr data(tips) pm <- ggpairs(tips[, 1:3]) p_(pm) pm <- ggpairs(tips, 1:3, columnLabels = c("Total Bill", "Tip", "Sex")) p_(pm) pm <- ggpairs(tips, upper = "blank") p_(pm) ## Plot Types # Change default plot behavior pm <- ggpairs( tips[, c(1, 3, 4, 2)], upper = list(continuous = "density", combo = "box_no_facet"), lower = list(continuous = "points", combo = "dot_no_facet") ) p_(pm) # Supply Raw Functions (may be user defined functions!) pm <- ggpairs( tips[, c(1, 3, 4, 2)], upper = list(continuous = ggally_density, combo = ggally_box_no_facet), lower = list(continuous = ggally_points, combo = ggally_dot_no_facet) ) p_(pm) # Use sample of the diamonds data data(diamonds, package = "ggplot2") diamonds.samp <- diamonds[sample(1:dim(diamonds)[1], 1000), ] # Different aesthetics for different plot sections and plot types pm <- ggpairs( diamonds.samp[, 1:5], mapping = ggplot2::aes(color = cut), upper = list(continuous = wrap("density", alpha = 0.5), combo = "box_no_facet"), lower = list(continuous = wrap("points", alpha = 0.3), combo = wrap("dot_no_facet", alpha = 0.4)), title = "Diamonds" ) p_(pm) ## Axis Label Variations # Only Variable Labels on the diagonal (no axis labels) pm <- ggpairs(tips[, 1:3], axisLabels = "internal") p_(pm) # Only Variable Labels on the outside (no axis labels) pm <- ggpairs(tips[, 1:3], axisLabels = "none") p_(pm) ## Facet Label Variations # Default: df_x <- rnorm(100) df_y <- df_x + rnorm(100, 0, 0.1) df <- data.frame(x = df_x, y = df_y, c = sqrt(df_x^2 + df_y^2)) pm <- ggpairs( df, columnLabels = c("alpha[foo]", "alpha[bar]", "sqrt(alpha[foo]^2 + alpha[bar]^2)") ) p_(pm) # Parsed labels: pm <- ggpairs( df, columnLabels = c("alpha[foo]", "alpha[bar]", "sqrt(alpha[foo]^2 + alpha[bar]^2)"), labeller = "label_parsed" ) p_(pm) ## Plot Insertion Example custom_car <- ggpairs(mtcars[, c("mpg", "wt", "cyl")], upper = "blank", title = "Custom Example") # ggplot example taken from example(geom_text) plot <- ggplot2::ggplot(mtcars, ggplot2::aes(x = wt, y = mpg, label = rownames(mtcars))) plot <- plot + ggplot2::geom_text(ggplot2::aes(colour = factor(cyl)), size = 3) + ggplot2::scale_colour_discrete(l = 40) custom_car[1, 2] <- plot personal_plot <- ggally_text( "ggpairs allows you\nto put in your\nown plot.\nLike that one.\n <---" ) custom_car[1, 3] <- personal_plot p_(custom_car) ## Remove binwidth warning from ggplot2 # displays warning about picking a better binwidth pm <- ggpairs(tips, 2:3) p_(pm) # no warning displayed pm <- ggpairs(tips, 2:3, lower = list(combo = wrap("facethist", binwidth = 0.5))) p_(pm) # no warning displayed with user supplied function pm <- ggpairs(tips, 2:3, lower = list(combo = wrap(ggally_facethist, binwidth = 0.5))) p_(pm) ## Remove panel grid lines from correlation plots pm <- ggpairs( flea, columns = 2:4, upper = list(continuous = wrap(ggally_cor, displayGrid = FALSE)) ) p_(pm) ## Custom with/height of subplots pm <- ggpairs(tips, columns = c(2, 3, 5)) p_(pm) pm <- ggpairs(tips, columns = c(2, 3, 5), proportions = "auto") p_(pm) pm <- ggpairs(tips, columns = c(2, 3, 5), proportions = c(1, 3, 2)) p_(pm)
# small function to display plots only if it's interactive p_ <- GGally::print_if_interactive ## Quick example, with and without colour data(flea) ggpairs(flea, columns = 2:4) pm <- ggpairs(flea, columns = 2:4, ggplot2::aes(colour = species)) p_(pm) # Note: colour should be categorical, else you will need to reset # the upper triangle to use points instead of trying to compute corr data(tips) pm <- ggpairs(tips[, 1:3]) p_(pm) pm <- ggpairs(tips, 1:3, columnLabels = c("Total Bill", "Tip", "Sex")) p_(pm) pm <- ggpairs(tips, upper = "blank") p_(pm) ## Plot Types # Change default plot behavior pm <- ggpairs( tips[, c(1, 3, 4, 2)], upper = list(continuous = "density", combo = "box_no_facet"), lower = list(continuous = "points", combo = "dot_no_facet") ) p_(pm) # Supply Raw Functions (may be user defined functions!) pm <- ggpairs( tips[, c(1, 3, 4, 2)], upper = list(continuous = ggally_density, combo = ggally_box_no_facet), lower = list(continuous = ggally_points, combo = ggally_dot_no_facet) ) p_(pm) # Use sample of the diamonds data data(diamonds, package = "ggplot2") diamonds.samp <- diamonds[sample(1:dim(diamonds)[1], 1000), ] # Different aesthetics for different plot sections and plot types pm <- ggpairs( diamonds.samp[, 1:5], mapping = ggplot2::aes(color = cut), upper = list(continuous = wrap("density", alpha = 0.5), combo = "box_no_facet"), lower = list(continuous = wrap("points", alpha = 0.3), combo = wrap("dot_no_facet", alpha = 0.4)), title = "Diamonds" ) p_(pm) ## Axis Label Variations # Only Variable Labels on the diagonal (no axis labels) pm <- ggpairs(tips[, 1:3], axisLabels = "internal") p_(pm) # Only Variable Labels on the outside (no axis labels) pm <- ggpairs(tips[, 1:3], axisLabels = "none") p_(pm) ## Facet Label Variations # Default: df_x <- rnorm(100) df_y <- df_x + rnorm(100, 0, 0.1) df <- data.frame(x = df_x, y = df_y, c = sqrt(df_x^2 + df_y^2)) pm <- ggpairs( df, columnLabels = c("alpha[foo]", "alpha[bar]", "sqrt(alpha[foo]^2 + alpha[bar]^2)") ) p_(pm) # Parsed labels: pm <- ggpairs( df, columnLabels = c("alpha[foo]", "alpha[bar]", "sqrt(alpha[foo]^2 + alpha[bar]^2)"), labeller = "label_parsed" ) p_(pm) ## Plot Insertion Example custom_car <- ggpairs(mtcars[, c("mpg", "wt", "cyl")], upper = "blank", title = "Custom Example") # ggplot example taken from example(geom_text) plot <- ggplot2::ggplot(mtcars, ggplot2::aes(x = wt, y = mpg, label = rownames(mtcars))) plot <- plot + ggplot2::geom_text(ggplot2::aes(colour = factor(cyl)), size = 3) + ggplot2::scale_colour_discrete(l = 40) custom_car[1, 2] <- plot personal_plot <- ggally_text( "ggpairs allows you\nto put in your\nown plot.\nLike that one.\n <---" ) custom_car[1, 3] <- personal_plot p_(custom_car) ## Remove binwidth warning from ggplot2 # displays warning about picking a better binwidth pm <- ggpairs(tips, 2:3) p_(pm) # no warning displayed pm <- ggpairs(tips, 2:3, lower = list(combo = wrap("facethist", binwidth = 0.5))) p_(pm) # no warning displayed with user supplied function pm <- ggpairs(tips, 2:3, lower = list(combo = wrap(ggally_facethist, binwidth = 0.5))) p_(pm) ## Remove panel grid lines from correlation plots pm <- ggpairs( flea, columns = 2:4, upper = list(continuous = wrap(ggally_cor, displayGrid = FALSE)) ) p_(pm) ## Custom with/height of subplots pm <- ggpairs(tips, columns = c(2, 3, 5)) p_(pm) pm <- ggpairs(tips, columns = c(2, 3, 5), proportions = "auto") p_(pm) pm <- ggpairs(tips, columns = c(2, 3, 5), proportions = c(1, 3, 2)) p_(pm)
A function for plotting static parallel coordinate plots, utilizing
the ggplot2
graphics package.
ggparcoord( data, columns = 1:ncol(data), groupColumn = NULL, scale = "std", scaleSummary = "mean", centerObsID = 1, missing = "exclude", order = columns, showPoints = FALSE, splineFactor = FALSE, alphaLines = 1, boxplot = FALSE, shadeBox = NULL, mapping = NULL, title = "" )
ggparcoord( data, columns = 1:ncol(data), groupColumn = NULL, scale = "std", scaleSummary = "mean", centerObsID = 1, missing = "exclude", order = columns, showPoints = FALSE, splineFactor = FALSE, alphaLines = 1, boxplot = FALSE, shadeBox = NULL, mapping = NULL, title = "" )
data |
the dataset to plot |
columns |
a vector of variables (either names or indices) to be axes in the plot |
groupColumn |
a single variable to group (color) by |
scale |
method used to scale the variables (see Details) |
scaleSummary |
if scale=="center", summary statistic to univariately center each variable by |
centerObsID |
if scale=="centerObs", row number of case plot should univariately be centered on |
missing |
method used to handle missing values (see Details) |
order |
method used to order the axes (see Details) |
showPoints |
logical operator indicating whether points should be plotted or not |
splineFactor |
logical or numeric operator indicating whether spline interpolation should be used. Numeric values will multiplied by the number of columns, |
alphaLines |
value of alpha scaler for the lines of the parcoord plot or a column name of the data |
boxplot |
logical operator indicating whether or not boxplots should underlay the distribution of each variable |
shadeBox |
color of underlying box which extends from the min to the
max for each variable (no box is plotted if |
mapping |
aes string to pass to ggplot object |
title |
character string denoting the title of the plot |
scale
is a character string that denotes how to scale the variables
in the parallel coordinate plot. Options:
std
: univariately, subtract mean and divide by standard deviation
robust
: univariately, subtract median and divide by median absolute deviation
uniminmax
: univariately, scale so the minimum of the variable is zero, and the maximum is one
globalminmax
: no scaling is done; the range of the graphs is defined by the global minimum and the global maximum
center
: use uniminmax
to standardize vertical height, then
center each variable at a value specified by the scaleSummary
param
centerObs
: use uniminmax
to standardize vertical height, then
center each variable at the value of the observation specified by the centerObsID
param
missing
is a character string that denotes how to handle missing
missing values. Options:
exclude
: remove all cases with missing values
mean
: set missing values to the mean of the variable
median
: set missing values to the median of the variable
min10
: set missing values to 10% below the minimum of the variable
random
: set missing values to value of randomly chosen observation on that variable
order
is either a vector of indices or a character string that denotes how to
order the axes (variables) of the parallel coordinate plot. Options:
(default)
: order by the vector denoted by columns
(given vector)
: order by the vector specified
anyClass
: order variables by their separation between any one class and the rest (as opposed to their overall variation between classes). This is accomplished by calculating the F-statistic for each class vs. the rest, for each axis variable. The axis variables are then ordered (decreasing) by their maximum of k F-statistics, where k is the number of classes.
allClass
: order variables by their overall F statistic (decreasing) from
an ANOVA with groupColumn
as the explanatory variable (note: it is required
to specify a groupColumn
with this ordering method). Basically, this method
orders the variables by their variation between classes (most to least).
skewness
: order variables by their sample skewness (most skewed to least skewed)
Outlying
: order by the scagnostic measure, Outlying, as calculated
by the package scagnostics
. Other scagnostic measures available to order
by are Skewed
, Clumpy
, Sparse
, Striated
, Convex
, Skinny
, Stringy
, and
Monotonic
. Note: To use these methods of ordering, you must have the scagnostics
package loaded.
ggplot object that if called, will print
Jason Crowley, Barret Schloerke, Dianne Cook, Heike Hofmann, Hadley Wickham
# small function to display plots only if it's interactive p_ <- GGally::print_if_interactive # use sample of the diamonds data for illustrative purposes data(diamonds, package = "ggplot2") diamonds.samp <- diamonds[sample(1:dim(diamonds)[1], 100), ] # basic parallel coordinate plot, using default settings p <- ggparcoord(data = diamonds.samp, columns = c(1, 5:10)) p_(p) # this time, color by diamond cut p <- ggparcoord(data = diamonds.samp, columns = c(1, 5:10), groupColumn = 2) p_(p) # underlay univariate boxplots, add title, use uniminmax scaling p <- ggparcoord( data = diamonds.samp, columns = c(1, 5:10), groupColumn = 2, scale = "uniminmax", boxplot = TRUE, title = "Parallel Coord. Plot of Diamonds Data" ) p_(p) # utilize ggplot2 aes to switch to thicker lines p <- ggparcoord( data = diamonds.samp, columns = c(1, 5:10), groupColumn = 2, title = "Parallel Coord. Plot of Diamonds Data", mapping = ggplot2::aes(linewidth = 1) ) + ggplot2::scale_linewidth_identity() p_(p) # basic parallel coord plot of the msleep data, using 'random' imputation and # coloring by diet (can also use variable names in the columns and groupColumn # arguments) data(msleep, package = "ggplot2") p <- ggparcoord( data = msleep, columns = 6:11, groupColumn = "vore", missing = "random", scale = "uniminmax" ) p_(p) # center each variable by its median, using the default missing value handler, # 'exclude' p <- ggparcoord( data = msleep, columns = 6:11, groupColumn = "vore", scale = "center", scaleSummary = "median" ) p_(p) # with the iris data, order the axes by overall class (Species) separation using # the anyClass option p <- ggparcoord(data = iris, columns = 1:4, groupColumn = 5, order = "anyClass") p_(p) # add points to the plot, add a title, and use an alpha scalar to make the lines # transparent p <- ggparcoord( data = iris, columns = 1:4, groupColumn = 5, order = "anyClass", showPoints = TRUE, title = "Parallel Coordinate Plot for the Iris Data", alphaLines = 0.3 ) p_(p) # color according to a column iris2 <- iris iris2$alphaLevel <- c("setosa" = 0.2, "versicolor" = 0.3, "virginica" = 0)[iris2$Species] p <- ggparcoord( data = iris2, columns = 1:4, groupColumn = 5, order = "anyClass", showPoints = TRUE, title = "Parallel Coordinate Plot for the Iris Data", alphaLines = "alphaLevel" ) p_(p) ## Use splines on values, rather than lines (all produce the same result) columns <- c(1, 5:10) p <- ggparcoord(diamonds.samp, columns, groupColumn = 2, splineFactor = TRUE) p_(p) p <- ggparcoord(diamonds.samp, columns, groupColumn = 2, splineFactor = 3) p_(p)
# small function to display plots only if it's interactive p_ <- GGally::print_if_interactive # use sample of the diamonds data for illustrative purposes data(diamonds, package = "ggplot2") diamonds.samp <- diamonds[sample(1:dim(diamonds)[1], 100), ] # basic parallel coordinate plot, using default settings p <- ggparcoord(data = diamonds.samp, columns = c(1, 5:10)) p_(p) # this time, color by diamond cut p <- ggparcoord(data = diamonds.samp, columns = c(1, 5:10), groupColumn = 2) p_(p) # underlay univariate boxplots, add title, use uniminmax scaling p <- ggparcoord( data = diamonds.samp, columns = c(1, 5:10), groupColumn = 2, scale = "uniminmax", boxplot = TRUE, title = "Parallel Coord. Plot of Diamonds Data" ) p_(p) # utilize ggplot2 aes to switch to thicker lines p <- ggparcoord( data = diamonds.samp, columns = c(1, 5:10), groupColumn = 2, title = "Parallel Coord. Plot of Diamonds Data", mapping = ggplot2::aes(linewidth = 1) ) + ggplot2::scale_linewidth_identity() p_(p) # basic parallel coord plot of the msleep data, using 'random' imputation and # coloring by diet (can also use variable names in the columns and groupColumn # arguments) data(msleep, package = "ggplot2") p <- ggparcoord( data = msleep, columns = 6:11, groupColumn = "vore", missing = "random", scale = "uniminmax" ) p_(p) # center each variable by its median, using the default missing value handler, # 'exclude' p <- ggparcoord( data = msleep, columns = 6:11, groupColumn = "vore", scale = "center", scaleSummary = "median" ) p_(p) # with the iris data, order the axes by overall class (Species) separation using # the anyClass option p <- ggparcoord(data = iris, columns = 1:4, groupColumn = 5, order = "anyClass") p_(p) # add points to the plot, add a title, and use an alpha scalar to make the lines # transparent p <- ggparcoord( data = iris, columns = 1:4, groupColumn = 5, order = "anyClass", showPoints = TRUE, title = "Parallel Coordinate Plot for the Iris Data", alphaLines = 0.3 ) p_(p) # color according to a column iris2 <- iris iris2$alphaLevel <- c("setosa" = 0.2, "versicolor" = 0.3, "virginica" = 0)[iris2$Species] p <- ggparcoord( data = iris2, columns = 1:4, groupColumn = 5, order = "anyClass", showPoints = TRUE, title = "Parallel Coordinate Plot for the Iris Data", alphaLines = "alphaLevel" ) p_(p) ## Use splines on values, rather than lines (all produce the same result) columns <- c(1, 5:10) p <- ggparcoord(diamonds.samp, columns, groupColumn = 2, splineFactor = TRUE) p_(p) p <- ggparcoord(diamonds.samp, columns, groupColumn = 2, splineFactor = 3) p_(p)
This function makes a scatterplot matrix for quantitative variables with density plots on the diagonal and correlation printed in the upper triangle.
ggscatmat( data, columns = 1:ncol(data), color = NULL, alpha = 1, corMethod = "pearson" )
ggscatmat( data, columns = 1:ncol(data), color = NULL, alpha = 1, corMethod = "pearson" )
data |
a data matrix. Should contain numerical (continuous) data. |
columns |
an option to choose the column to be used in the raw dataset. Defaults to |
color |
an option to group the dataset by the factor variable and color them by different colors.
Defaults to |
alpha |
an option to set the transparency in scatterplots for large data. Defaults to |
corMethod |
method argument supplied to |
Mengjia Ni, Di Cook
# small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(flea) p_(ggscatmat(flea, columns = 2:4)) p_(ggscatmat(flea, columns = 2:4, color = "species"))
# small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(flea) p_(ggscatmat(flea, columns = 2:4)) p_(ggscatmat(flea, columns = 2:4, color = "species"))
This function produces Kaplan-Meier plots using ggplot2.
As a first argument it needs a survfit
object, created by the
survival
package. Default settings differ for single stratum and
multiple strata objects.
ggsurv( s, CI = "def", plot.cens = TRUE, surv.col = "gg.def", cens.col = "gg.def", lty.est = 1, lty.ci = 2, size.est = 0.5, size.ci = size.est, cens.size = 2, cens.shape = 3, back.white = FALSE, xlab = "Time", ylab = "Survival", main = "", order.legend = TRUE )
ggsurv( s, CI = "def", plot.cens = TRUE, surv.col = "gg.def", cens.col = "gg.def", lty.est = 1, lty.ci = 2, size.est = 0.5, size.ci = size.est, cens.size = 2, cens.shape = 3, back.white = FALSE, xlab = "Time", ylab = "Survival", main = "", order.legend = TRUE )
s |
an object of class |
CI |
should a confidence interval be plotted? Defaults to |
plot.cens |
mark the censored observations? |
surv.col |
colour of the survival estimate. Defaults to black for one stratum, and to the default ggplot2 colours for multiple strata. Length of vector with colour names should be either 1 or equal to the number of strata. |
cens.col |
colour of the points that mark censored observations. |
lty.est |
linetype of the survival curve(s). Vector length should be either 1 or equal to the number of strata. |
lty.ci |
linetype of the bounds that mark the 95% CI. |
size.est |
line width of the survival curve |
size.ci |
line width of the 95% CI |
cens.size |
point size of the censoring points |
cens.shape |
shape of the points that mark censored observations. |
back.white |
if TRUE the background will not be the default
grey of |
xlab |
the label of the x-axis. |
ylab |
the label of the y-axis. |
main |
the plot label. |
order.legend |
boolean to determine if the legend display should be ordered by final survival time |
An object of class ggplot
Edwin Thoen
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive if (require(survival) && require(scales)) { lung <- survival::lung sf.lung <- survival::survfit(Surv(time, status) ~ 1, data = lung) p_(ggsurv(sf.lung)) # Multiple strata examples sf.sex <- survival::survfit(Surv(time, status) ~ sex, data = lung) pl.sex <- ggsurv(sf.sex) p_(pl.sex) # Adjusting the legend of the ggsurv fit p_(pl.sex + ggplot2::guides(linetype = "none") + ggplot2::scale_colour_discrete( name = "Sex", breaks = c(1, 2), labels = c("Male", "Female") )) # Multiple factors lung2 <- dplyr::mutate(lung, older = as.factor(age > 60)) sf.sex2 <- survival::survfit(Surv(time, status) ~ sex + older, data = lung2) pl.sex2 <- ggsurv(sf.sex2) p_(pl.sex2) # Change legend title p_(pl.sex2 + labs(color = "New Title", linetype = "New Title")) # We can still adjust the plot after fitting kidney <- survival::kidney sf.kid <- survival::survfit(Surv(time, status) ~ disease, data = kidney) pl.kid <- ggsurv(sf.kid, plot.cens = FALSE) p_(pl.kid) # Zoom in to first 80 days p_(pl.kid + ggplot2::coord_cartesian(xlim = c(0, 80), ylim = c(0.45, 1))) # Add the diseases names to the plot and remove legend p_(pl.kid + ggplot2::annotate( "text", label = c("PKD", "Other", "GN", "AN"), x = c(90, 125, 5, 60), y = c(0.8, 0.65, 0.55, 0.30), size = 5, colour = scales::hue_pal( h = c(0, 360) + 15, c = 100, l = 65, h.start = 0, direction = 1 )(4) ) + ggplot2::guides(color = "none", linetype = "none")) }
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive if (require(survival) && require(scales)) { lung <- survival::lung sf.lung <- survival::survfit(Surv(time, status) ~ 1, data = lung) p_(ggsurv(sf.lung)) # Multiple strata examples sf.sex <- survival::survfit(Surv(time, status) ~ sex, data = lung) pl.sex <- ggsurv(sf.sex) p_(pl.sex) # Adjusting the legend of the ggsurv fit p_(pl.sex + ggplot2::guides(linetype = "none") + ggplot2::scale_colour_discrete( name = "Sex", breaks = c(1, 2), labels = c("Male", "Female") )) # Multiple factors lung2 <- dplyr::mutate(lung, older = as.factor(age > 60)) sf.sex2 <- survival::survfit(Surv(time, status) ~ sex + older, data = lung2) pl.sex2 <- ggsurv(sf.sex2) p_(pl.sex2) # Change legend title p_(pl.sex2 + labs(color = "New Title", linetype = "New Title")) # We can still adjust the plot after fitting kidney <- survival::kidney sf.kid <- survival::survfit(Surv(time, status) ~ disease, data = kidney) pl.kid <- ggsurv(sf.kid, plot.cens = FALSE) p_(pl.kid) # Zoom in to first 80 days p_(pl.kid + ggplot2::coord_cartesian(xlim = c(0, 80), ylim = c(0.45, 1))) # Add the diseases names to the plot and remove legend p_(pl.kid + ggplot2::annotate( "text", label = c("PKD", "Other", "GN", "AN"), x = c(90, 125, 5, 60), y = c(0.8, 0.65, 0.55, 0.30), size = 5, colour = scales::hue_pal( h = c(0, 360) + 15, c = 100, l = 65, h.start = 0, direction = 1 )(4) ) + ggplot2::guides(color = "none", linetype = "none")) }
ggtable
is a variant of ggduo
for quick
cross-tabulated tables of discrete variables.
ggtable( data, columnsX = 1:ncol(data), columnsY = 1:ncol(data), cells = c("observed", "prop", "row.prop", "col.prop", "expected", "resid", "std.resid"), fill = c("none", "std.resid", "resid"), mapping = NULL, ... )
ggtable( data, columnsX = 1:ncol(data), columnsY = 1:ncol(data), cells = c("observed", "prop", "row.prop", "col.prop", "expected", "resid", "std.resid"), fill = c("none", "std.resid", "resid"), mapping = NULL, ... )
data |
dataset to be used, can have both categorical and numerical variables |
columnsX , columnsY
|
names or positions of which columns are used to make plots. Defaults to all columns. |
cells |
Which statistic should be displayed in table cells? |
fill |
Which statistic should be used for filling table cells? |
mapping |
additional aesthetic to be used, for example to indicate weights (see examples) |
... |
additional arguments passed to |
Joseph Larmarange
# small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) p_(ggtable(tips, "smoker", c("day", "time", "sex"))) # displaying row proportions p_(ggtable(tips, "smoker", c("day", "time", "sex"), cells = "row.prop")) # filling cells with standardized residuals p_(ggtable(tips, "smoker", c("day", "time", "sex"), fill = "std.resid", legend = 1)) # if continuous variables are provided, just displaying some summary statistics p_(ggtable(tips, c("smoker", "total_bill"), c("day", "time", "sex", "tip"))) # specifying weights d <- as.data.frame(Titanic) p_(ggtable( d, "Survived", c("Class", "Sex", "Age"), mapping = aes(weight = Freq), cells = "row.prop", fill = "std.resid" ))
# small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(tips) p_(ggtable(tips, "smoker", c("day", "time", "sex"))) # displaying row proportions p_(ggtable(tips, "smoker", c("day", "time", "sex"), cells = "row.prop")) # filling cells with standardized residuals p_(ggtable(tips, "smoker", c("day", "time", "sex"), fill = "std.resid", legend = 1)) # if continuous variables are provided, just displaying some summary statistics p_(ggtable(tips, c("smoker", "total_bill"), c("day", "time", "sex", "tip"))) # specifying weights d <- as.data.frame(Titanic) p_(ggtable( d, "Survived", c("Class", "Sex", "Age"), mapping = aes(weight = Freq), cells = "row.prop", fill = "std.resid" ))
GGally implementation of ts.plot. Wraps around the ggduo function and removes the column strips
ggts(..., columnLabelsX = NULL, xlab = "time")
ggts(..., columnLabelsX = NULL, xlab = "time")
... |
supplied directly to |
columnLabelsX |
remove top strips for the X axis by default |
xlab |
defaults to "time" |
ggmatrix
object
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive p_(ggts(pigs, "time", c("gilts", "profit", "s_per_herdsz", "production", "herdsz")))
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive p_(ggts(pigs, "time", c("gilts", "profit", "s_per_herdsz", "production", "herdsz")))
Glyph plot class
glyphplot(data, width, height, polar, x_major, y_major) is.glyphplot(x) ## S3 method for class 'glyphplot' x[...] ## S3 method for class 'glyphplot' print(x, ...)
glyphplot(data, width, height, polar, x_major, y_major) is.glyphplot(x) ## S3 method for class 'glyphplot' x[...] ## S3 method for class 'glyphplot' print(x, ...)
data |
A data frame containing variables named in |
height , width
|
The height and width of each glyph. Defaults to 95% of
the |
polar |
A logical of length 1, specifying whether the glyphs should
be drawn in polar coordinates. Defaults to |
x_major , y_major
|
The name of the variable (as a string) for the major x and y axes. Together, the |
x |
glyphplot to be printed |
... |
ignored |
Di Cook, Heike Hofmann, Hadley Wickham
glyphplot
dataCreate the data needed to generate a glyph plot.
glyphs( data, x_major, x_minor, y_major, y_minor, polar = FALSE, height = ggplot2::rel(0.95), width = ggplot2::rel(0.95), y_scale = identity, x_scale = identity )
glyphs( data, x_major, x_minor, y_major, y_minor, polar = FALSE, height = ggplot2::rel(0.95), width = ggplot2::rel(0.95), y_scale = identity, x_scale = identity )
data |
A data frame containing variables named in |
x_major , x_minor , y_major , y_minor
|
The name of the variable (as a string) for the major and minor x and y axes. Together, each unique |
polar |
A logical of length 1, specifying whether the glyphs should
be drawn in polar coordinates. Defaults to |
height , width
|
The height and width of each glyph. Defaults to 95% of
the |
y_scale , x_scale
|
The scaling function to be applied to each set of
minor values within a grid cell. Defaults to |
Di Cook, Heike Hofmann, Hadley Wickham
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(nasa) nasaLate <- nasa[ nasa$date >= as.POSIXct("1998-01-01") & nasa$lat >= 20 & nasa$lat <= 40 & nasa$long >= -80 & nasa$long <= -60, ] temp.gly <- glyphs(nasaLate, "long", "day", "lat", "surftemp", height = 2.5) p_(ggplot2::ggplot(temp.gly, ggplot2::aes(gx, gy, group = gid)) + add_ref_lines(temp.gly, color = "grey90") + add_ref_boxes(temp.gly, color = "grey90") + ggplot2::geom_path() + ggplot2::theme_bw() + ggplot2::labs(x = "", y = ""))
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(nasa) nasaLate <- nasa[ nasa$date >= as.POSIXct("1998-01-01") & nasa$lat >= 20 & nasa$lat <= 40 & nasa$long >= -80 & nasa$long <= -60, ] temp.gly <- glyphs(nasaLate, "long", "day", "lat", "surftemp", height = 2.5) p_(ggplot2::ggplot(temp.gly, ggplot2::aes(gx, gy, group = gid)) + add_ref_lines(temp.gly, color = "grey90") + add_ref_boxes(temp.gly, color = "grey90") + ggplot2::geom_path() + ggplot2::theme_bw() + ggplot2::labs(x = "", y = ""))
Grab the legend and print it as a plot
grab_legend(p) ## S3 method for class 'legend_guide_box' print(x, ..., plotNew = FALSE)
grab_legend(p) ## S3 method for class 'legend_guide_box' print(x, ..., plotNew = FALSE)
p |
ggplot2 plot object |
x |
legend object that has been grabbed from a ggplot2 object |
... |
ignored |
plotNew |
boolean to determine if the |
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive library(ggplot2) histPlot <- ggplot(iris, aes(Sepal.Length, fill = Species)) + geom_histogram(binwidth = 1 / 4) (right <- histPlot) (bottom <- histPlot + theme(legend.position = "bottom")) (top <- histPlot + theme(legend.position = "top")) (left <- histPlot + theme(legend.position = "left")) p_(grab_legend(right)) p_(grab_legend(bottom)) p_(grab_legend(top)) p_(grab_legend(left))
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive library(ggplot2) histPlot <- ggplot(iris, aes(Sepal.Length, fill = Species)) + geom_histogram(binwidth = 1 / 4) (right <- histPlot) (bottom <- histPlot + theme(legend.position = "bottom")) (top <- histPlot + theme(legend.position = "top")) (left <- histPlot + theme(legend.position = "left")) p_(grab_legend(right)) p_(grab_legend(bottom)) p_(grab_legend(top)) p_(grab_legend(left))
This data extract is taken from Hadley Wickham's productplots
package.
The original description follows, with minor edits.
data(happy)
data(happy)
A data frame with 51020 rows and 10 variables
The data is a small sample of variables related to happiness from the General Social Survey (GSS). The GSS is a yearly cross-sectional survey of Americans, run from 1972. We combine data for 25 years to yield 51,020 observations, and of the over 5,000 variables, we select nine related to happiness:
age. age in years: 18–89.
degree. highest education: lt high school, high school, junior college, bachelor, graduate.
finrela. relative financial status: far above, above average, average, below average, far below.
happy. happiness: very happy, pretty happy, not too happy.
health. health: excellent, good, fair, poor.
marital. marital status: married, never married, divorced, widowed, separated.
sex. sex: female, male.
wtsall. probability weight. 0.43–6.43.
Smith, Tom W., Peter V. Marsden, Michael Hout, Jibum Kim. General Social Surveys, 1972-2006. [machine-readable data file]. Principal Investigator, Tom W. Smith; Co-Principal Investigators, Peter V. Marsden and Michael Hout, NORC ed. Chicago: National Opinion Research Center, producer, 2005; Storrs, CT: The Roper Center for Public Opinion Research, University of Connecticut, distributor. 1 data file (57,061 logical records) and 1 codebook (3,422 pp).
Check if plot is horizontal
is_horizontal(data, mapping, val = "y") is_character_column(data, mapping, val = "y")
is_horizontal(data, mapping, val = "y") is_character_column(data, mapping, val = "y")
data |
data used in ggplot2 plot |
mapping |
ggplot2 |
val |
key to retrieve from |
Boolean determining if the data is a character-like data
is_horizontal(iris, ggplot2::aes(Sepal.Length, Species)) # TRUE is_horizontal(iris, ggplot2::aes(Sepal.Length, Species), "x") # FALSE is_horizontal(iris, ggplot2::aes(Sepal.Length, Sepal.Width)) # FALSE
is_horizontal(iris, ggplot2::aes(Sepal.Length, Species)) # TRUE is_horizontal(iris, ggplot2::aes(Sepal.Length, Species), "x") # FALSE is_horizontal(iris, ggplot2::aes(Sepal.Length, Sepal.Width)) # FALSE
ggscatmat
functionfunction for making the melted dataset used to plot the lowertriangle scatterplots.
lowertriangle(data, columns = 1:ncol(data), color = NULL)
lowertriangle(data, columns = 1:ncol(data), color = NULL)
data |
a data matrix. Should contain numerical (continuous) data. |
columns |
an option to choose the column to be used in the raw dataset. Defaults to |
color |
an option to choose a factor variable to be grouped with. Defaults to |
Mengjia Ni, Di Cook
data(flea) head(lowertriangle(flea, columns = 2:4)) head(lowertriangle(flea)) head(lowertriangle(flea, color = "species"))
data(flea) head(lowertriangle(flea, columns = 2:4)) head(lowertriangle(flea)) head(lowertriangle(flea, color = "species"))
Replace the fill with the color and make color NULL.
mapping_color_to_fill(current)
mapping_color_to_fill(current)
current |
the current aesthetics |
Swap x and y mapping
mapping_swap_x_y(mapping)
mapping_swap_x_y(mapping)
mapping |
output of |
Aes mapping with the x and y values switched
mapping <- ggplot2::aes(Petal.Length, Sepal.Width) mapping mapping_swap_x_y(mapping)
mapping <- ggplot2::aes(Petal.Length, Sepal.Width) mapping mapping_swap_x_y(mapping)
Retrieve either the response variable names, the beta variable names, or beta variable names. If the model is an object of class 'lm', by default, the beta variable names will include anova significance stars.
model_response_variables(model, data = broom::augment(model)) model_beta_variables(model, data = broom::augment(model)) model_beta_label(model, data = broom::augment(model), lmStars = TRUE)
model_response_variables(model, data = broom::augment(model)) model_beta_variables(model, data = broom::augment(model)) model_beta_label(model, data = broom::augment(model), lmStars = TRUE)
model |
model in question |
data |
equivalent to |
lmStars |
boolean that determines if stars are added to labels |
character vector of names
This data was provided by NASA for the competition.
data(nasa)
data(nasa)
A data frame with 41472 rows and 17 variables
The data shows 6 years of monthly measurements of a 24x24 spatial grid from Central America:
time integer specifying temporal order of measurements
x, y, lat, long spatial location of measurements.
cloudhigh, cloudlow, cloudmid, ozone, pressure, surftemp, temperature are the various satellite measurements.
date, day, month, year specifying the time of measurements.
id unique ide for each spatial position.
Murrell, P. (2010) The 2006 Data Expo of the American Statistical Association. Computational Statistics, 25:551-554.
This data contains about the United Kingdom Pig Production from the book 'Data' by Andrews and Herzberg. The original data can be on Statlib: http://lib.stat.cmu.edu/datasets/Andrews/T62.1
data(pigs)
data(pigs)
A data frame with 48 rows and 8 variables
The time variable has been added from a combination of year and quarter
time year + (quarter - 1) / 4
year year of production
quarter quarter of the year of production
gilts number of sows giving birth for the first time
profit ratio of price to an index of feed price
s_per_herdsz ratio of the number of breeding pigs slaughtered to the total breeding herd size
production number of pigs slaughtered that were reared for meat
herdsz breeding herd size
Andrews, David F., and Agnes M. Herzberg. Data: a collection of problems from many fields for the student and research worker. Springer Science & Business Media, 2012.
Small function to print a plot if the R session is interactive or in a CI build
print_if_interactive(p)
print_if_interactive(p)
p |
plot to be displayed |
ggmatrix
objectPrint method taken from ggplot2:::print.ggplot
and altered for a ggmatrix
object
## S3 method for class 'ggmatrix' print(x, newpage = is.null(vp), vp = NULL, ...)
## S3 method for class 'ggmatrix' print(x, newpage = is.null(vp), vp = NULL, ...)
x |
plot to display |
newpage |
draw new (empty) page first? |
vp |
viewport to draw plot in |
... |
arguments passed onto |
Barret Schloerke
data(tips) pMat <- ggpairs(tips, c(1, 3, 2), mapping = ggplot2::aes(color = sex)) pMat # calls print(pMat), which calls print.ggmatrix(pMat)
data(tips) pMat <- ggpairs(tips, c(1, 3, 2), mapping = ggplot2::aes(color = sex)) pMat # calls print(pMat), which calls print.ggmatrix(pMat)
This data contains 600 observations on eight variables
data(psychademic)
data(psychademic)
A data frame with 600 rows and 8 variables
locus_of_control - psychological
self_concept - psychological
motivation - psychological. Converted to four character groups
read - academic
write - academic
math - academic
science - academic
female - academic. Dropped from original source
sex - academic. Added as a character version of female column
R Data Analysis Examples | Canonical Correlation Analysis. UCLA: Institute for Digital Research and Education. from http://www.stats.idre.ucla.edu/r/dae/canonical-correlation-analysis (accessed May 22, 2017).
ggmatrix
objectFunction to place your own plot in the layout.
putPlot(pm, value, i, j) ## S3 replacement method for class 'ggmatrix' pm[i, j, ...] <- value
putPlot(pm, value, i, j) ## S3 replacement method for class 'ggmatrix' pm[i, j, ...] <- value
pm |
ggally object to be altered |
value |
ggplot object to be placed |
i |
row from the top |
j |
column from the left |
... |
ignored |
Barret Schloerke
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive custom_car <- ggpairs(mtcars[, c("mpg", "wt", "cyl")], upper = "blank", title = "Custom Example") # ggplot example taken from example(geom_text) plot <- ggplot2::ggplot(mtcars, ggplot2::aes(x = wt, y = mpg, label = rownames(mtcars))) plot <- plot + ggplot2::geom_text(ggplot2::aes(colour = factor(cyl)), size = 3) + ggplot2::scale_colour_discrete(l = 40) custom_car[1, 2] <- plot personal_plot <- ggally_text( "ggpairs allows you\nto put in your\nown plot.\nLike that one.\n <---" ) custom_car[1, 3] <- personal_plot # custom_car # remove plots after creating a plot matrix custom_car[2, 1] <- NULL custom_car[3, 1] <- "blank" # the same as storing null custom_car[3, 2] <- NULL p_(custom_car)
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive custom_car <- ggpairs(mtcars[, c("mpg", "wt", "cyl")], upper = "blank", title = "Custom Example") # ggplot example taken from example(geom_text) plot <- ggplot2::ggplot(mtcars, ggplot2::aes(x = wt, y = mpg, label = rownames(mtcars))) plot <- plot + ggplot2::geom_text(ggplot2::aes(colour = factor(cyl)), size = 3) + ggplot2::scale_colour_discrete(l = 40) custom_car[1, 2] <- plot personal_plot <- ggally_text( "ggpairs allows you\nto put in your\nown plot.\nLike that one.\n <---" ) custom_car[1, 3] <- personal_plot # custom_car # remove plots after creating a plot matrix custom_car[2, 1] <- NULL custom_car[3, 1] <- "blank" # the same as storing null custom_car[3, 2] <- NULL p_(custom_car)
Remove colour mapping unless found in select mapping keys
remove_color_unless_equal(mapping, to = c("x", "y"))
remove_color_unless_equal(mapping, to = c("x", "y"))
mapping |
output of |
to |
set of mapping keys to check |
Aes mapping with colour mapping kept only if found in selected mapping keys.
mapping <- aes(x = sex, y = age, colour = sex) mapping <- aes(x = sex, y = age, colour = region) remove_color_unless_equal(mapping)
mapping <- aes(x = sex, y = age, colour = sex) mapping <- aes(x = sex, y = age, colour = region) remove_color_unless_equal(mapping)
Rescaling functions
range01(x) max1(x) mean0(x) min0(x) rescale01(x, xlim = NULL) rescale11(x, xlim = NULL)
range01(x) max1(x) mean0(x) min0(x) rescale01(x, xlim = NULL) rescale11(x, xlim = NULL)
x |
numeric vector |
xlim |
value used in |
Find order of variables based on a specified scagnostic measure by maximizing the index values of that measure along the path.
scag_order(scag, vars, measure)
scag_order(scag, vars, measure)
scag |
|
vars |
character vector of the variables to be ordered |
measure |
scagnostics measure to order according to |
character vector of variable ordered according to the given scagnostic measure
Barret Schloerke
Function for making scatterplots in the lower triangle and diagonal density plots.
scatmat(data, columns = 1:ncol(data), color = NULL, alpha = 1)
scatmat(data, columns = 1:ncol(data), color = NULL, alpha = 1)
data |
a data matrix. Should contain numerical (continuous) data. |
columns |
an option to choose the column to be used in the raw dataset. Defaults to |
color |
an option to group the dataset by the factor variable and color them by different colors. Defaults to |
alpha |
an option to set the transparency in scatterplots for large data. Defaults to |
Mengjia Ni, Di Cook
# small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(flea) p_(scatmat(flea, columns = 2:4)) p_(scatmat(flea, columns = 2:4, color = "species"))
# small function to display plots only if it's interactive p_ <- GGally::print_if_interactive data(flea) p_(scatmat(flea, columns = 2:4)) p_(scatmat(flea, columns = 2:4, color = "species"))
Order axis variables by separation between one class and the rest (most separation to least).
singleClassOrder(classVar, axisVars, specClass = NULL)
singleClassOrder(classVar, axisVars, specClass = NULL)
classVar |
class variable (vector from original dataset) |
axisVars |
variables to be plotted as axes (data frame) |
specClass |
character string matching to level of |
character vector of names of axisVars ordered such that the first variable has the most separation between one of the classes and the rest, and the last variable has the least (as measured by F-statistics from an ANOVA)
Jason Crowley
Calculate the sample skewness of a vector while ignoring missing values.
skewness(x)
skewness(x)
x |
numeric vector |
sample skewness of x
Jason Crowley
ggmatrix
structureView the condensed version of the ggmatrix
object. The attribute "class" is ALWAYS altered to "_class" to avoid recursion.
## S3 method for class 'ggmatrix' str(object, ..., raw = FALSE)
## S3 method for class 'ggmatrix' str(object, ..., raw = FALSE)
object |
|
... |
passed on to the default |
raw |
boolean to determine if the plots should be converted to text or kept as original objects |
One waiter recorded information about each tip he received over a period of a few months working in one restaurant. He collected several variables:
tips
tips
A data frame with 244 rows and 7 variables
tip in dollars,
bill in dollars,
sex of the bill payer,
whether there were smokers in the party,
day of the week,
time of day,
size of the party.
In all he recorded 244 tips. The data was reported in a collection of case studies for business statistics (Bryant & Smith 1995).
Bryant, P. G. and Smith, M (1995) Practical Data Analysis: Case Studies in Business Statistics. Homewood, IL: Richard D. Irwin Publishing:
A network of spambots found on Twitter as part of a data mining project.
data(twitter_spambots)
data(twitter_spambots)
An object of class network
with 120 edges and 94 vertices.
Each node of the network is identified by the Twitter screen name of the account and further carries five vertex attributes:
location user's location, as provided by the user
lat latitude, based on the user's location
lon longitude, based on the user's location
followers number of Twitter accounts that follow this account
friends number of Twitter accounts followed by the account
Amos Elberg
ggscatmat
functionFunction for making the dataset used to plot the uppertriangle plots.
uppertriangle( data, columns = 1:ncol(data), color = NULL, corMethod = "pearson" )
uppertriangle( data, columns = 1:ncol(data), color = NULL, corMethod = "pearson" )
data |
a data matrix. Should contain numerical (continuous) data. |
columns |
an option to choose the column to be used in the raw dataset. Defaults to |
color |
an option to choose a factor variable to be grouped with. Defaults to |
corMethod |
method argument supplied to |
Mengjia Ni, Di Cook
data(flea) head(uppertriangle(flea, columns = 2:4)) head(uppertriangle(flea)) head(uppertriangle(flea, color = "species"))
data(flea) head(uppertriangle(flea, columns = 2:4)) head(uppertriangle(flea)) head(uppertriangle(flea, color = "species"))
ggmatrix
object by adding an ggplot2 object to allModify a ggmatrix
object by adding an ggplot2 object to all
v1_ggmatrix_theme()
v1_ggmatrix_theme()
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive p_(ggpairs(iris, 1:2) + v1_ggmatrix_theme()) # move the column names to the left and bottom p_(ggpairs(iris, 1:2, switch = "both") + v1_ggmatrix_theme())
# Small function to display plots only if it's interactive p_ <- GGally::print_if_interactive p_(ggpairs(iris, 1:2) + v1_ggmatrix_theme()) # move the column names to the left and bottom p_(ggpairs(iris, 1:2, switch = "both") + v1_ggmatrix_theme())
This function will open the directly to the vignette requested. If no name
is provided, the index of all GGally vignettes will be opened.
vig_ggally(name)
vig_ggally(name)
name |
Vignette name to open. If no name is provided, the vignette index will be opened |
This method allows for vignettes to be hosted remotely, reducing GGally's package size, and installation time.
# View `ggnostic` vignette vig_ggally("ggnostic") # View all vignettes by GGally vig_ggally()
# View `ggnostic` vignette vig_ggally("ggnostic") # View all vignettes by GGally vig_ggally()
Wraps a function with the supplied parameters to force different default behavior. This is useful for functions that are supplied to ggpairs. It allows you to change the behavior of one function, rather than creating multiple functions with different parameter settings.
wrap_fn_with_param_arg( funcVal, params = NULL, funcArgName = deparse(substitute(funcVal)) ) wrapp(funcVal, params = NULL, funcArgName = deparse(substitute(funcVal))) wrap(funcVal, ..., funcArgName = deparse(substitute(funcVal))) wrap_fn_with_params(funcVal, ..., funcArgName = deparse(substitute(funcVal)))
wrap_fn_with_param_arg( funcVal, params = NULL, funcArgName = deparse(substitute(funcVal)) ) wrapp(funcVal, params = NULL, funcArgName = deparse(substitute(funcVal))) wrap(funcVal, ..., funcArgName = deparse(substitute(funcVal))) wrap_fn_with_params(funcVal, ..., funcArgName = deparse(substitute(funcVal)))
funcVal |
function that the |
params |
named vector or list of parameters to be applied to the |
funcArgName |
name of function to be displayed |
... |
named parameters to be supplied to |
wrap
is identical to wrap_fn_with_params
. These function take the new parameters as arguments.
wrapp
is identical to wrap_fn_with_param_arg
. These functions take the new parameters as a single list.
The params
and fn
attributes are there for debugging purposes. If either attribute is altered, the function must be re-wrapped to have the changes take effect.
a function(data, mapping, ...){}
that will wrap the original function with the parameters applied as arguments
# small function to display plots only if it's interactive p_ <- GGally::print_if_interactive # example function that prints 'val' fn <- function(data, mapping, val = 2) { print(val) } fn(data = NULL, mapping = NULL) # 2 # wrap function to change default value 'val' to 5 instead of 2 wrapped_fn1 <- wrap(fn, val = 5) wrapped_fn1(data = NULL, mapping = NULL) # 5 # you may still supply regular values wrapped_fn1(data = NULL, mapping = NULL, val = 3) # 3 # wrap function to change 'val' to 5 using the arg list wrapped_fn2 <- wrap_fn_with_param_arg(fn, params = list(val = 5)) wrapped_fn2(data = NULL, mapping = NULL) # 5 # change parameter settings in ggpairs for a particular function ## Goal output: regularPlot <- ggally_points( iris, ggplot2::aes(Sepal.Length, Sepal.Width), size = 5, color = "red" ) p_(regularPlot) # Wrap ggally_points to have parameter values size = 5 and color = 'red' w_ggally_points <- wrap(ggally_points, size = 5, color = "red") wrappedPlot <- w_ggally_points( iris, ggplot2::aes(Sepal.Length, Sepal.Width) ) p_(wrappedPlot) # Double check the aes parameters are the same for the geom_point layer identical(regularPlot$layers[[1]]$aes_params, wrappedPlot$layers[[1]]$aes_params) # Use a wrapped function in ggpairs pm <- ggpairs(iris, 1:3, lower = list(continuous = wrap(ggally_points, size = 5, color = "red"))) p_(pm) pm <- ggpairs(iris, 1:3, lower = list(continuous = w_ggally_points)) p_(pm)
# small function to display plots only if it's interactive p_ <- GGally::print_if_interactive # example function that prints 'val' fn <- function(data, mapping, val = 2) { print(val) } fn(data = NULL, mapping = NULL) # 2 # wrap function to change default value 'val' to 5 instead of 2 wrapped_fn1 <- wrap(fn, val = 5) wrapped_fn1(data = NULL, mapping = NULL) # 5 # you may still supply regular values wrapped_fn1(data = NULL, mapping = NULL, val = 3) # 3 # wrap function to change 'val' to 5 using the arg list wrapped_fn2 <- wrap_fn_with_param_arg(fn, params = list(val = 5)) wrapped_fn2(data = NULL, mapping = NULL) # 5 # change parameter settings in ggpairs for a particular function ## Goal output: regularPlot <- ggally_points( iris, ggplot2::aes(Sepal.Length, Sepal.Width), size = 5, color = "red" ) p_(regularPlot) # Wrap ggally_points to have parameter values size = 5 and color = 'red' w_ggally_points <- wrap(ggally_points, size = 5, color = "red") wrappedPlot <- w_ggally_points( iris, ggplot2::aes(Sepal.Length, Sepal.Width) ) p_(wrappedPlot) # Double check the aes parameters are the same for the geom_point layer identical(regularPlot$layers[[1]]$aes_params, wrappedPlot$layers[[1]]$aes_params) # Use a wrapped function in ggpairs pm <- ggpairs(iris, 1:3, lower = list(continuous = wrap(ggally_points, size = 5, color = "red"))) p_(pm) pm <- ggpairs(iris, 1:3, lower = list(continuous = w_ggally_points)) p_(pm)