--- title: "State Space Grid Plotting Tutorial" author: Miriam Brinberg output: rmdformats::robobook: html_document: default word_document: default editor_options: chunk_output_type: console --- # Overview This tutorial provides R code on plotting state space grids (Hollenstein, 2013). Typically, state space grids are used to plot longitudinal dyadic (or bivariate) data. In these plots, each dyad member is represented on one axis. The information provided by each dyad member can be a continuous (e.g., 0 - 100 report of emotion), ordinal (e.g., highly negative to neutral to highly positive affect), or categorical (e.g., conversation turn type) variable. Each dyad member's value on the variable at each time point is plotted in the x-y state space grid, and points consecutive in time are connected. The goal of plotting state space grids is to visually understand how dyads' behaviors change (or not) over time. In this example, we plot the turn transitions that occurred during a subset (*N* = 59) of support conversations between strangers in which one dyad member disclosed about a current problem. Each turn in the conversations between stranger dyads was coded (e.g., as an acknowledgement, question, hedged disclosure, etc; see Bodie et al., 2021 in the *Journal of Language and Social Psychology* for more details about the creation of the turn typology). We were interested in how the transitions differed between dyads and how the use of these transitions changed throughout a conversation (more detail about related analyses can be found in Solomon et al., 2021, *Journal of Communication*). Note that the accompanying "StateSpaceGridPlots_Tutorial_2022July26.rmd" file contains all of the code presented in this tutorial and can be opened in RStudio (a somewhat more friendly user interface to R). # Outline In this tutorial, we'll cover... * Reading in the data and loading needed packages. * Data preparation. * Plotting an exemplar dyad. * Running a loop that will plot all dyads and save the plots in a PDF. # Read in the data and load needed libraries. **Let's read the data into R.** The data set we are working with is called "StrangerConversations_N59" and is stored as a .csv file (comma-separated values file, which can be created by saving an Excel file as a csv document) on my computer's desktop. ```{r} # Set working directory (i.e., where your data file is stored) # This can be done by going to the top bar of RStudio and selecting # "Session" --> "Set Working Directory" --> "Choose Directory" --> # finding the location of your file setwd("~/Desktop") # Read in the data data <- read.csv(file = "StrangerConversations_N59.csv", head = TRUE, sep = ",") # View the first 10 rows of the data head(data, 10) ``` In the data, we can see each row contains information for one turn and there are multiple rows (i.e., multiple turns) for each dyad. Specifically, there is a column for: * Dyad ID (`id`) * Time variable - in this case, turn in the conversation (`turn`) * Dyad member ID - in this case, role in the conversation (`role`; discloser = 1, listener = 2) * Turn type - in this case, based upon a typology derived in Bodie et al. (2021; `turn_type`) **Load the R packages we need.** Packages in R are a collection of functions (and their documentation/explanations) that enable us to conduct particular tasks, such as plotting or fitting a statistical model. ```{r, message = FALSE, warning = FALSE} # install.packages("devtools") # Install package if you have never used it before library(devtools) # For version control # install.packages("dplyr") # Install package if you have never used it before library(dplyr) # For data management # install.packages("ggplot2") # Install package if you have never used it before library(ggplot2) # For plotting # install.packages("zoo") # Install package if you have never used it before library(zoo) # For management of time series data ``` # Data preparation. Before plotting the state space grids, we need to create a few additional variables that will make the plotting process easier. First, we create two new variables that represent the listeners' turn types and disclosers' turn types in numeric form, which will be helpful later for plotting the turn types on the x- and y-axes of the state space grids. The value assigned to each turn type will determine the ordering of the turns in the state space grid. The lowest values will appear in the bottom/left of the state space grid and the highest values will appear in the top/right of the state space grid. We chose to order the turns on each axis based upon their hypothesized frequency of use for each conversational role. For instance, we would expect disclosers to spend most of the conversation describing their problem using elaboration and hedged disclosure turns and less of the conversation providing advice to the listener; thus, we placed these turn types on opposite ends of the y-axis. ```{r} # Create new variable that represents listeners' turn types in numeric form # Create new variable "listener_value" that has no values assigned to it data$listener_value <- NA # Assign the "listener_value" variable values for each turn type (turn_type) for listeners (role = 2) data$listener_value[data$role == 2 & data$turn_type == "Acknowledgement"] <- 0.01 data$listener_value[data$role == 2 & data$turn_type == "Question"] <- 1.01 data$listener_value[data$role == 2 & data$turn_type == "Reflection"] <- 2.01 data$listener_value[data$role == 2 & data$turn_type == "HedgedDisclosure"] <- 3.01 data$listener_value[data$role == 2 & data$turn_type == "Elaboration"] <- 4.01 data$listener_value[data$role == 2 & data$turn_type == "Advice"] <- 5.01 # Create new variable that represents disclosers' turn types in numeric form # Create new variable "discloser_value" that has no values assigned to it data$discloser_value <- NA # Assign the "discloser_value" variable values for each turn type (turn_type) for disclosers (role = 1) data$discloser_value[data$role == 1 & data$turn_type == "HedgedDisclosure"] <- 0.01 data$discloser_value[data$role == 1 & data$turn_type == "Elaboration"] <- 1.01 data$discloser_value[data$role == 1 & data$turn_type == "Reflection"] <- 2.01 data$discloser_value[data$role == 1 & data$turn_type == "Question"] <- 3.01 data$discloser_value[data$role == 1 & data$turn_type == "Acknowledgement"] <- 4.01 data$discloser_value[data$role == 1 & data$turn_type == "Advice"] <- 5.01 # View the first 10 rows of the data head(data, 10) ``` Second, we fill in the NA values for the "listener_value" and "discloser_value" variables such that the value in one row carries forward to the NA that follows. We do this so R has values to plot for each row of the data. ```{r} data <- # Select data data %>% # Select grouping variable, in this case, dyad ID (id) group_by(id) %>% # Create new variables that carry forward values using the "na.locf" function mutate(listener_value = na.locf(listener_value, na.rm = FALSE), discloser_value = na.locf(discloser_value, na.rm = FALSE)) %>% # Save the data as a data.frame as.data.frame() # View the first 10 rows of the data head(data, 10) ``` Third, we create lagged variables for "listener_value" and "discloser_value" to represent turn transitions in the same row (i.e., at the same time point). Stated differently, we want each row to contain information about the current turn (e.g., "listener_value") and the prior turn (e.g., "discloser_value_lag"). We do this so R can plot the values from two consecutive time points using information from each row of the data. ```{r} data <- # Select data data %>% # Select grouping variable, in this case, dyad ID (id) group_by(id) %>% # Create lagged value for the "listener_value" and "discloser_value" variable mutate(listener_value_lag = lag(listener_value), discloser_value_lag = lag(discloser_value)) # View the first 10 rows of the data head(data, 10) ``` Finally, we shift the "listener_value", "discloser_value", "listener_value_lag", and "discloser_value_lag" variables slightly so they all are not plotted on top of each other. For instance, all listener acknowledgements currently have a value of 0.01. Adding a small (and increasing) value to the acknowledgement value will make sure that the occurrence of listener acknowledgements are plotted in slightly different points on the state space grid. ```{r} data <- # Select data data %>% # Select grouping variable, in this case, dyad ID (id) group_by(id) %>% # Create new variables # First create "increment" variable that counts # from 0.01 to the end of the conversation by 0.01 # For instance, if there are 20 turns or rows in the conversation, # the "increment" variable will go 0.01, 0.02, ..., 0.19, 0.20 mutate(increment = 1:n()/100, # Create variables that add the increment variable to "listener_value", "discloser_value", # "listener_value_lag", and "discloser_value_lag" listener_value_shift = listener_value + increment, discloser_value_shift = discloser_value + increment, listener_value_lag_shift = listener_value_lag + increment, discloser_value_lag_shift = discloser_value_lag + increment) %>% # Save the data as a data.frame as.data.frame() ``` The data are ready to go! ```{r} # View the first 10 rows of the data head(data, 10) ``` # Plot an exemplar dyad. Before plotting all of the dyads' state space grids, let's plot one example dyad to see what the plot looks like. We are creating a 6 x 6 (36 cell) state space grid. To create a state space grid, we assign each member of the dyad to an axis, with the behavior of interest represented on each axis. Here, we assign listeners’ turn types to the x-axis and disclosers’ turn types to the y-axis. Since our behaviors of interest are categorical (in contrast to behaviors that may be measured on a continuous or ordinal scale, such as affect), we chose to order the turns on each axis based upon their hypothesized frequency of use for each conversational role. For instance, we expect disclosers to spend most of the conversation describing their problem using elaboration and hedged disclosure turns and less of the conversation providing advice to the listener; thus, we placed these turn types on opposite ends of the y-axis. The intersection of the axes creates cells within the state space grid, which each represents a particular state within the dynamic system. ```{r} plot_example <- # Choose the data, set one dyad member and their values for the x-axis, # and the other dyad member and their values for the y-axis ggplot(data = data[data$id == 105, ], aes(x = listener_value_shift, y = discloser_value_shift)) + # Create title for plot ggtitle(paste("Dyad = 105")) + # Create each cell of the state space grid by setting its width (xmin and xmax), # height (ymin and ymax), and color (fill using hex code: https://www.color-hex.com/) geom_rect(xmin=0,xmax=1,ymin=0,ymax=1, fill="#F8766D") + geom_rect(xmin=0,xmax=1,ymin=1,ymax=2, fill="#C69037") + geom_rect(xmin=0,xmax=1,ymin=2,ymax=3, fill="#7C9853") + geom_rect(xmin=0,xmax=1,ymin=3,ymax=4, fill="#7C9C86") + geom_rect(xmin=0,xmax=1,ymin=4,ymax=5, fill="#AD89B6") + geom_rect(xmin=0,xmax=1,ymin=5,ymax=6, fill="#EA74B4") + geom_rect(xmin=1,xmax=2,ymin=0,ymax=1, fill="#C69037") + geom_rect(xmin=1,xmax=2,ymin=1,ymax=2, fill="#93AA00") + geom_rect(xmin=1,xmax=2,ymin=2,ymax=3, fill="#4AB21C") + geom_rect(xmin=1,xmax=2,ymin=3,ymax=4, fill="#4AB650") + geom_rect(xmin=1,xmax=2,ymin=4,ymax=5, fill="#7AA380") + geom_rect(xmin=1,xmax=2,ymin=5,ymax=6, fill="#B78E7E") + geom_rect(xmin=2,xmax=3,ymin=0,ymax=1, fill="#7C9853") + geom_rect(xmin=2,xmax=3,ymin=1,ymax=2, fill="#4AB21C") + geom_rect(xmin=2,xmax=3,ymin=2,ymax=3, fill="#00BA38") + geom_rect(xmin=2,xmax=3,ymin=3,ymax=4, fill="#00BE6C") + geom_rect(xmin=2,xmax=3,ymin=4,ymax=5, fill="#31AB9C") + geom_rect(xmin=2,xmax=3,ymin=5,ymax=6, fill="#6E969A") + geom_rect(xmin=3,xmax=4,ymin=0,ymax=1, fill="#7C9C86") + geom_rect(xmin=3,xmax=4,ymin=1,ymax=2, fill="#4AB650") + geom_rect(xmin=3,xmax=4,ymin=2,ymax=3, fill="#00BE6C") + geom_rect(xmin=3,xmax=4,ymin=3,ymax=4, fill="#00C19F") + geom_rect(xmin=3,xmax=4,ymin=4,ymax=5, fill="#31AFCF") + geom_rect(xmin=3,xmax=4,ymin=5,ymax=6, fill="#6E9ACD") + geom_rect(xmin=4,xmax=5,ymin=0,ymax=1, fill="#AD89B6") + geom_rect(xmin=4,xmax=5,ymin=1,ymax=2, fill="#7AA380") + geom_rect(xmin=4,xmax=5,ymin=2,ymax=3, fill="#31AB9C") + geom_rect(xmin=4,xmax=5,ymin=3,ymax=4, fill="#31AFCF") + geom_rect(xmin=4,xmax=5,ymin=4,ymax=5, fill="#619CFF") + geom_rect(xmin=4,xmax=5,ymin=5,ymax=6, fill="#9E87FD") + geom_rect(xmin=5,xmax=6,ymin=0,ymax=1, fill="#EA74B4") + geom_rect(xmin=5,xmax=6,ymin=1,ymax=2, fill="#B78E7E") + geom_rect(xmin=5,xmax=6,ymin=2,ymax=3, fill="#6E969A") + geom_rect(xmin=5,xmax=6,ymin=3,ymax=4, fill="#6E9ACD") + geom_rect(xmin=5,xmax=6,ymin=4,ymax=5, fill="#9E87FD") + geom_rect(xmin=5,xmax=6,ymin=5,ymax=6, fill="#DB72FB") + # Create point that represents the x-y intersection point of the dyads' variables # Time (turn) controls the color of the point # Alpha controls the opacity of the point geom_point(aes(color = turn), alpha = 1) + # Create line that connects the consecutive points in time, # with the line changing color as time goes on # x: starting value of segment on the x-axis (prior listener turn type) # y: starting value of segment on the y-axis (prior discloser turn type) # xend: ending value of segment on the x-axis (current listener turn type) # yend: ending value of segment on the y-axis (current discloser turn type) # colour: controls the color of the segment, in this case, by time (turn) in the conversation geom_segment(aes(x = listener_value_lag_shift, y = discloser_value_lag_shift, xend = listener_value_shift, yend = discloser_value_shift, colour = turn)) + # Color line segments connecting points from white to black by time scale_color_gradient(name = 'Turn Order', low = 'white', high = 'black') + # Label for x-axis xlab('Listener Turn Type') + # Label for y-axis ylab('Discloser Turn Type') + # Additional plot aesthetics theme( # Adjust text size, color, and angle on x-axis and y-axis axis.text.x = element_text(size = 14, color = 'black', angle = 45), axis.text.y = element_text(size = 14, color = 'black'), # Adjust size of plot title axis.title = element_text(size = 18), # Adjust color and lines surrounding plot panel.grid.major = element_line(colour = "white"), panel.grid.minor = element_blank(), panel.background = element_blank(), axis.ticks = element_blank()) + # Vertical lines to divide grid geom_vline(xintercept = c(0, 1, 2, 3, 4, 5, 6)) + # Horizontal lines to divide grid geom_hline(yintercept = c(0, 1, 2, 3, 4, 5, 6)) + # X-axis ticks and labels scale_x_continuous(breaks = c(0.5, 1.5, 2.5, 3.5, 4.5, 5.5), labels = c("Acknow", "Question", "Reflection", "Hedged Disc", "Elab", "Advice")) + # Y-axis ticks and labels scale_y_continuous(breaks = c(0.5, 1.5, 2.5, 3.5, 4.5, 5.5), labels = c("Hedged Disc", "Elab", "Reflection", "Question", "Acknow", "Advice")) ``` Print state space grid plot example. ```{r, message = FALSE} print(plot_example) ``` The state space grid illustrates how the dyad moves through the conversational landscape over time. Listener turn types are indexed along the x-axis, and discloser turns are indexed along the y-axis. Points placed across the state space grid indicate all specific two-turn sequences, listener – discloser and discloser – listener, that the dyad used in the conversation. Horizontal lines showing the listener’s moves and vertical lines showing the discloser’s moves get darker over the course of the conversation. In this case, the dyad began the conversation (light gray lines) in the bottom left of the state space grid indicating the use of listener acknowledgement and reflection turn types and discloser elaboration and hedged disclosure turn types. As the conversation continued (the darker gray and black lines), the discloser still primarily used elaboration turns, but the listener used a greater variety of turn types. Note: warning messages about "rows containing missing values" may appear. This is not an issue we need to worry about. This warning message indicates that we have missing values for the variable that is creating the points (geom_point) or segments (geom_segment) on the plot. We expect there to be missing values because (1) the first turn in the conversation does not follow from a previous turn, resulting in a missing transition, and (2) some of our turns were "uncodable" and thus were not assigned a turn type code. So, we are able to ignore this warning message. A note on accessibility: To make your plots accessible, you may consider adopting a colorblind-friendly palette. David Nichols' website (https://davidmathlogic.com/colorblind/) provides a great explainer on this issue, as well as a color picking tool. # Run a Loop that will Plot all Dyads and Save the Plots in a PDF. Now that we know how to create a plot for a single dyad, we will create a loop that plots each dyad in the data set and saves these plots to a PDF. First, identify the location where you would like to save the PDF and save this location as the object "dir". ```{r} # This can be done by going to the top bar of RStudio and selecting # "Session" --> "Set Working Directory" --> "Choose Directory" --> # finding the location of where you want your file dir <- setwd("~/Desktop") ``` Second, create a vector of all the IDs in the data set. ```{r} # Create vector idlist <- unique(data$id) # View contents of list idlist ``` Note: the first number in brackets ([1]) is not part of the vector, it is just a counter and indicates the first (or 20th, 39th, or 58th) element of the vector. The numbers following the brackets are the IDs contained in the data set. Finally, create and run the loop for the plots. ```{r, eval = FALSE} # Open the pdf file pdf('Stranger Conversation State Space Grids.pdf', width = 10, height = 7) for(x in 1:length(idlist)) #looping through plots { # Select participant ID from the list of IDs subject_id <- idlist[x] # Subset data for selected participant ID data_sub <- subset(data, id == subject_id) # Create object with participant ID name <- as.character(data_sub$id[1]) # Choose the data, set one dyad member and their values for the x-axis, # and the other dyad member and their values for the y-axis plot<-ggplot(data = data_sub, aes(x = listener_value_shift, y = discloser_value_shift)) + # Create title for plot by combining "Dyad = " with the name object created above ggtitle(paste("Dyad =", name)) + # Create each cell of the state space grid by setting its width (xmin and xmax), # height (ymin and ymax), and color (fill using hex code: https://www.color-hex.com/) geom_rect(xmin=0,xmax=1,ymin=0,ymax=1, fill="#F8766D") + geom_rect(xmin=0,xmax=1,ymin=1,ymax=2, fill="#C69037") + geom_rect(xmin=0,xmax=1,ymin=2,ymax=3, fill="#7C9853") + geom_rect(xmin=0,xmax=1,ymin=3,ymax=4, fill="#7C9C86") + geom_rect(xmin=0,xmax=1,ymin=4,ymax=5, fill="#AD89B6") + geom_rect(xmin=0,xmax=1,ymin=5,ymax=6, fill="#EA74B4") + geom_rect(xmin=1,xmax=2,ymin=0,ymax=1, fill="#C69037") + geom_rect(xmin=1,xmax=2,ymin=1,ymax=2, fill="#93AA00") + geom_rect(xmin=1,xmax=2,ymin=2,ymax=3, fill="#4AB21C") + geom_rect(xmin=1,xmax=2,ymin=3,ymax=4, fill="#4AB650") + geom_rect(xmin=1,xmax=2,ymin=4,ymax=5, fill="#7AA380") + geom_rect(xmin=1,xmax=2,ymin=5,ymax=6, fill="#B78E7E") + geom_rect(xmin=2,xmax=3,ymin=0,ymax=1, fill="#7C9853") + geom_rect(xmin=2,xmax=3,ymin=1,ymax=2, fill="#4AB21C") + geom_rect(xmin=2,xmax=3,ymin=2,ymax=3, fill="#00BA38") + geom_rect(xmin=2,xmax=3,ymin=3,ymax=4, fill="#00BE6C") + geom_rect(xmin=2,xmax=3,ymin=4,ymax=5, fill="#31AB9C") + geom_rect(xmin=2,xmax=3,ymin=5,ymax=6, fill="#6E969A") + geom_rect(xmin=3,xmax=4,ymin=0,ymax=1, fill="#7C9C86") + geom_rect(xmin=3,xmax=4,ymin=1,ymax=2, fill="#4AB650") + geom_rect(xmin=3,xmax=4,ymin=2,ymax=3, fill="#00BE6C") + geom_rect(xmin=3,xmax=4,ymin=3,ymax=4, fill="#00C19F") + geom_rect(xmin=3,xmax=4,ymin=4,ymax=5, fill="#31AFCF") + geom_rect(xmin=3,xmax=4,ymin=5,ymax=6, fill="#6E9ACD") + geom_rect(xmin=4,xmax=5,ymin=0,ymax=1, fill="#AD89B6") + geom_rect(xmin=4,xmax=5,ymin=1,ymax=2, fill="#7AA380") + geom_rect(xmin=4,xmax=5,ymin=2,ymax=3, fill="#31AB9C") + geom_rect(xmin=4,xmax=5,ymin=3,ymax=4, fill="#31AFCF") + geom_rect(xmin=4,xmax=5,ymin=4,ymax=5, fill="#619CFF") + geom_rect(xmin=4,xmax=5,ymin=5,ymax=6, fill="#9E87FD") + geom_rect(xmin=5,xmax=6,ymin=0,ymax=1, fill="#EA74B4") + geom_rect(xmin=5,xmax=6,ymin=1,ymax=2, fill="#B78E7E") + geom_rect(xmin=5,xmax=6,ymin=2,ymax=3, fill="#6E969A") + geom_rect(xmin=5,xmax=6,ymin=3,ymax=4, fill="#6E9ACD") + geom_rect(xmin=5,xmax=6,ymin=4,ymax=5, fill="#9E87FD") + geom_rect(xmin=5,xmax=6,ymin=5,ymax=6, fill="#DB72FB") + # Create point that represents the x-y point of the dyads' variables # Time (turn) controls the color of the point # Alpha controls the opacity of the point geom_point(aes(color = turn), alpha = 1) + # Create line that connects the consecutive points in time, # with the line changing color as time goes on # x: starting value of segment on the x-axis (prior listener turn type) # y: starting value of segment on the y-axis (prior discloser turn type) # xend: ending value of segment on the x-axis (current listener turn type) # yend: ending value of segment on the y-axis (current discloser turn type) # colour: controls the color of the segment, in this case, by time (turn) in the conversation geom_segment(aes(x = listener_value_lag_shift, y = discloser_value_lag_shift, xend = listener_value_shift, yend = discloser_value_shift, colour = turn)) + # Color line segments connecting points from white to black by time scale_color_gradient(name = 'Turn Order', low = 'white', high = 'black') + # Label for x-axis xlab('Listener Turn Type') + # Label for y-axis ylab('Discloser Turn Type') + # Additional plot aesthetics theme( # Adjust text size, color, and angle on x-axis and y-axis axis.text.x = element_text(size = 14, color = 'black', angle = 45), axis.text.y = element_text(size = 14, color = 'black'), # Adjust size of plot title axis.title = element_text(size = 18), # Adjust color and lines surrounding plot panel.grid.major = element_line(colour = "white"), panel.grid.minor = element_blank(), panel.background = element_blank(), axis.ticks = element_blank()) + # Vertical lines to divide grid geom_vline(xintercept = c(0, 1, 2, 3, 4, 5, 6)) + # Horizontal lines to divide grid geom_hline(yintercept = c(0, 1, 2, 3, 4, 5, 6)) + # X-axis ticks and labels scale_x_continuous(breaks = c(0.5, 1.5, 2.5, 3.5, 4.5, 5.5), labels=c("Acknow", "Question", "Reflection", "Hedged Disc", "Elab", "Advice")) + #Y-axis ticks and labels scale_y_continuous(breaks = c(0.5, 1.5, 2.5, 3.5, 4.5, 5.5), labels=c("Hedged Disc", "Elab", "Reflection", "Question", "Acknow", "Advice")) # Print plot print(plot) } dev.off() ``` Hooray for plotting! ----- ### Additional Information We created this tutorial with a system environment and versions of R and packages that might be different from yours. If R reports errors when you attempt to run this tutorial, running the code chunk below and comparing your output and the tutorial posted on the LHAMA website may be helpful. ```{r} session_info(pkgs = c("attached")) ```