State Space Grid Metric Tutorial
Overview
This tutorial will provide R code on establishing a state space grid and deriving several metrics.
State space grids are typically used to depict and quantify longitudinal dyadic (or bivariate) data in a 2-dimensional space (Hollenstein, 2013). In this example, we plot the speaking turn behaviors enacted by each dyad member during a support conversation in which one dyad member shared a current problem they were facing. Each turn in the conversations was coded into one of six turn types e.g., as an acknowledgement, question, advice, etc.).
Here, we demonstrate how to calculate three metrics illustrated in the accompanying paper (“Using State Space Grids to Quantify and Examine Dynamics of Dyadic Conversation” by Brinberg, Solomon, Bodie, Jones, & Ram in Communication Methods and Measures):
- Entropy as a measure of behavioral flexibility
- Time using problem description behaviors as a measure of an
attractor
- Time of exit from the problem description attractor as a measure of a phase shift
We also demonstrate how to examine the association between the state space grid-derived metrics and a conversational outcome (in this case, distress) using multiple regression.
Note that the accompanying “SSGMetric_Tutorial.rmd” file contains all of the code presented in this tutorial and can be opened in RStudio (a somewhat more friendly user interface to R). This file can be used so you don’t have to copy and paste code or so you can easily rename the variables in the current code.
Thank you to Jon Helm for writing the original state space grid plotting and entropy code.
Outline
In this tutorial, we’ll cover…
- Reading in the data and loading needed packages.
- General data preparation.
- Entropy.
- Problem description attractor.
- Attractor exit time.
- SSG metrics and between-dyad differences.
Read in the data and load needed libraries.
Let’s read the data into R.
The exemplar data set we are working with is called “StrangerConversations_N59” and is stored as a .csv file (comma-separated values file, which can be created by saving an Excel file as a csv document) on my computer’s desktop.
# Set working directory (i.e., where your data file is stored)
# This can be done by going to the top bar of RStudio and selecting "Session" --> "Set Working Directory" --> "Choose Directory" --> finding the location of your file
setwd("~/Desktop")
# Read in the repeated measures data
data <- read.csv(file = "StrangerConversations_N59.csv", head = TRUE, sep = ",")
# View the first 10 rows of the repeated measures data
head(data, 10)
## id turn role turn_type
## 1 105 1 1 Question
## 2 105 2 2 Acknowledgement
## 3 105 3 1 Elaboration
## 4 105 4 2 Acknowledgement
## 5 105 5 1 Elaboration
## 6 105 6 2 Acknowledgement
## 7 105 7 1 Elaboration
## 8 105 8 2 Elaboration
## 9 105 9 1 Elaboration
## 10 105 10 2 Reflection
# Read in the outcomes data
outcomes <- read.csv(file = "StrangerConversations_N59_Outcomes.csv", head = TRUE, sep = ",")
# View the first 10 rows of the outcomes data
head(outcomes, 10)
## id distress
## 1 3 3
## 2 11 1
## 3 12 2
## 4 14 2
## 5 31 1
## 6 38 2
## 7 45 1
## 8 54 3
## 9 55 2
## 10 58 3
In the data, we can see each row contains information for one utterance and there are multiple rows (i.e., multiple utterances) for each dyad. Specifically, there is a column for:
- Dyad ID (
id
)
- Time variable - in this case, turn in the conversation
(
turn
)
- Dyad member ID - in this case, role in the conversation
(
role
; discloser = 1, listener = 2)
- Turn type - in this case, a typology of six different speaking turn
behaviors: acknowledgement, advice, elaboration, hedged disclosure,
question, and reflection (
turn_type
)
In the outcome data (“outcomes”), we can see there is one row for each dyad and there are columns for:
- Dyad ID (
id
)
- Outcome variable - in this case, post-conversation report of
distress by the support receiver (
distress
)
Load the R packages we need.
Packages in R are a collection of functions (and their documentation/explanations) that enable us to conduct particular tasks, such as plotting or fitting a statistical model.
# install.packages("devtools") # Install package if you have never used it before
library(devtools) # For version control
# install.packages("dplyr") # Install package if you have never used it before
library(dplyr) # For data management
# install.packages("ggplot2") # Install package if you have never used it before
library(ggplot2) # For plotting
# install.packages("psych") # Install package if you have never used it before
library(psych) # For descriptive statistics
# install.packages("tidyr") # Install package if you have never used it before
library(tidyr) # For data management
# install.package("vctrs") # Install package if you have never used it before
library(vctrs) # For data management
General data preparation.
Before calculating our state space grid metrics, we need to create a new variable that will assign each speaking turn transition pair to a cell (or state) in the state space grid.
To do so, we first create two new variables that represent the listeners’ turn types and disclosers’ turn types separately.
# Add "Discloser" to turn_type variable, then set all Listener (role = 2) turns to NA
data$discloser_turntype <- paste("Discloser", data$turn_type, sep = " ")
data$discloser_turntype[data$role == 2] <- NA
# Add "Listener" to turn_type variable, then set all Discloser (role = 1) turns to NA
data$listener_turntype <- paste("Listener", data$turn_type, sep = " ")
data$listener_turntype[data$role == 1] <- NA
# Reset missing values
data$listener_turntype[data$listener_turntype == "Listener NA"] <- NA
data$discloser_turntype[data$discloser_turntype == "Discloser NA"] <- NA
# View the first 10 rows of the data
head(data, 10)
## id turn role turn_type discloser_turntype listener_turntype
## 1 105 1 1 Question Discloser Question <NA>
## 2 105 2 2 Acknowledgement <NA> Listener Acknowledgement
## 3 105 3 1 Elaboration Discloser Elaboration <NA>
## 4 105 4 2 Acknowledgement <NA> Listener Acknowledgement
## 5 105 5 1 Elaboration Discloser Elaboration <NA>
## 6 105 6 2 Acknowledgement <NA> Listener Acknowledgement
## 7 105 7 1 Elaboration Discloser Elaboration <NA>
## 8 105 8 2 Elaboration <NA> Listener Elaboration
## 9 105 9 1 Elaboration Discloser Elaboration <NA>
## 10 105 10 2 Reflection <NA> Listener Reflection
We then carry forward each listener and discloser turn type. Specifically, we assign the following row (i.e., the following turn) that is missing to the turn type prior for both the listener turn type column and the discloser turn type column.
data <- # Select data
data %>%
# Select grouping variable, in this case, dyad ID (id)
dplyr::group_by(id) %>%
# Fill in the turn type for "listener_turntype" and "discloser_turntype"
# such that the turn type is carried forward one row
# in order to capture turn transitions
dplyr::mutate(listener_turntype = vec_fill_missing(listener_turntype, max_fill = 1),
discloser_turntype = vec_fill_missing(discloser_turntype, max_fill = 1)) %>%
# Save the data as a data.frame
as.data.frame()
# View the first 10 rows of the data
head(data, 10)
## id turn role turn_type discloser_turntype listener_turntype
## 1 105 1 1 Question Discloser Question <NA>
## 2 105 2 2 Acknowledgement Discloser Question Listener Acknowledgement
## 3 105 3 1 Elaboration Discloser Elaboration Listener Acknowledgement
## 4 105 4 2 Acknowledgement Discloser Elaboration Listener Acknowledgement
## 5 105 5 1 Elaboration Discloser Elaboration Listener Acknowledgement
## 6 105 6 2 Acknowledgement Discloser Elaboration Listener Acknowledgement
## 7 105 7 1 Elaboration Discloser Elaboration Listener Acknowledgement
## 8 105 8 2 Elaboration Discloser Elaboration Listener Elaboration
## 9 105 9 1 Elaboration Discloser Elaboration Listener Elaboration
## 10 105 10 2 Reflection Discloser Elaboration Listener Reflection
Next, we create a new variable (“ssg_state”) in which we label each combination of the disclosers’ and listeners’ turns to a cell in the state space grid. Since we have six turn types for both disclosers and listeners, we will have 36 states in which the conversation can exist.
data$ssg_cell <- NA
data$ssg_cell[data$listener_turntype == "Listener Acknowledgement" &
data$discloser_turntype == "Discloser Elaboration"] <- "LAck_DElab"
data$ssg_cell[data$listener_turntype == "Listener Question" &
data$discloser_turntype == "Discloser Elaboration"] <- "LQues_DElab"
data$ssg_cell[data$listener_turntype == "Listener Reflection" &
data$discloser_turntype == "Discloser Elaboration"] <- "LRefl_DElab"
data$ssg_cell[data$listener_turntype == "Listener Acknowledgement" &
data$discloser_turntype == "Discloser HedgedDisclosure"] <- "LAck_DHedg"
data$ssg_cell[data$listener_turntype == "Listener Question" &
data$discloser_turntype == "Discloser HedgedDisclosure"] <- "LQues_DHedg"
data$ssg_cell[data$listener_turntype == "Listener Reflection" &
data$discloser_turntype == "Discloser HedgedDisclosure"] <- "LRefl_DHedg"
data$ssg_cell[data$listener_turntype == "Listener Elaboration" &
data$discloser_turntype == "Discloser Elaboration"] <- "LElab_DElab"
data$ssg_cell[data$listener_turntype == "Listener Elaboration" &
data$discloser_turntype == "Discloser HedgedDisclosure"] <- "LElab_DHedg"
data$ssg_cell[data$listener_turntype == "Listener Elaboration" &
data$discloser_turntype == "Discloser Reflection"] <- "LElab_DRefl"
data$ssg_cell[data$listener_turntype == "Listener Elaboration" &
data$discloser_turntype == "Discloser Question"] <- "LElab_DQues"
data$ssg_cell[data$listener_turntype == "Listener Elaboration" &
data$discloser_turntype == "Discloser Acknowledgement"] <- "LElab_DAck"
data$ssg_cell[data$listener_turntype == "Listener HedgedDisclosure" &
data$discloser_turntype == "Discloser Elaboration"] <- "LHedg_DElab"
data$ssg_cell[data$listener_turntype == "Listener HedgedDisclosure" &
data$discloser_turntype == "Discloser HedgedDisclosure"] <- "LHedg_DHedg"
data$ssg_cell[data$listener_turntype == "Listener HedgedDisclosure" &
data$discloser_turntype == "Discloser Reflection"] <- "LHedg_DRefl"
data$ssg_cell[data$listener_turntype == "Listener HedgedDisclosure" &
data$discloser_turntype == "Discloser Question"] <- "LHedg_DQues"
data$ssg_cell[data$listener_turntype == "Listener HedgedDisclosure" &
data$discloser_turntype == "Discloser Acknowledgement"] <- "LHedg_DAck"
data$ssg_cell[data$listener_turntype == "Listener Acknowledgement" &
data$discloser_turntype == "Discloser Reflection"] <- "LAck_DRefl"
data$ssg_cell[data$listener_turntype == "Listener Acknowledgement" &
data$discloser_turntype == "Discloser Question"] <- "LAck_DQues"
data$ssg_cell[data$listener_turntype == "Listener Acknowledgement" &
data$discloser_turntype == "Discloser Acknowledgement"] <- "LAck_DAck"
data$ssg_cell[data$listener_turntype == "Listener Question" &
data$discloser_turntype == "Discloser Reflection"] <- "LQues_DRefl"
data$ssg_cell[data$listener_turntype == "Listener Question" &
data$discloser_turntype == "Discloser Question"] <- "LQues_DQues"
data$ssg_cell[data$listener_turntype == "Listener Question" &
data$discloser_turntype == "Discloser Acknowledgement"] <- "LQues_DAck"
data$ssg_cell[data$listener_turntype == "Listener Reflection" &
data$discloser_turntype == "Discloser Reflection"] <- "LRefl_DRefl"
data$ssg_cell[data$listener_turntype == "Listener Reflection" &
data$discloser_turntype == "Discloser Question"] <- "LRefl_DQues"
data$ssg_cell[data$listener_turntype == "Listener Reflection" &
data$discloser_turntype == "Discloser Acknowledgement"] <- "LRefl_DAck"
data$ssg_cell[data$listener_turntype == "Listener Advice" &
data$discloser_turntype == "Discloser Elaboration"] <- "LAdv_DElab"
data$ssg_cell[data$listener_turntype == "Listener Advice" &
data$discloser_turntype == "Discloser HedgedDisclosure"] <- "LAdv_DHedg"
data$ssg_cell[data$listener_turntype == "Listener Advice" &
data$discloser_turntype == "Discloser Reflection"] <- "LAdv_DRefl"
data$ssg_cell[data$listener_turntype == "Listener Advice" &
data$discloser_turntype == "Discloser Question"] <- "LAdv_DQues"
data$ssg_cell[data$listener_turntype == "Listener Advice" &
data$discloser_turntype == "Discloser Acknowledgement"] <- "LAdv_DAck"
data$ssg_cell[data$listener_turntype == "Listener Advice" &
data$discloser_turntype == "Discloser Advice"] <- "LAdv_DAdv"
data$ssg_cell[data$listener_turntype == "Listener Acknowledgement" &
data$discloser_turntype == "Discloser Advice"] <- "LAck_DAdv"
data$ssg_cell[data$listener_turntype == "Listener Question" &
data$discloser_turntype == "Discloser Advice"] <- "LQues_DAdv"
data$ssg_cell[data$listener_turntype == "Listener Reflection" &
data$discloser_turntype == "Discloser Advice"] <- "LRefl_DAdv"
data$ssg_cell[data$listener_turntype == "Listener HedgedDisclosure" &
data$discloser_turntype == "Discloser Advice"] <- "LHedg_DAdv"
data$ssg_cell[data$listener_turntype == "Listener Elaboration" &
data$discloser_turntype == "Discloser Advice"] <- "LElab_DAdv"
# View the first 10 rows of the data
head(data, 10)
## id turn role turn_type discloser_turntype listener_turntype
## 1 105 1 1 Question Discloser Question <NA>
## 2 105 2 2 Acknowledgement Discloser Question Listener Acknowledgement
## 3 105 3 1 Elaboration Discloser Elaboration Listener Acknowledgement
## 4 105 4 2 Acknowledgement Discloser Elaboration Listener Acknowledgement
## 5 105 5 1 Elaboration Discloser Elaboration Listener Acknowledgement
## 6 105 6 2 Acknowledgement Discloser Elaboration Listener Acknowledgement
## 7 105 7 1 Elaboration Discloser Elaboration Listener Acknowledgement
## 8 105 8 2 Elaboration Discloser Elaboration Listener Elaboration
## 9 105 9 1 Elaboration Discloser Elaboration Listener Elaboration
## 10 105 10 2 Reflection Discloser Elaboration Listener Reflection
## ssg_cell
## 1 <NA>
## 2 LAck_DQues
## 3 LAck_DElab
## 4 LAck_DElab
## 5 LAck_DElab
## 6 LAck_DElab
## 7 LAck_DElab
## 8 LElab_DElab
## 9 LElab_DElab
## 10 LRefl_DElab
Finally, we count the number of turns per conversation, which will serve as a control variable in a later analysis.
#create variable with total number of turns per dyad
total_turns <- # Select data
data %>%
# Select grouping variable, in this case, dyad ID (id)
dplyr::group_by(id) %>%
# Create a new variable ("total_turns") that represents the turn number of turns
# within the conversation - i.e., the maximum turn number
dplyr::summarise(total_turns = max(turn)) %>%
# Save the data as a data.frame
as.data.frame()
# View the first 10 rows of the total_turns data
head(total_turns, 10)
## id total_turns
## 1 3 65
## 2 11 60
## 3 12 82
## 4 14 90
## 5 31 84
## 6 38 90
## 7 45 115
## 8 54 79
## 9 55 64
## 10 58 69
Merge “total_turns” into “outcomes.”
outcomes <- merge(outcomes, total_turns, by = "id")
# View the first 10 rows of the outcomes data
head(outcomes, 10)
## id distress total_turns
## 1 3 3 65
## 2 11 1 60
## 3 12 2 82
## 4 14 2 90
## 5 31 1 84
## 6 38 2 90
## 7 45 1 115
## 8 54 3 79
## 9 55 2 64
## 10 58 3 69
Entropy.
In this section, we calculate entropy for each dyad.
Create function to calculate entropy. Thank you to [blinded for review] for creating this function!
# Create the function called iEntropy which calculates entropy for each conversation
# This function takes in two vectors. x is the observed vector on which means are calculated, id is a vector identification values.
Entropy = function(x = x, base = exp(1)){
# First, create a vector that counts all of the responses of x
count_x = table(x)
# Divide that table by its sum to convert the counts of the responses to probabilities of the responses
prob_x = count_x / sum(count_x)
# Take the logarithm of all of the probabilities with the user-specified base
log_x = log(prob_x, base)
# If any of the values within log_x equal -Inf (i.e. negative infinity) then replace them with 0
if(any(log_x == -Inf)){
log_x_corrected = log_x
log_x_corrected[which(log_x == -Inf)] = 0
}
# If all of the values do not equal -Inf, then do not replace any of them
if(all(log_x != -Inf)){
log_x_corrected = log_x
}
# Multiply the probablilities by the corresponding logs of the probabilities
prod_x = prob_x * log_x_corrected
# sum the products of the probabilities and log probabilities, and multiply by negative one
entropy = -1*sum(prod_x)
# return entropy
return(entropy)
}
# This second function takes the originally defined Entropy function and applies it to each individual
# I entropy takes in a vector of obesrvations, x, and id vector, and a user-specified base of the logarithm
iEntropy = function(x = x, id = id, base = exp(1)){
# The aggregate function splits x by id, and then applies the entropy function to each of the subsets of x
out = aggregate(x, by = list(id), FUN = Entropy, base = base)
# Rename the 'out' to 'id' and 'iEntropy'
names(out) = c('id', 'iEntropy')
# Return the calculated iEntropy values
return(out)
}
Calculate entropy.
# Remove missing data for the sake of calculating entropy
data_noNA <- data[!is.na(data$ssg_cell), ]
# Change "ssg_cell" to factor so the function recognizes the variable as categorical (rather than as a character)
data_noNA$ssg_cell <- as.factor(data_noNA$ssg_cell)
# Calculate entropy and save to new data frame ("entropy_data")
entropy_data <- # Select variable
aggregate(data_noNA$ssg_cell,
# Apply variable to each group, in this case, dyad id ("id")
by = list(data_noNA$id),
# Apply the Entropy function we created
FUN = Entropy, base = exp(1))
# Rename columns in new data set
names(entropy_data) = c('id', 'iEntropy')
# View the first 10 rows of the entropy data
head(entropy_data, 10)
## id iEntropy
## 1 3 2.432622
## 2 11 1.761813
## 3 12 2.114136
## 4 14 2.367273
## 5 31 1.661760
## 6 38 1.680077
## 7 45 2.064100
## 8 54 2.064036
## 9 55 2.368448
## 10 58 1.942869
Entropy descriptives.
## vars n mean sd median trimmed mad min max range skew kurtosis se
## X1 1 59 1.96 0.35 2.04 1.99 0.32 1.1 2.45 1.36 -0.66 -0.51 0.05
# Plot iEntropy distribution
# Select data and variable of interest (iEntropy)
ggplot(entropy_data, aes(x = iEntropy)) +
# Create histogram and set width of bars in histogram
geom_histogram(binwidth = 0.10) +
# Label x-axis
xlab("Entropy - Conversation Behavioral Flexibility") +
# Label y-axis
ylab("Frequency") +
# Update aesthetics
theme_classic()
Merge “entropy_data” into “outcomes.”
outcomes <- merge(outcomes, entropy_data, by = "id")
# View the first 10 rows of the outcomes data
head(outcomes, 10)
## id distress total_turns iEntropy
## 1 3 3 65 2.432622
## 2 11 1 60 1.761813
## 3 12 2 82 2.114136
## 4 14 2 90 2.367273
## 5 31 1 84 1.661760
## 6 38 2 90 1.680077
## 7 45 1 115 2.064100
## 8 54 3 79 2.064036
## 9 55 2 64 2.368448
## 10 58 3 69 1.942869
Problem description attractor.
In this next section, we calculate the relative proportion of time spent in the problem description attractor.
To begin, we first have to define the attractor. Specifically, we define the problem description attractor as the cells that were created from the intersection of discloser elaboration and hedged disclosure and listener acknowledgement, question, and reflection.
We create a new variable that labels each turn as either IN the attractor or OUT of the attractor.
# Create a variable that labels all turns as "OUT" of the attractor
data$attractor <- "OUT"
# Update the "attractor" variable so that turns that occur in one of the six problem description attractor cells is now labeled "IN"
data$attractor[data$ssg_cell == "LAck_DElab"] <- "IN"
data$attractor[data$ssg_cell == "LAck_DHedg"] <- "IN"
data$attractor[data$ssg_cell == "LQues_DElab"] <- "IN"
data$attractor[data$ssg_cell == "LQues_DHedg"] <- "IN"
data$attractor[data$ssg_cell == "LRefl_DElab"] <- "IN"
data$attractor[data$ssg_cell == "LRefl_DElab"] <- "IN"
# View the first 10 rows of the data
head(data, 10)
## id turn role turn_type discloser_turntype listener_turntype
## 1 105 1 1 Question Discloser Question <NA>
## 2 105 2 2 Acknowledgement Discloser Question Listener Acknowledgement
## 3 105 3 1 Elaboration Discloser Elaboration Listener Acknowledgement
## 4 105 4 2 Acknowledgement Discloser Elaboration Listener Acknowledgement
## 5 105 5 1 Elaboration Discloser Elaboration Listener Acknowledgement
## 6 105 6 2 Acknowledgement Discloser Elaboration Listener Acknowledgement
## 7 105 7 1 Elaboration Discloser Elaboration Listener Acknowledgement
## 8 105 8 2 Elaboration Discloser Elaboration Listener Elaboration
## 9 105 9 1 Elaboration Discloser Elaboration Listener Elaboration
## 10 105 10 2 Reflection Discloser Elaboration Listener Reflection
## ssg_cell attractor
## 1 <NA> OUT
## 2 LAck_DQues OUT
## 3 LAck_DElab IN
## 4 LAck_DElab IN
## 5 LAck_DElab IN
## 6 LAck_DElab IN
## 7 LAck_DElab IN
## 8 LElab_DElab OUT
## 9 LElab_DElab OUT
## 10 LRefl_DElab IN
Calculate the proportion of time in the attractor for each dyad.
attractor_time <- # Select data
data %>%
# Select grouping variable, in this case, dyad ID (id)
dplyr::group_by(id) %>%
# Count the occurrence of each category ("IN", "OUT") in the "attractor" variable
dplyr::count(attractor) %>%
# Remove any missing data
tidyr::drop_na() %>%
# Calculate the proportion of time in each category
dplyr::mutate(attractor_prop = prop.table(n)) %>%
# Save the data as a data.frame
as.data.frame()
# View the first 10 rows of the attractor_time data
head(attractor_time, 10)
## id attractor n attractor_prop
## 1 3 IN 31 0.4769231
## 2 3 OUT 34 0.5230769
## 3 11 IN 34 0.5666667
## 4 11 OUT 26 0.4333333
## 5 12 IN 29 0.3536585
## 6 12 OUT 53 0.6463415
## 7 14 IN 44 0.4888889
## 8 14 OUT 46 0.5111111
## 9 31 IN 59 0.7023810
## 10 31 OUT 25 0.2976190
Proportion of time in attractor descriptives.
## vars n mean sd median trimmed mad min max range skew kurtosis se
## X1 1 118 0.5 0.2 0.5 0.5 0.23 0.11 0.89 0.78 0 -0.79 0.02
# Plot attractor_prop distribution
# Select data and variable of interest (attractor_prop)
ggplot(attractor_time, aes(x = attractor_prop)) +
# Create histogram and set width of bars in histogram
geom_histogram(binwidth = 0.10) +
# Label x-axis
xlab("Proportion of Turns in the Problem Description Attractor for Each Dyad") +
# Label y-axis
ylab("Frequency") +
# Update aesthetics
theme_classic()
Merge proportion of time in the attractor (attractor_prop == “IN”) into “outcomes.”
outcomes <- merge(outcomes,
attractor_time[attractor_time$attractor == "IN", c("id", "attractor_prop")],
by = "id", all.x = TRUE)
# View the first 10 rows of the outcomes data
head(outcomes, 10)
## id distress total_turns iEntropy attractor_prop
## 1 3 3 65 2.432622 0.4769231
## 2 11 1 60 1.761813 0.5666667
## 3 12 2 82 2.114136 0.3536585
## 4 14 2 90 2.367273 0.4888889
## 5 31 1 84 1.661760 0.7023810
## 6 38 2 90 1.680077 0.6666667
## 7 45 1 115 2.064100 0.5913043
## 8 54 3 79 2.064036 0.7088608
## 9 55 2 64 2.368448 0.1718750
## 10 58 3 69 1.942869 0.7101449
Attractor exit time.
Finally, we will calculate the time (i.e., turn number) when the dyad first leaves the problem description attractor.
Find first row in which zone is not equal to 1.
attractor_exit <- # Select data
data %>%
# Select grouping variable, in this case, dyad ID (id)
dplyr::group_by(id) %>%
# Remove the first row of each group since the first turn will
# never be in a cell (since we need two data points to locate a turn
# transition in a cell)
dplyr::slice(2:n()) %>%
# Keep rows in which the dyad is out of the attractor
# (and is not due to missing data)
dplyr::filter(attractor == "OUT" & !is.na(ssg_cell)) %>%
# Keep the first row in which the dyad is out of the attractor
dplyr::filter(row_number()==1) %>%
# Save the data as a data.frame
as.data.frame()
# Rename turn variable to "attractor_exit"
colnames(attractor_exit)[2] <- "attractor_exit"
Time of attractor exit descriptives.
## vars n mean sd median trimmed mad min max range skew kurtosis se
## X1 1 59 11.69 10.87 8 10.1 8.9 2 47 45 1.3 1.02 1.41
# Plot attractor_exit distribution
# Select data and variable of interest (attractor_exit)
ggplot(attractor_exit, aes(x = attractor_exit)) +
# Create histogram and set width of bars in histogram
geom_histogram(binwidth = 5) +
# Label x-axis
xlab("Time (in turns) of Problem Description Attractor Exit") +
# Label y-axis
ylab("Frequency") +
# Update aesthetics
theme_classic()
Merge “attractor_exit” variable into “outcomes.”
outcomes <- merge(outcomes,
attractor_exit[, c("id", "attractor_exit")],
by = "id")
# View the first 10 rows of the outcomes data
head(outcomes, 10)
## id distress total_turns iEntropy attractor_prop attractor_exit
## 1 3 3 65 2.432622 0.4769231 18
## 2 11 1 60 1.761813 0.5666667 22
## 3 12 2 82 2.114136 0.3536585 7
## 4 14 2 90 2.367273 0.4888889 3
## 5 31 1 84 1.661760 0.7023810 32
## 6 38 2 90 1.680077 0.6666667 2
## 7 45 1 115 2.064100 0.5913043 2
## 8 54 3 79 2.064036 0.7088608 2
## 9 55 2 64 2.368448 0.1718750 4
## 10 58 3 69 1.942869 0.7101449 8
SSG metrics and between-dyad differences.
In this final step, we will examine how our state space grid metrics are associated with a between-dyad difference (distress). We will examine these associations using multiple regression, and we will control for the total number of turns in the conversation.
Entropy.
entropy_regression <- lm(# Post-conversation distress is predicted by entropy and total turns
distress ~ iEntropy + total_turns,
data = outcomes)
summary(entropy_regression)
##
## Call:
## lm(formula = distress ~ iEntropy + total_turns, data = outcomes)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.5502 -1.0537 -0.1442 0.8119 2.8918
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.029425 1.093803 2.770 0.0076 **
## iEntropy -0.616536 0.464506 -1.327 0.1898
## total_turns 0.003784 0.008188 0.462 0.6458
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.217 on 56 degrees of freedom
## Multiple R-squared: 0.03189, Adjusted R-squared: -0.002682
## F-statistic: 0.9224 on 2 and 56 DF, p-value: 0.4035
# Plot result
# Select data and set x-axis (iEntropy - i.e., the predictor)
# and y-axis (distress - i.e., the outcome)
ggplot(outcomes, aes(x = iEntropy, y = distress)) +
# Plot each dyad's iEntropy and distress
geom_point() +
# Plot smoothed regression line
stat_smooth(method='lm', formula = y ~ x, color = "blue", size = 1) +
# Label x-axis
xlab("Conversation Behavior Flexibility") +
# Label y-axis
ylab("Disclosers' Post-conversation Distress") +
# Update aesthetics
theme_classic()
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
Entropy of conversation behaviors during a support conversation is not associated with discloser’s post-conversation distress, while controlling for the total number of turns within the conversation.
Problem description attractor.
attractor_regression <- lm(# Post-conversation distress is predicted by time in
# attractor and total turns
distress ~ attractor_prop + total_turns,
data = outcomes)
summary(attractor_regression)
##
## Call:
## lm(formula = distress ~ attractor_prop + total_turns, data = outcomes)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.3928 -1.1224 -0.1605 0.8068 3.0325
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.523603 0.966507 1.576 0.121
## attractor_prop 0.631399 0.930769 0.678 0.500
## total_turns 0.002882 0.008247 0.349 0.728
##
## Residual standard error: 1.231 on 56 degrees of freedom
## Multiple R-squared: 0.009576, Adjusted R-squared: -0.0258
## F-statistic: 0.2707 on 2 and 56 DF, p-value: 0.7638
# Plot result
# Select data and set x-axis (attractor_prop - i.e., the predictor)
# and y-axis (distress - i.e., the outcome)
ggplot(outcomes, aes(x = attractor_prop, y = distress)) +
# Plot each dyad's attractor_prop and distress
geom_point() +
# Plot smoothed regression line
stat_smooth(method='lm', formula = y ~ x, color = "blue", size = 1) +
# Label x-axis
xlab("Proportion of Turns in the Problem Description Attractor") +
# Label y-axis
ylab("Disclosers' Post-conversation Distress") +
# Update aesthetics
theme_classic()
The proportion of time spent in the problem description attractor during a support conversation is not associated with discloser’s post-conversation distress, while controlling for the total number of turns within the conversation.
Attractor exit time.
exit_regression <- lm(# Post-conversation distress is predicted by time of first
# attractor exit and total turns
distress ~ attractor_exit + total_turns,
data = outcomes)
summary(exit_regression)
##
## Call:
## lm(formula = distress ~ attractor_exit + total_turns, data = outcomes)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.3816 -1.1258 -0.1257 0.8572 2.9207
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.781912 0.835332 2.133 0.0373 *
## attractor_exit 0.006599 0.015508 0.426 0.6721
## total_turns 0.003346 0.008561 0.391 0.6974
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.234 on 56 degrees of freedom
## Multiple R-squared: 0.004656, Adjusted R-squared: -0.03089
## F-statistic: 0.131 on 2 and 56 DF, p-value: 0.8775
# Plot result
# Select data and set x-axis (attractor_exit - i.e., the predictor)
# and y-axis (distress - i.e., the outcome)
ggplot(outcomes, aes(x = attractor_exit, y = distress)) +
# Plot each dyad's attractor_exit and distress
geom_jitter() +
# Plot smoothed regression line
stat_smooth(method='lm', formula = y ~ x, color = "blue", size = 1) +
# Label x-axis
xlab("Time (in turns) of Problem Description Attractor Exit") +
# Label y-axis
ylab("Disclosers' Post-conversation Distress") +
# Update aesthetics
theme_classic()
The timing of first exit from the problem description attractor during a support conversation is not associated with discloser’s post-conversation distress, while controlling for the total number of turns within the conversation.
Additional Information
We created this tutorial with a system environment and versions of R and packages that might be different from yours. If R reports errors when you attempt to run this tutorial, running the code chunk below and comparing your output may be helpful.
## ─ Session info ───────────────────────────────────────────────────────────────
## setting value
## version R version 4.3.2 (2023-10-31)
## os macOS Ventura 13.6.3
## system x86_64, darwin20
## ui X11
## language (EN)
## collate en_US.UTF-8
## ctype en_US.UTF-8
## tz America/New_York
## date 2024-09-16
## pandoc 3.1.1 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/ (via rmarkdown)
##
## ─ Packages ───────────────────────────────────────────────────────────────────
## package * version date (UTC) lib source
## devtools * 2.4.5 2022-10-11 [1] CRAN (R 4.3.0)
## dplyr * 1.1.4 2023-11-17 [1] CRAN (R 4.3.0)
## ggplot2 * 3.5.0 2024-02-23 [1] CRAN (R 4.3.2)
## psych * 2.4.1 2024-01-18 [1] CRAN (R 4.3.0)
## tidyr * 1.3.1 2024-01-24 [1] CRAN (R 4.3.2)
## usethis * 2.2.3 2024-02-19 [1] CRAN (R 4.3.2)
## vctrs * 0.6.5 2023-12-01 [1] CRAN (R 4.3.0)
##
## [1] /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/library
##
## ──────────────────────────────────────────────────────────────────────────────