State Space Grid Metric Tutorial

Overview

This tutorial will provide R code on establishing a state space grid and deriving several metrics.

State space grids are typically used to depict and quantify longitudinal dyadic (or bivariate) data in a 2-dimensional space (Hollenstein, 2013). In this example, we plot the speaking turn behaviors enacted by each dyad member during a support conversation in which one dyad member shared a current problem they were facing. Each turn in the conversations was coded into one of six turn types e.g., as an acknowledgement, question, advice, etc.).

Here, we demonstrate how to calculate three metrics illustrated in the accompanying paper (“Using State Space Grids to Quantify and Examine Dynamics of Dyadic Conversation” by Brinberg, Solomon, Bodie, Jones, & Ram in Communication Methods and Measures):

  • Entropy as a measure of behavioral flexibility
  • Time using problem description behaviors as a measure of an attractor
  • Time of exit from the problem description attractor as a measure of a phase shift

We also demonstrate how to examine the association between the state space grid-derived metrics and a conversational outcome (in this case, distress) using multiple regression.

Note that the accompanying “SSGMetric_Tutorial.rmd” file contains all of the code presented in this tutorial and can be opened in RStudio (a somewhat more friendly user interface to R). This file can be used so you don’t have to copy and paste code or so you can easily rename the variables in the current code.

Thank you to Jon Helm for writing the original state space grid plotting and entropy code.

Outline

In this tutorial, we’ll cover…

  • Reading in the data and loading needed packages.
  • General data preparation.
  • Entropy.
  • Problem description attractor.
  • Attractor exit time.
  • SSG metrics and between-dyad differences.

Read in the data and load needed libraries.

Let’s read the data into R.

The exemplar data set we are working with is called “StrangerConversations_N59” and is stored as a .csv file (comma-separated values file, which can be created by saving an Excel file as a csv document) on my computer’s desktop.

# Set working directory (i.e., where your data file is stored)
# This can be done by going to the top bar of RStudio and selecting "Session" --> "Set Working Directory" --> "Choose Directory" --> finding the location of your file
setwd("~/Desktop")

# Read in the repeated measures data
data <- read.csv(file = "StrangerConversations_N59.csv", head = TRUE, sep = ",")

# View the first 10 rows of the repeated measures data
head(data, 10)
##     id turn role       turn_type
## 1  105    1    1        Question
## 2  105    2    2 Acknowledgement
## 3  105    3    1     Elaboration
## 4  105    4    2 Acknowledgement
## 5  105    5    1     Elaboration
## 6  105    6    2 Acknowledgement
## 7  105    7    1     Elaboration
## 8  105    8    2     Elaboration
## 9  105    9    1     Elaboration
## 10 105   10    2      Reflection
# Read in the outcomes data
outcomes <- read.csv(file = "StrangerConversations_N59_Outcomes.csv", head = TRUE, sep = ",")

# View the first 10 rows of the outcomes data
head(outcomes, 10)
##    id distress
## 1   3        3
## 2  11        1
## 3  12        2
## 4  14        2
## 5  31        1
## 6  38        2
## 7  45        1
## 8  54        3
## 9  55        2
## 10 58        3

In the data, we can see each row contains information for one utterance and there are multiple rows (i.e., multiple utterances) for each dyad. Specifically, there is a column for:

  • Dyad ID (id)
  • Time variable - in this case, turn in the conversation (turn)
  • Dyad member ID - in this case, role in the conversation (role; discloser = 1, listener = 2)
  • Turn type - in this case, a typology of six different speaking turn behaviors: acknowledgement, advice, elaboration, hedged disclosure, question, and reflection (turn_type)

In the outcome data (“outcomes”), we can see there is one row for each dyad and there are columns for:

  • Dyad ID (id)
  • Outcome variable - in this case, post-conversation report of distress by the support receiver (distress)

Load the R packages we need.

Packages in R are a collection of functions (and their documentation/explanations) that enable us to conduct particular tasks, such as plotting or fitting a statistical model.

# install.packages("devtools") # Install package if you have never used it before
library(devtools) # For version control

# install.packages("dplyr") # Install package if you have never used it before
library(dplyr) # For data management

# install.packages("ggplot2") # Install package if you have never used it before
library(ggplot2) # For plotting

# install.packages("psych") # Install package if you have never used it before
library(psych) # For descriptive statistics

# install.packages("tidyr") # Install package if you have never used it before
library(tidyr) # For data management

# install.package("vctrs") # Install package if you have never used it before
library(vctrs) # For data management

General data preparation.

Before calculating our state space grid metrics, we need to create a new variable that will assign each speaking turn transition pair to a cell (or state) in the state space grid.

To do so, we first create two new variables that represent the listeners’ turn types and disclosers’ turn types separately.

# Add "Discloser" to turn_type variable, then set all Listener (role = 2) turns to NA
data$discloser_turntype <- paste("Discloser", data$turn_type, sep = " ")
data$discloser_turntype[data$role == 2] <- NA

# Add "Listener" to turn_type variable, then set all Discloser (role = 1) turns to NA
data$listener_turntype <- paste("Listener", data$turn_type, sep = " ")
data$listener_turntype[data$role == 1] <- NA

# Reset missing values
data$listener_turntype[data$listener_turntype == "Listener NA"] <- NA
data$discloser_turntype[data$discloser_turntype == "Discloser NA"] <- NA

# View the first 10 rows of the data
head(data, 10)
##     id turn role       turn_type    discloser_turntype        listener_turntype
## 1  105    1    1        Question    Discloser Question                     <NA>
## 2  105    2    2 Acknowledgement                  <NA> Listener Acknowledgement
## 3  105    3    1     Elaboration Discloser Elaboration                     <NA>
## 4  105    4    2 Acknowledgement                  <NA> Listener Acknowledgement
## 5  105    5    1     Elaboration Discloser Elaboration                     <NA>
## 6  105    6    2 Acknowledgement                  <NA> Listener Acknowledgement
## 7  105    7    1     Elaboration Discloser Elaboration                     <NA>
## 8  105    8    2     Elaboration                  <NA>     Listener Elaboration
## 9  105    9    1     Elaboration Discloser Elaboration                     <NA>
## 10 105   10    2      Reflection                  <NA>      Listener Reflection

We then carry forward each listener and discloser turn type. Specifically, we assign the following row (i.e., the following turn) that is missing to the turn type prior for both the listener turn type column and the discloser turn type column.

data <- # Select data
        data %>%
        # Select grouping variable, in this case, dyad ID (id)
        dplyr::group_by(id) %>% 
        # Fill in the turn type for "listener_turntype" and "discloser_turntype"
        # such that the turn type is carried forward one row
        # in order to capture turn transitions
        dplyr::mutate(listener_turntype = vec_fill_missing(listener_turntype, max_fill = 1),
                      discloser_turntype = vec_fill_missing(discloser_turntype, max_fill = 1)) %>%
        # Save the data as a data.frame
        as.data.frame()

# View the first 10 rows of the data
head(data, 10)
##     id turn role       turn_type    discloser_turntype        listener_turntype
## 1  105    1    1        Question    Discloser Question                     <NA>
## 2  105    2    2 Acknowledgement    Discloser Question Listener Acknowledgement
## 3  105    3    1     Elaboration Discloser Elaboration Listener Acknowledgement
## 4  105    4    2 Acknowledgement Discloser Elaboration Listener Acknowledgement
## 5  105    5    1     Elaboration Discloser Elaboration Listener Acknowledgement
## 6  105    6    2 Acknowledgement Discloser Elaboration Listener Acknowledgement
## 7  105    7    1     Elaboration Discloser Elaboration Listener Acknowledgement
## 8  105    8    2     Elaboration Discloser Elaboration     Listener Elaboration
## 9  105    9    1     Elaboration Discloser Elaboration     Listener Elaboration
## 10 105   10    2      Reflection Discloser Elaboration      Listener Reflection

Next, we create a new variable (“ssg_state”) in which we label each combination of the disclosers’ and listeners’ turns to a cell in the state space grid. Since we have six turn types for both disclosers and listeners, we will have 36 states in which the conversation can exist.

data$ssg_cell <- NA


data$ssg_cell[data$listener_turntype == "Listener Acknowledgement" &
              data$discloser_turntype == "Discloser Elaboration"] <- "LAck_DElab"

data$ssg_cell[data$listener_turntype == "Listener Question" &
              data$discloser_turntype == "Discloser Elaboration"] <- "LQues_DElab"

data$ssg_cell[data$listener_turntype == "Listener Reflection" &
              data$discloser_turntype == "Discloser Elaboration"] <- "LRefl_DElab"

data$ssg_cell[data$listener_turntype == "Listener Acknowledgement" &
              data$discloser_turntype == "Discloser HedgedDisclosure"] <- "LAck_DHedg"

data$ssg_cell[data$listener_turntype == "Listener Question" &
              data$discloser_turntype == "Discloser HedgedDisclosure"] <- "LQues_DHedg"

data$ssg_cell[data$listener_turntype == "Listener Reflection" &
              data$discloser_turntype == "Discloser HedgedDisclosure"] <- "LRefl_DHedg"

data$ssg_cell[data$listener_turntype == "Listener Elaboration" &
              data$discloser_turntype == "Discloser Elaboration"] <- "LElab_DElab"

data$ssg_cell[data$listener_turntype == "Listener Elaboration" &
              data$discloser_turntype == "Discloser HedgedDisclosure"] <- "LElab_DHedg"

data$ssg_cell[data$listener_turntype == "Listener Elaboration" &
              data$discloser_turntype == "Discloser Reflection"] <- "LElab_DRefl"

data$ssg_cell[data$listener_turntype == "Listener Elaboration" &
              data$discloser_turntype == "Discloser Question"] <- "LElab_DQues"

data$ssg_cell[data$listener_turntype == "Listener Elaboration" &
              data$discloser_turntype == "Discloser Acknowledgement"] <- "LElab_DAck"

data$ssg_cell[data$listener_turntype == "Listener HedgedDisclosure" &
              data$discloser_turntype == "Discloser Elaboration"] <- "LHedg_DElab"

data$ssg_cell[data$listener_turntype == "Listener HedgedDisclosure" &
              data$discloser_turntype == "Discloser HedgedDisclosure"] <- "LHedg_DHedg"

data$ssg_cell[data$listener_turntype == "Listener HedgedDisclosure" &
              data$discloser_turntype == "Discloser Reflection"] <- "LHedg_DRefl"

data$ssg_cell[data$listener_turntype == "Listener HedgedDisclosure" &
              data$discloser_turntype == "Discloser Question"] <- "LHedg_DQues"

data$ssg_cell[data$listener_turntype == "Listener HedgedDisclosure" &
              data$discloser_turntype == "Discloser Acknowledgement"] <- "LHedg_DAck"

data$ssg_cell[data$listener_turntype == "Listener Acknowledgement" &
              data$discloser_turntype == "Discloser Reflection"] <- "LAck_DRefl"

data$ssg_cell[data$listener_turntype == "Listener Acknowledgement" &
              data$discloser_turntype == "Discloser Question"] <- "LAck_DQues"

data$ssg_cell[data$listener_turntype == "Listener Acknowledgement" &
              data$discloser_turntype == "Discloser Acknowledgement"] <- "LAck_DAck"

data$ssg_cell[data$listener_turntype == "Listener Question" &
              data$discloser_turntype == "Discloser Reflection"] <- "LQues_DRefl"

data$ssg_cell[data$listener_turntype == "Listener Question" &
              data$discloser_turntype == "Discloser Question"] <- "LQues_DQues"

data$ssg_cell[data$listener_turntype == "Listener Question" &
              data$discloser_turntype == "Discloser Acknowledgement"] <- "LQues_DAck"

data$ssg_cell[data$listener_turntype == "Listener Reflection" &
              data$discloser_turntype == "Discloser Reflection"] <- "LRefl_DRefl"

data$ssg_cell[data$listener_turntype == "Listener Reflection" &
              data$discloser_turntype == "Discloser Question"] <- "LRefl_DQues"

data$ssg_cell[data$listener_turntype == "Listener Reflection" &
              data$discloser_turntype == "Discloser Acknowledgement"] <- "LRefl_DAck"

data$ssg_cell[data$listener_turntype == "Listener Advice" &
              data$discloser_turntype == "Discloser Elaboration"] <- "LAdv_DElab"

data$ssg_cell[data$listener_turntype == "Listener Advice" &
              data$discloser_turntype == "Discloser HedgedDisclosure"] <- "LAdv_DHedg"

data$ssg_cell[data$listener_turntype == "Listener Advice" &
              data$discloser_turntype == "Discloser Reflection"] <- "LAdv_DRefl"

data$ssg_cell[data$listener_turntype == "Listener Advice" &
              data$discloser_turntype == "Discloser Question"] <- "LAdv_DQues"

data$ssg_cell[data$listener_turntype == "Listener Advice" &
              data$discloser_turntype == "Discloser Acknowledgement"] <- "LAdv_DAck"

data$ssg_cell[data$listener_turntype == "Listener Advice" &
              data$discloser_turntype == "Discloser Advice"] <- "LAdv_DAdv"

data$ssg_cell[data$listener_turntype == "Listener Acknowledgement" &
              data$discloser_turntype == "Discloser Advice"] <- "LAck_DAdv"

data$ssg_cell[data$listener_turntype == "Listener Question" &
              data$discloser_turntype == "Discloser Advice"] <- "LQues_DAdv"

data$ssg_cell[data$listener_turntype == "Listener Reflection" &
              data$discloser_turntype == "Discloser Advice"] <- "LRefl_DAdv"

data$ssg_cell[data$listener_turntype == "Listener HedgedDisclosure" &
              data$discloser_turntype == "Discloser Advice"] <- "LHedg_DAdv"

data$ssg_cell[data$listener_turntype == "Listener Elaboration" &
              data$discloser_turntype == "Discloser Advice"] <- "LElab_DAdv"

# View the first 10 rows of the data
head(data, 10)
##     id turn role       turn_type    discloser_turntype        listener_turntype
## 1  105    1    1        Question    Discloser Question                     <NA>
## 2  105    2    2 Acknowledgement    Discloser Question Listener Acknowledgement
## 3  105    3    1     Elaboration Discloser Elaboration Listener Acknowledgement
## 4  105    4    2 Acknowledgement Discloser Elaboration Listener Acknowledgement
## 5  105    5    1     Elaboration Discloser Elaboration Listener Acknowledgement
## 6  105    6    2 Acknowledgement Discloser Elaboration Listener Acknowledgement
## 7  105    7    1     Elaboration Discloser Elaboration Listener Acknowledgement
## 8  105    8    2     Elaboration Discloser Elaboration     Listener Elaboration
## 9  105    9    1     Elaboration Discloser Elaboration     Listener Elaboration
## 10 105   10    2      Reflection Discloser Elaboration      Listener Reflection
##       ssg_cell
## 1         <NA>
## 2   LAck_DQues
## 3   LAck_DElab
## 4   LAck_DElab
## 5   LAck_DElab
## 6   LAck_DElab
## 7   LAck_DElab
## 8  LElab_DElab
## 9  LElab_DElab
## 10 LRefl_DElab

Finally, we count the number of turns per conversation, which will serve as a control variable in a later analysis.

#create variable with total number of turns per dyad
total_turns <- # Select data
               data %>%
               # Select grouping variable, in this case, dyad ID (id)
               dplyr::group_by(id) %>%
               # Create a new variable ("total_turns") that represents the turn number of turns
               # within the conversation - i.e., the maximum turn number
               dplyr::summarise(total_turns = max(turn)) %>%
               # Save the data as a data.frame
               as.data.frame()

# View the first 10 rows of the total_turns data
head(total_turns, 10)
##    id total_turns
## 1   3          65
## 2  11          60
## 3  12          82
## 4  14          90
## 5  31          84
## 6  38          90
## 7  45         115
## 8  54          79
## 9  55          64
## 10 58          69

Merge “total_turns” into “outcomes.”

outcomes <- merge(outcomes, total_turns, by = "id")

# View the first 10 rows of the outcomes data
head(outcomes, 10)
##    id distress total_turns
## 1   3        3          65
## 2  11        1          60
## 3  12        2          82
## 4  14        2          90
## 5  31        1          84
## 6  38        2          90
## 7  45        1         115
## 8  54        3          79
## 9  55        2          64
## 10 58        3          69

Entropy.

In this section, we calculate entropy for each dyad.

Create function to calculate entropy. Thank you to [blinded for review] for creating this function!

# Create the function called iEntropy which calculates entropy for each conversation

# This function takes in two vectors. x is the observed vector on which means are calculated, id is a vector identification values.
Entropy = function(x = x, base = exp(1)){
  
# First, create a vector that counts all of the responses of x
      count_x = table(x)
      
# Divide that table by its sum to convert the counts of the responses to probabilities of the responses
      prob_x  = count_x / sum(count_x)

# Take the logarithm of all of the probabilities with the user-specified base    
      log_x = log(prob_x, base)
      
# If any of the values within log_x equal -Inf (i.e. negative infinity) then replace them with 0      
      if(any(log_x == -Inf)){
          log_x_corrected =  log_x
          log_x_corrected[which(log_x == -Inf)] = 0
      }
      

# If all of the values do not equal -Inf, then do not replace any of them      
      if(all(log_x != -Inf)){
          log_x_corrected =  log_x
      }     

# Multiply the probablilities by the corresponding logs of the probabilities            
      prod_x = prob_x * log_x_corrected
      
# sum the products of the probabilities and log probabilities, and multiply by negative one      
      entropy = -1*sum(prod_x)

# return entropy            
      return(entropy)
}


# This second function takes the originally defined Entropy function and applies it to each individual
# I entropy takes in a vector of obesrvations, x, and id vector, and a user-specified base of the logarithm
iEntropy = function(x = x, id = id, base = exp(1)){
  
# The aggregate function splits x by id, and then applies the entropy function to each of the subsets of x  
      out = aggregate(x, by = list(id), FUN = Entropy, base = base)
      
# Rename the 'out' to 'id' and 'iEntropy'
      names(out) = c('id', 'iEntropy')
      
# Return the calculated iEntropy values
      return(out)
}

Calculate entropy.

# Remove missing data for the sake of calculating entropy
data_noNA <- data[!is.na(data$ssg_cell), ]

# Change "ssg_cell" to factor so the function recognizes the variable as categorical (rather than as a character)
data_noNA$ssg_cell <- as.factor(data_noNA$ssg_cell)

# Calculate entropy and save to new data frame ("entropy_data")
entropy_data <-           # Select variable
                aggregate(data_noNA$ssg_cell, 
                          # Apply variable to each group, in this case, dyad id ("id")
                          by = list(data_noNA$id), 
                          # Apply the Entropy function we created
                          FUN = Entropy, base = exp(1))

# Rename columns in new data set
names(entropy_data) = c('id', 'iEntropy')

# View the first 10 rows of the entropy data
head(entropy_data, 10)
##    id iEntropy
## 1   3 2.432622
## 2  11 1.761813
## 3  12 2.114136
## 4  14 2.367273
## 5  31 1.661760
## 6  38 1.680077
## 7  45 2.064100
## 8  54 2.064036
## 9  55 2.368448
## 10 58 1.942869

Entropy descriptives.

# Examine descriptive statistics of iEntropy
describe(entropy_data$iEntropy)
##    vars  n mean   sd median trimmed  mad min  max range  skew kurtosis   se
## X1    1 59 1.96 0.35   2.04    1.99 0.32 1.1 2.45  1.36 -0.66    -0.51 0.05
# Plot iEntropy distribution
      # Select data and variable of interest (iEntropy)
ggplot(entropy_data, aes(x = iEntropy)) + 
      # Create histogram and set width of bars in histogram
      geom_histogram(binwidth = 0.10) + 
      # Label x-axis
      xlab("Entropy - Conversation Behavioral Flexibility") +
      # Label y-axis
      ylab("Frequency") +
      # Update aesthetics
      theme_classic()

Merge “entropy_data” into “outcomes.”

outcomes <- merge(outcomes, entropy_data, by = "id")

# View the first 10 rows of the outcomes data
head(outcomes, 10)
##    id distress total_turns iEntropy
## 1   3        3          65 2.432622
## 2  11        1          60 1.761813
## 3  12        2          82 2.114136
## 4  14        2          90 2.367273
## 5  31        1          84 1.661760
## 6  38        2          90 1.680077
## 7  45        1         115 2.064100
## 8  54        3          79 2.064036
## 9  55        2          64 2.368448
## 10 58        3          69 1.942869

Problem description attractor.

In this next section, we calculate the relative proportion of time spent in the problem description attractor.

To begin, we first have to define the attractor. Specifically, we define the problem description attractor as the cells that were created from the intersection of discloser elaboration and hedged disclosure and listener acknowledgement, question, and reflection.

We create a new variable that labels each turn as either IN the attractor or OUT of the attractor.

# Create a variable that labels all turns as "OUT" of the attractor
data$attractor <- "OUT"

# Update the "attractor" variable so that turns that occur in one of the six problem description attractor cells is now labeled "IN"
data$attractor[data$ssg_cell == "LAck_DElab"] <- "IN"
data$attractor[data$ssg_cell == "LAck_DHedg"] <- "IN"
data$attractor[data$ssg_cell == "LQues_DElab"] <- "IN"
data$attractor[data$ssg_cell == "LQues_DHedg"] <- "IN"
data$attractor[data$ssg_cell == "LRefl_DElab"] <- "IN"
data$attractor[data$ssg_cell == "LRefl_DElab"] <- "IN"

# View the first 10 rows of the data
head(data, 10)
##     id turn role       turn_type    discloser_turntype        listener_turntype
## 1  105    1    1        Question    Discloser Question                     <NA>
## 2  105    2    2 Acknowledgement    Discloser Question Listener Acknowledgement
## 3  105    3    1     Elaboration Discloser Elaboration Listener Acknowledgement
## 4  105    4    2 Acknowledgement Discloser Elaboration Listener Acknowledgement
## 5  105    5    1     Elaboration Discloser Elaboration Listener Acknowledgement
## 6  105    6    2 Acknowledgement Discloser Elaboration Listener Acknowledgement
## 7  105    7    1     Elaboration Discloser Elaboration Listener Acknowledgement
## 8  105    8    2     Elaboration Discloser Elaboration     Listener Elaboration
## 9  105    9    1     Elaboration Discloser Elaboration     Listener Elaboration
## 10 105   10    2      Reflection Discloser Elaboration      Listener Reflection
##       ssg_cell attractor
## 1         <NA>       OUT
## 2   LAck_DQues       OUT
## 3   LAck_DElab        IN
## 4   LAck_DElab        IN
## 5   LAck_DElab        IN
## 6   LAck_DElab        IN
## 7   LAck_DElab        IN
## 8  LElab_DElab       OUT
## 9  LElab_DElab       OUT
## 10 LRefl_DElab        IN

Calculate the proportion of time in the attractor for each dyad.

attractor_time <- # Select data
                  data %>%
                  # Select grouping variable, in this case, dyad ID (id)
                  dplyr::group_by(id) %>%
                  # Count the occurrence of each category ("IN", "OUT") in the "attractor" variable
                  dplyr::count(attractor) %>%
                  # Remove any missing data
                  tidyr::drop_na() %>%
                  # Calculate the proportion of time in each category
                  dplyr::mutate(attractor_prop = prop.table(n)) %>%
                  # Save the data as a data.frame
                  as.data.frame()

# View the first 10 rows of the attractor_time data
head(attractor_time, 10)
##    id attractor  n attractor_prop
## 1   3        IN 31      0.4769231
## 2   3       OUT 34      0.5230769
## 3  11        IN 34      0.5666667
## 4  11       OUT 26      0.4333333
## 5  12        IN 29      0.3536585
## 6  12       OUT 53      0.6463415
## 7  14        IN 44      0.4888889
## 8  14       OUT 46      0.5111111
## 9  31        IN 59      0.7023810
## 10 31       OUT 25      0.2976190

Proportion of time in attractor descriptives.

# Examine descriptive statistics of attractor_prop
describe(attractor_time$attractor_prop)
##    vars   n mean  sd median trimmed  mad  min  max range skew kurtosis   se
## X1    1 118  0.5 0.2    0.5     0.5 0.23 0.11 0.89  0.78    0    -0.79 0.02
# Plot attractor_prop distribution
      # Select data and variable of interest (attractor_prop)
ggplot(attractor_time, aes(x = attractor_prop)) + 
      # Create histogram and set width of bars in histogram
      geom_histogram(binwidth = 0.10) + 
      # Label x-axis
      xlab("Proportion of Turns in the Problem Description Attractor for Each Dyad") +
      # Label y-axis
      ylab("Frequency") +
      # Update aesthetics
      theme_classic()

Merge proportion of time in the attractor (attractor_prop == “IN”) into “outcomes.”

outcomes <- merge(outcomes,
                  attractor_time[attractor_time$attractor == "IN", c("id", "attractor_prop")], 
                  by = "id", all.x = TRUE)

# View the first 10 rows of the outcomes data
head(outcomes, 10)
##    id distress total_turns iEntropy attractor_prop
## 1   3        3          65 2.432622      0.4769231
## 2  11        1          60 1.761813      0.5666667
## 3  12        2          82 2.114136      0.3536585
## 4  14        2          90 2.367273      0.4888889
## 5  31        1          84 1.661760      0.7023810
## 6  38        2          90 1.680077      0.6666667
## 7  45        1         115 2.064100      0.5913043
## 8  54        3          79 2.064036      0.7088608
## 9  55        2          64 2.368448      0.1718750
## 10 58        3          69 1.942869      0.7101449

Attractor exit time.

Finally, we will calculate the time (i.e., turn number) when the dyad first leaves the problem description attractor.

Find first row in which zone is not equal to 1.

attractor_exit <- # Select data
                  data %>%
                  # Select grouping variable, in this case, dyad ID (id)
                  dplyr::group_by(id) %>%
                  # Remove the first row of each group since the first turn will
                  # never be in a cell (since we need two data points to locate a turn
                  # transition in a cell)
                  dplyr::slice(2:n()) %>%
                  # Keep rows in which the dyad is out of the attractor 
                  # (and is not due to missing data)
                  dplyr::filter(attractor == "OUT" & !is.na(ssg_cell)) %>%
                  # Keep the first row in which the dyad is out of the attractor
                  dplyr::filter(row_number()==1) %>%
                  # Save the data as a data.frame
                  as.data.frame()

# Rename turn variable to "attractor_exit"
colnames(attractor_exit)[2] <- "attractor_exit"

Time of attractor exit descriptives.

# Examine descriptive statistics of attractor_exit
describe(attractor_exit$attractor_exit)
##    vars  n  mean    sd median trimmed mad min max range skew kurtosis   se
## X1    1 59 11.69 10.87      8    10.1 8.9   2  47    45  1.3     1.02 1.41
# Plot attractor_exit distribution
      # Select data and variable of interest (attractor_exit)
ggplot(attractor_exit, aes(x = attractor_exit)) + 
      # Create histogram and set width of bars in histogram
      geom_histogram(binwidth = 5) + 
      # Label x-axis
      xlab("Time (in turns) of Problem Description Attractor Exit") +
      # Label y-axis
      ylab("Frequency") +
      # Update aesthetics
      theme_classic()

Merge “attractor_exit” variable into “outcomes.”

outcomes <- merge(outcomes,
                  attractor_exit[, c("id", "attractor_exit")], 
                  by = "id")

# View the first 10 rows of the outcomes data
head(outcomes, 10)
##    id distress total_turns iEntropy attractor_prop attractor_exit
## 1   3        3          65 2.432622      0.4769231             18
## 2  11        1          60 1.761813      0.5666667             22
## 3  12        2          82 2.114136      0.3536585              7
## 4  14        2          90 2.367273      0.4888889              3
## 5  31        1          84 1.661760      0.7023810             32
## 6  38        2          90 1.680077      0.6666667              2
## 7  45        1         115 2.064100      0.5913043              2
## 8  54        3          79 2.064036      0.7088608              2
## 9  55        2          64 2.368448      0.1718750              4
## 10 58        3          69 1.942869      0.7101449              8

SSG metrics and between-dyad differences.

In this final step, we will examine how our state space grid metrics are associated with a between-dyad difference (distress). We will examine these associations using multiple regression, and we will control for the total number of turns in the conversation.

Entropy.

entropy_regression <- lm(# Post-conversation distress is predicted by entropy and total turns
                         distress ~ iEntropy + total_turns, 
                         data = outcomes)
summary(entropy_regression)
## 
## Call:
## lm(formula = distress ~ iEntropy + total_turns, data = outcomes)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.5502 -1.0537 -0.1442  0.8119  2.8918 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)   
## (Intercept)  3.029425   1.093803   2.770   0.0076 **
## iEntropy    -0.616536   0.464506  -1.327   0.1898   
## total_turns  0.003784   0.008188   0.462   0.6458   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.217 on 56 degrees of freedom
## Multiple R-squared:  0.03189,    Adjusted R-squared:  -0.002682 
## F-statistic: 0.9224 on 2 and 56 DF,  p-value: 0.4035
# Plot result
      # Select data and set x-axis (iEntropy - i.e., the predictor)
      # and y-axis (distress - i.e., the outcome)
ggplot(outcomes, aes(x = iEntropy, y = distress)) + 
      # Plot each dyad's iEntropy and distress
      geom_point() +
      # Plot smoothed regression line
      stat_smooth(method='lm', formula = y ~ x, color = "blue", size = 1) +
      # Label x-axis
      xlab("Conversation Behavior Flexibility") + 
      # Label y-axis
      ylab("Disclosers' Post-conversation Distress") +
      # Update aesthetics
      theme_classic()
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

Entropy of conversation behaviors during a support conversation is not associated with discloser’s post-conversation distress, while controlling for the total number of turns within the conversation.

Problem description attractor.

attractor_regression <- lm(# Post-conversation distress is predicted by time in 
                           # attractor and total turns
                           distress ~ attractor_prop + total_turns, 
                           data = outcomes)
summary(attractor_regression)
## 
## Call:
## lm(formula = distress ~ attractor_prop + total_turns, data = outcomes)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.3928 -1.1224 -0.1605  0.8068  3.0325 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)
## (Intercept)    1.523603   0.966507   1.576    0.121
## attractor_prop 0.631399   0.930769   0.678    0.500
## total_turns    0.002882   0.008247   0.349    0.728
## 
## Residual standard error: 1.231 on 56 degrees of freedom
## Multiple R-squared:  0.009576,   Adjusted R-squared:  -0.0258 
## F-statistic: 0.2707 on 2 and 56 DF,  p-value: 0.7638
# Plot result
      # Select data and set x-axis (attractor_prop - i.e., the predictor)
      # and y-axis (distress - i.e., the outcome)
ggplot(outcomes, aes(x = attractor_prop, y = distress)) + 
      # Plot each dyad's attractor_prop and distress
      geom_point() +
      # Plot smoothed regression line
      stat_smooth(method='lm', formula = y ~ x, color = "blue", size = 1) +
      # Label x-axis
      xlab("Proportion of Turns in the Problem Description Attractor") + 
      # Label y-axis
      ylab("Disclosers' Post-conversation Distress") +
      # Update aesthetics
      theme_classic()

The proportion of time spent in the problem description attractor during a support conversation is not associated with discloser’s post-conversation distress, while controlling for the total number of turns within the conversation.

Attractor exit time.

exit_regression <- lm(# Post-conversation distress is predicted by time of first
                      # attractor exit and total turns
                      distress ~ attractor_exit + total_turns, 
                      data = outcomes)
summary(exit_regression)
## 
## Call:
## lm(formula = distress ~ attractor_exit + total_turns, data = outcomes)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.3816 -1.1258 -0.1257  0.8572  2.9207 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)  
## (Intercept)    1.781912   0.835332   2.133   0.0373 *
## attractor_exit 0.006599   0.015508   0.426   0.6721  
## total_turns    0.003346   0.008561   0.391   0.6974  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.234 on 56 degrees of freedom
## Multiple R-squared:  0.004656,   Adjusted R-squared:  -0.03089 
## F-statistic: 0.131 on 2 and 56 DF,  p-value: 0.8775
# Plot result
      # Select data and set x-axis (attractor_exit - i.e., the predictor)
      # and y-axis (distress - i.e., the outcome)
ggplot(outcomes, aes(x = attractor_exit, y = distress)) + 
      # Plot each dyad's attractor_exit and distress
      geom_jitter() +
      # Plot smoothed regression line
      stat_smooth(method='lm', formula = y ~ x, color = "blue", size = 1) +
      # Label x-axis
      xlab("Time (in turns) of Problem Description Attractor Exit") + 
      # Label y-axis
      ylab("Disclosers' Post-conversation Distress") +
      # Update aesthetics
      theme_classic()

The timing of first exit from the problem description attractor during a support conversation is not associated with discloser’s post-conversation distress, while controlling for the total number of turns within the conversation.

Additional Information

We created this tutorial with a system environment and versions of R and packages that might be different from yours. If R reports errors when you attempt to run this tutorial, running the code chunk below and comparing your output may be helpful.

session_info(pkgs = c("attached"))
## ─ Session info ───────────────────────────────────────────────────────────────
##  setting  value
##  version  R version 4.3.2 (2023-10-31)
##  os       macOS Ventura 13.6.3
##  system   x86_64, darwin20
##  ui       X11
##  language (EN)
##  collate  en_US.UTF-8
##  ctype    en_US.UTF-8
##  tz       America/New_York
##  date     2024-09-16
##  pandoc   3.1.1 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/ (via rmarkdown)
## 
## ─ Packages ───────────────────────────────────────────────────────────────────
##  package  * version date (UTC) lib source
##  devtools * 2.4.5   2022-10-11 [1] CRAN (R 4.3.0)
##  dplyr    * 1.1.4   2023-11-17 [1] CRAN (R 4.3.0)
##  ggplot2  * 3.5.0   2024-02-23 [1] CRAN (R 4.3.2)
##  psych    * 2.4.1   2024-01-18 [1] CRAN (R 4.3.0)
##  tidyr    * 1.3.1   2024-01-24 [1] CRAN (R 4.3.2)
##  usethis  * 2.2.3   2024-02-19 [1] CRAN (R 4.3.2)
##  vctrs    * 0.6.5   2023-12-01 [1] CRAN (R 4.3.0)
## 
##  [1] /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/library
## 
## ──────────────────────────────────────────────────────────────────────────────