Conversational Motifs Tutorial
Overview
This tutorial provides R code to conduct the conversational motif analyses presented in the paper “Using Sequence Analysis to Identify Conversational Motifs in Supportive Interactions (Solomon et al., in press).”
The primary analytic technique forwarded and used in the paper is sequence analysis (MacIndoe & Abbott, 2004), a data-driven analytic technique that is used to (a) identify patterns in categorical time-series data and (b) examine differences across pattern groups and/or if and how the timing and prevalence of those patterns is related to an outcome of interest. Additional details about the method and how it is used to analyze and understand conversation data are elaborated in the paper.
In the empirical example presented here, we describe and examine patterns embedded in dyads’ conversations - conversational motifs. Using turn-by-turn data from a subset of the data analyzed in the paper (specifically, 53 conversations between friends during which one friend, the discloser, talked about a current propblem with the other friend, the listener). Each speaking turn in these conversations was coded as being one of six types: acknowledgement, advice, elaboration, hedged disclosure, question, or reflection (see Bodie et al., 2021 in the Journal of Language and Social Psychology for more details about the creation of the turn typology). We are specifically interested in (a) identifying the specific types of turn-to-turn exchanges that these dyads use in their conversations - i.e., the five-turn sequences that we refer to as conversational motifs and (b) whether the prevalence and timing of those sequences is associated with post-conversation reports of the discloser’s emotional improvement.
Please note that that while the steps of sequence analysis demonstrated in this tutorial are the same as those described in the paper, the steps that connect the conversational motifs to the outcome of interest differ. Specifically, the paper uses multigroup-SEMs to examine whether the timing and prevalence of conversational motifs are associated with the outcomes of interest. Here, we use regression models to examine whether the timing and prevalence of conversational motifs are associated with an outcome of interest to keep the focus of the tutorial on sequence analysis.
Note that the accompanying “ConversationalMotifs_Tutorial_2022August20.rmd” file contains all of the code presented in this tutorial and can be opened in RStudio (a somewhat more friendly user interface to R).
Outline
This tutorial covers…
- Reading in the data and loading needed packages.
- Data descriptives.
- Creating five-turn windows.
- Creating sequences.
- Establishing a cost matrix and obtaining a dissimilarity
matrix.
- Determining the number of clusters - conversational motifs.
- Examining associations of timing and prevalence of conversational
motifs with outcomes.
- Conclusion.
Read in the data and load analysis packages.
Let’s read the data into R.
The exemplar data are stored in two .csv files. One file, “friends_subset.csv”, contains the repeated measures (i.e., turn-by-turn) supportive conversation data for all 53 dyads. The second file, “friends_outcomes.csv”, contains the time-invariant outcome data for all 53 dyads (specifically, disclosers’ self-reported emotional improvement).
# Set working directory (i.e., where the data files are stored)
# This can also be done by going to the top bar of RStudio and selecting
# "Session" --> "Set Working Directory" --> "Choose Directory" -->
# finding the location of the folder that contains the data files
setwd("~/Desktop") # Note: You can skip this line if you have
#the data files and this .rmd file stored in the same directory
# Read in the repeated measures data
<- read.csv(file = "friends_subset.csv", head = TRUE, sep = ",")
data
# View the first 10 rows of the repeated measures data
head(data, 10)
## id turn role turn_type
## 1 1 1 Listener Listener_Acknowledgement
## 2 1 2 Discloser Discloser_Acknowledgement
## 3 1 3 Listener Listener_Acknowledgement
## 4 1 4 Discloser Discloser_Advice
## 5 1 5 Listener Listener_Elaboration
## 6 1 6 Discloser Discloser_Elaboration
## 7 1 7 Listener Listener_Elaboration
## 8 1 8 Discloser Discloser_Elaboration
## 9 1 9 Listener Listener_Acknowledgement
## 10 1 10 Discloser Discloser_Elaboration
# Read in the outcomes data
<- read.csv(file = "friends_outcomes_subset.csv", head = TRUE, sep = ",")
outcomes
# View the first 10 rows of the outcomes data
head(outcomes, 10)
## id emo_improve
## 1 80 5.666667
## 2 79 7.000000
## 3 78 5.333333
## 4 77 5.333333
## 5 74 3.000000
## 6 73 1.666667
## 7 72 6.666667
## 8 70 6.000000
## 9 69 4.333333
## 10 68 5.000000
In the data, we can see each row contains information for one turn and there are multiple rows (i.e., turns) for each dyad. Specifically, there is a column for:
- Dyad ID (
id
)
- Time variable - in this case, turn in the conversation
(
turn
)
- Dyad member’s role in the conversation (
role
; discloser = 1, listener = 2)
- Turn type - in this case, based upon a typology derived in Bodie et
al. (2021;
turn_type
)
In the outcome data (“outcomes”), we can see there is one row for each dyad and there are columns for:
- Dyad ID (
id
)
- Outcome variable - in this case, the discloser’s post-conversation
report of emotional improvement (
emo_improve
; an average of the three items: “I feel better after having talked with my friend,” “My friend made me feel better about myself,” and “I feel more optimistic after having talked with my friend”)
Load the R packages we need.
Packages in R are a collection of functions (and their documentation/explanations) that enable us to conduct particular tasks, such as plotting or fitting a statistical model.
# install.packages("cluster") # Install package if you have never used it before
library(cluster) # For hierarchical cluster analysis
# install.packages("devtools") # Install package if you have never used it before
require(devtools) # For version control
# install.packages("dplyr") # Install package if you have never used it before
library(dplyr) # For data management
# install.packages("ggplot2") # Install package if you have never used it before
library(ggplot2) # For plotting
# install.packages("psych") # Install package if you have never used it before
library(psych) # For descriptive statistics
# install.packages("reshape") # Install package if you have never used it before
library(reshape) # For data management
# install.packages("stringr") # Install package if you have never used it before
library(stringr) # For changing character strings within variables
# install.packages("tidyr") # Install package if you have never used it before
library(tidyr) # For data management
# install.packages("TraMineR") # Install package if you have never used it before
library(TraMineR) # For sequence analysis
# install.packages("TraMineRextras") # Install package if you have never used it before
library(TraMineRextras) # For sequence analysis
Data Descriptives.
The goal of this step is to describe our sample, specifically,
- how many dyads are in the data set,
- how many conversation turns there are for each dyad, and
- the frequency of each turn type across all dyads.
- Number of dyads.
# Number of dyads in the repeated measures data
# Length (i.e., number) of unique ID values
length(unique(data$id))
## [1] 53
# Number of dyads in the outcome data
# Length (i.e., number) of unique ID values
length(unique(outcomes$id))
## [1] 53
There are 53 dyads in both data sets.
- Number of conversation turns for each dyad.
<- # Select data
num_occ %>%
data # Select grouping variable, in this case, dyad ID (id)
group_by(id) %>%
# Count the number of turns in each conversation
summarise(count = n()) %>%
# Save the data as a data.frame
as.data.frame()
# Calculate descriptives on the number of turns per conversation
describe(num_occ$count)
## vars n mean sd median trimmed mad min max range skew kurtosis se
## X1 1 53 74.7 20.63 68 72.88 19.27 49 132 83 0.67 -0.56 2.83
The dyads in this subset of the data had supportive conversations that had, on average, approximately 75 turns (M = 74.70, SD = 20.63), with the conversations ranging in length from 49 to 132 turns.
Plot a histogram of the number of turns per conversation.
# Select data (num_occ) and value on the x-axis (number of turns per conversation: "count")
ggplot(data = num_occ, aes(x = count)) +
# Create a histogram with binwidth = 5 and white bars outlined in black
geom_histogram(binwidth = 5, fill = "white", color = "black") +
# Label x-axis
labs(x = "Number of Turns per Conversation") +
# Change background aesthetics of plot
theme_classic()
- The number of total turns for each turn type.
# Create table that calculates the number of turns for each turn type
<- table(data$turn_type)
turntype_table
# Display the table
turntype_table
##
## Discloser_Acknowledgement Discloser_Advice
## 136 48
## Discloser_Elaboration Discloser_HedgedDisclosure
## 1449 244
## Discloser_Question Discloser_Reflection
## 88 27
## Listener_Acknowledgement Listener_Advice
## 721 78
## Listener_Elaboration Listener_HedgedDisclosure
## 381 97
## Listener_Question Listener_Reflection
## 310 379
We can see that disclosers overall used Elaboration turns the most (1,449 turns), while listeners overall used Acknowledgement turns the most (721 turns).
Create Five-turn Windows.
The goal of this step is to prepare the data for sequence analysis by creating five-turn windows, which requires several sub-steps:
- manipulating the data so each conversation begins with a discloser’s
turn,
- updating the turn variable so that all conversations begin at turn
1,
- adding empty rows where there is missing information for a turn,
and
- creating a data set that contains all five-turn windows for each dyad.
Note: You are not required to begin these five-turn windows with a set role in the dyads. We decided to focus on five-turn windows that begin with a discloser in this analysis, but creating windows that begin with a listener is also possible. In the case of indistinguishable dyads (e.g., two arguers), steps (1) & (2) are not necessary.
Remove rows that precede the first discloser turn.
<- # Select data
data1 %>%
data # Select grouping variable, in this case, dyad ID (id)
group_by(id) %>%
# Remove any rows that precede the first "Discloser" row;
slice(which.max(role == "Discloser") : n()) %>%
# Save the data as a data.frame
as.data.frame()
# View the first 10 rows of the data
head(data1, 10)
## id turn role turn_type
## 1 1 2 Discloser Discloser_Acknowledgement
## 2 1 3 Listener Listener_Acknowledgement
## 3 1 4 Discloser Discloser_Advice
## 4 1 5 Listener Listener_Elaboration
## 5 1 6 Discloser Discloser_Elaboration
## 6 1 7 Listener Listener_Elaboration
## 7 1 8 Discloser Discloser_Elaboration
## 8 1 9 Listener Listener_Acknowledgement
## 9 1 10 Discloser Discloser_Elaboration
## 10 1 11 Listener Listener_Acknowledgement
Create a new turn number variable that is the same as the original turn variable if the conversation originally started with a discloser turn. If the conversation originally started with a listener turn, then the new turn variable is one less than the original turn variable.
<- # Select data
data1 %>%
data1 # Select grouping variable, in this case, dyad ID (id)
group_by(id) %>%
# Create a new variable called "turn_minus" that is the value of the "turn" variable minus 1
mutate(turn_minus = turn - 1) %>%
# Create a new variable called "newturn"
# If the first value of "turn" within a dyad is 2, then label "newturn" as "Yes"
# If the first value of "turn" within a dyad is not 2, then label "newturn" as "No"
mutate(newturn = if_else(first(turn) == 2, "Yes", "No")) %>%
# If "newturn" is equal to "Yes" than replace the "newturn" values with the
# values in "turn_minus"
# If "newturn" is not equal to "Yes" than keep the values in "turn"
mutate(newturn = ifelse(newturn == "Yes", turn_minus, turn)) %>%
# Save the data as a data.frame
as.data.frame()
# View the first 10 rows of the data
head(data1, 10)
## id turn role turn_type turn_minus newturn
## 1 1 2 Discloser Discloser_Acknowledgement 1 1
## 2 1 3 Listener Listener_Acknowledgement 2 2
## 3 1 4 Discloser Discloser_Advice 3 3
## 4 1 5 Listener Listener_Elaboration 4 4
## 5 1 6 Discloser Discloser_Elaboration 5 5
## 6 1 7 Listener Listener_Elaboration 6 6
## 7 1 8 Discloser Discloser_Elaboration 7 7
## 8 1 9 Listener Listener_Acknowledgement 8 8
## 9 1 10 Discloser Discloser_Elaboration 9 9
## 10 1 11 Listener Listener_Acknowledgement 10 10
Some datasets may contain missing data (e.g., because of uncodable turns that were removed from analyses). To ensure that the original turn numbers and ordering are still maintained even with missing data, we add empty rows so there are consecutive time points for all IDs.
<- # Select data
data2 %>%
data1 # Select grouping variable, in this case, dyad ID (id)
group_by(id) %>%
# Complete the sequence of values in "newturn" that range
# from the lowest value in "newturn" to
# the highest value in "newturn"
complete(newturn = seq(min(newturn), max(newturn), 1L)) %>%
# Save the data as a data.frame
as.data.frame()
# View the first 10 rows of the data
head(data2, 10)
## id newturn turn role turn_type turn_minus
## 1 1 1 2 Discloser Discloser_Acknowledgement 1
## 2 1 2 3 Listener Listener_Acknowledgement 2
## 3 1 3 4 Discloser Discloser_Advice 3
## 4 1 4 5 Listener Listener_Elaboration 4
## 5 1 5 6 Discloser Discloser_Elaboration 5
## 6 1 6 7 Listener Listener_Elaboration 6
## 7 1 7 8 Discloser Discloser_Elaboration 7
## 8 1 8 9 Listener Listener_Acknowledgement 8
## 9 1 9 10 Discloser Discloser_Elaboration 9
## 10 1 10 11 Listener Listener_Acknowledgement 10
Finally, we need to create a data set that contains all the five-turn windows for each dyad that begin with a discloser turn. We create this data set in the loop below.
# Change the structure of the "id", "newturn", and "turn_type" variables
$id <- as.character(data2$id)
data2$newturn <- as.numeric(data2$newturn)
data2$turn_type <- as.character(data2$turn_type)
data2
# Create a vector of all IDs in the data set that the loop will work through
<- unique(data2$id)
data2_idlist
# Set the value of the window length
# (value is 1 less than the desired value in order to select the number of values that follow)
<- 4
window
# Set the minimum length of a sequence
# (i.e., do not include partial sequences that may result because of windows toward end of sequence)
<- 5
min_length
# Create an empty data set called "window_data"
<- NULL
window_data
# Start loop
# For each i (i.e., dyad) in the vector
for(i in 1:length(data2_idlist)){
# Select i-th subject from the vector
<- data2_idlist[i]
subject_id
# Subset i-th subject's data
<- subset(data2, id == subject_id)
dat
# Create a vector that contains all of the turns in the i-th subject's data
<- dat$newturn
turn_list
# Only keep odd numbered turns from vector (as Discloser turns are odd numbered)
# Skip this step by inserting a # (pound sign) before turn_list,
# if you are working with indistinguishable dyads
<- turn_list[lapply(turn_list, "%%", 2) != 0]
turn_list
# Create loop that selects a sub-sequence of turns starting with each turn in "turn_list"
for(x in turn_list){
# Select the x to x+window values in the "turn_type" variable and save to list "turntype"
<- dat[x:(x + window), "turn_type"]
turntype
# Save the subject's ID variable to object "subject"
<- unique(dat$id)
subject
# Combine the subject value and the turntype values into list called "new_row"
<- c(subject, turntype)
new_row
# Add "new_row" to data set "window_data"
<- rbind(window_data, new_row)
window_data
# Change the column names of "window_data"
colnames(window_data)[1:6] <- c("id", "turn1", "turn2", "turn3", "turn4", "turn5")
}
}
# Save "window_data" as data frame
<- as.data.frame(window_data)
window_data
# Remove row names from data set
rownames(window_data) <- NULL
# If there is missing data in a row, then delete
<- na.omit(window_data)
window_data
# View the first 10 rows of the data
head(window_data, 10)
## id turn1 turn2
## 1 1 Discloser_Acknowledgement Listener_Acknowledgement
## 2 1 Discloser_Advice Listener_Elaboration
## 3 1 Discloser_Elaboration Listener_Elaboration
## 4 1 Discloser_Elaboration Listener_Acknowledgement
## 5 1 Discloser_Elaboration Listener_Acknowledgement
## 6 1 Discloser_Elaboration Listener_Acknowledgement
## 7 1 Discloser_HedgedDisclosure Listener_Acknowledgement
## 16 1 Discloser_Elaboration Listener_Acknowledgement
## 17 1 Discloser_Elaboration Listener_Acknowledgement
## 18 1 Discloser_Elaboration Listener_Acknowledgement
## turn3 turn4
## 1 Discloser_Advice Listener_Elaboration
## 2 Discloser_Elaboration Listener_Elaboration
## 3 Discloser_Elaboration Listener_Acknowledgement
## 4 Discloser_Elaboration Listener_Acknowledgement
## 5 Discloser_Elaboration Listener_Acknowledgement
## 6 Discloser_HedgedDisclosure Listener_Acknowledgement
## 7 Discloser_Elaboration Listener_Acknowledgement
## 16 Discloser_Elaboration Listener_Acknowledgement
## 17 Discloser_Elaboration Listener_Acknowledgement
## 18 Discloser_HedgedDisclosure Listener_Acknowledgement
## turn5
## 1 Discloser_Elaboration
## 2 Discloser_Elaboration
## 3 Discloser_Elaboration
## 4 Discloser_Elaboration
## 5 Discloser_HedgedDisclosure
## 6 Discloser_Elaboration
## 7 Discloser_Elaboration
## 16 Discloser_Elaboration
## 17 Discloser_HedgedDisclosure
## 18 Discloser_HedgedDisclosure
In the five-turn window data (“window_data”), the first column is the
dyad ID variable (id
) and columns two through six contain
the turn type information for the five turns within the window.
Define Sequences.
Now that we know a little bit more about our repeated measures data and re-formatted our data into five-turn sequences, we can move on to the data preparation required for sequence analysis.
The goal of this step is to:
- create an “alphabet” that represents each of our categories,
and
- create and plot the categorical sequence data.
- Create alphabet.
We create an alphabet that represents each possible category within the categorical variable of interest (in this case, “turn_type”). The actual naming of these values is not important, but we are going to name them in a way that facilitates interpretation.
# This object contains the categories that appear in the data set.
<- c("Listener_Question", "Listener_Acknowledgement",
turn_alphabet "Listener_Elaboration", "Listener_HedgedDisclosure",
"Listener_Reflection", "Listener_Advice",
"Discloser_Elaboration", "Discloser_HedgedDisclosure",
"Discloser_Question", "Discloser_Acknowledgement",
"Discloser_Advice", "Discloser_Reflection")
# This object allows for more helpful labels
<- c("Listener_Question", "Listener_Acknowledgement",
turn_labels "Listener_Elaboration", "Listener_HedgedDisclosure",
"Listener_Reflection", "Listener_Advice",
"Discloser_Elaboration", "Discloser_HedgedDisclosure",
"Discloser_Question", "Discloser_Acknowledgement",
"Discloser_Advice", "Discloser_Reflection")
- Create sequences.
Now that the data are in the correct (i.e., wide) format and we have
created an alphabet for our sequence analysis, we next need to create a
sequence object that can be understood by the R package for sequence
analysis (TraMineR
).
Before creating the sequences, we first assign colors to each of the categories, which will help us when viewing plots of the sequences. This step is not required since there is a default color palette, but this gives us control over what color is assigned to which category. We do this by assigning each category (i.e., turn type) a hex code (https://www.color-hex.com/). The categories should be written as they appear in the alphabet created above.
A note on accessibility: To make your plots accessible, you may consider adopting a colorblind-friendly palette. David Nichols’ website (https://davidmathlogic.com/colorblind/) provides a great explainer on this issue, as well as a color picking tool.
<- "#619CFF" # Blue
Listener_Acknowledgement <- "#FFE700" # Yellow
Listener_Advice <- "#F8766D" # Red
Listener_Elaboration <- "#FFA500" # Orange
Listener_HedgedDisclosure <- "#00BA38" # Green
Listener_Question <- "#DB72FB" # Purple
Listener_Reflection
<- "#619CFF" # Blue
Discloser_Acknowledgement <- "#FFE700" # Yellow
Discloser_Advice <- "#F8766D" # Red
Discloser_Elaboration <- "#FFA500" # Orange
Discloser_HedgedDisclosure <- "#00BA38" # Green
Discloser_Question <- "#DB72FB" # Purple Discloser_Reflection
Next, we create an object (“turn_seq”) that contains all of the sequences in the format needed for the sequence analysis package.
<- TraMineR::seqdef(window_data, # Select data
turn_seq var = 2:6, # Columns containing repeated measures data
alphabet = turn_alphabet, # Alphabet
labels = turn_labels, # Labels
xtstep = 5, # Steps between tick marks
cpal = c(Listener_Question, Listener_Acknowledgement,
Listener_Elaboration, Listener_HedgedDisclosure,
Listener_Reflection, Listener_Advice,
Discloser_Elaboration, Discloser_HedgedDisclosure,
Discloser_Question, Discloser_Acknowledgement, # Color palette Discloser_Advice, Discloser_Reflection))
## [>] 12 distinct states appear in the data:
## 1 = Discloser_Acknowledgement
## 2 = Discloser_Advice
## 3 = Discloser_Elaboration
## 4 = Discloser_HedgedDisclosure
## 5 = Discloser_Question
## 6 = Discloser_Reflection
## 7 = Listener_Acknowledgement
## 8 = Listener_Advice
## 9 = Listener_Elaboration
## 10 = Listener_HedgedDisclosure
## 11 = Listener_Question
## 12 = Listener_Reflection
## [>] state coding:
## [alphabet] [label] [long label]
## 1 Listener_Question Listener_Question Listener_Question
## 2 Listener_Acknowledgement Listener_Acknowledgement Listener_Acknowledgement
## 3 Listener_Elaboration Listener_Elaboration Listener_Elaboration
## 4 Listener_HedgedDisclosure Listener_HedgedDisclosure Listener_HedgedDisclosure
## 5 Listener_Reflection Listener_Reflection Listener_Reflection
## 6 Listener_Advice Listener_Advice Listener_Advice
## 7 Discloser_Elaboration Discloser_Elaboration Discloser_Elaboration
## 8 Discloser_HedgedDisclosure Discloser_HedgedDisclosure Discloser_HedgedDisclosure
## 9 Discloser_Question Discloser_Question Discloser_Question
## 10 Discloser_Acknowledgement Discloser_Acknowledgement Discloser_Acknowledgement
## 11 Discloser_Advice Discloser_Advice Discloser_Advice
## 12 Discloser_Reflection Discloser_Reflection Discloser_Reflection
## [>] 1721 sequences in the data set
## [>] min/max sequence length: 5/5
A lot of text will appear after the sequence object is created. This text tells you about the number of sequences (which should be equal to the number of five-turn sequences in the sample), the states (i.e., categories) that appear in the sequence object, and the alphabet and labels of the categories.
Finally, we can plot the sequences.
seqIplot(turn_seq, # Sequence object
with.legend = "right", # Display legend on right side of plot
cex.legend = 0.8, # Change size of legend
main = "Turn Type Use during a Conversational Motif", # Plot title
legend.prop = .4) # Proportion of space for legend
To read this plot, each row represents a single five-turn conversational motif, where the different turn types within that motif are represented as different colors. We can see that the conversations varied in content, although, elaboration turns (red turns) seem to appear frequently across all conversations.
Establish a Cost Matrix and Obtain Dissimilarity Matrix.
The goal of this step is to establish a cost matrix and to obtain a dissimilarity matrix.
Sequence analysis aims to identify groups of sequences that are similar by clustering together sequences based on their distances. The distance between any pair of sequences is calculated as the minimum “cost” of transforming one sequence into another and is calculated using an optimal matching algorithm. In the transformation, there are specific costs associated with inserting, deleting, and substituting elements of the sequence, as well as costs for substituting missing values. We set these costs when conducting the sequence analysis.
There are a number of ways to set substitution costs. Typically, substitution costs are established as the distance between cells. However, we do not have an ordinal scale for the categories (i.e., there is no logical order or distance between our turn types, e.g., what turn type is “closest” to acknowledgement?). In this case, we use a constant cost matrix (i.e., the distance between any turn type is the same). If we were to use a theoretical rationale to sort turn types that were more or less similar to each other, we could use Manhattan (city-block) distance or Euclidian distance.
We need to create a substitution cost matrix before conducting the sequence analysis. The substitution cost matrix will be a (k+1) by (k+1) matrix with k = number of categories and an additional right-most column and bottom row to represent missingness costs (half of the highest cost, which in this case is half of 2). In our case, the substitution cost matrix will be a 7 x 7 matrix since we have 6 turn type categories + 1 row and column for the cost of missing values.
Here, we establish our substitution cost matrix.
# Create substitution cost matrix and save to the object "costmatrix"
<- seqsubm(turn_seq, # Sequence object
costmatrix method = "CONSTANT", # Method to determine costs
cval = 2, # Substitution cost
with.missing = TRUE, # Allows for missingness state
miss.cost = 1, # Cost for substituting a missing state
time.varying = FALSE, # Does not allow the cost to vary over time
weighted = TRUE) # Allows weights to be used when applicable
## [!!] seqcost: 'with.missing' set as FALSE as 'seqdata' has no non-void missing values
## [>] creating 12x12 substitution-cost matrix using 2 as constant value
# Examine substitution cost matrix
costmatrix
## Listener_Question Listener_Acknowledgement
## Listener_Question 0 2
## Listener_Acknowledgement 2 0
## Listener_Elaboration 2 2
## Listener_HedgedDisclosure 2 2
## Listener_Reflection 2 2
## Listener_Advice 2 2
## Discloser_Elaboration 2 2
## Discloser_HedgedDisclosure 2 2
## Discloser_Question 2 2
## Discloser_Acknowledgement 2 2
## Discloser_Advice 2 2
## Discloser_Reflection 2 2
## Listener_Elaboration Listener_HedgedDisclosure
## Listener_Question 2 2
## Listener_Acknowledgement 2 2
## Listener_Elaboration 0 2
## Listener_HedgedDisclosure 2 0
## Listener_Reflection 2 2
## Listener_Advice 2 2
## Discloser_Elaboration 2 2
## Discloser_HedgedDisclosure 2 2
## Discloser_Question 2 2
## Discloser_Acknowledgement 2 2
## Discloser_Advice 2 2
## Discloser_Reflection 2 2
## Listener_Reflection Listener_Advice
## Listener_Question 2 2
## Listener_Acknowledgement 2 2
## Listener_Elaboration 2 2
## Listener_HedgedDisclosure 2 2
## Listener_Reflection 0 2
## Listener_Advice 2 0
## Discloser_Elaboration 2 2
## Discloser_HedgedDisclosure 2 2
## Discloser_Question 2 2
## Discloser_Acknowledgement 2 2
## Discloser_Advice 2 2
## Discloser_Reflection 2 2
## Discloser_Elaboration Discloser_HedgedDisclosure
## Listener_Question 2 2
## Listener_Acknowledgement 2 2
## Listener_Elaboration 2 2
## Listener_HedgedDisclosure 2 2
## Listener_Reflection 2 2
## Listener_Advice 2 2
## Discloser_Elaboration 0 2
## Discloser_HedgedDisclosure 2 0
## Discloser_Question 2 2
## Discloser_Acknowledgement 2 2
## Discloser_Advice 2 2
## Discloser_Reflection 2 2
## Discloser_Question Discloser_Acknowledgement
## Listener_Question 2 2
## Listener_Acknowledgement 2 2
## Listener_Elaboration 2 2
## Listener_HedgedDisclosure 2 2
## Listener_Reflection 2 2
## Listener_Advice 2 2
## Discloser_Elaboration 2 2
## Discloser_HedgedDisclosure 2 2
## Discloser_Question 0 2
## Discloser_Acknowledgement 2 0
## Discloser_Advice 2 2
## Discloser_Reflection 2 2
## Discloser_Advice Discloser_Reflection
## Listener_Question 2 2
## Listener_Acknowledgement 2 2
## Listener_Elaboration 2 2
## Listener_HedgedDisclosure 2 2
## Listener_Reflection 2 2
## Listener_Advice 2 2
## Discloser_Elaboration 2 2
## Discloser_HedgedDisclosure 2 2
## Discloser_Question 2 2
## Discloser_Acknowledgement 2 2
## Discloser_Advice 0 2
## Discloser_Reflection 2 0
Now that we have created the substitution cost matrix, we can calculate the distances between each pair of five-turn sequences.
We use an optimal matching algorithm. The output of the sequence analysis is a n x n (n = number of five-turn sequences) dissimilarity matrix where the elements in the matrix indicate the minimal cost of transforming one sequence into every other sequence in the corresponding cell of the matrix. Insertion/deletion costs are typically set to 1.0, substitution costs are set to the matrix we established above, and missingness costs are typically set to half the highest cost within the matrix (and are included in the substitution cost matrix we established above).
Note: Other algorithms are available, and they can be specified in the method = “” argument below. To see other algorithms available in the TraMineR package, type ?seqdist in the console or type seqdist in the search bar at the top of the Help tab on the right.
# Obtain distance matrix
<- seqdist(turn_seq, # Sequence object
dist_om method = "OM", # Optimal matching algorithm
indel = 1.0, # Insert/deletion costs set to 1
sm = costmatrix, # Substitution cost matrix
with.missing = TRUE)
## [!!] seqdist: 'with.missing' set as FALSE as 'seqdata' has no non-void missing values
## [>] 1721 sequences with 12 distinct states
## [>] checking 'sm' (size and triangle inequality)
## [>] 549 distinct sequences
## [>] min/max sequence lengths: 5/5
## [>] computing distances using the OM metric
## [>] elapsed time: 0.191 secs
# Examine the top left corner of the dissimilarity matrix
1:10, 1:10] dist_om[
## 1 2 3 4 5 6 7 16 17 18
## 1 0 4 6 6 6 6 6 6 6 8
## 2 4 0 4 6 6 6 6 6 6 8
## 3 6 4 0 2 4 4 4 2 4 6
## 4 6 6 2 0 2 2 2 0 2 4
## 5 6 6 4 2 0 4 4 2 0 2
## 6 6 6 4 2 4 0 4 2 4 2
## 7 6 6 4 2 4 4 0 2 4 6
## 16 6 6 2 0 2 2 2 0 2 4
## 17 6 6 4 2 0 4 4 2 0 2
## 18 8 8 6 4 2 2 6 4 2 0
Determine Number of Clusters - Conversational Motifs.
The goal of this step is to determine the number of clusters - i.e., a typology of conversational motifs - using a data-driven approach.
We next take the distance matrix obtained in the prior step to determine an appropriate number of clusters that represent the different conversational motifs within our supportive conversations. We used hierarchical cluster analysis using Ward’s single linkage method to determine the number of clusters that represent the data well. We then create an object that contains cluster membership for each five-turn sequence (which will be used in the final step) and plot the clusters.
Conduct hierarchical cluster analysis and save cluster analysis results to the object “clusterward”.
# Insert dissimilarity matrix ("dist_om"),
# indicate that we are using a dissimilarity matrix, and
# indicate that we want to use Ward's single linkage clustering method
<- cluster::agnes(dist_om, diss = TRUE, method = "ward")
clusterward
# Plot the results of the cluster analysis using a dendrogram
# Insert cluster analysis results object ("clusterward")
plot(clusterward, which.plot = 2)
In this example, the resulting dendrogram indicates three clusters. We reached this conclusion by examining the length of the vertical lines (longer vertical lines indicate greater differences between groups) and the number of dyads within each group (we didn’t want a group with too few dyads). After selecting a 3-cluster solution, we plotted the sequences of the three clusters for visual comparison.
# Cut dendrogram (or tree) by the number of determined groups (in this case, 3)
# Insert cluster analysis results object ("clusterward")
# and the number of cut points
<- cutree(clusterward, k = 3)
cl3
# Turn cut points into a factor variable and label them
# Insert cut point object ("cl3") and create labels
# by combining the text "Type" with either 1, 2, or 3
<- factor(cl3, labels = paste("Type", 1:3))
cl3fac
# Plot the sequences for each cluster
seqplot(turn_seq, # Sequence object
group = cl3fac, # Grouping factor level variable
type = "I", # Create whole sequence plot
sortv = "from.start", # Sort sequences based upon the category in which they begin
with.legend = "right", # Display legend on right side of plot
cex.legend = 0.8, # Change size of legend
border = NA) # No plot border
Notable interpretation of the clusters include (1) the listeners in the “Type 1” dyads use more elaboration turns, which fits with the listener-focused elaboration conversational motif, (2) the disclosers in the “Type 2” dyads spent most of their time elaborating on their problem, which fits with the discloser problem description conversational motif, and (3) the disclosers in the “Type 3” dyad use elaboration and hedged discloser turns, which fits with the discloser problem processing conversational motif. These plots can sometimes be difficult to distinguish, so further descriptives (e.g., percent of each turn type that comprise each cluster) can be helpful.
In the next (and final) step, we will examine the associations of the timing and prevalence of conversational motifs with the discloser’s emotional improvement following the conversation.
Examine Associations of Prevalence and Timing of Conversational Motifs with Outcomes.
The goal of the final step is to examine whether the prevalence and timing of conversational motifs are associated with the discloser’s post-conversation reports of emotional improvement.
This step involves (1) data management substeps that add the conversational motif cluster information into the sequence data and calculate the proportion of each conversational motif for each third of the conversation, and (2) fit regression models that examine the association between the timing and prevalence of each conversational motif and the discloser’s emotional improvement.
Data Management.
We first add cluster information back into data set.
# Add grouping variable to data set
$cluster <- cl3 window_data
We next create a variable that counts window number, which will make sure the conversational motifs stay within the temporal order they occurred for later analyses.
<- # Select data
window_data %>%
window_data # Select grouping variable, in this case, dyad ID (id)
group_by(id) %>%
# Create new variable "windownumber" that counts from 1 to n (the last row)
mutate(windownumber = 1:n()) %>%
# Save the data as a data.frame
as.data.frame()
# View the first 10 rows of the data
head(window_data, 10)
## id turn1 turn2
## 1 1 Discloser_Acknowledgement Listener_Acknowledgement
## 2 1 Discloser_Advice Listener_Elaboration
## 3 1 Discloser_Elaboration Listener_Elaboration
## 4 1 Discloser_Elaboration Listener_Acknowledgement
## 5 1 Discloser_Elaboration Listener_Acknowledgement
## 6 1 Discloser_Elaboration Listener_Acknowledgement
## 7 1 Discloser_HedgedDisclosure Listener_Acknowledgement
## 8 1 Discloser_Elaboration Listener_Acknowledgement
## 9 1 Discloser_Elaboration Listener_Acknowledgement
## 10 1 Discloser_Elaboration Listener_Acknowledgement
## turn3 turn4
## 1 Discloser_Advice Listener_Elaboration
## 2 Discloser_Elaboration Listener_Elaboration
## 3 Discloser_Elaboration Listener_Acknowledgement
## 4 Discloser_Elaboration Listener_Acknowledgement
## 5 Discloser_Elaboration Listener_Acknowledgement
## 6 Discloser_HedgedDisclosure Listener_Acknowledgement
## 7 Discloser_Elaboration Listener_Acknowledgement
## 8 Discloser_Elaboration Listener_Acknowledgement
## 9 Discloser_Elaboration Listener_Acknowledgement
## 10 Discloser_HedgedDisclosure Listener_Acknowledgement
## turn5 cluster windownumber
## 1 Discloser_Elaboration 1 1
## 2 Discloser_Elaboration 1 2
## 3 Discloser_Elaboration 2 3
## 4 Discloser_Elaboration 3 4
## 5 Discloser_HedgedDisclosure 3 5
## 6 Discloser_Elaboration 3 6
## 7 Discloser_Elaboration 3 7
## 8 Discloser_Elaboration 3 8
## 9 Discloser_HedgedDisclosure 3 9
## 10 Discloser_HedgedDisclosure 3 10
We then need to calculate the time each conversational motif occurred within a conversation. We do so by calculating time as the proportion of the conversation at which that motif occurred, with 0 representing the beginning of the conversation and 1 representing the end of the conversation.
We first create a new data set that calculates the time in proportion for each conversational motif.
<- # Select data
window_data2 %>%
window_data # Select variables of interest: dyad ID, conversational motif cluster, window number
select(id, cluster, windownumber) %>%
# Select grouping variable, in this case, dyad ID (id)
group_by(id) %>%
# Create new variables
# The "max_turn" variable determines how many motif windows are in the conversation
# The "time_prop" variable is calculated by dividing the window number by the
# total number of motif windows in the conversation
# The "time_prop" value is then rounded to 3 decimal places
mutate(max_turn = max(windownumber),
time_prop = windownumber/max_turn,
time_prop = round(time_prop, digits = 3)) %>%
# Save the data as a data.frame
as.data.frame()
# View the first 10 rows of the data
head(window_data2, 10)
## id cluster windownumber max_turn time_prop
## 1 1 1 1 37 0.027
## 2 1 1 2 37 0.054
## 3 1 2 3 37 0.081
## 4 1 3 4 37 0.108
## 5 1 3 5 37 0.135
## 6 1 3 6 37 0.162
## 7 1 3 7 37 0.189
## 8 1 3 8 37 0.216
## 9 1 3 9 37 0.243
## 10 1 3 10 37 0.270
We then create a data set that contains rows that represent 1000 time points in the conversation, since we’ve rounded proportion of time to the third decimal place.
# Create vector of id values
<- unique(window_data2$id)
window_data_id
# Determine the number of unique IDs
length(window_data_id) # 53 dyads - which matches what we determined earlier in the tutorial
## [1] 53
# Create sequence that contains values between 0 and 1 to the .001 decimal place
<- seq(from = 0, to = 1, length = 1000)
turns_seq
# Create data set with two variables:
# id variable that is repeated 1000 times for each id
# time_prop variable that repeats the "turns_seq" sequence for each id (i.e., repeat it 53 times)
<- data.frame(id = rep(window_data_id, each = 1000),
turns_seq_data time_prop = rep(turns_seq, 53))
# Round value in "time_prop" to 3 decimal places
$time_prop <- round(turns_seq_data$time_prop, digits = 3)
turns_seq_data
# View the first 10 rows of the data
head(turns_seq_data, 10)
## id time_prop
## 1 1 0.000
## 2 1 0.001
## 3 1 0.002
## 4 1 0.003
## 5 1 0.004
## 6 1 0.005
## 7 1 0.006
## 8 1 0.007
## 9 1 0.008
## 10 1 0.009
Merge the conversational motif data (“window_data2”) with the 1000-step time data (“turns_seq_data”).
<- merge(turns_seq_data, # Select 1000-step time data
window_data3 # Select conversational motif data
window_data2, by = c("id", "time_prop"), # Merge on the ID and time variables
all.x = TRUE) # Keep all rows across data sets
# View the first 10 rows of the data
head(window_data3, 10)
## id time_prop cluster windownumber max_turn
## 1 1 0.000 NA NA NA
## 2 1 0.001 NA NA NA
## 3 1 0.002 NA NA NA
## 4 1 0.003 NA NA NA
## 5 1 0.004 NA NA NA
## 6 1 0.005 NA NA NA
## 7 1 0.006 NA NA NA
## 8 1 0.007 NA NA NA
## 9 1 0.008 NA NA NA
## 10 1 0.009 NA NA NA
Finally, we need to fill in the conversational cluster information in the “cluster” column.
# Fill in NA values for cluster
<- # Select data
window_data3 %>%
window_data3 # Select grouping variable, in this case, dyad ID (id)
group_by(id) %>%
# Fill NAs in cluster column, with the NAs prior to a value
# being filled in with that value
fill(cluster, .direction = "up") %>%
# Save the data as a data.frame
as.data.frame()
# View the first 10 rows of the data
head(window_data3, 10)
## id time_prop cluster windownumber max_turn
## 1 1 0.000 1 NA NA
## 2 1 0.001 1 NA NA
## 3 1 0.002 1 NA NA
## 4 1 0.003 1 NA NA
## 5 1 0.004 1 NA NA
## 6 1 0.005 1 NA NA
## 7 1 0.006 1 NA NA
## 8 1 0.007 1 NA NA
## 9 1 0.008 1 NA NA
## 10 1 0.009 1 NA NA
Now that we have a more precise measure of time in our data, we next need to divide the conversations into phases. Here, we divide the conversations into three phases (first, middle, and final) and add a variable indicating the phase of the conversation to the data.
# Create labels that will be repeated for each ID
# Since we are dividing the conversations (as measured in 1000 steps) into thirds,
# we repeat each value either 334 or 33 times
<- c(rep("first", 334), rep("middle", 333), rep("final", 333))
convo_part
# Add variable to data set that contains phase information for each dyad
$convo_part <- rep(convo_part, 53)
window_data3
# View the first 10 rows of the data
head(window_data3, 10)
## id time_prop cluster windownumber max_turn convo_part
## 1 1 0.000 1 NA NA first
## 2 1 0.001 1 NA NA first
## 3 1 0.002 1 NA NA first
## 4 1 0.003 1 NA NA first
## 5 1 0.004 1 NA NA first
## 6 1 0.005 1 NA NA first
## 7 1 0.006 1 NA NA first
## 8 1 0.007 1 NA NA first
## 9 1 0.008 1 NA NA first
## 10 1 0.009 1 NA NA first
As our last data preparation step, we calculate the proportion of each conversational motif type for each phase of the conversation.
<- # Select data
window_data_prop %>%
window_data3 # Select grouping variable, in this case, dyad ID (id), phase of the conversation,
# and then motif type
group_by(id, convo_part, cluster) %>%
# Calculate the percentage of each motif type
# Divide by 333 because that is (approximately) the total number of instances
# in each phase
summarise(percentage = n()/333) %>%
# Create new variable that connects the cluster value
# with the phase of the conversation
mutate(cluster_time = paste0(cluster, "_", convo_part)) %>%
# Save the data as a data.frame
as.data.frame()
## `summarise()` has grouped output by 'id', 'convo_part'. You can override using
## the `.groups` argument.
# View the first 10 rows of the data
head(window_data_prop, 10)
## id convo_part cluster percentage cluster_time
## 1 1 final 1 0.24324324 1_final
## 2 1 final 2 0.35135135 2_final
## 3 1 final 3 0.40540541 3_final
## 4 1 first 1 0.16516517 1_first
## 5 1 first 2 0.08108108 2_first
## 6 1 first 3 0.75675676 3_first
## 7 1 middle 2 0.13513514 2_middle
## 8 1 middle 3 0.86486486 3_middle
## 9 10 final 1 0.15915916 1_final
## 10 10 final 2 0.60060060 2_final
We then reshape the data from long to wide so each column of the data represents the proportion of each turn type at each phase of the conversation.
<- reshape(# Select variables of interest
window_data_third_prop data = window_data_prop[, c("id", "percentage", "cluster_time")],
# Select variable that will now represent the columns
timevar = c("cluster_time"),
# Select the ID variable
idvar = c("id"),
# Select the variable that will be the values in the columns
v.names = "percentage",
# Reshape from long to wide
direction = "wide",
# Separate words in column names with _
sep = "_")
# Replace NAs with 0s for proportion variables
2:10][is.na(window_data_third_prop[, 2:10])] <- 0
window_data_third_prop[,
# View the first 10 rows of the data
head(window_data_third_prop, 10)
## id percentage_1_final percentage_2_final percentage_3_final
## 1 1 0.24324324 0.3513514 0.4054054
## 9 10 0.15915916 0.6006006 0.2402402
## 17 11 0.00000000 0.6396396 0.3603604
## 24 12 0.06906907 0.4444444 0.4864865
## 32 13 0.00000000 0.8888889 0.1111111
## 40 14 0.42942943 0.1231231 0.4474474
## 49 15 0.00000000 0.6006006 0.3993994
## 56 16 0.06906907 0.7927928 0.1381381
## 65 17 0.20720721 0.5165165 0.2762763
## 74 18 0.09909910 0.3003003 0.6006006
## percentage_1_first percentage_2_first percentage_3_first percentage_2_middle
## 1 0.16516517 0.08108108 0.7567568 0.1351351
## 9 0.00000000 0.75975976 0.2432432 0.4384384
## 17 0.06306306 0.39939940 0.5405405 0.6426426
## 24 0.14114114 0.55855856 0.3033033 0.6066066
## 32 0.11111111 0.11111111 0.7807808 0.3333333
## 40 0.42642643 0.20720721 0.3693694 0.3423423
## 49 0.00000000 0.30330330 0.6996997 0.4954955
## 56 0.06906907 0.52252252 0.4114114 0.7297297
## 65 0.06906907 0.62462462 0.3093093 0.3093093
## 74 0.00000000 0.70270270 0.3003003 0.3513514
## percentage_3_middle percentage_1_middle
## 1 0.8648649 0.00000000
## 9 0.2402402 0.32132132
## 17 0.3573574 0.00000000
## 24 0.3933934 0.00000000
## 32 0.3333333 0.33333333
## 40 0.3513514 0.30630631
## 49 0.3063063 0.19819820
## 56 0.2042042 0.06606607
## 65 0.3453453 0.34534535
## 74 0.4474474 0.20120120
Merge the outcome variable of interest to proportion data.
<- merge(window_data_third_prop, outcomes, by = "id")
window_data_third_prop
# View the first 10 rows of the data
head(window_data_third_prop, 10)
## id percentage_1_final percentage_2_final percentage_3_final
## 1 1 0.24324324 0.3513514 0.4054054
## 2 10 0.15915916 0.6006006 0.2402402
## 3 11 0.00000000 0.6396396 0.3603604
## 4 12 0.06906907 0.4444444 0.4864865
## 5 13 0.00000000 0.8888889 0.1111111
## 6 14 0.42942943 0.1231231 0.4474474
## 7 15 0.00000000 0.6006006 0.3993994
## 8 16 0.06906907 0.7927928 0.1381381
## 9 17 0.20720721 0.5165165 0.2762763
## 10 18 0.09909910 0.3003003 0.6006006
## percentage_1_first percentage_2_first percentage_3_first percentage_2_middle
## 1 0.16516517 0.08108108 0.7567568 0.1351351
## 2 0.00000000 0.75975976 0.2432432 0.4384384
## 3 0.06306306 0.39939940 0.5405405 0.6426426
## 4 0.14114114 0.55855856 0.3033033 0.6066066
## 5 0.11111111 0.11111111 0.7807808 0.3333333
## 6 0.42642643 0.20720721 0.3693694 0.3423423
## 7 0.00000000 0.30330330 0.6996997 0.4954955
## 8 0.06906907 0.52252252 0.4114114 0.7297297
## 9 0.06906907 0.62462462 0.3093093 0.3093093
## 10 0.00000000 0.70270270 0.3003003 0.3513514
## percentage_3_middle percentage_1_middle emo_improve
## 1 0.8648649 0.00000000 6.000000
## 2 0.2402402 0.32132132 7.000000
## 3 0.3573574 0.00000000 3.000000
## 4 0.3933934 0.00000000 4.000000
## 5 0.3333333 0.33333333 5.000000
## 6 0.3513514 0.30630631 5.333333
## 7 0.3063063 0.19819820 6.666667
## 8 0.2042042 0.06606607 7.000000
## 9 0.3453453 0.34534535 4.000000
## 10 0.4474474 0.20120120 5.000000
The data are finally ready for regression analyses!
Data Analysis.
We examine the association of the timing and prevalence of each conversational motif on the discloser’s emotional improvement. We examine each type of conversational motif separately.
Type 1: Listener-focused elaboration.
<- lm(emo_improve ~ percentage_1_first + percentage_1_middle +
type1_reg
percentage_1_final, data = window_data_third_prop)
summary(type1_reg)
##
## Call:
## lm(formula = emo_improve ~ percentage_1_first + percentage_1_middle +
## percentage_1_final, data = window_data_third_prop)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.3880 -0.7447 0.0100 0.9627 2.0165
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.99002 0.28341 17.607 <2e-16 ***
## percentage_1_first 0.23645 1.44401 0.164 0.871
## percentage_1_middle -0.02926 1.17913 -0.025 0.980
## percentage_1_final 2.20076 1.37425 1.601 0.116
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.353 on 49 degrees of freedom
## Multiple R-squared: 0.06725, Adjusted R-squared: 0.01014
## F-statistic: 1.178 on 3 and 49 DF, p-value: 0.3279
The timing and prevalence of the listener-focused dialogue motif were not associated with the discloser’s emotional improvement following the supportive conversation.
Type 2: Discloser problem description.
<- lm(emo_improve ~ percentage_2_first + percentage_2_middle +
type2_reg
percentage_2_final, data = window_data_third_prop)
summary(type2_reg)
##
## Call:
## lm(formula = emo_improve ~ percentage_2_first + percentage_2_middle +
## percentage_2_final, data = window_data_third_prop)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.5974 -0.7426 -0.0661 1.2742 2.1831
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5.74145 0.46740 12.284 <2e-16 ***
## percentage_2_first -0.85943 1.00624 -0.854 0.397
## percentage_2_middle -0.26380 0.99979 -0.264 0.793
## percentage_2_final 0.07066 0.96776 0.073 0.942
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.38 on 49 degrees of freedom
## Multiple R-squared: 0.02945, Adjusted R-squared: -0.02997
## F-statistic: 0.4957 on 3 and 49 DF, p-value: 0.687
The timing and prevalence of the discloser problem description motif were not associated with the discloser’s emotional improvement following the supportive conversation.
Type 3: Discloser problem processing.
<- lm(emo_improve ~ percentage_3_first + percentage_3_middle +
type3_reg
percentage_3_final, data = window_data_third_prop)
summary(type3_reg)
##
## Call:
## lm(formula = emo_improve ~ percentage_3_first + percentage_3_middle +
## percentage_3_final, data = window_data_third_prop)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.4991 -0.8959 0.0432 1.2095 2.1759
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5.2058 0.4872 10.684 2.14e-14 ***
## percentage_3_first 0.8396 0.9865 0.851 0.399
## percentage_3_middle 0.7607 1.1117 0.684 0.497
## percentage_3_final -1.4248 1.0773 -1.323 0.192
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.37 on 49 degrees of freedom
## Multiple R-squared: 0.04418, Adjusted R-squared: -0.01434
## F-statistic: 0.7549 on 3 and 49 DF, p-value: 0.5249
The timing and prevalence of the discloser problem processing motif were not associated with the discloser’s emotional improvement following the supportive conversation.
In sum, in this example, we found that the prevalence and timing of the different conversational motifs were not associated with the discloser’s emotional improvement.
Conclusion.
In this tutorial, we demonstrated the steps to identify five-turn sequences - conversational motifs - in supportive conversations using sequence analysis. Furthermore, we examined whether the timing and prevalence of those conversational motifs were related to emotional improvement following the supportive conversation.
We are excited about the potential for sequence analysis to contribute to the study of interpersonal dynamics across a variety of relationship types and interaction episodes.
Additional Information
We created this tutorial with a system environment and versions of R and packages that might be different from yours. If R reports errors when you attempt to run this tutorial, running the code chunk below and comparing your output and the tutorial posted on the LHAMA website may be helpful.
session_info(pkgs = c("attached"))
## ─ Session info ───────────────────────────────────────────────────────────────
## setting value
## version R version 4.2.0 (2022-04-22)
## os macOS Big Sur/Monterey 10.16
## system x86_64, darwin17.0
## ui X11
## language (EN)
## collate en_US.UTF-8
## ctype en_US.UTF-8
## tz America/New_York
## date 2022-08-20
## pandoc 2.18 @ /Applications/RStudio.app/Contents/MacOS/quarto/bin/tools/ (via rmarkdown)
##
## ─ Packages ───────────────────────────────────────────────────────────────────
## package * version date (UTC) lib source
## cluster * 2.1.3 2022-03-28 [1] CRAN (R 4.2.0)
## devtools * 2.4.3 2021-11-30 [1] CRAN (R 4.2.0)
## dplyr * 1.0.9 2022-04-28 [1] CRAN (R 4.2.0)
## ggplot2 * 3.3.6 2022-05-03 [1] CRAN (R 4.2.0)
## psych * 2.2.5 2022-05-10 [1] CRAN (R 4.2.0)
## reshape * 0.8.9 2022-04-12 [1] CRAN (R 4.2.0)
## stringr * 1.4.0 2019-02-10 [1] CRAN (R 4.2.0)
## tidyr * 1.2.0 2022-02-01 [1] CRAN (R 4.2.0)
## TraMineR * 2.2-4 2022-06-09 [1] CRAN (R 4.2.0)
## TraMineRextras * 0.6.4 2022-06-13 [1] CRAN (R 4.2.0)
## usethis * 2.1.6 2022-05-25 [1] CRAN (R 4.2.0)
##
## [1] /Library/Frameworks/R.framework/Versions/4.2/Resources/library
##
## ──────────────────────────────────────────────────────────────────────────────