Two-Sample Configural Frequency Analysis Tutorial
Overview
This tutorial provides R code on conducting two-sample configural frequency analysis (Stemmler, 2020; Stemmler & Bingham, 2003; Stemmler & Hammond, 1997; von Eye, 2002). Two-sample configural frequency analysis works quite similarly to configural frequency analysis (e.g., both rely on contingency tables), however, the null hypotheses differ. In configural frequency analysis, the null hypothesis is that the variables that define each axis of the contingency table are independent of each other (e.g., discloser turns are not followed more or less frequently by certain listener turns). Instead, the two-sample configural frequency analysis tests the null hypothesis that the expected frequencies within each cell do not differ across groups – i.e., that any deviations in the frequency distributions should be random (Stemmler, 2020).
In this example, we use two-sample configural frequency analysis to compare the use of specific turn transitions during “supportive” versus “unsupportive” conversations between strangers in which one dyad member disclosed about a current problem (N = 59). Each turn in the conversations between stranger dyads was coded (e.g., as an acknowledgement, question, hedged disclosure, etc.; see Bodie et al., 2021 in the Journal of Language and Social Psychology for more details about the creation of the turn typology) and the transitions between turn types was tallied across each sample. We use two-sample configural frequency analysis to determine which turn transitions differed between the two types of conversations.
Note that the accompanying “TwoSampleConfiguralFrequency_Tutorial_2022July26.rmd” file contains all of the code presented in this tutorial and can be opened in RStudio (a somewhat more friendly user interface to R).
Outline
In this tutorial, we’ll cover…
- Reading in the data and loading needed packages.
- Plotting two dyads’ conversation.
- Preparing the data for two-sample configural frequency
analysis.
- Conducting two-sample configural frequency analysis.
Read in the data and load needed libraries.
Let’s read the data into R.
We are working with two data sets in this tutorial. One data set contains repeated measures (“StrangerConversations_N59”), specifically the conversation turn codes for each dyad. The other data set contains time-invariant outcome data (“StrangerConversations_N59_Outcomes”), specifically the reports of change in discloser distress following the conversation.
Both data sets were stored as .csv files (comma-separated values file, which can be created by saving an Excel file as a csv document) on my computer’s desktop.
# Set working directory (i.e., where your data file is stored)
# This can be done by going to the top bar of RStudio and selecting
# "Session" --> "Set Working Directory" --> "Choose Directory" -->
# finding the location of your file
setwd("~/Desktop") # Note: You can skip this line if you have
#the data file and this .rmd file stored in the same directory
# Read in the repeated measures data
<- read.csv(file = "StrangerConversations_N59.csv", head = TRUE, sep = ",")
data
# View the first 10 rows of the repeated measures data
head(data, 10)
## id turn role turn_type
## 1 105 1 1 Question
## 2 105 2 2 Acknowledgement
## 3 105 3 1 Elaboration
## 4 105 4 2 Acknowledgement
## 5 105 5 1 Elaboration
## 6 105 6 2 Acknowledgement
## 7 105 7 1 Elaboration
## 8 105 8 2 Elaboration
## 9 105 9 1 Elaboration
## 10 105 10 2 Reflection
# Read in the outcomes data
<- read.csv(file = "StrangerConversations_N59_Outcomes.csv", head = TRUE, sep = ",")
outcomes
# View the first 10 rows of the outcomes data
head(outcomes, 10)
## id distress
## 1 3 3
## 2 11 1
## 3 12 2
## 4 14 2
## 5 31 1
## 6 38 2
## 7 45 1
## 8 54 3
## 9 55 2
## 10 58 3
In the repeated measures data (“data”), we can see each row contains information for one turn and there are multiple rows (i.e., multiple turns) for each dyad. In this data set, there are columns for:
- Dyad ID (
id
)
- Time variable - in this case, turn in the conversation
(
turn
)
- Dyad member ID - in this case, role in the conversation
(
role
; discloser = 1, listener = 2)
- Turn type - in this case, based upon a typology derived in Bodie et
al. (2021;
turn_type
)
In the outcome data (“outcomes”), we can see there is one row for each dyad and there are columns for:
- Dyad ID (
id
)
- Outcome variable - in this case, post-conversation report of
distress by the support receiver (
distress
)
Load the R packages we need.
Packages in R are a collection of functions (and their documentation/explanations) that enable us to conduct particular tasks, such as plotting or fitting a statistical model.
# install.packages(data.table) # Install package if you have never used it before
library(data.table) # For data management: counting turn transitions
# install.packages("devtools") # Install package if you have never used it before
require(devtools) # For version control
# install.packages("confreq") # Install package if you have never used it before
library(confreq) # For conducting configural frequency analysis
# install.packages("dplyr") # Install package if you have never used it before
library(dplyr) # For data management
# install.packages("ggplot2") # Install package if you have never used it before
library(ggplot2) # For plotting
# install.packages("tidyverse")
library(tidyverse) # For data management
Plot Two Dyads’ Conversation.
To get a better feel for the conversation data, let’s plot two dyads’ conversations.
Before creating the plots, it is helpful to set the colors for each turn type so the color of the turn categories are consistent across plots (i.e., the number of turn types present in a given conversation does not affect the color of the turn types). We do this by creating a vector “cols” that contains color assignments (via hex code: https://www.color-hex.com/) for each turn type.
A note on accessibility: To make your plots accessible, you may consider adopting a colorblind-friendly palette. David Nichols’ website (https://davidmathlogic.com/colorblind/) provides a great explainer on this issue, as well as a color picking tool.
<- c("Elaboration" = "#00BA38",
cols "Question" = "#619CFF",
"Acknowledgement" = "#F8766D",
"Reflection" = "#DB72FB",
"Advice" = "#93AA00",
"HedgedDisclosure" = "#00C19F")
We’ll create the dyadic categorical time series plot for each exemplar dyad and save these plots to the objects “dyad105_plot” and “dyad123_plot”.
Dyad 105 plot.
# First partition data of interest
<- data[data$id == 105, ]
dyad105
<-
dyad105_plot # Choose the data, set time variable (turn) for the x-axis
ggplot(dyad105, aes(x = turn)) +
# Create title for plot by combining "Dyad = " with the dyad id variable (id)
ggtitle(paste("Dyad =", unique(dyad105$id))) +
# Create bars for the form of the listeners' turns
# Partition data for listeners (role = 2)
geom_rect(data = dyad105[dyad105$role == 2, ],
# Set the width of each bar as -0.5 and +0.5 the value of the time variable (turn)
mapping = aes(xmin = turn-.5, xmax = turn+.5,
# Set the height of each bar to range from 0 to 5
ymin = 0, ymax = 5,
# Set the color of each bar to correspond to each turn type
fill = turn_type)) +
# Add a horizontal line to separate bars
geom_hline(yintercept = 5, color = "black") +
# Create bars for the form of the disclosers' turns
# Partition data for disclosers (role = 1)
geom_rect(data = dyad105[dyad105$role == 1, ],
# Set the width of each bar as -0.5 and +0.5 the value of the time variable (turn)
mapping = aes(xmin = turn-.5, xmax = turn+.5,
# Set the height of each bar to range from 5 to 10
ymin = 5, ymax = 10,
# Set the color of each bar to correspond to each turn type
fill = turn_type)) +
# Set color of turn types to vector we created earlier ("cols")
scale_fill_manual(values = cols) +
# Label for x-axis
xlab("Turn") +
# Label for y-axis
ylab("Role") +
# X-axis ticks and labels
scale_x_continuous(breaks = seq(0, 110, by = 10)) +
# Y-axis ticks and label
scale_y_continuous(breaks = c(2.5, 7.5),
labels=c("Listener Turn", "Discloser Turn")) +
# Legend label
labs(fill = "Turn Type") +
# Additional plot aesthetics
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
axis.text=element_text(color = "black"))
Dyad 123 plot.
# First partition data of interest
<- data[data$id == 123, ]
dyad123
<-
dyad123_plot # Choose the data, set time variable (turn) for the x-axis
ggplot(dyad123, aes(x = turn)) +
# Create title for plot by combining "Dyad = " with the dyad id variable (id)
ggtitle(paste("Dyad =", unique(dyad123$id))) +
# Create bars for the form of the listeners' turns
# Partition data for listeners (role = 2)
geom_rect(data = dyad123[dyad123$role == 2, ],
# Set the width of each bar as -0.5 and +0.5 the value of the time variable (turn)
mapping = aes(xmin = turn-.5, xmax = turn+.5,
# Set the height of each bar to range from 0 to 5
ymin = 0, ymax = 5,
# Set the color of each bar to correspond to each turn type
fill = turn_type)) +
# Add a horizontal line to separate bars
geom_hline(yintercept = 5, color = "black") +
# Create bars for the form of the disclosers' turns
# Partition data for disclosers (role = 1)
geom_rect(data = dyad123[dyad123$role == 1, ],
# Set the width of each bar as -0.5 and +0.5 the value of the time variable (turn)
mapping = aes(xmin = turn-.5, xmax = turn+.5,
# Set the height of each bar to range from 5 to 10
ymin = 5, ymax = 10,
# Set the color of each bar to correspond to each turn type
fill = turn_type)) +
# Set color of turn types to vector we created earlier ("cols")
scale_fill_manual(values = cols) +
# Label for x-axis
xlab("Turn") +
# Label for y-axis
ylab("Role") +
# X-axis ticks and labels
scale_x_continuous(breaks = seq(0, 110, by = 10)) +
# Y-axis ticks and label
scale_y_continuous(breaks = c(2.5, 7.5),
labels=c("Listener Turn", "Discloser Turn")) +
# Legend label
labs(fill = "Turn Type") +
# Additional plot aesthetics
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
axis.text=element_text(color = "black"))
Print the plots we just created.
print(dyad105_plot)
print(dyad123_plot)
On the x-axis, we have turn in the conversation. On the y-axis, we have the turn type for the disclosers on the top half and the listeners on the bottom half. Each turn category is represented by a different color and the gray bars indicate when a particular dyad member is not speaking. We can see that Dyad 105 had greater back-and-forth exchange during their conversation, as indicated by the greater number of turns. In both dyads, we can see that the disclosers spent many of their turns elaborating on their problem (green) and the listener used a variety of different turn types.
Prepare the Data for Configural Frequency Analysis.
To prepare the conversation data for configural frequency analysis, we must first separate our data into two samples: supportive vs unsupportive conversations. We consider conversations in which the discloser’s post-conversation distress is low (i.e. distress = 1 or 2) as “supportive” and conversations in which the discloser’s post-conversation distress is high (i.e., distress = 3, 4, or 5) as “unsupportive.”
Merge the outcome data (“outcomes”) into the conversation data (“data”) and split the conversation data (“data”) into supportive/unsupportive conversations.
# Merge outcome data with conversation data
<- merge(data, outcomes, by = "id")
data
# View the first 10 rows of the repeated measures data
head(data, 10)
## id turn role turn_type distress
## 1 3 1 1 Elaboration 3
## 2 3 2 2 Acknowledgement 3
## 3 3 3 1 HedgedDisclosure 3
## 4 3 4 2 Acknowledgement 3
## 5 3 5 1 HedgedDisclosure 3
## 6 3 6 2 Acknowledgement 3
## 7 3 7 1 Elaboration 3
## 8 3 8 2 Question 3
## 9 3 9 1 Elaboration 3
## 10 3 10 2 Acknowledgement 3
# Split conversation data
# Distress >= 3 then unsupportive conversation
# Distress < 3 then supportive conversation
<- data[data$distress >= 3, ]
data_uns <- data[data$distress < 3, ] data_sup
Now that we have our two samples, we next count the number of transitions between listener –> discloser turns and the number of transitions between discloser –> listener turns for each sample.
First, let’s make sure each data set is ordered by turn within each dyad.
# Order data by ID and turn number
<- data_uns[order(data_uns$id, data_uns$turn), ]
data_uns <- data_sup[order(data_sup$id, data_sup$turn), ]
data_sup
# View the first 10 rows of each data set
head(data_uns, 10)
## id turn role turn_type distress
## 1 3 1 1 Elaboration 3
## 2 3 2 2 Acknowledgement 3
## 3 3 3 1 HedgedDisclosure 3
## 4 3 4 2 Acknowledgement 3
## 5 3 5 1 HedgedDisclosure 3
## 6 3 6 2 Acknowledgement 3
## 7 3 7 1 Elaboration 3
## 8 3 8 2 Question 3
## 9 3 9 1 Elaboration 3
## 10 3 10 2 Acknowledgement 3
head(data_sup, 10)
## id turn role turn_type distress
## 66 11 1 1 Elaboration 1
## 67 11 2 2 Acknowledgement 1
## 68 11 3 1 Elaboration 1
## 69 11 4 2 Question 1
## 70 11 5 1 Elaboration 1
## 71 11 6 2 Question 1
## 72 11 7 1 Elaboration 1
## 73 11 8 2 Question 1
## 74 11 9 1 Elaboration 1
## 75 11 10 2 Question 1
Second, before calculating the number of turn transitions, we first need to distinguish between listener and discloser turns of the same label (e.g., listener question vs. discloser question) since these are not distinguished in the “turn_type” variable.
# Unsupportive conversations
<- # Select data
data_uns %>%
data_uns # Update "turn_type" variable so that if
# the role = 1 (i.e., if it is a discloser turn),
# then add a "D" in front of turn type,
# otherwise add a "L" in front of turn type
# separate the D (or L) and the turn type with a "_"
mutate(turn_type = paste0(ifelse(role == 1, "D", "L"), "_", turn_type)) %>%
# Save the data as a data.frame
as.data.frame()
# View the first 10 rows of the unsupportive conversation data
head(data_uns, 10)
## id turn role turn_type distress
## 1 3 1 1 D_Elaboration 3
## 2 3 2 2 L_Acknowledgement 3
## 3 3 3 1 D_HedgedDisclosure 3
## 4 3 4 2 L_Acknowledgement 3
## 5 3 5 1 D_HedgedDisclosure 3
## 6 3 6 2 L_Acknowledgement 3
## 7 3 7 1 D_Elaboration 3
## 8 3 8 2 L_Question 3
## 9 3 9 1 D_Elaboration 3
## 10 3 10 2 L_Acknowledgement 3
# Supportive conversations
<- # Select data
data_sup %>%
data_sup # Update "turn_type" variable so that if
# the role = 1 (i.e., if it is a discloser turn),
# then add a "D" in front of turn type,
# otherwise add a "L" in front of turn type
# separate the D (or L) and the turn type with a "_"
mutate(turn_type = paste0(ifelse(role == 1, "D", "L"), "_", turn_type)) %>%
# Save the data as a data.frame
as.data.frame()
# View the first 10 rows of the unsupportive conversation data
head(data_sup, 10)
## id turn role turn_type distress
## 66 11 1 1 D_Elaboration 1
## 67 11 2 2 L_Acknowledgement 1
## 68 11 3 1 D_Elaboration 1
## 69 11 4 2 L_Question 1
## 70 11 5 1 D_Elaboration 1
## 71 11 6 2 L_Question 1
## 72 11 7 1 D_Elaboration 1
## 73 11 8 2 L_Question 1
## 74 11 9 1 D_Elaboration 1
## 75 11 10 2 L_Question 1
Third, let’s create a lagged “turn_type” variable.
After running the code, you will see below that the discloser’s first
turn is shown as D_Elaboration in the “unsupportive” conversation data.
On the same line and in the next column, the first lagged turn is
represented as
# Unsupportive conversations
# Create a lagged variable
<- # Select data
data_uns %>%
data_uns # Select grouping variable, in this case, dyad ID (id)
group_by(id) %>%
# Create new variable that is a lag of "turn_type"
mutate(lagged_turn_type = lag(turn_type)) %>%
# Save the data as a data.frame
as.data.frame()
# View the first 10 rows of the data
head(data_uns, 10)
## id turn role turn_type distress lagged_turn_type
## 1 3 1 1 D_Elaboration 3 <NA>
## 2 3 2 2 L_Acknowledgement 3 D_Elaboration
## 3 3 3 1 D_HedgedDisclosure 3 L_Acknowledgement
## 4 3 4 2 L_Acknowledgement 3 D_HedgedDisclosure
## 5 3 5 1 D_HedgedDisclosure 3 L_Acknowledgement
## 6 3 6 2 L_Acknowledgement 3 D_HedgedDisclosure
## 7 3 7 1 D_Elaboration 3 L_Acknowledgement
## 8 3 8 2 L_Question 3 D_Elaboration
## 9 3 9 1 D_Elaboration 3 L_Question
## 10 3 10 2 L_Acknowledgement 3 D_Elaboration
# Supportive conversations
# Create a lagged variable
<- # Select data
data_sup %>%
data_sup # Select grouping variable, in this case, dyad ID (id)
group_by(id) %>%
# Create new variable that is a lag of "turn_type"
mutate(lagged_turn_type = lag(turn_type)) %>%
# Save the data as a data.frame
as.data.frame()
# View the first 10 rows of the data
head(data_sup, 10)
## id turn role turn_type distress lagged_turn_type
## 1 11 1 1 D_Elaboration 1 <NA>
## 2 11 2 2 L_Acknowledgement 1 D_Elaboration
## 3 11 3 1 D_Elaboration 1 L_Acknowledgement
## 4 11 4 2 L_Question 1 D_Elaboration
## 5 11 5 1 D_Elaboration 1 L_Question
## 6 11 6 2 L_Question 1 D_Elaboration
## 7 11 7 1 D_Elaboration 1 L_Question
## 8 11 8 2 L_Question 1 D_Elaboration
## 9 11 9 1 D_Elaboration 1 L_Question
## 10 11 10 2 L_Question 1 D_Elaboration
We next generate two data frames containing transition frequencies through cross-tabulation, and save them to “data_uns_counts” for unsupportive conversations, and “data_sup_counts” for supportive conversations, respectively. Because there are 7 turn types (6 identified and 1 NA) and 2 roles (listener and discloser), there are 14 turn categories in our datasets. Thus, each crosstab table contains 14 * 14 = 196 cells, representing all pairings or combinations of two turns.
# Generating a crosstabs table for unsupportive conversations,
# then turning the table into the long format
<-as.data.frame(with(data_uns, table(turn_type, lagged_turn_type)))
data_uns_counts
# View the first 10 rows of the data
head(data_uns_counts, 10)
## turn_type lagged_turn_type Freq
## 1 D_Acknowledgement D_Acknowledgement 0
## 2 D_Advice D_Acknowledgement 0
## 3 D_Elaboration D_Acknowledgement 0
## 4 D_HedgedDisclosure D_Acknowledgement 0
## 5 D_NA D_Acknowledgement 0
## 6 D_Question D_Acknowledgement 0
## 7 D_Reflection D_Acknowledgement 0
## 8 L_Acknowledgement D_Acknowledgement 5
## 9 L_Advice D_Acknowledgement 2
## 10 L_Elaboration D_Acknowledgement 24
# Generating a crosstabs table for supportive conversations,
# then turning the table into the long format
<-as.data.frame(with(data_sup, table(turn_type, lagged_turn_type)))
data_sup_counts
# View the first 10 rows of the data
head(data_uns_counts, 10)
## turn_type lagged_turn_type Freq
## 1 D_Acknowledgement D_Acknowledgement 0
## 2 D_Advice D_Acknowledgement 0
## 3 D_Elaboration D_Acknowledgement 0
## 4 D_HedgedDisclosure D_Acknowledgement 0
## 5 D_NA D_Acknowledgement 0
## 6 D_Question D_Acknowledgement 0
## 7 D_Reflection D_Acknowledgement 0
## 8 L_Acknowledgement D_Acknowledgement 5
## 9 L_Advice D_Acknowledgement 2
## 10 L_Elaboration D_Acknowledgement 24
# Examine the number of rows of these two data frames
nrow(data_uns_counts)
## [1] 196
nrow(data_sup_counts)
## [1] 196
Of course, not all of the 196 are legitimate turn transitions. Transitions that were at the beginning or end of the conversation and those that involved uncodable turns will need to be removed. In addition, while they exist in the table, D->D and L->L transitions do not occur in our data. We will remove them in the two code chunks below.
# Unsupportive conversation
# Remove rows in which the lagged turn in the transition contains a _NA
<- data_uns_counts[ grep("_NA", data_uns_counts$lagged_turn_type, invert = TRUE) , ]
data_uns_counts
# Remove rows in which the following turn in the transition contains a _NA
<- data_uns_counts[ grep("_NA", data_uns_counts$turn_type, invert = TRUE) , ]
data_uns_counts
# Supportive conversation
# Remove rows in which the lagged turn in the transition contains a _NA
<- data_sup_counts[ grep("_NA", data_sup_counts$lagged_turn_type, invert = TRUE) , ]
data_sup_counts
# Remove rows in which the following turn in the transition contains a _NA
<- data_sup_counts[ grep("_NA", data_sup_counts$turn_type, invert = TRUE) , ] data_sup_counts
Since we will be examining listener –> discloser and discloser –> listener turn transitions separately, we will create two different data sets that contain the counts of these turn transitions for both the “unsupportive” and the “supportive” conversations. We will also remove the L->L and D->D transitions at this step.
# Unsupportive conversations
# Create listener --> discloser transition data set
# Remove transitions that begin with a discloser turn to create the L->D subset
# by selecting all rows in lagged_turn_type with the discloser tag "D_"
<- data_uns_counts[ grep("D_", data_uns_counts$lagged_turn_type, invert = TRUE) , ]
list2disc_uns # Remove transitions that end with a listener turn
# by selecting all rows in turn_type with the listener tag "L_"
<- list2disc_uns[ grep("L_", list2disc_uns$turn_type, invert = TRUE) , ]
list2disc_uns nrow(list2disc_uns)
## [1] 36
# View the first 10 rows of the data
head(list2disc_uns, 10)
## turn_type lagged_turn_type Freq
## 99 D_Acknowledgement L_Acknowledgement 9
## 100 D_Advice L_Acknowledgement 3
## 101 D_Elaboration L_Acknowledgement 291
## 102 D_HedgedDisclosure L_Acknowledgement 56
## 104 D_Question L_Acknowledgement 13
## 105 D_Reflection L_Acknowledgement 0
## 113 D_Acknowledgement L_Advice 3
## 114 D_Advice L_Advice 1
## 115 D_Elaboration L_Advice 8
## 116 D_HedgedDisclosure L_Advice 1
# Create discloser --> listener transition data set
# Remove transitions that begin with a listener turn to create the L->D subset
# by selecting all rows in lagged_turn_type with the listener tag "L_"
<- data_uns_counts[ grep("L_", data_uns_counts$lagged_turn_type, invert = TRUE) , ]
disc2list_uns # Remove transitions that end with a discloser turn
# by selecting all rows in turn_type with the discloser tag "D_"
<- disc2list_uns[ grep("D_", disc2list_uns$turn_type, invert = TRUE) , ]
disc2list_uns nrow(disc2list_uns)
## [1] 36
# View the first 10 rows of the data
head(disc2list_uns, 10)
## turn_type lagged_turn_type Freq
## 8 L_Acknowledgement D_Acknowledgement 5
## 9 L_Advice D_Acknowledgement 2
## 10 L_Elaboration D_Acknowledgement 24
## 11 L_HedgedDisclosure D_Acknowledgement 7
## 13 L_Question D_Acknowledgement 7
## 14 L_Reflection D_Acknowledgement 19
## 22 L_Acknowledgement D_Advice 6
## 23 L_Advice D_Advice 0
## 24 L_Elaboration D_Advice 1
## 25 L_HedgedDisclosure D_Advice 0
# Supportive conversations
# Create listener --> discloser transition data set
# Remove transitions that begin with a discloser turn to create the L->D subset
# by selecting all rows in lagged_turn_type with the discloser tag "D_"
<- data_sup_counts[ grep("D_", data_sup_counts$lagged_turn_type, invert = TRUE) , ]
list2disc_sup # Remove transitions that end with a listener turn
# by selecting all rows in turn_type with the listener tag "L_"
<- list2disc_sup[ grep("L_", list2disc_sup$turn_type, invert = TRUE) , ]
list2disc_sup nrow(list2disc_sup)
## [1] 36
# View the first 10 rows of the data
head(list2disc_sup, 10)
## turn_type lagged_turn_type Freq
## 99 D_Acknowledgement L_Acknowledgement 19
## 100 D_Advice L_Acknowledgement 5
## 101 D_Elaboration L_Acknowledgement 476
## 102 D_HedgedDisclosure L_Acknowledgement 100
## 104 D_Question L_Acknowledgement 15
## 105 D_Reflection L_Acknowledgement 3
## 113 D_Acknowledgement L_Advice 7
## 114 D_Advice L_Advice 0
## 115 D_Elaboration L_Advice 23
## 116 D_HedgedDisclosure L_Advice 4
# Create discloser --> listener transition data set
# Remove transitions that begin with a listener turn to create the L->D subset
# by selecting all rows in lagged_turn_type with the listener tag "L_"
<- data_sup_counts[ grep("L_", data_sup_counts$lagged_turn_type, invert = TRUE) , ]
disc2list_sup # Remove transitions that end with a listener turn
# by selecting all rows in turn_type with the discloser tag "D_"
<- disc2list_sup[ grep("D_", disc2list_sup$turn_type, invert = TRUE) , ]
disc2list_sup nrow(disc2list_sup)
## [1] 36
# View the first 10 rows of the data
head(disc2list_sup, 10)
## turn_type lagged_turn_type Freq
## 8 L_Acknowledgement D_Acknowledgement 11
## 9 L_Advice D_Acknowledgement 6
## 10 L_Elaboration D_Acknowledgement 95
## 11 L_HedgedDisclosure D_Acknowledgement 26
## 13 L_Question D_Acknowledgement 17
## 14 L_Reflection D_Acknowledgement 27
## 22 L_Acknowledgement D_Advice 5
## 23 L_Advice D_Advice 0
## 24 L_Elaboration D_Advice 4
## 25 L_HedgedDisclosure D_Advice 0
Given that we are examining the transitions between six listener turn types and six discloser turn types, we should expect each of our data sets to contain 36 rows. Because all six turn types occur in both turn_type and lagged_turn_type columns in our data sets, a crosstabs table should have 6 * 6 = 36 cells. If some of the transitions were not observed in our data sets, there should be rows that count 0 for those transitions, such as L_Acknowledgement->D_Reflection in the unsupportive data set.
Finally, the confreq
package requires the data to be
organized in a specific way to conduct configural frequency analysis.
Specifically, each data set should have three columns that represent (1)
the type of lagged turn, (2) the current type of turn, and (3) the
frequency of that turn transition. In the first two columns, each turn
type needs to be represented by a number instead of a category label.
So, we number the categories in alphabetical order (1 = acknowledgement,
2 = advice, 3 = elaboration, 4 = hedged disclosure, 5 = question, 6 =
reflection). The code chunk below recodes the turn labels into
numbers.
# Unsupportive conversations
# Recoding variables in listener --> discloser transition data set
<- # Select data
list2disc_uns %>%
list2disc_uns # Recode lagged turn and following turn variables with numbers
mutate(lagged_turn_type = recode(list2disc_uns$lagged_turn_type,
"L_Acknowledgement" = 1,
"L_Advice" = 2,
"L_Elaboration" = 3,
"L_HedgedDisclosure" = 4,
"L_Question" = 5,
"L_Reflection" = 6),
turn_type = recode(list2disc_uns$turn_type,
"D_Acknowledgement" = 1,
"D_Advice" = 2,
"D_Elaboration" = 3,
"D_HedgedDisclosure" = 4,
"D_Question" = 5,
"D_Reflection" = 6)) %>%
# Save the data as a data.frame
as.data.frame()
# Recoding variables in discloser --> listener transition data set
<- # Select data
disc2list_uns %>%
disc2list_uns # Recode lagged turn and following turn variables with numbers
mutate(lagged_turn_type = recode(disc2list_uns$lagged_turn_type,
"D_Acknowledgement" = 1,
"D_Advice" = 2,
"D_Elaboration" = 3,
"D_HedgedDisclosure" = 4,
"D_Question" = 5,
"D_Reflection" = 6),
turn_type = recode(disc2list_uns$turn_type,
"L_Acknowledgement" = 1,
"L_Advice" = 2,
"L_Elaboration" = 3,
"L_HedgedDisclosure" = 4,
"L_Question" = 5,
"L_Reflection" = 6)) %>%
# Save the data as a data.frame
as.data.frame()
# Supportive conversations
# Recoding variables in listener --> discloser transition data set
<- # Select data
list2disc_sup %>%
list2disc_sup # Recode lagged turn and following turn variables with numbers
mutate(lagged_turn_type = recode(list2disc_sup$lagged_turn_type,
"L_Acknowledgement" = 1,
"L_Advice" = 2,
"L_Elaboration" = 3,
"L_HedgedDisclosure" = 4,
"L_Question" = 5,
"L_Reflection" = 6),
turn_type = recode(list2disc_sup$turn_type,
"D_Acknowledgement" = 1,
"D_Advice" = 2,
"D_Elaboration" = 3,
"D_HedgedDisclosure" = 4,
"D_Question" = 5,
"D_Reflection" = 6)) %>%
# Save the data as a data.frame
as.data.frame()
# Recoding variables in discloser --> listener transition data set
<- # Select data
disc2list_sup %>%
disc2list_sup # Recode lagged turn and following turn variables with numbers
mutate(lagged_turn_type = recode(disc2list_sup$lagged_turn_type,
"D_Acknowledgement" = 1,
"D_Advice" = 2,
"D_Elaboration" = 3,
"D_HedgedDisclosure" = 4,
"D_Question" = 5,
"D_Reflection" = 6),
turn_type = recode(disc2list_sup$turn_type,
"L_Acknowledgement" = 1,
"L_Advice" = 2,
"L_Elaboration" = 3,
"L_HedgedDisclosure" = 4,
"L_Question" = 5,
"L_Reflection" = 6)) %>%
# Save the data as a data.frame
as.data.frame()
Let’s reorder the values of the turn variables since they represent the turn categories in alphabetical order, and reorganize the columns so the lagged turns come first.
# Unsupportive conversations
# Order data by lagged_turn_type and turn_type for listener --> discloser turn transitions
<- list2disc_uns[order(list2disc_uns$lagged_turn_type, list2disc_uns$turn_type), ]
list2disc_uns
# Reorder the columns for listener --> discloser turn transitions for unsupportive conversations
<- list2disc_uns[, c(2, 1, 3)]
list2disc_uns
# View the first 10 rows of the data
head(list2disc_uns, 10)
## lagged_turn_type turn_type Freq
## 99 1 1 9
## 100 1 2 3
## 101 1 3 291
## 102 1 4 56
## 104 1 5 13
## 105 1 6 0
## 113 2 1 3
## 114 2 2 1
## 115 2 3 8
## 116 2 4 1
# Order data by lagged_turn_type and turn_type for discloser --> listener turn transitions
<- disc2list_uns[order(disc2list_uns$lagged_turn_type, disc2list_uns$turn_type), ]
disc2list_uns
# Reorder the columns for discloser --> listener turn transitions for unsupportive conversations
<- disc2list_uns[, c(2, 1, 3)]
disc2list_uns
# View the first 10 rows of the data
head(disc2list_uns, 10)
## lagged_turn_type turn_type Freq
## 8 1 1 5
## 9 1 2 2
## 10 1 3 24
## 11 1 4 7
## 13 1 5 7
## 14 1 6 19
## 22 2 1 6
## 23 2 2 0
## 24 2 3 1
## 25 2 4 0
# Supportive conversations
# Order data by lagged_turn_type and turn_type for listener --> discloser turn transitions
<- list2disc_sup[order(list2disc_sup$lagged_turn_type, list2disc_sup$turn_type), ]
list2disc_sup
# Reorder the columns for listener --> discloser turn transitions for supportive conversations
<- list2disc_sup[, c(2, 1, 3)]
list2disc_sup
# View the first 10 rows of the data
head(list2disc_sup, 10)
## lagged_turn_type turn_type Freq
## 99 1 1 19
## 100 1 2 5
## 101 1 3 476
## 102 1 4 100
## 104 1 5 15
## 105 1 6 3
## 113 2 1 7
## 114 2 2 0
## 115 2 3 23
## 116 2 4 4
# Order data by lagged_turn_type and turn_type for discloser --> listener turn transitions
<- disc2list_sup[order(disc2list_sup$lagged_turn_type, disc2list_sup$turn_type), ]
disc2list_sup
# Reorder the columns for discloser --> listener turn transitions for unsupportive conversations
<- disc2list_sup[, c(2, 1, 3)]
disc2list_sup
# View the first 10 rows of the data
head(disc2list_sup, 10)
## lagged_turn_type turn_type Freq
## 8 1 1 11
## 9 1 2 6
## 10 1 3 95
## 11 1 4 26
## 13 1 5 17
## 14 1 6 27
## 22 2 1 5
## 23 2 2 0
## 24 2 3 4
## 25 2 4 0
We can also rename the columns to represent the speaker of that turn.
# Unsupportive conversations
# Rename columns in listener --> discloser turn transition data
colnames(list2disc_uns) <- c("listener", "discloser", "freq")
# Rename columns in discloser --> listener turn transition data
colnames(disc2list_uns) <- c("discloser", "listener", "freq")
# Supportive conversations
# Rename columns in listener --> discloser turn transition data
colnames(list2disc_sup) <- c("listener", "discloser", "freq")
# Rename columns in discloser --> listener turn transition data
colnames(disc2list_sup) <- c("discloser", "listener", "freq")
Conduct the Configural Frequency Analysis.
Let’s examine the structure of the data sets.
We need to check whether the variables are in the correct format. Specifically, we need the variables that label the turn types of the listeners and disclosers to be factor variables (instead of integer variables). A factor variable makes sure R interprets the variables as categories instead of integers.
Examine the structure of the listener –> discloser and discloser –> listener data.
# Unsupportive conversation transition data
str(list2disc_uns)
## 'data.frame': 36 obs. of 3 variables:
## $ listener : num 1 1 1 1 1 1 2 2 2 2 ...
## $ discloser: num 1 2 3 4 5 6 1 2 3 4 ...
## $ freq : int 9 3 291 56 13 0 3 1 8 1 ...
str(disc2list_uns)
## 'data.frame': 36 obs. of 3 variables:
## $ discloser: num 1 1 1 1 1 1 2 2 2 2 ...
## $ listener : num 1 2 3 4 5 6 1 2 3 4 ...
## $ freq : int 5 2 24 7 7 19 6 0 1 0 ...
# Supportive conversation transition data
str(list2disc_sup)
## 'data.frame': 36 obs. of 3 variables:
## $ listener : num 1 1 1 1 1 1 2 2 2 2 ...
## $ discloser: num 1 2 3 4 5 6 1 2 3 4 ...
## $ freq : int 19 5 476 100 15 3 7 0 23 4 ...
str(disc2list_sup)
## 'data.frame': 36 obs. of 3 variables:
## $ discloser: num 1 1 1 1 1 1 2 2 2 2 ...
## $ listener : num 1 2 3 4 5 6 1 2 3 4 ...
## $ freq : int 11 6 95 26 17 27 5 0 4 0 ...
Need the discloser and listener variables to be factors in both datasets.
# Unsupportive conversation transition data
$listener <- as.factor(list2disc_uns$listener)
list2disc_uns$discloser <- as.factor(list2disc_uns$discloser)
list2disc_uns
$listener <- as.factor(disc2list_uns$listener)
disc2list_uns$discloser <- as.factor(disc2list_uns$discloser)
disc2list_uns
# Supportive conversation transition data
$listener <- as.factor(list2disc_sup$listener)
list2disc_sup$discloser <- as.factor(list2disc_sup$discloser)
list2disc_sup
$listener <- as.factor(disc2list_sup$listener)
disc2list_sup$discloser <- as.factor(disc2list_sup$discloser) disc2list_sup
The two-sample configural frequency analysis function requires that the data are formatted as a response pattern frequency table.
# Change format of data for configural frequency analysis
# Insert data ("list2disc") in the dat2fre(fre2dat()) function
# Unsupportive conversation transition data
<-dat2fre(fre2dat(list2disc_uns)) cfa_list2disc_uns
## Number of categories for each variable
## estimated from data are:
## listener discloser
## 6 6
## --> 36 different configurations
<-dat2fre(fre2dat(disc2list_uns)) cfa_disc2list_uns
## Number of categories for each variable
## estimated from data are:
## discloser listener
## 6 6
## --> 36 different configurations
# Supportive conversation transition data
<-dat2fre(fre2dat(list2disc_sup)) cfa_list2disc_sup
## Number of categories for each variable
## estimated from data are:
## listener discloser
## 6 6
## --> 36 different configurations
<-dat2fre(fre2dat(disc2list_sup)) cfa_disc2list_sup
## Number of categories for each variable
## estimated from data are:
## discloser listener
## 6 6
## --> 36 different configurations
Next, we add a grouping variables to the “supportive” and “unsupportive” data sets, which will help us distinguish the two groups once we combine the data sets.
# Unsupportive conversation data
$Group <- "U"
cfa_list2disc_uns$Group <- "U"
cfa_disc2list_uns
# Supportive conversation data
$Group <- "S"
cfa_list2disc_sup$Group <- "S" cfa_disc2list_sup
Finally, we combine the “unsupportive” and the “supportive” conversation data sets and make sure the group variable is a factor.
# Combine data sets for the listener --> discloser transitions
<- rbind(cfa_list2disc_sup, cfa_list2disc_uns)
cfa_list2disc_supportcomp
# Make group variable a factor variable
$Group <- as.factor(cfa_list2disc_supportcomp$Group)
cfa_list2disc_supportcomp
# Rearrange columns
<- cfa_list2disc_supportcomp[, c(1, 2, 4, 3)]
cfa_list2disc_supportcomp
# Combine data sets for the discloser --> listener transitions
<- rbind(cfa_disc2list_sup, cfa_disc2list_uns)
cfa_disc2list_supportcomp
# Make group variable a factor variable
$Group <- as.factor(cfa_disc2list_supportcomp$Group)
cfa_disc2list_supportcomp
# Rearrange columns
<- cfa_disc2list_supportcomp[, c(1, 2, 4, 3)] cfa_disc2list_supportcomp
Conduct the two-sample configural frequency analysis for “supportive” vs “unsupportive” conversations’ listener –> discloser transitions.
<- S2CFA(cfa_list2disc_supportcomp)
cfa_compare1 summary(cfa_compare1)
##
## Grouping by variable: Group , with categories: S U
## pattern based on variables: listener discloser
## -----------------------
## results of local tests:
## -----------------------
## discriminating Type (+) / not discriminating Type (.) based on: ex.fisher.test ;
## with Bonferroni adjusted alpha: 0.001388889
## pat. disc.Type S.exp. S.obs. U.exp. U.obs. ex.fisher.test Chi df pChi
## 1 1 1 . 18.407 19 9.593 9 0.156 0.056 1 0.812
## 2 1 2 . 5.259 5 2.741 3 0.277 0.037 1 0.847
## 3 1 3 + 504.214 476 262.786 291 0.001 6.674 1 0.010
## 4 1 4 . 102.552 100 53.448 56 0.062 0.198 1 0.656
## 5 1 5 . 18.407 15 9.593 13 0.062 1.861 1 0.172
## 6 1 6 . 1.972 3 1.028 0 0.284 1.565 1 0.211
## 7 2 1 . 6.574 7 3.426 3 0.257 0.081 1 0.776
## 8 2 2 . 0.657 0 0.343 1 0.343 1.920 1 0.166
## 9 2 3 . 20.379 23 10.621 8 0.097 0.996 1 0.318
## 10 2 4 . 3.287 4 1.713 1 0.320 0.452 1 0.501
## 11 2 5 . 0.657 1 0.343 0 0.657 0.521 1 0.470
## 12 2 6 . 0.657 0 0.343 1 0.343 1.920 1 0.166
## 13 3 1 . 64.424 76 33.576 22 0.003 6.321 1 0.012
## 14 3 2 . 3.287 2 1.713 3 0.174 1.474 1 0.225
## 15 3 3 + 222.854 250 116.146 89 0.000 11.181 1 0.001
## 16 3 4 . 27.610 27 14.390 15 0.126 0.040 1 0.841
## 17 3 5 . 38.786 30 20.214 29 0.006 5.950 1 0.015
## 18 3 6 . 11.833 16 6.167 2 0.022 4.314 1 0.038
## 19 4 1 . 28.268 32 14.732 11 0.065 1.464 1 0.226
## 20 4 2 . 0.657 0 0.343 1 0.343 1.920 1 0.166
## 21 4 3 . 51.276 52 26.724 26 0.095 0.031 1 0.861
## 22 4 4 . 11.176 11 5.824 6 0.199 0.008 1 0.928
## 23 4 5 . 4.602 6 2.398 1 0.193 1.244 1 0.265
## 24 4 6 . 2.630 4 1.370 0 0.187 2.088 1 0.148
## 25 5 1 . 9.203 8 4.797 6 0.170 0.462 1 0.497
## 26 5 2 . 0.657 1 0.343 0 0.657 0.521 1 0.470
## 27 5 3 . 151.856 155 79.144 76 0.053 0.210 1 0.647
## 28 5 4 . 15.120 15 7.880 8 0.173 0.003 1 0.958
## 29 5 5 . 3.287 3 1.713 2 0.334 0.073 1 0.787
## 30 5 6 . 0.000 0 0.000 0 1.000 NaN 1 NaN
## 31 6 1 . 34.184 42 17.816 10 0.008 5.328 1 0.021
## 32 6 2 . 0.000 0 0.000 0 1.000 NaN 1 NaN
## 33 6 3 . 225.483 211 117.517 132 0.010 3.151 1 0.076
## 34 6 4 . 28.925 27 15.075 17 0.103 0.381 1 0.537
## 35 6 5 . 6.574 7 3.426 3 0.257 0.081 1 0.776
## 36 6 6 . 3.287 1 1.713 4 0.045 4.654 1 0.031
In the results, we can examine the column “Type” to determine which transitions differ between the two groups. We can see that the transitions between (1) listener acknowledgements and discloser elaborations transitions differed between groups, such that this transition occurred more frequently than expected in “unsupportive” conversations and less frequently than expected in “supportive” conversations and (2) listener elaborations and discloser elaborations differed between groups, such that this transition occurred more frequently than expected in “supportive” conversations and less frequently than expected in “unsupportive” conversations.
Conduct the two-sample configural frequency analysis for “supportive” vs “unsupportive” conversations’ discloser –> listener transitions.
<- S2CFA(cfa_disc2list_supportcomp)
cfa_compare2 summary(cfa_compare2)
##
## Grouping by variable: Group , with categories: S U
## pattern based on variables: discloser listener
## -----------------------
## results of local tests:
## -----------------------
## discriminating Type (+) / not discriminating Type (.) based on: ex.fisher.test ;
## with Bonferroni adjusted alpha: 0.001388889
## pat. disc.Type S.exp. S.obs. U.exp. U.obs. ex.fisher.test Chi df pChi
## 1 1 1 . 10.530 11 5.470 5 0.205 0.062 1 0.804
## 2 1 2 . 5.265 6 2.735 2 0.266 0.301 1 0.583
## 3 1 3 + 78.317 95 40.683 24 0.000 10.915 1 0.001
## 4 1 4 . 21.718 26 11.282 7 0.044 2.502 1 0.114
## 5 1 5 . 15.795 17 8.205 7 0.154 0.271 1 0.602
## 6 1 6 . 30.274 27 15.726 19 0.072 1.055 1 0.304
## 7 2 1 . 7.239 5 3.761 6 0.091 2.035 1 0.154
## 8 2 2 . 0.000 0 0.000 0 1.000 NaN 1 NaN
## 9 2 3 . 3.291 4 1.709 1 0.321 0.448 1 0.503
## 10 2 4 . 0.000 0 0.000 0 1.000 NaN 1 NaN
## 11 2 5 . 1.316 1 0.684 1 0.450 0.222 1 0.637
## 12 2 6 . 0.000 0 0.000 0 1.000 NaN 1 NaN
## 13 3 1 + 519.920 486 270.080 304 0.000 9.467 1 0.002
## 14 3 2 . 21.718 27 11.282 6 0.022 3.808 1 0.051
## 15 3 3 . 216.524 231 112.476 98 0.010 3.260 1 0.071
## 16 3 4 . 61.864 63 32.136 31 0.086 0.063 1 0.801
## 17 3 5 . 128.335 131 66.665 64 0.058 0.176 1 0.675
## 18 3 6 . 237.584 239 123.416 122 0.047 0.029 1 0.865
## 19 4 1 . 105.300 106 54.700 54 0.068 0.015 1 0.904
## 20 4 2 . 1.316 0 0.684 2 0.117 3.853 1 0.050
## 21 4 3 . 21.718 26 11.282 7 0.044 2.502 1 0.114
## 22 4 4 . 8.556 10 4.444 3 0.174 0.717 1 0.397
## 23 4 5 . 25.009 24 12.991 14 0.126 0.121 1 0.728
## 24 4 6 . 28.958 22 15.042 22 0.011 4.977 1 0.026
## 25 5 1 . 14.479 13 7.521 9 0.138 0.446 1 0.504
## 26 5 2 . 1.316 1 0.684 1 0.450 0.222 1 0.637
## 27 5 3 . 46.727 41 24.273 30 0.035 2.113 1 0.146
## 28 5 4 . 7.239 4 3.761 7 0.034 4.259 1 0.039
## 29 5 5 . 2.633 3 1.367 1 0.390 0.150 1 0.698
## 30 5 6 . 1.316 1 0.684 1 0.450 0.222 1 0.637
## 31 6 1 . 5.923 6 3.077 3 0.273 0.003 1 0.957
## 32 6 2 . 0.658 1 0.342 0 0.658 0.520 1 0.471
## 33 6 3 . 9.214 12 4.786 2 0.070 2.478 1 0.115
## 34 6 4 . 0.658 1 0.342 0 0.658 0.520 1 0.471
## 35 6 5 . 1.316 2 0.684 0 0.433 1.040 1 0.308
## 36 6 6 . 1.974 2 1.026 1 0.444 0.001 1 0.975
In the results, we can examine the column “Type” to determine which transitions differ between the two groups. We can see that the transitions between (1) discloser acknowledgements and listener elaborations transitions differed between groups, such that this transition occurred more frequently than expected in “supportive” conversations and less frequently than expected in “unsupportive” conversations and (2) discloser elaborations and listener acknowledgements differed between groups, such that this transition occurred more frequently than expected in “unsupportive” conversations and less frequently than expected in “supportive” conversations.
Additional Information
We created this tutorial with a system environment and versions of R and packages that might be different from yours. If R reports errors when you attempt to run this tutorial, running the code chunk below and comparing your output and the tutorial posted on the LHAMA website may be helpful.
session_info(pkgs = c("attached"))
## ─ Session info ───────────────────────────────────────────────────────────────
## setting value
## version R version 4.2.0 (2022-04-22)
## os macOS Big Sur/Monterey 10.16
## system x86_64, darwin17.0
## ui X11
## language (EN)
## collate en_US.UTF-8
## ctype en_US.UTF-8
## tz America/New_York
## date 2022-07-26
## pandoc 2.17.1.1 @ /Applications/RStudio.app/Contents/MacOS/quarto/bin/ (via rmarkdown)
##
## ─ Packages ───────────────────────────────────────────────────────────────────
## package * version date (UTC) lib source
## confreq * 1.5.6-7 2022-04-12 [1] CRAN (R 4.2.0)
## data.table * 1.14.2 2021-09-27 [1] CRAN (R 4.2.0)
## devtools * 2.4.3 2021-11-30 [1] CRAN (R 4.2.0)
## dplyr * 1.0.9 2022-04-28 [1] CRAN (R 4.2.0)
## forcats * 0.5.1 2021-01-27 [1] CRAN (R 4.2.0)
## ggplot2 * 3.3.6 2022-05-03 [1] CRAN (R 4.2.0)
## gmp * 0.6-5 2022-03-17 [1] CRAN (R 4.2.0)
## purrr * 0.3.4 2020-04-17 [1] CRAN (R 4.2.0)
## readr * 2.1.2 2022-01-30 [1] CRAN (R 4.2.0)
## stringr * 1.4.0 2019-02-10 [1] CRAN (R 4.2.0)
## tibble * 3.1.7 2022-05-03 [1] CRAN (R 4.2.0)
## tidyr * 1.2.0 2022-02-01 [1] CRAN (R 4.2.0)
## tidyverse * 1.3.1 2021-04-15 [1] CRAN (R 4.2.0)
## usethis * 2.1.6 2022-05-25 [1] CRAN (R 4.2.0)
## vcd * 1.4-10 2022-06-09 [1] CRAN (R 4.2.0)
##
## [1] /Library/Frameworks/R.framework/Versions/4.2/Resources/library
##
## ──────────────────────────────────────────────────────────────────────────────