|
@@ -1,17 +1,17 @@
|
|
|
---
|
|
|
title: Supplementary Materials to Establishing the reliability and validity of measures extracted from long-form recordings
|
|
|
output:
|
|
|
+ pdf_document:
|
|
|
+ toc: yes
|
|
|
+ toc_depth: 3
|
|
|
html_document:
|
|
|
toc: yes
|
|
|
toc_depth: '3'
|
|
|
df_print: paged
|
|
|
- pdf_document:
|
|
|
- toc: yes
|
|
|
- toc_depth: 3
|
|
|
---
|
|
|
|
|
|
```{r setup, include=FALSE, eval=TRUE}
|
|
|
-knitr::opts_chunk$set(echo = FALSE, warning = FALSE)
|
|
|
+knitr::opts_chunk$set(echo = FALSE, warning = FALSE, message = FALSE)
|
|
|
|
|
|
set.seed(726282)
|
|
|
|
|
@@ -296,6 +296,10 @@ panel.background = element_blank(), legend.key=element_blank(), axis.line = elem
|
|
|
|
|
|
|
|
|
|
|
|
+The majority of measures had ICCs between .3 and .5. `r sum(df.icc.mixed$icc_child_id > .5)` measures had higher ICCs, and surprisingly, `r sum(df.icc.mixed$icc_child_id[grep("och",df.icc.mixed$metric)] > .5)` of them corresponded to the "other child" category, known to have the worst accuracy according to previous analyses (Cristia et al., 2020).
|
|
|
+
|
|
|
+### Checking whether high ICC for other child measures are due to presence of other siblings
|
|
|
+
|
|
|
```{r explo-och-sibn}
|
|
|
|
|
|
#df.icc.mixed[df.icc.mixed$icc_child_id>.5,c("data_set","metric")]
|
|
@@ -323,26 +327,32 @@ myaclewdat$ch_id[myaclewdat$experiment=="winnipeg"]=gsub(" C"," CW",myaclewdat$c
|
|
|
#sum(myaclewdat$ch_id %in% x$ch_id)
|
|
|
#sum(x$ch_id %in% myaclewdat$ch_id)
|
|
|
|
|
|
-
|
|
|
-mydat2=merge(myaclewdat,x[,c("ch_id","n_of_siblings")],all.x=T,by="ch_id")
|
|
|
+metadata=x[,c("ch_id","n_of_siblings")]
|
|
|
+read.csv("../input/quechua_md.csv")->x
|
|
|
+x$ch_id=paste("que",x$child_id)
|
|
|
+metadata=rbind(metadata,x[,c("ch_id","n_of_siblings")])
|
|
|
+mydat2=merge(myaclewdat,metadata,all.x=T,by="ch_id")
|
|
|
#table(mydat2$n_of_siblings,mydat2$experiment)
|
|
|
-
|
|
|
+has_n_of_sib=table(mydat2$experiment,!is.na(mydat2$n_of_siblings))
|
|
|
+corp_w_sib=levels(factor(mydat2$experiment[!is.na(mydat2$n_of_siblings)]))
|
|
|
+corp_w_sib_clean=corp_w_sib[1]
|
|
|
+for(i in 2:length(corp_w_sib)) corp_w_sib_clean=paste(corp_w_sib_clean,corp_w_sib[i],sep=", ")
|
|
|
+
|
|
|
model<-lmer(voc_dur_och_ph~ age_s +n_of_siblings + (1|experiment/child_id),data=mydat2)
|
|
|
+
|
|
|
#is sing
|
|
|
model<-lmer(voc_dur_och_ph~ age_s +n_of_siblings + (1|child_id),data=mydat2)
|
|
|
icc.result.split<- t(as.data.frame(icc(model, by_group=TRUE))$ICC)
|
|
|
#names(icc.result.split)=c("icc_child_id", "icc_corpus")
|
|
|
names(icc.result.split)=c("icc_child_id")
|
|
|
|
|
|
-read.csv("../input/quechua_md.csv")->x ##OH this needs to be done!!!
|
|
|
-
|
|
|
-
|
|
|
```
|
|
|
|
|
|
|
|
|
-The majority of measures had ICCs between .3 and .5. `r sum(df.icc.mixed$icc_child_id > .5)` measures had higher ICCs, and surprisingly, `r sum(df.icc.mixed$icc_child_id > .5)` of them
|
|
|
+We reasoned this may be because children in our corpora vary in terms of the number of siblings they have, and that siblings' presence may be stable across recordings. To address this possibility, we fit the full model again to predict number of vocalizations from other children, but this time including sibling number as a fixed effect $lmer(metric~ age + sibling_number + (1|corpus/child))$, so that individual variation that was actually due to sibling number was captured by that fixed effect instead of the random effect for child. We had sibling number data for `r sum(has_n_of_sib[,"TRUE"])` recordings from `r length(levels(factor(mydat2$child_id[!is.na(mydat2$n_of_siblings)])))` in `r length(levels(factor(mydat2$experiment[!is.na(mydat2$n_of_siblings)])))` corpora (`r corp_w_sib_clean`). We fit this model for the metric with the highest Child ICC, ACLEW's total vocalization duration by other children. Results indicated the full model was singular, so we fitted a No Corpus model to be able to extract a Child ICC. In fact, there was no difference in Child ICC in our original analysis (`r round(df.icc.mixed[df.icc.mixed$metric=="voc_dur_och_ph" & df.icc.mixed$data_set=="aclew","icc_child_id"],2)`) versus this re-analysis including the number of siblings (`r round(icc.result.split["icc_child_id"],2)`).
|
|
|
+
|
|
|
|
|
|
-Six measures had higher ICCs, and surprisingly, they corresponded to the "other child" category, known to have the worst accuracy according to previous analyses (Cristia et al., 2020). We reasoned this may be because children in our corpora vary in terms of the number of siblings they have, and that siblings' presence may be stable across recordings. To address this possibility, we fit the full model again to predict number of vocalizations from other children, but this time including sibling number as a fixed effect $lmer(metric~ age + sibling_number + (1|corpus/child))$, so that individual variation that was actually due to sibling number was captured by that fixed effect instead of the random effect for child. We did this for the metric with the highest Child ICC, ACLEW's total vocalization duration by other children. Results indicated the full model was singular, a first sign that included variables explained shared variance. When we fitted the No Corpus model, Child ICC was indeed reduced from `r round(df.icc.mixed[df.icc.mixed$metric=="voc_dur_och_ph" & df.icc.mixed$data_set=="aclew","icc_child_id"],2)` to `r round(icc.result.split["icc_child_id"],2)`.
|
|
|
+### Code to reproduce text before Table 3
|
|
|
|
|
|
```{r reg model icc}
|
|
|
#I moved this chunk here -- check that nothing is broken by it
|
|
@@ -356,7 +366,7 @@ reg_anova=Anova(lr_icc_chi)
|
|
|
```
|
|
|
|
|
|
|
|
|
-Going back to our overarching analyses, we explored how similar Child ICCs were across different talker types and pipelines. We fit a linear model with the formula lm(icc_child_id ~ type * pipeline), where type indicates whether the measure pertained to the key child, (female/male) adults, other children; and pipeline LENA or ACLEW. We found an adjusted R-squared of `r round(reg_sum$adj.r.squared*100)`%, suggesting much of the variance across Child ICCs was explained by this model. A Type 3 ANOVA on this model revealed only type was a signficant (F(`r reg_anova["Type","Df"]`)=`r round(reg_anova["Type","F value"],1)`, p<.001), whereas neither pipeline nor the interaction between type and pipeline were significant.
|
|
|
+Going back to our overarching analyses, we explored how similar Child ICCs were across different talker types and pipelines. We fit a linear model with the formula $lm(icc_child_id ~ type * pipeline)$, where type indicates whether the measure pertained to the key child, (female/male) adults, other children; and pipeline LENA or ACLEW. We found an adjusted R-squared of `r round(reg_sum$adj.r.squared*100)`%, suggesting much of the variance across Child ICCs was explained by this model. A Type 3 ANOVA on this model revealed only type was a signficant (F(`r reg_anova["Type","Df"]`)=`r round(reg_anova["Type","F value"],1)`, p<.001), whereas neither pipeline nor the interaction between type and pipeline were significant.
|
|
|
|
|
|
|
|
|
|
|
@@ -413,7 +423,7 @@ mydat_aclew <- read.csv(paste0('../data_output/', "aclew",'_metrics_scaled.csv')
|
|
|
#length(dist_contig_aclew$session_id[!(dist_contig_aclew$session_id %in% dist_contig_lena$session_id)]) # in fact, we have lots of sessions not in common!
|
|
|
#length(dist_contig_lena$session_id[!(dist_contig_lena$session_id %in% dist_contig_aclew$session_id)])
|
|
|
# they are present in aclew but not in lena
|
|
|
-# NOTE: I have "winnipeg C175 C175_20151201" "winnipeg C175 C175_20160301" for lena but not aclew; and i have "fausey-trio T066 T066/T066_000700" "quechua 1096 20190630_190025_009107" "quechua 1096 20190702_193551_008712" for aclew but not lena? It may well be a bug I introduced myself when adding the ava standard score (but if that were the case, I'd only have some things present in aclew but not LENA -- the fact that I have some in lena but not aclew would remain unexplained).
|
|
|
+# NOTE: I have "winnipeg C175 C175_20151201" "winnipeg C175 C175_20160301" for lena but not aclew; and i have "fausey-trio T066 T066/T066_000700" "quechua 1096 20190630_190025_009107" "quechua 1096 20190702_193551_008712"
|
|
|
|
|
|
#one thing that drove me crazy was that, probably because of the small differences in inclusion (2 recs in aclew & lena respectively), I was ending up with different lists of pairings across aclew & lena. So to simplify, I'll impose the same pairing across both, which involves losing a couple of additional recs in lena
|
|
|
xxx=mydat_aclew[mydat_aclew$session_id %in% mydat_lena$session_id,]
|
|
@@ -505,7 +515,6 @@ ggplot(rval_tab, aes(y = m, x = toupper(p))) +
|
|
|
## Code to reproduce results of regression on correlation values
|
|
|
|
|
|
```{r reg model cor}
|
|
|
-#bug here probably inherited from above
|
|
|
|
|
|
|
|
|
lr_cor <- lm(m ~ Type * p, data=rval_tab)
|
|
@@ -516,14 +525,15 @@ reg_sum_cor=summary(lr_cor)
|
|
|
|
|
|
reg_anova_cor=Anova(lr_icc_chi)
|
|
|
|
|
|
+cor_t=t.test(rval_tab$m ~ rval_tab$p)
|
|
|
+
|
|
|
```
|
|
|
|
|
|
-To see whether correlations in this analysis differed by talker types and pipelines, we fit a linear model with the formula lm(cor ~ type * pipeline), where type indicates whether the measure pertained to the key child, (female/male) adults, other children; and pipeline LENA or ACLEW. We found an adjusted R-squared of `r round(reg_sum_cor$adj.r.squared*100)`%, suggesting this model did not explain a great deal of variance in correlation coefficients. Moreover, a Type 3 ANOVA on this model revealed no significant effects or interactions (all p's > .1). See SMXX for fuller results.
|
|
|
+To see whether correlations in this analysis differed by talker types and pipelines, we fit a linear model with the formula $lm(cor ~ type * pipeline)$, where type indicates whether the measure pertained to the key child, (female/male) adults, other children; and pipeline LENA or ACLEW. We found an adjusted R-squared of `r round(reg_sum_cor$adj.r.squared*100)`%, suggesting this model did not explain a great deal of variance in correlation coefficients. A Type 3 ANOVA on this model revealed a significant effect of pipeline (F = `r round(reg_anova_cor["data_set","F value"],2)`, p = `r round(reg_anova_cor["data_set","Pr(>F)"],2)`), due to higher correlations for ACLEW (m = `r round(cor_t$estimate["mean in group aclew"],2)`) than for LENA metrics (m = `r round(cor_t$estimate["mean in group lena"],2)`). See below for fuller results.
|
|
|
|
|
|
```{r print out anova results rec on cor}
|
|
|
-#bug here probably inherited from above
|
|
|
|
|
|
-kable(reg_anova_cor)
|
|
|
+kable(round(reg_anova_cor,2))
|
|
|
```
|
|
|
|
|
|
|
|
@@ -538,10 +548,10 @@ df.icc.corpus$Type <- get_type(df.icc.corpus)
|
|
|
|
|
|
```
|
|
|
|
|
|
-Figure 5A addresses this question, showing the distribution of ICC across our 53 metrics in each of the `r length(levels(factor(df.icc.corpus$corpus)))` included corpora. Out of `r dim(df.icc.corpus)[1]` fitted models (53 metrics times `r length(levels(factor(df.icc.corpus$corpus)))` corpora), `r sum(df.icc.corpus$formula=="no_chi_effect")` were singular when including a random intercept per child, and therefore they could not be included in these analyses at all, and the remaining `r sum(df.icc.corpus$formula=="no_exp")` were singular when including a random intercept per corpus.
|
|
|
+Figure 5 addresses this question, showing the distribution of ICC across our `r dim(df.icc.mixed)[1]` metrics in each of the `r length(levels(factor(df.icc.corpus$corpus)))` included corpora. Out of `r dim(df.icc.corpus)[1]` fitted models (`r dim(df.icc.mixed)[1]` metrics times `r length(levels(factor(df.icc.corpus$corpus)))` corpora), `r sum(df.icc.corpus$formula=="no_chi_effect")` were singular when including a random intercept per child, and therefore they could not be included in these analyses at all. (Including a random intercept per corpus is not relevant here, since only data from one corpus is included in each model fit.)
|
|
|
|
|
|
|
|
|
-```{r icc-bycor-fig5A, echo=F,fig.width=4, fig.height=10,fig.cap="Child ICC by metric type and pipeline, when considering each corpus separately."}
|
|
|
+```{r icc-bycor-fig5, echo=F,fig.width=4, fig.height=10,fig.cap="Child ICC by metric type and pipeline, when considering each corpus separately."}
|
|
|
|
|
|
ggplot(df.icc.corpus, aes(y = icc_child_id, x = toupper(data_set))) +
|
|
|
geom_violin(alpha = 0.5) +
|
|
@@ -552,7 +562,7 @@ ggplot(df.icc.corpus, aes(y = icc_child_id, x = toupper(data_set))) +
|
|
|
```
|
|
|
|
|
|
|
|
|
-```{r icc-bycor-fig5B, echo=F,fig.width=4, fig.height=10,fig.cap="Correlations in Child ICC across corpora. Each point indicates the correlation in Child ICC for the corpus named in the x-axis with every other corpus."}
|
|
|
+```{r icc-bycor-fig6, echo=F,fig.width=4, fig.height=10,fig.cap="Correlations in Child ICC across corpora. Each point indicates the correlation in Child ICC for the corpus named in the x-axis with every other corpus."}
|
|
|
|
|
|
|
|
|
|
|
@@ -580,7 +590,7 @@ ggplot(r_X_corpus, aes(y = cor, x = corpusA)) +
|
|
|
|
|
|
|
|
|
|
|
|
-```{r reg model corpusm,eval=F}
|
|
|
+```{r reg model corpusm}
|
|
|
|
|
|
|
|
|
cor_icc <- lm(icc_child_id ~ Type * data_set * corpus, data=df.icc.corpus)
|
|
@@ -593,16 +603,10 @@ reg_anova_cor_icc=Anova(cor_icc)
|
|
|
|
|
|
```
|
|
|
|
|
|
+The fact that we cannot infer reliability from one corpus based on another one was confirmed statistically: We checked whether Child ICC differed by talker types and pipelines across corpora by fitting a linear model with the formula $lm(Child_ICC ~ type * pipeline * corpus)$, where type indicates whether the measure pertained to the key child, (female/male) adults, other children; pipeline LENA or ACLEW; and corpus the corpus ID. We found an adjusted R-squared of `r round(reg_sum_cor_icc$adj.r.squared*100)`%, suggesting this model explained nearly half of the variance in Child ICC. A Type 3 ANOVA on this model revealed several significant effects and interactions, including a three-way interaction of type, pipeline, and corpus; a two-way interaction of type and corpus; and a main effect of corpus. See below for more information.
|
|
|
|
|
|
-
|
|
|
-```{r print out anova results rec on icc by corpus,eval=F}
|
|
|
-kable(reg_anova_cor_icc)
|
|
|
-
|
|
|
-
|
|
|
-
|
|
|
-
|
|
|
-# The fact that we cannot infer reliability from one corpus based on another one was confirmed statistically: We checked whether Child ICC differed by talker types and pipelines across corpora by fitting a linear model with the formula $lm(Child_ICC ~ type * pipeline * corpus)$, where type indicates whether the measure pertained to the key child, (female/male) adults, other children; pipeline LENA or ACLEW; and corpus the corpus ID. We found an adjusted R-squared of `r round(reg_sum_cor_icc$adj.r.squared*100)`%, suggesting this model explained over half of the variance in Child ICC. A Type 3 ANOVA on this model revealed several significant effects and interactions, including a three-way interaction of type, pipeline, and corpus; a two-way interaction of type and corpus; and a main effect of corpus. See the Supplementary Materials for more information.
|
|
|
-
|
|
|
+```{r print out anova results rec on icc by corpus}
|
|
|
+kable(round(reg_anova_cor_icc,2))
|
|
|
|
|
|
```
|
|
|
|
|
@@ -622,10 +626,10 @@ df.icc.age$age_bin<-factor(df.icc.age$age_bin,levels=age_levels)
|
|
|
df.icc.age$Type<-get_type(df.icc.age)
|
|
|
```
|
|
|
|
|
|
-Out of `r dim(df.icc.age)[1]` fitted models (53 metrics times `r length(levels(factor(df.icc.age$age_bin)))` age bins), `r sum(df.icc.age$formula=="no_chi_effect")` were singular when including a random intercept per child, and therefore they could not be included in these analyses at all. In addition, `r sum(df.icc.age$formula=="no_exp")` were singular when including a random intercept per corpus. The remaining `r sum(df.icc.age$formula=="full")` could be analyzed with the full model.
|
|
|
+Out of `r dim(df.icc.age)[1]` fitted models (`r dim(df.icc.mixed)[1]` metrics times `r length(levels(factor(df.icc.age$age_bin)))` age bins), `r sum(df.icc.age$formula=="no_chi_effect")` were singular when including a random intercept per child, and therefore they could not be included in these analyses at all. In addition, `r sum(df.icc.age$formula=="no_exp")` were singular when including a random intercept per corpus. The remaining `r sum(df.icc.age$formula=="full")` could be analyzed with the full model.
|
|
|
|
|
|
|
|
|
-```{r relBYage-fig6A, echo=F,fig.width=6, fig.height=10,fig.cap="Distribution of ICC attributed to corpus (a) and children (b), when binning children's age."}
|
|
|
+```{r relBYage-fig7, echo=F,fig.width=6, fig.height=10,fig.cap="Distribution of ICC attributed to corpus (a) and children (b), when binning children's age."}
|
|
|
|
|
|
#this complicated section is just to add N of participants in each facet, we first estimate it:
|
|
|
facet_labels_chi=facet_labels_cor=NULL
|
|
@@ -647,30 +651,15 @@ ggplot(df.icc.age, aes(y = icc_child_id, x = toupper(data_set))) +
|
|
|
geom_violin(alpha = 0.5) +
|
|
|
geom_quasirandom(aes(colour = Type,shape = Type)) +
|
|
|
theme(legend.position="none") +labs( y = "r",x="Pipeline") + facet_wrap(~age_bin, ncol = 3) +
|
|
|
- geom_text(x=1.5,y=max(df.icc.age$icc_child_id,na.rm=T),aes(label=facet_labels_chi),data=f_labels,size=2) +
|
|
|
- geom_text(x=1.5,y=max(df.icc.age$icc_child_id,na.rm=T)*.95,aes(label=facet_labels_cor),data=f_labels,size=2)
|
|
|
+ geom_text(x=1.5,y=max(df.icc.age$icc_child_id,na.rm=T),aes(label=facet_labels_chi),data=f_labels,size=3) +
|
|
|
+ geom_text(x=1.5,y=max(df.icc.age$icc_child_id,na.rm=T)*.95,aes(label=facet_labels_cor),data=f_labels,size=3)
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
-```{r reg model age}
|
|
|
-
|
|
|
|
|
|
-age_icc <- lm(icc_child_id ~ Type * data_set * age_bin, data=df.icc.age)
|
|
|
-#plot(age_icc)
|
|
|
-#binomial could be used, diagnostic plots look good
|
|
|
-
|
|
|
-reg_sum_age_icc=summary(age_icc)
|
|
|
-
|
|
|
-reg_anova_age_icc=Anova(age_icc)
|
|
|
-
|
|
|
-```
|
|
|
-
|
|
|
-As we did in the previous section for corpus, we checked whether Child ICC differed by talker types and pipelines across age bins by fitting a linear model with the formula $lm(Child_ICC ~ type * pipeline * age_bin)$. We found an adjusted R-squared of `r round(reg_sum_age_icc$adj.r.squared*100)`%, suggesting this model explained over half of the variance in Child ICC. However, a Type 3 ANOVA on this model revealed only an interaction of type and age bin, as well as a main effect of age bin, suggesting less complex effects than in the case of corpus. See the Supplementary Materials for more information.
|
|
|
-
|
|
|
-
|
|
|
-```{r icc-bycor-fig6B, echo=F,fig.width=4, fig.height=4,fig.cap="Correlations in Child ICC across corpora. Each point indicates the correlation in Child ICC for the corpus named in the x-axis with every other corpus."}
|
|
|
+```{r icc-bycor-fig8, echo=F,fig.width=4, fig.height=4,fig.cap="Correlations in Child ICC across corpora. Each point indicates the correlation in Child ICC for the corpus named in the x-axis with every other corpus."}
|
|
|
|
|
|
r_X_age = NULL
|
|
|
|
|
@@ -699,6 +688,28 @@ ggplot(r_X_age, aes(y = cor, x = ageA)) +
|
|
|
|
|
|
|
|
|
|
|
|
+```{r reg model age}
|
|
|
+
|
|
|
+
|
|
|
+age_icc <- lm(icc_child_id ~ Type * data_set * age_bin, data=df.icc.age)
|
|
|
+#plot(age_icc)
|
|
|
+#binomial could be used, diagnostic plots look good
|
|
|
+
|
|
|
+reg_sum_age_icc=summary(age_icc)
|
|
|
+
|
|
|
+reg_anova_age_icc=Anova(age_icc)
|
|
|
+
|
|
|
+```
|
|
|
+
|
|
|
+As we did in the previous section for corpus, we checked whether Child ICC differed by talker types and pipelines across age bins by fitting a linear model with the formula $lm(Child_ICC ~ type * pipeline * age_bin)$. We found an adjusted R-squared of `r round(reg_sum_age_icc$adj.r.squared*100)`%, suggesting this model explained about a third of the variance in Child ICC. However, a Type 3 ANOVA on this model revealed only an interaction of type and age bin, as well as a main effect of age bin, suggesting less complex effects than in the case of corpus. See below for more information.
|
|
|
+
|
|
|
+
|
|
|
+```{r print out anova results rec on icc by age}
|
|
|
+kable(round(reg_anova_age_icc,2))
|
|
|
+
|
|
|
+```
|
|
|
+
|
|
|
+
|
|
|
## Save information about packages used
|
|
|
|
|
|
```{r}
|