YODA repo to align vandam corpus using Montreal Forced Aligner.

mfrebo 707d897fcf Add 'LICENSE' 2 年之前
.datalad c11e500a15 [DATALAD] new dataset 2 年之前
.vscode 385198801d corrected with lucas' review 2 年之前
code dbd9efcf3f updated compare.py + confusion matrices 2 年之前
inputs 2cb46a39ea cleaned unnecessary files 2 年之前
outputs dbd9efcf3f updated compare.py + confusion matrices 2 年之前
.gitattributes c84fb1c33e Apply YODA dataset setup 2 年之前
.gitmodules 1d2b1741b5 [DATALAD] Recorded changes 2 年之前
CHANGELOG.md c84fb1c33e Apply YODA dataset setup 2 年之前
Comparison-summary.md 17093d6efb edited comaprison.md 2 年之前
LICENSE 707d897fcf Add 'LICENSE' 2 年之前
README.md 2820fd10b6 modified readme 2 年之前
datacite.yml dedb8732a6 Add information for publishing with DataCite 2 年之前

README.md

Project

Dataset structure

  • All inputs (i.e. building blocks from other sources) are located in inputs/.
  • All custom code is located in code/.

Steps to generate aligned .csv from vandam-data .cha annotations

  1. Run code/csv2grid with annotations/cha/converted as input (converts the original .csv to .TextGrid)
  2. Run MFA Align with output files of previous step as input (with inputs/mfa-models/acoustic & inputs/mfa-models/dictionary as required)
  3. Run code/grid2csv to convert .TextGrids to .csv with outputs of previous step as input.

Steps for comparison of aligned segments with human annotator

  1. Use child-project sampler to generate 5x 1 minute segments (high-volubility) and outputs them in outputs/
  2. Use child-project eaf-builder with files generated at previous step and templates at inputs/eaf_templates
  3. Annotate segments by hand on ELAN
  4. Create csv dataframe with each segment in outputs/fivesegments-eaf
  5. Import that .csv with child-project import-annotations
datacite.yml
Title Alignment of Vandam corpus using Montreal Forced Aligner
Authors FREBOURG,Martin;McGill University
Gautheron,Lucas;École Normale Supérieure - PSL
Cristia,Alejandrina;École Normale Supérieure - PSL
Description YODA repo to align vandam corpus using Montreal Forced Aligner (MFA).
License Apache License 2.0 (https://www.apache.org/licenses/LICENSE-2.0)
References Citation1 [doi:10.xxx/zzzz] (IsSupplementTo)
Citation2 [arxiv:mmmm.nnnn] (IsSupplementTo)
Citation3 [pmid:nnnnnnnn] (IsReferencedBy)
Funding DFG, AB1234/5-6
EU, EU.12345
Keywords Neuroscience
Linguistics
Vandam
Montreal Forced Aligner
MFA
annotations
Resource Type Dataset