mfrebo/align-vandam: YODA repo to align vandam corpus using Montreal Forced Aligner.

Scheduled service maintenance on November 22

On Friday, November 22, 2024, between 06:00 CET and 18:00 CET, GIN services will undergo planned maintenance. Extended service interruptions should be expected. We will try to keep downtimes to a minimum, but recommend that users avoid critical tasks, large data uploads, or DOI requests during this time.

We apologize for any inconvenience.

mfrebo 456da85a3a Update 'datacite.yml'		3 years ago
.datalad	c11e500a15 [DATALAD] new dataset	3 years ago
.vscode	385198801d corrected with lucas' review	3 years ago
code	dbd9efcf3f updated compare.py + confusion matrices	3 years ago
inputs	2cb46a39ea cleaned unnecessary files	3 years ago
outputs	dbd9efcf3f updated compare.py + confusion matrices	3 years ago
.gitattributes	c84fb1c33e Apply YODA dataset setup	3 years ago
.gitmodules	1d2b1741b5 [DATALAD] Recorded changes	3 years ago
CHANGELOG.md	c84fb1c33e Apply YODA dataset setup	3 years ago
Comparison-summary.md	17093d6efb edited comaprison.md	3 years ago
LICENSE	707d897fcf Add 'LICENSE'	3 years ago
README.md	2820fd10b6 modified readme	3 years ago
datacite.yml	456da85a3a Update 'datacite.yml'	3 years ago

mfrebo 456da85a3a Update 'datacite.yml'

3 years ago

.datalad

c11e500a15 [DATALAD] new dataset

3 years ago

.vscode

385198801d corrected with lucas' review

3 years ago

code

dbd9efcf3f updated compare.py + confusion matrices

3 years ago

inputs

2cb46a39ea cleaned unnecessary files

3 years ago

outputs

dbd9efcf3f updated compare.py + confusion matrices

3 years ago

.gitattributes

c84fb1c33e Apply YODA dataset setup

3 years ago

.gitmodules

1d2b1741b5 [DATALAD] Recorded changes

3 years ago

CHANGELOG.md

c84fb1c33e Apply YODA dataset setup

3 years ago

Comparison-summary.md

17093d6efb edited comaprison.md

3 years ago

LICENSE

707d897fcf Add 'LICENSE'

3 years ago

README.md

2820fd10b6 modified readme

3 years ago

datacite.yml

456da85a3a Update 'datacite.yml'

3 years ago

Steps to generate aligned .csv from vandam-data .cha annotations

Run code/csv2grid with annotations/cha/converted as input (converts the original .csv to .TextGrid)

Run MFA Align with output files of previous step as input (with inputs/mfa-models/acoustic & inputs/mfa-models/dictionary as required)

Run code/grid2csv to convert .TextGrids to .csv with outputs of previous step as input.

Steps for comparison of aligned segments with human annotator

Use child-project sampler to generate 5x 1 minute segments (high-volubility) and outputs them in outputs/

Use child-project eaf-builder with files generated at previous step and templates at inputs/eaf_templates

Annotate segments by hand on ELAN

Create csv dataframe with each segment in outputs/fivesegments-eaf

Import that .csv with child-project import-annotations

datacite.yml
Title	Alignment of Vandam corpus using Montreal Forced Aligner
Authors	FREBOURG,Martin;McGill University Gautheron,Lucas;École Normale Supérieure - PSL Cristia,Alejandrina;École Normale Supérieure - PSL
Description	YODA repo to align vandam corpus using Montreal Forced Aligner (MFA).
License	Apache License 2.0 (https://www.apache.org/licenses/LICENSE-2.0)
References	HomeBank VanDam Public Daylong Corpus - Mark VanDam [https://doi.org/10.21415/t5qh5n] (Dataset) McAuliffe, Michael, Michaela Socolof, Sarah Mihuc, Michael Wagner, and Morgan Sonderegger (2017). Montreal Forced Aligner [Computer program]. Version 0.9.0, retrieved 17 January 2017 from http://montrealcorpustools.github.io/Montreal-Forced-Aligner/. [http://dx.doi.org/10.21437/Interspeech.2017-1386] (Montreal Forced Aligner)
Funding
Keywords	Neuroscience Linguistics Vandam Montreal Forced Aligner MFA annotations
Resource Type	Dataset

datacite.yml

Title

Alignment of Vandam corpus using Montreal Forced Aligner

Authors

FREBOURG,Martin;McGill University
Gautheron,Lucas;École Normale Supérieure - PSL
Cristia,Alejandrina;École Normale Supérieure - PSL

Description

YODA repo to align vandam corpus using Montreal Forced Aligner (MFA).

License

Apache License 2.0 (https://www.apache.org/licenses/LICENSE-2.0)

References

HomeBank VanDam Public Daylong Corpus - Mark VanDam [https://doi.org/10.21415/t5qh5n] (Dataset)
McAuliffe, Michael, Michaela Socolof, Sarah Mihuc, Michael Wagner, and Morgan Sonderegger (2017). Montreal Forced Aligner [Computer program]. Version 0.9.0, retrieved 17 January 2017 from http://montrealcorpustools.github.io/Montreal-Forced-Aligner/. [http://dx.doi.org/10.21437/Interspeech.2017-1386] (Montreal Forced Aligner)

Funding

Keywords

Neuroscience
Linguistics
Vandam
Montreal Forced Aligner
MFA
annotations

Resource Type

Dataset

Scheduled service maintenance on November 22

README.md

Project

Dataset structure

Steps to generate aligned .csv from vandam-data .cha annotations

Steps for comparison of aligned segments with human annotator