CRCNS Dataset: Interference of mid-level sound statistics predicts human speech recognition in natural noise

Alex Clonan d215047b10 Update 'datacite.yml' 7 ay önce
Experiment1 62ef3a2334 Delete 'Experiment1/foresounds/Sound0023.wav' 7 ay önce
Experiment2 7495d8d5bd gin commit from d127h8.psy.uconn.edu 7 ay önce
Experiment3 0a3d88ab65 Delete 'Experiment3/V2_SNR_-18_to_0/.DS_Store' 7 ay önce
SupplementalAudio e3ee3800fc Delete 'SupplementalAudio/Audio_2/SNR_Neg9/Jackhammer/.DS_Store' 7 ay önce
LICENSE 405f0521c9 Update 'LICENSE' 8 ay önce
README.md 4fd14fa518 Update 'README.md' 7 ay önce
datacite.yml d215047b10 Update 'datacite.yml' 7 ay önce

README.md

Digits-in-Noise Perceptual Dataset

Experimental Data: Interference of mid-level sound statistics predicts human speech recognition in natural noise

DOI: https://doi.org/10.1101/2024.02.13.579526

Experiment 1: Original (OR), Phase Randomized (PR), Spectrum Equalized (SE) Data

  • 825 Foreground Sounds
  • 825 Background Sounds
  • BehavioralDataExp1.mat
    • BackSoundNum: (1-11) as shown in paper (1x825 trials)
    • BackCondition: 'OR' 'PR' 'SE' (1x825 trials)
    • SNR: Signal-to-Noise-Ratio Value Between Fore and Back (1x825 trials)
    • Resp: Behavioral Response (18 participants x 825 trials) - 0 (incorrect response), 1 (correct response)

Experiment 2: Gradually Added Texture Statistics in Babble8 and Jackhammer (Behavioral Response Only)

  • BehavioralDataExp2_Babble8.mat

    • BackSoundNum: (1-11) as shown in paper (1x500 trials)
    • SNR: Signal-to-Noise-Ratio Value Between Fore and Back (1x500 trials)
    • TextureStat: {SPEC,MAR,MOD,CORR,ORIG} representing gradually added texture statistics (McDermott 2011)
    • Resp: Behavioral Response (10 participants x 500 trials) - 0 (incorrect response), 1 (correct response)
  • BehavioralDataExp2_Jackhammer.mat

    • BackSoundNum: (1-11) as shown in paper (1x500 trials)
    • SNR: Signal-to-Noise-Ratio Value Between Fore and Back (1x500 trials)
    • TextureStat: {SPEC,MAR,MOD,CORR,ORIG} representing gradually added texture statistics (McDermott 2011)
    • Resp: Behavioral Response (6 participants x 500 trials) - 0 (incorrect response), 1 (correct response)

Experiment 3: Varying SNR Across OR Conditions

  • V1_SNR_-15_-3

    • 1375 Foresounds
    • 1375 Backsounds
    • BehavioralDataExp3_v1.mat
      • BackSoundNum: (1-11) as shown in paper (1x1375 trials)
      • SNR: Signal-to-Noise-Ratio Value Between Fore and Back (1x1375 trials)
      • Resp: Behavioral Response (5 participants x 1375 trials) - 0 (incorrect response), 1 (correct response)
  • V1_SNR_-18_0

    • 1925 Foresounds
    • 1925 Backsounds
    • BehavioralDataExp3_v2.mat
      • BackSoundNum: (1-11) as shown in paper (1x1925 trials)
      • SNR: Signal-to-Noise-Ratio Value Between Fore and Back (1x1925 trials)
      • Resp: Behavioral Response (4 participants x 1925 trials) - 0 (incorrect response), 1 (correct response)

Supplemental Audio

  • Audio 1
    • Example audio excerpts for each of the 33 background conditions for Experiment 1. All of the three-digit sequences are delivered at an SNR -9 dB. Exemplars for each of the 11 background sound conditions are organized in a directory corresponding to the experimental manipulation performed (OR, PR, SE). Sounds are numbered from 1 to 11 in rank order according to the perceptual accuracy, as shown in paper (Sound01.wav, Sound02.wav, etc.).
  • Audio 2
    • Example audio excerpts for each of the background conditions for Experiment 2. The background conditions include the babble-8 and jackhammer backgrounds delivered at variable SNR and variable statistics (Spec, +Mar, +MPS, +Corr, and Orig). Sounds are organized in subdirectories corresponding to the SNR condition (-12, -9, -6, and -3 dB) and background name (EightSpeakerBabble or Jackhammer) and are labeled according to the summary statistics used to synthesize the background (SoundSpec.wav, SoundMar.wav, SoundMPS.wav, SoundCorr.wav, SoundOrig.wav).
datacite.yml
Title Low-dimensional interference of mid-level sound statistics predicts human speech recognition in natural environmental noise
Authors Clonan,Alex;University of Connecticut;ORCID:0009-0007-1460-6483
Zhai,Xiu;Wentworth Institute of Technology;ORCID:0000-0003-0341-7816
Stevenson,Ian;University of Connecticut;ORCID:0000-0002-1428-5946
Escabi,Monty;University of Connecticut;ORCID:0000-0001-7271-1061
Description This is a supporting dataset for the manuscript "Low-dimensional interference of mid-level sound statistics predicts human speech recognition in natural environmental noise". The dataset itself is comprised of three psychoacoustic experiments that investigate human speech recognition in differing natural enviornments. In the first experiment, (n=18) participants recognize spoken digit triplets in the presence of 11 natural backgrounds, and acoustically perturbed variants that whiten the the modulation content (Phase Randomized, PR) or the spectrum content (Spectrum Equalized, SE) of the sound. In the second experiment, (n=16) participants recognize spoken digit triplets in the presence of the Jackhammer Sound or the 8 Speaker Babble sound, that have been perturbed by gradually added texture statistics (McDermott 2011). In the third experiment, (n=9) participants recognize spoken digit triplets in the presence of 11 natural backgrounds at 7 different, signal-to-noise ratios. The supported data will be able to replicate the psychoacoustic results presented in the paper, in addition to serving as the input for the logistic regression model used in subsequent analysis. The repository contains Audio Files (.wav format) and Behavioral Data (MATLAB .mat format).
License Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public License (https://creativecommons.org/licenses/by-nc-sa/4.0/)
References Alex Clonan, Xiu Zhai, Ian Stevenson, Monty Escabi, Low-dimensional interference of mid-level sound statistics predicts human speech recognition in natural environmental noise [https://doi.org/10.1101/2024.02.13.579526] (isSupplementTo)
Funding NIDCD, DC020097
Keywords Neuroscience
Speech
Perception
Natural Noise
Auditory
Resource Type Dataset