This repository contains the dataset described in:
L. Wang-Nöth, P. Heiler, H. Huang, D. Lichtenstern, A. Reichenbach, L. Flacke, L. Maisch, H. Mayer, "How Much Data is Enough? Optimization of Data Collection for Artifact Detection in EEG Recordings", preprint, 2024.
[Doi coming soon].

This dataset is licensed under the Creative Commons Attribution 4.0 (CC-BY 4.0). Please cite the original article mentioned above.

Lu Wang-Nöth f799dbfe49 'datacite.yml' ändern 3 days ago
EO 71a7936389 Dateien hochladen nach 'EO' 1 month ago
artifacts 0dde11e35c Dateien hochladen nach 'artifacts' 1 month ago
LICENSE f5ef569aba change license to CC-BY 1 month ago
README.md d6ae9d2090 'README.md' ändern 3 days ago
ch_names.txt 960464ee09 gin commit from HP-V 1 month ago
datacite.yml f799dbfe49 'datacite.yml' ändern 3 days ago
epochID_mapping.txt 9e544202cc gin commit from HP-V 1 month ago

README.md

BREAD (brainboost EEG Artifact Detection)

The dataset BREAD (BRainboost Eeg Artifact Detection), used for EMG artifact detection in EEG recordings, contains both artifact-contaminated signals and resting-state eyes-open (EO) signals. It includes 932 npy files of EEG recordings from seven subjects, consisting of 664 artifact-containing epochs and 268 EO epochs.

Each subject (identified by a subjectID ranging from 5 to 11; note that subjects 1 to 4 are not included in this dataset) participated in seven isometric contraction artifact tasks, each lasting 5 seconds and repeated 10 times, as well as five continuous movement tasks, each lasting 10 seconds and repeated 5 times. This results in 95 artifact-containing epochs per subject, with the exception of subject 7, who had one less repetition for the "kh_a" artifact task.

Additionally, each subject provided EO recordings as well, which were segmented into alternating 10-second and 5-second epochs without overlap. On average, each subject contributed 38 ± 7 EO epochs.

Epochs were extracted from the original EDF files for each subject. All subjects, except subject 5, had one EDF file containing all the necessary epochs. For subject 5, the epochs were spread across two EDF files. Each npy file represents a single epoch. The original EDF files are not published and hence not available.

Npy File Naming Schema

[subjectID]_[edfID]_eeg_[epochID]_[epochNumber]
[subjectID]: Ranges from 5 to 11.
[edfID]: 1 or 2 for subject 5, 1 for all other subjects.
[epochID]: see epochID_mapping.txt
[epochNumber]: Ranges from 0 to 4 for continuous movements and from 0 to 9 for isometric contractions. For EO, it is formatted as [EO recording Number]-[segment number].

File Name Examples

5_1_eeg_EO_0-2.npy: This file contains the 2nd epoch of EO recording 0, segmented from the 1st EDF file of subject 5.
5_2_eeg_kb_db_0.npy: This file contains the 0th epoch of the "kb_db" artifact from the 2nd EDF file of subject 5.

Additional Information for Reading the Data

Data shape in npy file: channel * time
Sampling rate: 2048Hz
Unit: Volts
Channel names: see ch_name.txt

Citation and Further Details

This dataset is licensed under the Creative Commons Attribution 4.0 (CC-BY 4.0). Please cite the original article:
L. Wang-Nöth, P. Heiler, H. Huang, D. Lichtenstern, A. Reichenbach, L. Flacke, L. Maisch, H. Mayer, "How Much Data is Enough? Optimization of Data Collection for Artifact Detection in EEG Recordings", preprint, 2024.
[Doi coming soon]

For further details, refer to the accompanying publication.

datacite.yml
Title How Much Data is Enough? Optimization of Data Collection for Artifact Detection in EEG Recordings
Authors Wang-Nöth,Lu;brainboost GmbH, Augsburgerstraße 4, 80337 Munich, Germany; Institute for Applied Computer Science, Bundeswehr University Munich, Werner-Heisenberg-Weg 39, 85579 Neubiberg, Germany;ORCID:0009-0002-7443-121X
Heiler,Philipp;brainboost GmbH, Augsburgerstraße 4, 80337 Munich, Germany
Huang,Hai;Institute for Applied Computer Science, Bundeswehr University Munich, Werner-Heisenberg-Weg 39, 85579 Neubiberg, Germany;ORCID:0000-0001-8745-8142
Lichtenstern,Daniel;brainboost GmbH, Augsburgerstraße 4, 80337 Munich, Germany
Reichenbach,Alexandra;Center for Machine Learning, Heilbronn University, Max-Planck-Str. 39, 74081 Heilbronn, Germany;ORCID:0000-0003-4199-3005
Flacke,Luis;brainboost GmbH, Augsburgerstraße 4, 80337 Munich, Germany
Maisch,Linus;brainboost GmbH, Augsburgerstraße 4, 80337 Munich, Germany
Mayer,Helmut;Institute for Applied Computer Science, Bundeswehr University Munich, Werner-Heisenberg-Weg 39, 85579 Neubiberg, Germany;ORCID:0000-0002-9439-2695
Description This EEG dataset, used for EMG artifact detection in EEG recordings, contains both artifact-contaminated signals and resting-state eyes-open (EO) signals. It includes 932 npy files of EEG recordings from seven subjects, consisting of 664 artifact-containing epochs and 268 EO epochs.
License Creative Commons Attribution 4.0 (https://creativecommons.org/licenses/by/4.0/deed.en)
References Lu Wang-Nöth, Philipp Heiler, Hai Huang, Daniel Lichtenstern, Alexandra Reichenbach, Luis Flacke, Linus Maisch, Helmut Mayer: How Much Data is Enough? Optimization of Data Collection for Artifact Detection in EEG Recordings. Preprint. [] (IsSupplementTo)
Funding Federal Ministry for Economic Affairs and Climate Action of Germany, ZIM KK5211501BM0
Keywords Neuroscience
EEG
EMG
Artifact Detection
Data Cleaning
Data Collection Optimization
Resource Type Dataset