# BREAD (brainboost EEG Artifact Detection)
The dataset BREAD (BRainboost Eeg Artifact Detection), used for EMG artifact detection in EEG recordings, contains both artifact-contaminated signals and resting-state eyes-open (EO) signals. It includes 932 npy files of EEG recordings from seven subjects, consisting of 664 artifact-containing epochs and 268 EO epochs.
Each subject (identified by a subjectID ranging from 5 to 11; note that subjects 1 to 4 are not included in this dataset) participated in seven isometric contraction artifact tasks, each lasting 5 seconds and repeated 10 times, as well as five continuous movement tasks, each lasting 10 seconds and repeated 5 times. This results in 95 artifact-containing epochs per subject, with the exception of subject 7, who had one less repetition for the "kh_a" artifact task.
Additionally, each subject provided EO recordings as well, which were segmented into alternating 10-second and 5-second epochs without overlap. On average, each subject contributed 38 ± 7 EO epochs.
Epochs were extracted from the original EDF files for each subject. All subjects, except subject 5, had one EDF file containing all the necessary epochs. For subject 5, the epochs were spread across two EDF files. Each npy file represents a single epoch. The original EDF files are not published and hence not available.
### Npy File Naming Schema
\[subjectID]\_[edfID]\_eeg\_[epochID]\_[epochNumber]
\[subjectID]: Ranges from 5 to 11.
\[edfID]: 1 or 2 for subject 5, 1 for all other subjects.
\[epochID]: see epochID_mapping.txt
\[epochNumber]: Ranges from 0 to 4 for continuous movements and from 0 to 9 for isometric contractions. For EO, it is formatted as [EO recording Number]-[segment number].
### File Name Examples
5_1_eeg_EO_0-2.npy: This file contains the 2nd epoch of EO recording 0, segmented from the 1st EDF file of subject 5.
5_2_eeg_kb_db_0.npy: This file contains the 0th epoch of the "kb_db" artifact from the 2nd EDF file of subject 5.
### Additional Information for Reading the Data
Data shape in npy file: channel * time
Sampling rate: 2048Hz
Unit: Volts
Channel names: see ch_name.txt
### Citation and Further Details
This dataset is licensed under the Creative Commons Attribution 4.0 (CC-BY 4.0). Please cite the original article:
L. Wang-Nöth, P. Heiler, H. Huang, D. Lichtenstern, A. Reichenbach, L. Flacke, L. Maisch, H. Mayer, "How Much Data is Enough? Optimization of Data Collection for Artifact Detection in EEG Recordings", preprint, 2024.
[Doi coming soon]
For further details, refer to the accompanying publication.