Motor evoked potentials for multiple sclerosis: A multiyear follow-up dataset.

Jan Yperman bfba8748a7 Updated README		2 years ago
code	8d87637c8b Bestanden uploaden naar 'code'	2 years ago
data	96516da707 Restructuring repository	3 years ago
images	9d887d634e Added image to files	3 years ago
LICENSE	fc835d36a9 Initial commit	3 years ago
README.md	bfba8748a7 Updated README	2 years ago
datacite.yml	7b30e0e852 Added funding information	3 years ago
db_version.yml	746bf613cf Added commit hash of dataset creation repository used to generate the current dataset	3 years ago

Motor evoked potentials for multiple sclerosis: A multiyear follow-up dataset.

Introduction

Multiple sclerosis (MS) is a chronic disease affecting millions of people worldwide. The signal conduction through the central nervous system of MS patients deteriorates. Evoked potential measurements allow clinicians to monitor the degree of deterioration and are used for decision support. We share a dataset that contains motor evoked potential (MEP) measurements, in which the brain is stimulated and the resulting signal is measured in the hands and feet. This results in time series of 100 milliseconds long. Typically, both hands and feet are measured in one hospital visit. The dataset consists of 5586 visits of 963 patients, performed in day-to-day clinical care over a period of 6 years. The dataset consists of approximately 100,000 MEP. Clinical metadata such as the expanded disability status scale, sex, and age is also available. This dataset can be used to explore the role of evoked potentials in MS research and patient care. It may also be used as a real-world benchmark for machine learning techniques for time series analysis and predictive modelling.

Usage

Downloading the dataset

There are a few ways to download the dataset (data/mep_dataset.zip). Since it is a fairly small filesize (~300MB), it can just be downloaded through the web interface. Or from the commandline:

wget https://gin.g-node.org/JanYperman/motor_evoked_potentials/raw/master/data/mep_dataset.zip

Alternatively, you may clone the repository to your local machine, which will also include the dataset:

git clone https://gin.g-node.org/JanYperman/motor_evoked_potentials.git

For more ways of accessing the data, please refer to GIN's FAQ.

Structure

The dataset itself is stored in data/mep_dataset.zip. The general structures is as follows:

patient.csv: Contains the records for the various patients.
visit.csv: Contains the records for the various visits.
test.csv: Contains the records for the various tests.
measurement.csv: Contains the records for the various measurements.
edss.csv: Contains the records for the various edss measurements.

Besides these files the dataset also contains textfiles for each of the actual time series. The filenames of these files contain a unique identifier which can be used to link back to the column "timeseries" in the measurement.csv file. Some code to automate this linking (in Python) is included in code/create_df_from_portable_dataset.py.

More details about specifics fields can be found in the dataset descriptor.

Getting started

Python

It is highly recommended to have a look at the included code/jupyter notebook to familiarize oneself with the dataset. It includes a sample use case and goes over how to work with the dataset.

To run the jupyter notebook a few Python packages are required:

Pandas
Numpy
Matplotlib
Scipy
Scikit-learn
Tqdm
Jupyter

For example in anaconda this could be achieved using:

conda create --name mep python=3 pandas numpy matplotlib scipy scikit-learn tqdm jupyter

which creates an environment called "mep" that contains the required packages.

Microsoft Excel

While the recommended starting point to exploring the data is the aforementioned Jupyter notebook, we have also included an Excel file which can serve as a data exploration tool. This file contains all the data except the full time series data. We have added the possibility of visualizing the time series on demand, by clicking the "Visualize time series" button (to the right of the table in the 'EP data' sheet). This will create a new worksheet containing a graph of the currently selected EP measurement.

For this to work you need to:

Allow macro execution: Excel will by default warn you that it has disabled running macros in the Excel sheet. To use the time series visualization you will need to use the "Enable Content" button to enable macros.
The time series will need to be unzipped in the location where it is stored as a zip file in this repository. That is, you need to unzip data/mep_dataset.zip. The Excel file (which should be in the 'code' directory) will look for the data in the place where it is by default in the repository (in the 'data' directory). This is just the default place where it will look for the files, but you may change this by modifying the column 'filename' in the Excel file.

License

This work, including the provided code, is licensed under a Creative Commons Attribution 4.0 International Public License.

See LICENSE file for the full license.

datacite.yml
Title	Motor evoked potentials for multiple sclerosis: A multiyear follow-up dataset.
Authors	Yperman,Jan;Hasselt University, Belgium;0000-0002-7632-2001 Popescu,Veronica;Hasselt University, Belgium Van Wijmeersch,Bart;Hasselt University, Belgium Becker,Thijs;Hasselt University, Belgium Peeters,Liesbet;Hasselt University, Belgium
Description	Multiple sclerosis (MS) is a chronic disease affecting millions of people worldwide. The signal conduction through the central nervous system of MS patients deteriorates. Evoked potential measurements allow clinicians to monitor the degree of deterioration and are used for decision support. We share a dataset that contains motor evoked potential (MEP) measurements, in which the brain is stimulated and the resulting signal is measured in the hands and feet. This results in time series of 100 milliseconds long. Typically, both hands and feet are measured in one hospital visit. The dataset consists of 5586 visits of 963 patients, performed in day-to-day clinical care over a period of 6 years. The dataset consists of approximately 100,000 MEP. Clinical metadata such as the expanded disability status scale, sex, and age is also available. This dataset can be used to explore the role of evoked potentials in MS research and patient care. It may also be used as a real-world benchmark for machine learning techniques for time series analysis and predictive modelling.
License	CC-BY (https://creativecommons.org/licenses/by/4.0/)
References	Yperman, J., Becker, T., Valkenborg, D., Popescu, V., Hellings, N., Wijmeersch, B. V., & Peeters, L. M. (2020). Machine learning analysis of motor evoked potential time series to predict disability progression in multiple sclerosis. BMC Neurology, 20(1). [doi:10.1186/s12883-020-01672-w] (IsSupplementTo) Yperman, J., Becker, T., Valkenborg, D., Hellings, N., Cambron, M., Dive, D., Laureys, G., Popescu, V., Van Wijmeersch, B., & Peeters, L. M. (2020). Deciphering the Morphology of Motor Evoked Potentials. Frontiers in Neuroinformatics, 14. [doi:10.3389/fninf.2020.00028] (IsSupplementTo)
Funding	The computational resources and services used in this work were provided by the VSC (Flemish Supercomputer Center), funded by the Research Foundation—Flanders (FWO) and the Flemish Government—department EWI. Research Foundation—Flanders (FWO) for ELIXIR Belgium (I002819N), and Hermesfonds for ELIXIR Belgium, AH.2017.051, IO 17001306
Keywords	Multiple Sclerosis Prognosis Time series
Resource Type	Dataset

README.md

Motor evoked potentials for multiple sclerosis: A multiyear follow-up dataset.

Introduction

Usage

Downloading the dataset

Structure

Getting started

Python

Microsoft Excel

License