Browse Source

업데이트 'README.md'

Hio-Been Han 4 years ago
parent
commit
a6f722356e
1 changed files with 247 additions and 44 deletions
  1. 247 44
      README.md

+ 247 - 44
README.md

@@ -1,6 +1,5 @@
-<br>
 # 1. Dataset information
-<br>
+
 A set of high-density EEG (electroencephalogram) recording obtained from awake, freely-moving mice (*mus musculus*) (n = 6). Details of experimental method are described in the original research article using the same dataset [Hwang et al., 2019, *Brain Structure and Function*].
 
 * Title: High-density EEG recording in mice for auditory steady-state response with optogenetic stimulation in the basal forebrain
@@ -8,12 +7,10 @@ A set of high-density EEG (electroencephalogram) recording obtained from awake,
 * Version: 1.0.0
 * Related publication: [Hwang et al., 2019, *Brain Structure and Function*](https://link.springer.com/article/10.1007/s00429-019-01845-5). 
 
-**Step-by-step tutorial is included, fully functioning with _Google Colaboratory_ environment. [Open in COLAB [*Data_Description_gin-Mouse_hdEEG_ASSR_Hwang_et_al.ipynb*]](https://colab.research.google.com/drive/1LRmR0fxqWFvwxaNIJxMJkMQVblcuTazk)**
+**Step-by-step tutorial is included, fully functioning with _Google Colaboratory_ environment. [Open in COLAB [*data_description.ipynb*]](http://colab.research.google.com)**
 
-<br>
-<br>
 # 2. File organization
-<br>
+
 Raw EEG data are saved in EEGLAB dataset format (*.set). Below are the list of files included in the database.
 
 **a) Meta data file (1 csv file)**
@@ -33,21 +30,19 @@ Raw EEG data are saved in EEGLAB dataset format (*.set). Below are the list of f
     [rawdata/epochs_animal5.set, rawdata/epochs_animal5.fdt]
     [rawdata/epochs_animal6.set, rawdata/epochs_animal6.fdt]
     
-**d) Example python/matlab scripts**
+**d) Example python scripts**
 
-    [data_description.ipynb, data_description.py (written and tested on Python 3 environment)
+    [Data_Description_gin-Mouse_hdEEG_ASSR_Hwang_et_al.ipynb]
+     * written and tested on Google COLAB - Python 3 environment
     
-    
-<br>
-<br>
+
 # 3. How to get started (Python 3 without _gin_)
 As the data are saved in EEGLAB format, you need to install appropriate module to access the data in Python3 environment. The fastest way would be to use <code>read_epochs_eeglab()</code> function in *MNE-python* module. You can download the toolbox from the link below (or use <code>pip install mne</code> in terminal shell).
 
+
 *[MNE-python]* https://martinos.org/mne/stable/index.html
 
-<br>
 ## Part 1. Accessing dataset
-<br>
 
 ### 1-1. Download dataset and MNE-python module
 
@@ -60,12 +55,13 @@ To download dataset and install MNE-python module into your environment (local m
 > Note: Through this step-by-step demonstration, we will use data from one animal (#Animal 2). Unnecessary data files will not be downloaded to prevent long download time. To download whole dataset, change this part; <code>dataset_to_download = [2]</code> into <code>dataset_to_download = [1,2,3,4,5,6]</code>.
 
 
+
+
 ```python
 # Demo 1-1. Setting an enviroment
-dir_origin = '/content/' # <- Change this in local machine
+dir_origin = 'content/' # <- Change this in local machine
 dir_dataset= 'dataset/'
 
-#!rm -rf /content/dataset/
 import requests
 from os import listdir, mkdir, path, system
 def download_dataset( dataset_to_download = range(1,7), dir_dataset = dir_dataset ): 
@@ -115,6 +111,35 @@ print('\n============= End of the list ==================\n\n')
 system('pip install mne');
 ```
 
+    
+    ============= Start Downloading =================
+    
+    ...skipping already existing file [content/dataset/meta.csv]...
+    ...skipping already existing file [content/dataset/montage.csv]...
+    ...skipping already existing file [content/dataset/rawdata/epochs_animal2.set]...
+    ...skipping already existing file [content/dataset/rawdata/epochs_animal2.fdt]...
+    
+    ============= Download finished ==================
+    
+    
+    
+    ==== List of available files in google drive ====
+    
+    ['meta.csv', 'montage.csv', 'rawdata']
+    
+    ============= End of the list ==================
+    
+    
+    
+    ==== List of available files in google drive ====
+    
+    ['epochs_animal2.fdt', 'epochs_animal2.set']
+    
+    ============= End of the list ==================
+    
+    
+
+
 ### 1-2. Accessing meta-data table
 
 File *meta.csv* contains the demographic information of 6 mice. Using <code>read_csv()</code> of *pandas* module, meta-data table can be visualized as follow. 
@@ -128,11 +153,94 @@ meta
 ```
 
 
+
+
+<div>
+<style scoped>
+    .dataframe tbody tr th:only-of-type {
+        vertical-align: middle;
+    }
+
+    .dataframe tbody tr th {
+        vertical-align: top;
+    }
+
+    .dataframe thead th {
+        text-align: right;
+    }
+</style>
+<table border="1" class="dataframe">
+  <thead>
+    <tr style="text-align: right;">
+      <th></th>
+      <th>subject_name</th>
+      <th>file_name</th>
+      <th>n_trials</th>
+      <th>age_in_week</th>
+      <th>female</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td>0</td>
+      <td>animal1</td>
+      <td>epochs_animal1.set</td>
+      <td>589</td>
+      <td>0</td>
+      <td>0</td>
+    </tr>
+    <tr>
+      <td>1</td>
+      <td>animal2</td>
+      <td>epochs_animal2.set</td>
+      <td>557</td>
+      <td>0</td>
+      <td>0</td>
+    </tr>
+    <tr>
+      <td>2</td>
+      <td>animal3</td>
+      <td>epochs_animal3.set</td>
+      <td>349</td>
+      <td>0</td>
+      <td>0</td>
+    </tr>
+    <tr>
+      <td>3</td>
+      <td>animal4</td>
+      <td>epochs_animal4.set</td>
+      <td>493</td>
+      <td>0</td>
+      <td>0</td>
+    </tr>
+    <tr>
+      <td>4</td>
+      <td>animal5</td>
+      <td>epochs_animal5.set</td>
+      <td>956</td>
+      <td>0</td>
+      <td>0</td>
+    </tr>
+    <tr>
+      <td>5</td>
+      <td>animal6</td>
+      <td>epochs_animal6.set</td>
+      <td>959</td>
+      <td>0</td>
+      <td>0</td>
+    </tr>
+  </tbody>
+</table>
+</div>
+
+
+
 ### 1-3. Data loading and dimensionality check
 
 Each _*.fdt_ file is consisted of different number of trials. To load dataset, a function <code>get_eeg_data()</code> is defined below. To maintain original dimensionality order (cf. channel-time-trial in EEGLAB of Matlab), <code>np.moveaxis()</code> was applied. 
 
 
+
 ```python
 # Demo 1-3. Data loading and dimensionality check
 from mne.io import read_epochs_eeglab as loadeeg
@@ -151,14 +259,19 @@ print('File name  :  [%s]'%f_name)
 print('File contains [%d channels, %4d time points, %3d trials]'%(EEG.data.shape))
 ```
 
+    File name  :  [content/dataset/rawdata/epochs_animal2.set]
+    File contains [38 channels, 5200 time points, 557 trials]
+
+
 Note that voltage calibration value (*CAL*) is set to 1e-6 in 0.11.0 version of [eeglab.py](https://github.com/mne-tools/mne-python/blob/master/mne/io/eeglab/eeglab.py])
 
 
-### 1-4. Getting channel coorinates
+###1-4. Getting channel coordinates
 
 As EEG are recorded in 38-electrode array and two electrodes among them were used as ground and reference site, total 36 channel data are avilable. The coordinates information are in the file [data/montage.csv], and can be accessed and visualized by following commands.  
 
 
+
 ```python
 # Demo 1-4. Import montage matrix
 from matplotlib import pyplot as plt; plt.style.use('ggplot')
@@ -206,19 +319,21 @@ def get_boundary():
     -4.7776, -3.9597, -4.8177, -3.7697, -4.8384, -3.5797, -4.8399, -3.3897,-4.8397, -3.1997, -4.8387, -3.0097, -4.8367, -2.8197, -4.8331, -2.6297,
     -4.8277, -2.4397, -4.8200, -2.2497, -4.8095, -2.0597, -4.7960, -1.8697,-4.7789, -1.6797, -4.7579, -1.4897, -4.7326, -1.2997, -4.7025, -1.1097,
     -4.6673, -0.9197, -4.6266, -0.7297, -4.5799, -0.5397, -4.5268, -0.3497,-4.4670, -0.1597, -4.4000, 0.03025]).reshape(-1, 2)
+boundary = get_boundary()
 for p in range(len(boundary)-1): plt.plot(boundary[p:p+2,0],boundary[p:p+2,1], 'k-')
-plt.show();
 ```
 
-<br>
-<br>
+
+![png](figure_output/output_12_0.png)
+
 
 ## Part 2. Plotting Event-Related Potentials
-<br>
+
 ### 2-1. Accessing event info
 
 Event information is saved in struct-type variable EEG.event and you can access it by typing <code>EEG.event</code>. Also, their time trace are avilable in 37-th and 38-th channel of <code>EEG.data</code>. For demonstration purpose, light and sound stimuli of 7 different types of event can be extracted and drawn as follow.
 
+
 ```python
 # Demo 2-1. Event profile (sound/light stimuli)
 condNames = ['In-phase','Out-of-phase','Delayed','Advanced',
@@ -248,12 +363,16 @@ plt.subplots_adjust(wspace=.3, hspace=.8)
 ```
 
 
+![png](figure_output/output_15_0.png)
+
+
 ### 2-2. Visualizing example single-trial trace
 
 If data is successfully loaded, now you're ready! For data visualization, an example function is provided below. 
 
 The function <code>plot_multichan()</code> draws multi-channel time series data, by taking 1D time vector, <code>x</code>, and 2D data matrix, <code>y</code>.
 
+
 ```python
 # Demo 2-2a. Function for multi-channel plotting
 def plot_multichan( x, y, spacing = 3000, figsize = (10,10), ch_names = EEG.ch_names ):
@@ -273,25 +392,27 @@ def plot_multichan( x, y, spacing = 3000, figsize = (10,10), ch_names = EEG.ch_n
   plt.gca().set_facecolor((1,1,1))
   return y_center
 ```
-  
-  
+
 Using <code>plot_multichan()</code> function, example single-trial EEG trace can be visualized as follow.
 
 
-```
+```python
 # Demo 2-2b. Visualization of raw EEG time trace
 trial_index = 0
 y_center = plot_multichan(EEG.times, EEG.data[:,:,trial_index])
-plt.title('Example trial (index=%d) trace'%(1+trial_index));
+plt.title('Example trial (index=%d) trace'%(1+trial_index)); plt.gcf().set_facecolor('w')
 ```
 
 
-Note that channel 1 to 36 contain actual EEG data from 36-channel electrodes (from FP1 to PO8), and channel 37 and 38 contain binary stimulus profile (0: no stimulation, 1: stimulation) of light and sound, respectively. 
+![png](figure_output/output_19_0.png)
 
 
-### 2-3. ERP in time domain
+Note that channel 1 to 36 contain actual EEG data from 36-channel electrode array (from FP1 to PO8), and channel 37 and 38 contain binary stimulus profile (0: no stimulation, 1: stimulation) of light and sound, respectively. 
+
+###2-3. ERP in time domain
+
+Using same function, <code>plot_multichan()</code>, ERP (Event-related potentials) trace can be drawn as follow.
 
-Using <code>plot_multichan()</code>, ERP (Event-related potentials) trace can be drawn as follow.
 
 ```python
 # Demo 2-3. Visualization of ERP time trace
@@ -301,25 +422,30 @@ targetCondition = 5 # <- Try changing this
 trialIdx = np.where((EEG.events[:,2])==targetCondition)[0]
 erp = np.nanmean(EEG.data[:,:,trialIdx],2)
 c = plot_multichan(EEG.times, erp, spacing = 300 )
-plt.title('Condition #%d: %s (n=%d)'%(targetCondition, condNames[targetCondition-1],len(trialIdx)));
+plt.title('Condition #%d: %s (n=%d)'%(targetCondition, condNames[targetCondition-1],len(trialIdx))); plt.gcf().set_facecolor('w')
 ```
 
+
+![png](figure_output/output_22_0.png)
+
+
 ### 2-4. ERP in frequency domain
 
-As the sensory stimuli used in this experiment was 40 Hz click train, one can easily see the 40 Hz evoked (steady-state) response in the time-domain ERP. Through 
+To calculate the amplitude of 40-Hz auditory steady-state response, fast Fourier transform can be applied as follow. 
+
 
 ```python
 # Demo 2-4. Time- and frequency-domain visualization of grand-averaged ERP
 from numpy.fft import fft
 def fft_half(x, Fs=2000): return fft(x)[:int(len(x)/2)]/len(x), np.linspace(0,Fs/2,len(x)/2)
 
-plt.figure(figsize=(14,8))
-
+# Get ERP in frequency domain
 erp = np.nanmean(EEG.data[:,:,np.where((EEG.events[:,2])==2)[0]],2)
 ac_erp = np.mean(erp[:36,],0)
 ac_erp_fft,freq = fft_half(ac_erp)
 
 # Plot time-domain signal
+plt.figure(figsize=(14,8))
 plt.subplot(2,1,1)
 plt.plot( EEG.times, ac_erp, linewidth=.5 )
 plt.xlim((-.8,1.8))
@@ -337,14 +463,19 @@ plt.ylabel('Amplitude (mV/Hz)')
 plt.title('ERP signal in frequency domain')
 plt.gca().set_facecolor((1,1,1))
 
-plt.subplots_adjust(wspace=.3, hspace=.5)
+plt.subplots_adjust(wspace=.3, hspace=.5); plt.gcf().set_facecolor('w')
 ```
 
 
+![png](figure_output/output_24_0.png)
+
+
 ### 2-5. ERP in time-frequency domain
 
 Applying fast Fourier transform with moving temporal window, ERP signal can be drawn in time-frequency domain. To calculate spectrogram, a function <code>get_spectrogram()</code> is defined.
 
+
+
 ```python
 # Demo 2-5. Visualize frequency components in ERP
 def get_spectrogram( data, t=EEG.times, Fs=2000, 
@@ -423,18 +554,22 @@ for targetCondition in conditions:
   plt.xlim([-.5,1.5])
   plt.ylim([5,80])
 
-plt.subplots_adjust(wspace=.3, hspace=.5)
+plt.subplots_adjust(wspace=.3, hspace=.5); plt.gcf().set_facecolor('w')
 ```
-<br>
-<br>
+
+
+![png](figure_output/output_26_0.png)
+
+
 ## Part 3. Drawing topography
-<br>
+
 ### 3-1. Drawing 2D power topography
 
 Using this coordinate, spatial dynamics of ERP can be drawn in 2D plane (i.e., power topography). For visualization purpose, an example function, <code>plot_topo2d( data )</code>, is provided which takes <code>data</code> as 1D data matrix (i.e., 1 x 36 channels) each of which represents the channel power. 
 
 For 2D interpolation, additional *class* of <code>bi_interp2</code> is defined. 
 
+
 ```python
 # Demo 3-1a. Preparation of 2D power topography
 
@@ -534,8 +669,6 @@ class bi_interp2:
     # find the boundary line
     interp_obj = interpolate.PchipInterpolator(xx, yy)
     return interp_obj(xxi)
-
-
   
 """ (2) Function for Topography plot """
 from pandas import read_csv
@@ -579,9 +712,9 @@ def plot_topo2d(data, clim=(-15,25), montage_file='%s%smontage.csv'%(dir_origin,
 
 In typical EEG recordings, large amplitude artifacts coming from few bad channel(s) usually be a problem. To prevent this, researchers have developed various methods of artifact-rejection and bad-channel selection. Here, simple bad channel identification method is implemented using channel correlation. Data from the bad channels identified here will be ignored in topography and replaced by median value hereafter.
 
+
 ```python
 # Demo 3-1b. Identification of bad-channel using correlation
-""" Bad-channel selection (correlation based) """
 from scipy.stats import ttest_1samp as ttest
 
 ga_erp = np.nanmean(EEG.data[:36,:,:],2)
@@ -604,13 +737,53 @@ for chIdx in range(corr_coef.shape[1]):
   print( marker+'Ch=%02d) p = %.3f, R_mean = %.3f, R_std = %.3f'%(
       chIdx+1, pval, np.mean(r_data), np.std(r_data)))
 print('\nLow-correlated (bad) channels: %s'%(bad_channels))
+
 ```
 
+      Ch=01) p = 0.000, R_mean = 0.561, R_std = 0.335
+      Ch=02) p = 0.000, R_mean = 0.537, R_std = 0.340
+      Ch=03) p = 0.000, R_mean = 0.679, R_std = 0.301
+      Ch=04) p = 0.000, R_mean = 0.657, R_std = 0.324
+      Ch=05) p = 0.000, R_mean = 0.651, R_std = 0.321
+      Ch=06) p = 0.000, R_mean = 0.618, R_std = 0.338
+      Ch=07) p = 0.000, R_mean = 0.742, R_std = 0.268
+      Ch=08) p = 0.000, R_mean = 0.736, R_std = 0.289
+      Ch=09) p = 0.000, R_mean = 0.744, R_std = 0.272
+      Ch=10) p = 0.000, R_mean = 0.711, R_std = 0.301
+      Ch=11) p = 0.000, R_mean = 0.762, R_std = 0.223
+      Ch=12) p = 0.000, R_mean = 0.711, R_std = 0.231
+      Ch=13) p = 0.000, R_mean = 0.763, R_std = 0.246
+      Ch=14) p = 0.000, R_mean = 0.715, R_std = 0.282
+      Ch=15) p = 0.000, R_mean = 0.747, R_std = 0.207
+      Ch=16) p = 0.000, R_mean = 0.755, R_std = 0.224
+      Ch=17) p = 0.000, R_mean = 0.753, R_std = 0.214
+      Ch=18) p = 0.000, R_mean = 0.749, R_std = 0.232
+      Ch=19) p = 0.000, R_mean = 0.765, R_std = 0.223
+      Ch=20) p = 0.000, R_mean = 0.721, R_std = 0.270
+      Ch=21) p = 0.000, R_mean = 0.701, R_std = 0.193
+      Ch=22) p = 0.000, R_mean = 0.733, R_std = 0.202
+      Ch=23) p = 0.000, R_mean = 0.724, R_std = 0.194
+      Ch=24) p = 0.000, R_mean = 0.769, R_std = 0.223
+      Ch=25) p = 0.000, R_mean = 0.752, R_std = 0.181
+      Ch=26) p = 0.000, R_mean = 0.710, R_std = 0.257
+      Ch=27) p = 0.000, R_mean = 0.398, R_std = 0.163
+      Ch=28) p = 0.000, R_mean = 0.648, R_std = 0.182
+      Ch=29) p = 0.000, R_mean = 0.647, R_std = 0.188
+      Ch=30) p = 0.000, R_mean = 0.737, R_std = 0.197
+      Ch=31) p = 0.000, R_mean = 0.619, R_std = 0.140
+      Ch=32) p = 0.000, R_mean = 0.658, R_std = 0.228
+    > Ch=33) p = 0.051, R_mean = 0.106, R_std = 0.306
+      Ch=34) p = 0.000, R_mean = 0.262, R_std = 0.135
+    > Ch=35) p = 0.012, R_mean = 0.124, R_std = 0.270
+      Ch=36) p = 0.000, R_mean = 0.281, R_std = 0.123
+    
+    Low-correlated (bad) channels: [32, 34]
 
 
 ### 3-2. Time-course of raw voltage topography
 
-Input data of EEG topography can be defined by any mean; voltage, band-limited power, instantaneous phase, and so on. In this example, spatial distribution of raw voltage at specific time point is drawn. For better understanding of the data, ERP time traces at frontal and parietal area are also drawn.
+Input data of EEG topography can be defined by any mean; voltage, band-limited power, instantaneous angle, and so on. In this example, spatial distribution of raw voltage at specific time point is drawn. For better understanding of the data, ERP time traces at frontal and parietal area are also drawn.
+
 
 ```python
 # Demo 3-2. Raw voltage topography
@@ -665,18 +838,21 @@ for targetCondition in conditions:
     topo_y = np.linspace(topo_shift[tIdx][1]-topo_size[1]*.5,topo_shift[tIdx][1]+topo_size[1]*.5,topos[tIdx].shape[1])
     plt.contourf( topo_x, topo_y, topos[tIdx], cmap=cm.jet, levels=topo_clim )
 
-plt.show()
-plt.close()
+plt.gcf().set_facecolor('w'); plt.show(); plt.close(); 
 
 ```
 
 
-### 3-3. Band-limited power topography
+![png](figure_output/output_33_0.png)
+
+
+###3-3. Band-limited power topography
 
 Other than raw voltage, topography of band-limited power at stimulation frequency (40 Hz) can be drawn as well. In this example, stimulus-evoked 40 Hz power were estimated using <code>bandpower()</code> function.  
 
 To demonstrate the effect of stimulation, stimulus-free periods (e.g., pre- and post-stimulus period) data are also obtained.
 
+
 ```python
 # Demo 3-3. Band-power topography: ERP response as a function of time and space
 def band_power(x, targetBand, Fs=2000):
@@ -710,16 +886,25 @@ for periodIdx in range(len(period)):
   plt.subplot(1,len(period),periodIdx+1)
   plot_topo2d(power, clim=(0,3) )
   plt.title('%s, t = [%.1f, %.1f]s'%(periodName[periodIdx],period[periodIdx][0],period[periodIdx][1]))
+  plt.gcf().set_facecolor('w')
 
 ```
 
+    /Users/hiobeen/opt/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:120: RuntimeWarning: invalid value encountered in greater
+    /Users/hiobeen/opt/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:121: RuntimeWarning: invalid value encountered in less
+
+
+
+![png](figure_output/output_35_1.png)
+
 
 ### 3-4. Band-power topography: Comparison across various experimental conditions
 
 Applying the same routine above, power topography figures of five different experimental conditions can be drawn as below.
 
+
 ```python
-# Demo 3-4. Band-power topography: Summary comparison across various stimulus conditions
+# Demo 13. Band-power topography: Summary comparison across various stimulus conditions
 freq = 40 # Hz
 plt.figure(figsize=(17,4))
 conditions = [6,4,1,3,2]
@@ -737,5 +922,23 @@ for targetCondition in conditions:
   plot_topo2d(power, clim=(0,7) )
   plt.title('%s'%condNames[targetCondition-1])
   if targetCondition is not conditions[0]: plt.ylabel('')
+  plt.gcf().set_facecolor('w')
 ```
 
+    /Users/hiobeen/opt/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:120: RuntimeWarning: invalid value encountered in greater
+    /Users/hiobeen/opt/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:121: RuntimeWarning: invalid value encountered in less
+
+
+
+![png](figure_output/output_37_1.png)
+
+
+<br>
+<br>
+Enjoy!
+
+
+```python
+# Try on your own!
+
+```