Browse Source

day2 tutorial on features

Jan Grewe 2 năm trước cách đây
mục cha
commit
eacd5735ef
1 tập tin đã thay đổi với 180 bổ sung0 xóa
  1. 180 0
      day_2/tutorial_1.ipynb

+ 180 - 0
day_2/tutorial_1.ipynb

@@ -0,0 +1,180 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "source": [
+    "# Features\n",
+    "\n",
+    "Tagging via **Tag** & **MultiTags** allows to highlight points or regions in stored data.\n",
+    "\n",
+    "So far, the **Tag/MultiTag** just notes that there are some  interesting intervals in which something happened in the data stored in one or many **DataArray(s)**. The name of the **Tag/MultiTag** entity may tell us that the highlighted interval(s) represent stimulus regions. Using Features we can now add further information to characterize these regions.\n",
+    "\n",
+    "![multiple regions](resources/multiple_regions.png)\n",
+    "\n",
+    "Let’s assume we wanted to store the stimulus frequency. The following lines of code can be inserted into the previous example before the file is closed.\n",
+    "\n",
+    "```python\n",
+    "    stim_frequencies = [10, 15, 20, 25]\n",
+    "    frequencies = block.create_data_array(\"stimulus frequency\", \"nix.feature\", data=stim_frequencies, label=\"frequency\", unit=\"Hz\")\n",
+    "    frequencies.append_set_dimension()\n",
+    "\n",
+    "    mtag = block.create_multi_tag(\"stimulus segments\", \"nix.segments.stimulus\", positions=positions, extents=extents)\n",
+    "    mtag.references.append(data_array)\n",
+    "    mtag.create_feature(frequencies, nixio.LinkType.Indexed)\n",
+    "```\n",
+    "\n",
+    "The feature data can be used to create the text labels below the segments in the plot above. Each entry in the frequencies **DataArray** corresponds to one of the tagged sections. Thus we use the ``nixio.LinkType.Indexed`` flag while creating the feature. We can read the feature data that belongs to the respective position index by calling the feature_data method on the **MultiTag**.\n",
+    "\n",
+    "```python\n",
+    "ax.text(interval + extent / 2, -1.25, \"%.1f %s\" % (mtag.feature_data(i, \"stimulus frequency\")[:],\n",
+    "                                                   mtag.features[\"stimulus frequency\"].data.unit),\n",
+    "        fontsize=8, ha=\"center\")\n",
+    "```\n"
+   ],
+   "metadata": {}
+  },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "## The LinkType specifies how Features are interpreted\n",
+    "\n",
+    "The Feature adds the information stored in a DataArray to the Tag/MultiTag. The way this information has to be interpreted is specified via the LinkType. There are three distinct types:\n",
+    "\n",
+    "1. **Indexed:** For each position in the referring Tag/MultiTag there is one entry in the linked DataArray. In case the linked DataArray is multi-dimensional, the number of entries along dimension 0 must match the number of positions.\n",
+    "2. **Tagged:** Positions and extents of the referring Tag/MultiTag need to be applied in the same way to the linked DataArray as to the referenced data (stored in the ‘references’ list).\n",
+    "3. **Untagged:** The whole data stored in the linked Feature is a feature of the Tag/MultiTag ignoring any indexing, positions or extents.\n",
+    "\n",
+    "Let's go through it step by step..."
+   ],
+   "metadata": {}
+  },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "## Untagged features\n",
+    "\n",
+    "Example scenario: The activity of a neuron was recorded. Now, in a certain period of time, a stimulus was applied and the stimulus waveform (e.g. a noise stimulus) should be stored along the data for later analysis.\n",
+    "\n",
+    "* the neuronal response is stored in **DataArrays**.\n",
+    "* within these we highlight the stimulus-on period using a **Tag**.\n",
+    "* the random noise stimulus was stored in another **DataArray**, it is shorter than the response arrays.\n",
+    "\n",
+    "![noise stim](resources/untagged_feature.png)\n",
+    "\n",
+    "The recorded membrane voltage data is 10s long and we *tag* the interval between stimulus_onset and stimulus_onset + stimulus_duration (from 1 to 9 seconds). The stimulus itself is only 8s long and was played in the tagged interval. We use a Tag to bind stimulus and recorded signal together. The data stored in the ``untagged`` feature is the whole stimulus. The Tag’s position and extent do not apply to the stimulus trace.\n",
+    "\n",
+    "```python\n",
+    "    stim = block.create_data_array(\"stimulus\", \"nix.sampled.time_series\", data=stimulus,\n",
+    "                                   label=\"current stimulus\", unit=\"nA\")\n",
+    "    stim.append_sampled_dimension(stepsize, label=\"time\", unit=\"s\")\n",
+    "\n",
+    "    # create the Tag to highlight the stimulus-on segment\n",
+    "    tag = block.create_tag(\"stimulus presentation\", \"nix.epoch.stimulus_presentation\", [stim_onset])\n",
+    "    tag.extent = [stim_duration]\n",
+    "    tag.references.append(data)\n",
+    "\n",
+    "    # set stimulus as untagged feature of the tag\n",
+    "    tag.create_feature(stim, nixio.LinkType.Untagged)\n",
+    "```"
+   ],
+   "metadata": {}
+  },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "## Tagged Feature\n",
+    "\n",
+    "In contrast to the ``Untagged`` **Feature**, position and extent of the **Tag** should also be applied to the data stored in the **Feature**.\n",
+    "\n",
+    "In this example the data and the stimulus trace have the same duration that is, the times in one trace match the times in the other one. \n",
+    "\n",
+    "![tagged feature](resources/tagged_feature.png)\n",
+    "\n",
+    "The spike times are used to tag the recording of the membrane voltage using a **MultiTag**. The stimulus is added to the **MultiTag** as a ``tagged`` **Feature**. That is, the positions of the tag (the spike times) should be applied also to the stimulus. Extracting the ``feature_data`` gives the stimulus intensities at the times of the spikes, the orange distribution.\n",
+    "\n",
+    "```python\n",
+    "    # create the positions DataArray, i.e. the spike times\n",
+    "    positions = block.create_data_array(\"spike times\", \"nix.events.spike_times\", data=spike_times)\n",
+    "    positions.append_range_dimension_using_self()\n",
+    "\n",
+    "    # create a MultiTag\n",
+    "    multi_tag = block.create_multi_tag(\"spike times\", \"nix.events.spike_times\", positions)\n",
+    "    multi_tag.references.append(data)\n",
+    "\n",
+    "    # save stimulus snippets in a DataArray\n",
+    "    stimulus_array = block.create_data_array(\"stimulus\", \"nix.sampled\", data=stimulus, label=\"stimulus\", unit=\"nA\")\n",
+    "    # add a descriptor for the time axis\n",
+    "    stimulus_array.append_sampled_dimension(stepsize, label=\"time\", unit=\"s\")\n",
+    "\n",
+    "    # set stimulus as a tagged feature of the multi_tag\n",
+    "    multi_tag.create_feature(stimulus_array, nixio.LinkType.Tagged)\n",
+    "```\n"
+   ],
+   "metadata": {}
+  },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "## Storing wavelets as an Indexed Feature\n",
+    "\n",
+    "Above we have seen, that single numbers can be stored as an indexed **Feature**.\n",
+    "\n",
+    "In this example we want to store a snippet of the stimulus driving the neuron for each spike to create a Spike Triggered Average (STA), i.e. the average stimulus that leads to an action potential in the recorded neuron.\n",
+    "\n",
+    "In this example we store these stimulus snippets and link them to the events/spikes by adding a **Feature** to the **MultiTag**. There is one snippet for each spike. The index of each event has to be used as an index in the first dimension of the **Feature** data. \n",
+    "\n",
+    "```python\n",
+    "    # sts is 2D, it contains the stimulus snippets centered on the recorded spikes. First dimension represents the number of spikes, second represents time.\n",
+    "    snippets = block.create_data_array(\"spike triggered stimulus\", \"nix.regular_sampled.multiple_series\", data=sts, label=\"stimulus\", unit=\"nA\")\n",
+    "    snippets.append_set_dimension()\n",
+    "    snippets.append_sampled_dimension(stepsize, offset= -sta_offset * stepsize, label=\"time\", unit=\"s\")\n",
+    "\n",
+    "    # set snippets as an indexed feature of the multi_tag\n",
+    "    multi_tag.create_feature(snippets, nixio.LinkType.Indexed)\n",
+    "```\n",
+    "\n",
+    "![sta feature](resources/spike_features.png)\n"
+   ],
+   "metadata": {}
+  },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "## Retrieving the data stored in a feature\n",
+    "\n",
+    "Regardless of the ``LinkType``, one can ask the library to return the feature data by calling the ``feature_data`` methods on **Tag** or **MultiTag**.\n",
+    "\n",
+    "The signatures differ a little bit:\n",
+    "\n",
+    "* ``tag.feature_data(name_or_id_or_index)`` takes a single argument, i.e. the *name*, *id* or the *index* of one of the referenced **Features**.\n",
+    "* ``mtag.feature_data(pos_index, name_or_id_or_index)`` takes two arguments of which the first is the *position index*, and the second one the  *name*, *id* or the *index* of one of the referenced **Features**.\n",
+    "\n",
+    "The methods return a **DataView** entity that holds the reference to the data in the file. The data can also be read in one go:\n",
+    "\n",
+    "``data = tag.feature_data(\"name of feature\")[:]``\n"
+   ],
+   "metadata": {}
+  },
+  {
+   "cell_type": "markdown",
+   "source": [],
+   "metadata": {}
+  }
+ ],
+ "metadata": {
+  "orig_nbformat": 4,
+  "language_info": {
+   "name": "python",
+   "version": "3.9.5"
+  },
+  "kernelspec": {
+   "name": "python3",
+   "display_name": "Python 3.9.5 64-bit"
+  },
+  "interpreter": {
+   "hash": "aee8b7b246df8f9039afb4144a1f6fd8d2ca17a180786b69acc140d282b71a49"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}