{
 "cells": [
  {
   "cell_type": "markdown",
   "source": [
    "# Features\n",
    "\n",
    "Tagging via **Tag** & **MultiTags** allows to highlight points or regions in stored data.\n",
    "\n",
    "So far, the **Tag/MultiTag** just notes that there are some  interesting intervals in which something happened in the data stored in one or many **DataArray(s)**. The name of the **Tag/MultiTag** entity may tell us that the highlighted interval(s) represent stimulus regions. Using Features we can now add further information to characterize these regions.\n",
    "\n",
    "![multiple regions](resources/multiple_regions.png)\n",
    "\n",
    "Let’s assume we wanted to store the stimulus frequency. The following lines of code can be inserted into the previous example before the file is closed.\n",
    "\n",
    "```python\n",
    "    stim_frequencies = [10, 15, 20, 25]\n",
    "    frequencies = block.create_data_array(\"stimulus frequency\", \"nix.feature\", data=stim_frequencies, label=\"frequency\", unit=\"Hz\")\n",
    "    frequencies.append_set_dimension()\n",
    "\n",
    "    mtag = block.create_multi_tag(\"stimulus segments\", \"nix.segments.stimulus\", positions=positions, extents=extents)\n",
    "    mtag.references.append(data_array)\n",
    "    mtag.create_feature(frequencies, nixio.LinkType.Indexed)\n",
    "```\n",
    "\n",
    "The feature data can be used to create the text labels below the segments in the plot above. Each entry in the frequencies **DataArray** corresponds to one of the tagged sections. Thus we use the ``nixio.LinkType.Indexed`` flag while creating the feature. We can read the feature data that belongs to the respective position index by calling the feature_data method on the **MultiTag**.\n",
    "\n",
    "```python\n",
    "ax.text(interval + extent / 2, -1.25, \"%.1f %s\" % (mtag.feature_data(i, \"stimulus frequency\")[:],\n",
    "                                                   mtag.features[\"stimulus frequency\"].data.unit),\n",
    "        fontsize=8, ha=\"center\")\n",
    "```\n"
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "## The LinkType specifies how Features are interpreted\n",
    "\n",
    "The Feature adds the information stored in a DataArray to the Tag/MultiTag. The way this information has to be interpreted is specified via the LinkType. There are three distinct types:\n",
    "\n",
    "1. **Indexed:** For each position in the referring Tag/MultiTag there is one entry in the linked DataArray. In case the linked DataArray is multi-dimensional, the number of entries along dimension 0 must match the number of positions.\n",
    "2. **Tagged:** Positions and extents of the referring Tag/MultiTag need to be applied in the same way to the linked DataArray as to the referenced data (stored in the ‘references’ list).\n",
    "3. **Untagged:** The whole data stored in the linked Feature is a feature of the Tag/MultiTag ignoring any indexing, positions or extents.\n",
    "\n",
    "Let's go through it step by step..."
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "## Untagged features\n",
    "\n",
    "Example scenario: The activity of a neuron was recorded. Now, in a certain period of time, a stimulus was applied and the stimulus waveform (e.g. a noise stimulus) should be stored along the data for later analysis.\n",
    "\n",
    "* the neuronal response is stored in **DataArrays**.\n",
    "* within these we highlight the stimulus-on period using a **Tag**.\n",
    "* the random noise stimulus was stored in another **DataArray**, it is shorter than the response arrays.\n",
    "\n",
    "![noise stim](resources/untagged_feature.png)\n",
    "\n",
    "The recorded membrane voltage data is 10s long and we *tag* the interval between stimulus_onset and stimulus_onset + stimulus_duration (from 1 to 9 seconds). The stimulus itself is only 8s long and was played in the tagged interval. We use a Tag to bind stimulus and recorded signal together. The data stored in the ``untagged`` feature is the whole stimulus. The Tag’s position and extent do not apply to the stimulus trace.\n",
    "\n",
    "```python\n",
    "    stim = block.create_data_array(\"stimulus\", \"nix.sampled.time_series\", data=stimulus,\n",
    "                                   label=\"current stimulus\", unit=\"nA\")\n",
    "    stim.append_sampled_dimension(stepsize, label=\"time\", unit=\"s\")\n",
    "\n",
    "    # create the Tag to highlight the stimulus-on segment\n",
    "    tag = block.create_tag(\"stimulus presentation\", \"nix.epoch.stimulus_presentation\", [stim_onset])\n",
    "    tag.extent = [stim_duration]\n",
    "    tag.references.append(data)\n",
    "\n",
    "    # set stimulus as untagged feature of the tag\n",
    "    tag.create_feature(stim, nixio.LinkType.Untagged)\n",
    "```"
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "## Tagged Feature\n",
    "\n",
    "In contrast to the ``Untagged`` **Feature**, position and extent of the **Tag** should also be applied to the data stored in the **Feature**.\n",
    "\n",
    "In this example the data and the stimulus trace have the same duration that is, the times in one trace match the times in the other one. \n",
    "\n",
    "![tagged feature](resources/tagged_feature.png)\n",
    "\n",
    "The spike times are used to tag the recording of the membrane voltage using a **MultiTag**. The stimulus is added to the **MultiTag** as a ``tagged`` **Feature**. That is, the positions of the tag (the spike times) should be applied also to the stimulus. Extracting the ``feature_data`` gives the stimulus intensities at the times of the spikes, the orange distribution.\n",
    "\n",
    "```python\n",
    "    # create the positions DataArray, i.e. the spike times\n",
    "    positions = block.create_data_array(\"spike times\", \"nix.events.spike_times\", data=spike_times)\n",
    "    positions.append_range_dimension_using_self()\n",
    "\n",
    "    # create a MultiTag\n",
    "    multi_tag = block.create_multi_tag(\"spike times\", \"nix.events.spike_times\", positions)\n",
    "    multi_tag.references.append(data)\n",
    "\n",
    "    # save stimulus snippets in a DataArray\n",
    "    stimulus_array = block.create_data_array(\"stimulus\", \"nix.sampled\", data=stimulus, label=\"stimulus\", unit=\"nA\")\n",
    "    # add a descriptor for the time axis\n",
    "    stimulus_array.append_sampled_dimension(stepsize, label=\"time\", unit=\"s\")\n",
    "\n",
    "    # set stimulus as a tagged feature of the multi_tag\n",
    "    multi_tag.create_feature(stimulus_array, nixio.LinkType.Tagged)\n",
    "```\n"
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "## Storing wavelets as an Indexed Feature\n",
    "\n",
    "Above we have seen, that single numbers can be stored as an indexed **Feature**.\n",
    "\n",
    "In this example we want to store a snippet of the stimulus driving the neuron for each spike to create a Spike Triggered Average (STA), i.e. the average stimulus that leads to an action potential in the recorded neuron.\n",
    "\n",
    "In this example we store these stimulus snippets and link them to the events/spikes by adding a **Feature** to the **MultiTag**. There is one snippet for each spike. The index of each event has to be used as an index in the first dimension of the **Feature** data. \n",
    "\n",
    "```python\n",
    "    # sts is 2D, it contains the stimulus snippets centered on the recorded spikes. First dimension represents the number of spikes, second represents time.\n",
    "    snippets = block.create_data_array(\"spike triggered stimulus\", \"nix.regular_sampled.multiple_series\", data=sts, label=\"stimulus\", unit=\"nA\")\n",
    "    snippets.append_set_dimension()\n",
    "    snippets.append_sampled_dimension(stepsize, offset= -sta_offset * stepsize, label=\"time\", unit=\"s\")\n",
    "\n",
    "    # set snippets as an indexed feature of the multi_tag\n",
    "    multi_tag.create_feature(snippets, nixio.LinkType.Indexed)\n",
    "```\n",
    "\n",
    "![sta feature](resources/spike_features.png)\n"
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "## Retrieving the data stored in a feature\n",
    "\n",
    "Regardless of the ``LinkType``, one can ask the library to return the feature data by calling the ``feature_data`` methods on **Tag** or **MultiTag**.\n",
    "\n",
    "The signatures differ a little bit:\n",
    "\n",
    "* ``tag.feature_data(name_or_id_or_index)`` takes a single argument, i.e. the *name*, *id* or the *index* of one of the referenced **Features**.\n",
    "* ``mtag.feature_data(pos_index, name_or_id_or_index)`` takes two arguments of which the first is the *position index*, and the second one the  *name*, *id* or the *index* of one of the referenced **Features**.\n",
    "\n",
    "The methods return a **DataView** entity that holds the reference to the data in the file. The data can also be read in one go:\n",
    "\n",
    "``data = tag.feature_data(\"name of feature\")[:]``\n"
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [],
   "metadata": {}
  }
 ],
 "metadata": {
  "orig_nbformat": 4,
  "language_info": {
   "name": "python",
   "version": "3.9.5"
  },
  "kernelspec": {
   "name": "python3",
   "display_name": "Python 3.9.5 64-bit"
  },
  "interpreter": {
   "hash": "aee8b7b246df8f9039afb4144a1f6fd8d2ca17a180786b69acc140d282b71a49"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}