FAnalyze_Manual.tex.bak 20 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342
  1. \documentclass[letterpaper, 11pt]{article}
  2. \usepackage{amsmath}
  3. \usepackage[final]{graphicx}
  4. \usepackage[left = 1.5in, right =1.5in, top = 1.25in, bottom = 1.25in]{geometry}
  5. \usepackage{booktabs}
  6. \usepackage{tabularx}
  7. \usepackage{longtable}
  8. \title{FAnalyze v0.1 \\User's Manual}
  9. \author{Dan Valente\\\small{Mitra Lab}\\\small{Cold Spring Harbor Laboratory}}
  10. \date{\today}
  11. \begin{document}
  12. \maketitle
  13. \section{Introduction}
  14. FAnalyze is a suite of functions written in Matlab to aid in the analysis of animal behavior in the
  15. Open Field. It was written and tested primarily for fly exploratory behavior in a circular arena,
  16. and a description and illustration of the measures calculated by FAnalyze can be found in Ref.\
  17. \cite{valente1}. FAnalyze assumes that the user has tracked the (x,y) location of the animal to
  18. obtain the trajectory that the animal took over the course of the experiment. FAnalyze smooths the
  19. data, calculates the animal's velocity, and allows one to segment space and speed based on hard
  20. thresholds. With this inaugural version, FAnalyze only allows for segmentation into circular zones
  21. (it is assumed that the Open Field arena is circular). FAnalyze then allows the user to calculate
  22. and display probability distributions of the relevant variables to obtain a quantitative phenotypic
  23. characterization of the exploratory behavior. Please note that these calculations --- the
  24. probability distributions --- are the focus of the program.
  25. Although it has been rigourously tested, being that this is version 0.1, please be aware that
  26. multiple bugs may still exist in FAnalyze. If you find them, feel free to fix them (if you are
  27. Matlab savvy), or send an email to the author. To use FAnalyze, a basic working knowledge of
  28. Matlab is assumed.
  29. \section{Placing FAnalyze in the Matlab path}
  30. In order to use FAnalyze, the FAnalyze folder must first be placed in the Matlab path.
  31. \begin{enumerate}
  32. \item Open Matlab.
  33. \item Select \textbf{File $>>$ Set Path\ldots} from the toolbar. The Set Path dialogue will open up.
  34. \item Click on the \textbf{Add Folder\ldots} button.
  35. \item Find the FAnalyze directory and select the \textbf{functions} folder. Click \textbf{OK}. The path and name of the
  36. FAnalyze functions folder should show up in the Matlab search path window.
  37. \item Click \textbf{Save}, and then click \textbf{Close}. FAnalyze is now ready to use. To test if
  38. it is installed correctly, \texttt{help FAnalyze} in the Matlab command window. If an error is
  39. obtained upon this command, the folder was not correctly installed, so try to re-install. If the
  40. installation worked, you will be able to open the program or use any of the functions regardless of
  41. what directory you are currently in.
  42. \end{enumerate}
  43. \section{The FAnalyze GUI}
  44. The FAnalyze graphical user interface (GUI) was written to facilitate use of the analysis functions
  45. for quick data exploration. It is not intended for large scale data analysis---the individual
  46. FAnalyze Matlab functions are best for this---although the user may find the GUI useful in
  47. high-throughput studies.
  48. \begin{figure}
  49. \centering
  50. \hspace{-0cm}
  51. \includegraphics[bb= 0.0 0.0 622.0 367.0 clip=true, scale = 0.7]{GUI.jpg}
  52. \caption{The FAnalyze GUI}
  53. \label{GUI}
  54. \end{figure}
  55. The interface is shown in Fig.\ \ref{GUI} and is divided into three general sections: Trajectory
  56. Data, Probability Distributions, and Segmentation. The functionality of these sections is described
  57. below.
  58. \subsection*{Loading a Trajectory}
  59. To begin, the user must load a trajectory file for analysis by clicking the \textbf{Load
  60. Trajectory} button. FAnalyze assumes that the data is contained in a \texttt{.mat} file. Within
  61. that file \emph{must} be variables labeled \texttt{x}, \texttt{y}, and \texttt{t} describing the
  62. spatial coordinates and time of every point in the trajectory. FAnalyze permits analysis of only
  63. one file per session. Once the file is loaded, a message is displayed in the Command Window
  64. informing the user of the chosen file's name.
  65. \subsection*{Instructions for Smoothing}
  66. Due to jitter in the object's location caused by non-translational object movements and artifacts
  67. of the tracking method, it is typically good practice to smooth the resulting trajectory. The
  68. smoothing is executed using the function \texttt{runline} from the Chronux neural data analysis
  69. software package (http://www.chronux.org), which performs a local linear regression on the
  70. data\footnote{\texttt{runline} is included with FAnalyze, so having Chronux is not a pre-requisite
  71. for use of this program.}.
  72. \begin{enumerate}
  73. \item Select the length of the running window (in samples) to be used in the smoothing process.
  74. \item Select the step-size (in samples) that the window will take as it moves across the data (also
  75. known as the Window Overlap.) Enter these numbers into the appropriate boxes.
  76. \item Click the \textbf{Smooth Trajectory} button. In the Command Window, a message informs the user that the smoothing process is underway\footnote{Smoothing may take a long time, depending on the length of the trajectory.}. When smoothing has finished, a message
  77. will be displayed declaring a successful completion of the smoothing algorithm.
  78. \end{enumerate}
  79. Because smoothing can significantly eliminate fluctuations in the data, one is advised to
  80. investigate the effects of changing the smoothing parameters on the resulting trajectory. This
  81. will allow you to decide what amount of fluctuations are physically relevant in your videos. For
  82. instance, if the data is smoothed too drastically, a short but visible stop of the object can be
  83. entirely smoothed into an apparent movement!
  84. \subsection*{Viewing the Trajectory}
  85. Once the data is smoothed, the user is able to view the resulting trajectory. Simply select the
  86. plot of interest from the pull-down menu in the Trajectory Data section and click \textbf{View
  87. Trajectory}. The user has access to view the following data (note that a circular arena is assumed,
  88. hence the availability of data in polar coordinates):
  89. \begin{tabular}{l r}
  90. \texttt{(x,y)} & The complete position trajectory\\
  91. \texttt{x} & The x-coordinate as a function of time\\
  92. \texttt{y} & The y-coordinate as a function of time\\
  93. \texttt{r} & The radial coordinate as a function of time \\
  94. \texttt{theta} & The angular coordinate as a function of time\\
  95. \texttt{vx} & The x-velocity as a function of time\\
  96. \texttt{vy} & The y-velocity as a function of time\\
  97. \texttt{v} & The speed as a function of time\\
  98. \texttt{vtheta} & The direction of the velocity vector as a function of time
  99. \end{tabular}
  100. \subsection*{Creating and Viewing Probability Distributions}
  101. The philosophy behind FAnalyze is as follows\footnote{This is adapted from Ref.\ \cite{valente1}}:
  102. We regard the trajectory as a stochastic process $\mathbf{x}(t)$. This process would be fully
  103. characterized if the joint distributions $P(\mathbf{x}(t_1), \mathbf{x}(t_2), \ldots
  104. ,\mathbf{x}(t_n))$ were specified for all choices of time points $(t_1,t_2, \ldots ,t_n )$.
  105. Unfortunately, the full distribution, $P(\mathbf{x}(t_1), \mathbf{x}(t_2), \ldots ,x(t_n))$, is
  106. difficult (if not impossible) to measure. However, by examining joint distributions of position and
  107. velocity along with distributions of path curvature, reorientation angle, and event durations, we
  108. can obtain a convenient summary of the animal's behavior in the arena and its interaction with the
  109. environment.
  110. The distributions are estimated using histograms of the data, so it is recommended that the
  111. organism be studied for a long period of time for ``clean-looking" distributions (a ``good" length
  112. of time will depend on the activity of the animal and the frame rate that the video was taken at).
  113. These probability histograms are calculated with the functions \texttt{ProbDist1D},
  114. \texttt{ProbDist2D}, and \texttt{JointDist}.
  115. When examining histogram estimates of probability distributions, one needs to exercise care about phase space factors in order to obtain
  116. accurate estimates. For example, if the animal is moving in two dimensions, the probability
  117. density for the speed $v$ along with the phase space factors is given by $p(v)vdvd\theta$ (where
  118. $\theta$ is polar angle of the point $(v_x,v_y)$ in velocity space). Therefore, binning data in
  119. bins of size $\Delta v\Delta\theta$ would yield an estimate for $p(v)v$. When this is the case, we
  120. eliminate the need to divide by $v$ (which could be an unstable calculation for small $v$) by
  121. binning in $v^2$, since $p(v)vdvd\theta \sim p(v^2/2)d(v^2/2)d\theta$. For one-dimensional motion,
  122. such as movement along the arena boundary, there are no phase space factors and it is sufficient to
  123. bin the data in $v$. FAnalyze allows the user to select whether to calculate the distributions
  124. assuming a 1D or 2D phase space. For 2D phase space calculations, the user should take note of
  125. the non-constant bin widths of these histograms.
  126. Therefore, as soon as the data is smoothed (as well as when the data is segmented), the Probability
  127. Distributions list box will become populated with the variables that are available for analysis.
  128. The naming convention for variables in the list box is described in Appendix A. To calculate and
  129. view the probability distributions of interest, proceed as follows:
  130. \begin{enumerate}
  131. \item For a single variable marginal distribution, highlight the variable of interest by clicking
  132. on it. For a joint distribution of two variables, select the first variable of interest, hold down
  133. the CTRL button on the keyboard, and select the second variable of interest.
  134. \item Enter the \textbf{Bins} to use for the calculation. This field will accept any bin description that the Matlab \texttt{hist} or \texttt{hist3} commands accept. See
  135. the help files of those functions for details, and make sure that brackets commas and other
  136. necessary puncuation are used. Also note that no other options available to \texttt{hist} or
  137. \texttt{hist3} are available in the FAnalyze functions with this release (v0.1).
  138. As an example, if you wish to calculate a joint distribution having 100 bins in the first variable
  139. and 150 in the second, you would enter: \texttt{[100 150]}. If, instead you wanted bin centers
  140. from 0 to 10 in steps of 0.1 for the first variable, and bin centers from 2 to 4 in steps of 0.3
  141. for the second you would enter: \texttt{ \{[0:0.1:10] [2:0.3:4]\}}. (Note the curly brackets).
  142. \item Select whether the variable of interest exists in a one or two-dimensional phase space (see above).
  143. \item Click the \textbf{View Distribution} button. The distribution will be calculated and a plot
  144. of the distribution will be displayed.
  145. \end{enumerate}
  146. Every calculation that is performed is held in memory until FAnalyze is closed. At any point in
  147. time, you may click the Save button to save your calculations (structure of data is described
  148. below). Unfortunately, at this point, the user cannot access the calculated data from the command
  149. window until after the data is saved and reloaded.
  150. \subsection*{Segmenting Space}
  151. Often, an examination of the joint distributions $p(x,v)$ or $p(r,\theta)$ will show that the
  152. animal has a spatial preference for some part of the arena. FAnalyze allows the user to segment
  153. the arena into any number of circular spatial ``zones." Please note that only concentric circular
  154. zones are allowed (or, rather, toroidal zones). To segment space:
  155. \begin{enumerate}
  156. \item Enter the \textbf{Number of Zones} that you wish to segment the arena into.
  157. \item Click the \textbf{Segment Space} button. A window will pop up asking the user to input
  158. relevant information.
  159. \item Input the location of the threshold defining the boundary between zones 1 and 2, in terms of the radial distance from center. Enter names for these zones. Click \textbf{OK}.
  160. \item If more than two zones are requested, another window will pop up asking for similar
  161. information. Make sure that the first zone name in this window is the same as the second name in
  162. the last window and that the second threshold is larger than the first; otherwise, you will get an
  163. error (zone 2 must have a consistent name, and the second threshold must be further than the first
  164. threshold).
  165. \item Repeat this for all the zones you requested.
  166. \item Once the segmentation is finished, a message will be displayed in the command window, and the
  167. Probability Distributions list box will be populated with variables available for analysis.
  168. \end{enumerate}
  169. \subsection*{Segmenting Speed}
  170. Similar to the spatial distributions, when examining the speed distribution $p(v)$, the user may
  171. find that the distribution appears to be a mixture of a few different types of motion. Because of
  172. this, investigators often find it useful to segment the speed into distinct modes of motion. In
  173. FAnalyze, speed segmentation is performed almost exactly as the space segmentation. To segment
  174. speed:
  175. \begin{enumerate}
  176. \item Enter the \textbf{Noise Threshold} seen in your data. The noise threshold is the lowest speed that
  177. you can accurately resolve. It can be obtained by examining the speed vs.\ time plot and noting
  178. the maximum speed attained in regions where the animal is visibly stationary. Velocity and speed
  179. points below this threshold are assigned a value of 0.
  180. \item Click the \textbf{Segment Speed} button. A window will pop up asking the user to input
  181. relevant information \emph{for each spatial zone}.
  182. \item Input the location of the threshold defining the boundary between segments 1 and 2, in terms of the absolute speed. Enter names for these segments. Click \textbf{OK}.
  183. \item If more than two segments are requested, another window will pop up asking for similar
  184. information. Make sure that the first segment name in this window is the same as the second name
  185. in the last window and that the second threshold is larger than the first; otherwise, you will get
  186. an error (segment 2 must have a consistent name, and the second threshold must be further than the
  187. first threshold).
  188. \item Repeat this for all the segments and zones you requested.
  189. \item FAnalyze now segments the data according to the user's requests, as well as calculating where
  190. the animal has stopped (points below the noise threshold). Stops are considered a ``segment."
  191. \item Once the segmentation is finished, a message will be displayed in the command window, and the
  192. Probability Distributions list box will be populated with variables available for analysis.
  193. \end{enumerate}
  194. \subsection*{Saving the Data}
  195. To save the data from the session as a \texttt{.mat} file, click on the \textbf{Save} button in the
  196. lower right-hand corner of FAnalyze. The user is asked to choose a location and a filename in which
  197. to save. The
  198. data is saved as two cells, \texttt{traj} and \texttt{P}.\\
  199. \noindent\texttt{traj} is a 1 x $N$ cell, where $N$ is the number of zones in the arena. The
  200. $i^{\text{th}}$ cell contains a structure with the trajectory information from that zone. Within
  201. that structure, each variable is a cell itself containing structures for each speed segment. For
  202. example, data from the second zone is accessed by typing \texttt{traj\{2\}} and is organized as
  203. follows:
  204. \begin{verbatim}
  205. traj{2} =
  206. zone_label: {`CZ'}
  207. seg_label: {`all' `stops' `NZS' `FSS'}
  208. t: {[1x1 struct] [1x1 struct] [1x1 struct] [1x1 struct]}
  209. x: {[1x1 struct] [1x1 struct] [1x1 struct] [1x1 struct]}
  210. y: {[1x1 struct] [1x1 struct] [1x1 struct] [1x1 struct]}
  211. r: {[1x1 struct] [1x1 struct] [1x1 struct] [1x1 struct]}
  212. theta: {[1x1 struct] [1x1 struct] [1x1 struct] [1x1 struct]}
  213. vx: {[1x1 struct] [1x1 struct] [1x1 struct] [1x1 struct]}
  214. vy: {[1x1 struct] [1x1 struct] [1x1 struct] [1x1 struct]}
  215. v: {[1x1 struct] [1x1 struct] [1x1 struct] [1x1 struct]}
  216. vtheta: {[1x1 struct] [1x1 struct] [1x1 struct] [1x1 struct]}
  217. tau: {[1x1 struct] [1x1 struct] [1x1 struct] [1x1 struct]}
  218. kappa: {[1x1 struct] [1x1 struct] [1x1 struct] [1x1 struct]}
  219. beta: {[1x1 struct]}
  220. \end{verbatim}
  221. The labels are fairly self explanatory. Note that \texttt{beta} can only be calculated if all the
  222. points in the trajectory are considered.
  223. The $j^{\text{th}}$ entry of each variable's cell contains a structure with a single field. This
  224. field is called \texttt{data}. So, if one is interested in extracting the x-position of the
  225. organism in zone \texttt{`CZ'} (the second zone), while the organism walked in speed segments
  226. labeled by \texttt{`FSS'} (the fourth segment), one would type
  227. \begin{verbatim}
  228. traj{2}.x{4}.data
  229. \end{verbatim}
  230. The probability histograms are saved in the \texttt{P} cell. \texttt{P} is a 1 x M cell, where M
  231. is the number of times the View Distributions button was pressed during the session. Each entry of
  232. the cell contains a structure that is organized as follows:
  233. \begin{verbatim}
  234. P{1} =
  235. label: `x_Full Arena_all'
  236. phase_opt: `phase1D'
  237. data: [1x50 double]
  238. bins: [1x50 double]
  239. \end{verbatim}
  240. The field \texttt{label} is the name of the variable from which the probability distribution was
  241. calculated. \texttt{phase\_opt} denotes whether the user chose to calculate the distribution
  242. assuming a 1D or 2D phase space. \texttt{data} contains the bin-by-bin data from the calculated
  243. histogram, and \texttt{bins} contains the bin centers. If an error occurs, the entry is completely
  244. empty.
  245. Admittedly, this seems complicated, but the author felt it was a decent way to organize the data
  246. file.
  247. \section{Known Problems}
  248. There are no known problems with FAnalyze at this point, although they undoubtedly exist.
  249. \section{Concluding Comments}
  250. For those who wish to use the functions from the Matlab command line, complete descriptions of
  251. their use and workings can be found in the Matlab help files; simply type \texttt{help
  252. function\_name}. Describing them in detail here would be superfluous. The scripts are also
  253. commented, and as such, they should be relatively easy to follow. Suggestions for improvements to
  254. the algorithms, the GUI or the coding style are highly encouraged! Comments on the ease of use of
  255. the GUI and functions are also important for refining this program. Since this is v0.1, FAnalyze
  256. needs quite a bit of testing in order to find all of the bugs (pun intended). Until then, please
  257. check and double-check any results that you obtain from this program, and make sure that they make sense! \\
  258. \noindent Enjoy!
  259. \section{Appendix A: Variable Naming Convention}
  260. There are eleven variables that are available for analysis in FAnalyze. They are:\\
  261. \begin{tabular}{l r}
  262. \texttt{x} & The x-coordinate as a function of time\\
  263. \texttt{y} & The y-coordinate as a function of time\\
  264. \texttt{r} & The radial coordinate as a function of time \\
  265. \texttt{theta} & The angular coordinate as a function of time\\
  266. \texttt{vx} & The x-velocity as a function of time\\
  267. \texttt{vy} & The y-velocity as a function of time\\
  268. \texttt{v} & The speed as a function of time\\
  269. \texttt{vtheta} & The direction of the velocity vector as a function of time\\
  270. \texttt{tau} & Duration of speed segments\\
  271. \texttt{kappa} & Curvature of the path \\
  272. \texttt{beta} & Reorientation angle
  273. \end{tabular}\\
  274. In the list box, these variable names are followed by the zone name and the speed segment name as
  275. given by the user. For example, for a zone named \texttt{`RZ'} and a speed segment within that
  276. zone named \texttt{`FSS'}, the speed would appear as \texttt{v\_RZ\_FSS}. For the full arena, the
  277. label is automatically called (appropriately) \texttt{`Full Arena'}. If all the velocity points are
  278. included the speed segment label is \texttt{`all'}. Therefore, after smoothing and before any
  279. segmentation, the variables in the list contain the label \texttt{`\_Full Arena\_all'}.
  280. \begin{thebibliography}{9}
  281. \bibitem{valente1}Valente, D., Golani, I., and P.P. Mitra, ``Analysis of the trajectory of \emph{Drosophila
  282. melanogaster} in a circular open field arena." PLoS ONE 2(10), e1083
  283. doi:10.1371/journal.pone.0001083, (2007)
  284. \bibitem{valente2}Valente, D., Wang H., Andrews P., Saar S., Tchernichovski O., Benjanimi, Y., Golani I. and
  285. P.P. Mitra, ``Characterization of animal behavior through the use of audio and video signal
  286. processing." IEEE Multimedia, 14 (2), 32-41, (2007)
  287. \end{thebibliography}
  288. \end{document}