# Alphabetic Decision Task Analysis This repository contains code for rerunning the analyses reported in our preprint: > Taylor, J.E., Sinn, R., Iaia, C., & Fiebach, C. J. (2024). Beyond Letters: Optimal Transport as a Model for Sub-Letter Orthographic Processing. https://doi.org/10.1101/2024.11.11.622929 ## Running the Analysis on Docker The easiest way to re-run these analyses without worrying about package versions is using a Docker container. For this you will need a [Docker installation](https://docs.docker.com/engine/install/). ### 1. Build the Docker Image Set the working directory to the directory containing `Dockerfile`, then build an image, e.g., called `lettersim`: ```sh docker image build -t lettersim . ``` This requires internet access. Building the image will automatically download the raw data in BIDS format from OpenNeuro (https://openneuro.org/datasets/ds005594) into the container. ### 2. Mount a Volume Create a volume, e.g., called `vol1`, from which any outputs can be retrieved: ```sh docker volume create vol1 ``` ### 3. Run a Script in a Container #### Run Script Non-Interactively Create a container from the Docker image to run the script you want to use, mounted to the volume you just created, e.g., here we run `01_get_corpus_model_rdms.py` in a container named `LS1`. The last part, `./run.sh 01_get_corpus_model_rdms.py`, tells the container which script to run. Passing it through `run.sh` will ensure that the correct environment/interpreter is used. ```sh docker container run -t --mount source=vol1,target=/analysis --name=LS1 lettersim ./run.sh 01_get_corpus_model_rdms.py ``` > Note: see [Docker documentation](https://docs.docker.com/reference/cli/docker/container/create/) for available options. For example, you can use `--cpus` to limit the number of CPUs available to the container. #### Run Script Interactively Alternatively, you can run scripts interactively, e.g. start an interactive shell: ```sh docker container run -it --mount source=vol1,target=/analysis --name=LSINT lettersim bash ``` And then run a script through `run.sh`: ```sh ./run.sh 01_get_corpus_model_rdms.py ``` ### Accessing Data from the Container To get any output files, you will need to access the mountpoint of the volume that the container is using. By default, volumes can be accessed at `/var/lib/docker/volume`. You can also find the location listed in the volume's configuration: ```sh docker volume inspect vol1 ``` ```json [ { "CreatedAt": "1980-1-1T00:00:01Z", "Driver": "local", "Labels": null, "Mountpoint": "/var/lib/docker/volumes/vol1/_data", "Name": "vol1", "Options": null, "Scope": "local" } ] ``` If using Docker Desktop, you can also browse the files in the Volumes tab. The actual location used by Docker Desktop varies by Docker and WSL version. On Docker Desktop v4.34.3 on Windows 11, data in a volume called `vol1` can be found at ``` \\wsl.localhost\docker-desktop-data\data\docker\volumes\vol1\_data ``` ### Running on HPC As a docker image, this analysis can be run on high-performance computing (HPC) clusters via Singularity: https://docs.sylabs.io/guides/2.6/user-guide/singularity_and_docker.html ## Running the Analysis without Docker If you prefer not to use Docker, you can run the scripts outside of a container. These instructions assume you have [git](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git) and a [conda distribution](https://docs.conda.io/projects/conda/en/latest/user-guide/install/index.html) installed. ### Download the BIDS Dataset Download the dataset from OpenNeuro, saving it to a directory called `eeg/`: ```sh git clone https://github.com/OpenNeuroDatasets/ds005594 eeg ``` ### Install Environments There are three conda environments used to run the scripts: * `mne` Python environment for working with EEG data and calculating neural RDMs (conda .yml file in `env/environment-mne.yml`) * `rdms` Python environment for calculating model RDMs used in the RSA (conda .yml file in `env/environment-rdms.yml`) * `r` R environment for running all .R scripts (conda .yml file in `env/environment-r.yml`) You can install these environments with: ```sh conda env create --file=env/environment-mne.yml --force conda env create --file=env/environment-rdms.yml --force conda env create --file=env/environment-r.yml --force ``` ### Setup for Reproducing Figures You may need to run these steps if you want to use scripts in `fig_code/` to reproduce figures. #### Install LaTeX Soms scripts in `fig_code/` use LaTeX. We use [TexLive](https://www.tug.org/texlive/). If TexLive is installed, you can use `tlmgr` to install all the TeX packages that we use: ```sh tlmgr install xcolor tex-gyre underscore etoolbox pgf ``` #### Add Fonts to `matplotlib` You may need to add fonts to `matplotlib`. You can do this using: ```sh conda run -n mne fig_code/mpl_setup_fonts.py conda run -n rdms fig_code/mpl_setup_fonts.py ``` ### Running Scripts If `conda` is on path, and the environments are set up as in the `.yml` files (including environment names), you can use the script `run.sh` to automatically run a script with the correct environment and interpreter, e.g.: ```sh ./run.sh 01_get_corpus_model_rdms.py ``` ## General Notes on the Analysis * All code is intended to be run with the base directory (i.e., the same directory as `run.sh`) as the working directory. * Scripts are numbered based on the order we ran them in. Scripts `00` to `03` should be strictly run in order to reproduce all steps, as they depend on outputs from the previous files. Scripts `04` and up can then be run in any order or skipped. * All outputs are included in this repository, so you can run any script in isolation. * Scripts for creating figures are in the `fig_code` directory, with the `99` prefix. * The repository structure is listed and explained in `project_tree.md`