This repository contains code for rerunning the analyses reported in our preprint:
Taylor, J.E., Sinn, R., Iaia, C., & Fiebach, C. J. (2024). Beyond Letters: Optimal Transport as a Model for Sub-Letter Orthographic Processing. https://doi.org/10.1101/2024.11.11.622929
The easiest way to re-run these analyses without worrying about package versions is using a Docker container. For this you will need a Docker installation.
Set the working directory to the directory containing Dockerfile
, then build an image, e.g., called lettersim
:
docker image build -t lettersim .
This requires internet access. Building the image will automatically download the raw data in BIDS format from OpenNeuro (https://openneuro.org/datasets/ds005594) into the container.
Create a volume, e.g., called vol1
, from which any outputs can be retrieved:
docker volume create vol1
Create a container from the Docker image to run the script you want to use, mounted to the volume you just created, e.g., here we run 01_get_corpus_model_rdms.py
in a container named LS1
. The last part, ./run.sh 01_get_corpus_model_rdms.py
, tells the container which script to run. Passing it through run.sh
will ensure that the correct environment/interpreter is used.
docker container run -t --mount source=vol1,target=/analysis --name=LS1 lettersim ./run.sh 01_get_corpus_model_rdms.py
Note: see Docker documentation for available options. For example, you can use
--cpus
to limit the number of CPUs available to the container.
Alternatively, you can run scripts interactively, e.g. start an interactive shell:
docker container run -it --mount source=vol1,target=/analysis --name=LSINT lettersim bash
And then run a script through run.sh
:
./run.sh 01_get_corpus_model_rdms.py
To get any output files, you will need to access the mountpoint of the volume that the container is using. By default, volumes can be accessed at /var/lib/docker/volume
.
You can also find the location listed in the volume's configuration:
docker volume inspect vol1
[
{
"CreatedAt": "1980-1-1T00:00:01Z",
"Driver": "local",
"Labels": null,
"Mountpoint": "/var/lib/docker/volumes/vol1/_data",
"Name": "vol1",
"Options": null,
"Scope": "local"
}
]
If using Docker Desktop, you can also browse the files in the Volumes tab.
The actual location used by Docker Desktop varies by Docker and WSL version. On Docker Desktop v4.34.3 on Windows 11, data in a volume called vol1
can be found at
\\wsl.localhost\docker-desktop-data\data\docker\volumes\vol1\_data
As a docker image, this analysis can be run on high-performance computing (HPC) clusters via Singularity: https://docs.sylabs.io/guides/2.6/user-guide/singularity_and_docker.html
If you prefer not to use Docker, you can run the scripts outside of a container. These instructions assume you have git and a conda distribution installed.
Download the dataset from OpenNeuro, saving it to a directory called eeg/
:
git clone https://github.com/OpenNeuroDatasets/ds005594 eeg
There are three conda environments used to run the scripts:
mne
Python environment for working with EEG data and calculating neural RDMs (conda .yml file in env/environment-mne.yml
)
rdms
Python environment for calculating model RDMs used in the RSA (conda .yml file in env/environment-rdms.yml
)
r
R environment for running all .R scripts (conda .yml file in env/environment-r.yml
)
You can install these environments with:
conda env create --file=env/environment-mne.yml --force
conda env create --file=env/environment-rdms.yml --force
conda env create --file=env/environment-r.yml --force
You may need to run these steps if you want to use scripts in fig_code/
to reproduce figures.
Soms scripts in fig_code/
use LaTeX. We use TexLive.
If TexLive is installed, you can use tlmgr
to install all the TeX packages that we use:
tlmgr install xcolor tex-gyre underscore etoolbox pgf
matplotlib
You may need to add fonts to matplotlib
. You can do this using:
conda run -n mne fig_code/mpl_setup_fonts.py
conda run -n rdms fig_code/mpl_setup_fonts.py
If conda
is on path, and the environments are set up as in the .yml
files (including environment names), you can use the script run.sh
to automatically run a script with the correct environment and interpreter, e.g.:
./run.sh 01_get_corpus_model_rdms.py
All code is intended to be run with the base directory (i.e., the same directory as run.sh
) as the working directory.
Scripts are numbered based on the order we ran them in. Scripts 00
to 03
should be strictly run in order to reproduce all steps, as they depend on outputs from the previous files. Scripts 04
and up can then be run in any order or skipped.
All outputs are included in this repository, so you can run any script in isolation.
Scripts for creating figures are in the fig_code
directory, with the 99
prefix.
The repository structure is listed and explained in project_tree.md