Scheduled service maintenance on November 22


On Friday, November 22, 2024, between 06:00 CET and 18:00 CET, GIN services will undergo planned maintenance. Extended service interruptions should be expected. We will try to keep downtimes to a minimum, but recommend that users avoid critical tasks, large data uploads, or DOI requests during this time.

We apologize for any inconvenience.

Browse Source

add scripts

Jack 9 months ago
parent
commit
dfc4e23896

+ 79 - 0
04 Analysis/BioSemi64.loc

@@ -0,0 +1,79 @@
+1	-18.0000	0.4344	Fp1.
+2	-36.0000	0.4344	AF7.
+3	-25.0000	0.3494	AF3.
+4	-22.0000	0.2361	F1..
+5	-39.0000	0.2833	F3..
+6	-49.0000	0.3542	F5..
+7	-54.0000	0.4344	F7..
+8	-72.0000	0.4344	FT7.
+9	-69.0000	0.3400	FC5.
+10	-62.0000	0.2361	FC3.
+11	-45.0000	0.1511	FC1.
+12	-90.0000	0.1086	C1..
+13	-90.0000	0.2172	C3..
+14	-90.0000	0.3258	C5..
+15	-90.0000	0.4344	T7..
+16	252.0000	0.4344	TP7.
+17	249.0000	0.3400	CP5.
+18	242.0000	0.2361	CP3.
+19	225.0000	0.1511	CP1.
+20	202.0000	0.2361	P1..
+21	219.0000	0.2833	P3..
+22	229.0000	0.3542	P5..
+23	234.0000	0.4344	P7..
+24	230.0000	0.5667	P9..
+25	216.0000	0.4344	PO7.
+26	205.0000	0.3494	PO3.
+27	198.0000	0.4344	O1..
+28	180.0000	0.5667	Iz..
+29	180.0000	0.4344	Oz..
+30	180.0000	0.3258	POz.
+31	180.0000	0.2172	Pz..
+32	180.0000	0.1086	CPz.
+33	0.0000	0.4344	Fpz.
+34	18.0000	0.4344	Fp2.
+35	36.0000	0.4344	AF8.
+36	25.0000	0.3494	AF4.
+37	0.0000	0.3258	Afz.
+38	0.0000	0.2172	Fz..
+39	22.0000	0.2361	F2..
+40	39.0000	0.2833	F4..
+41	49.0000	0.3542	F6..
+42	54.0000	0.4344	F8..
+43	72.0000	0.4344	FT8.
+44	69.0000	0.3400	FC6.
+45	62.0000	0.2361	FC4.
+46	45.0000	0.1511	FC2.
+47	0.0000	0.1086	FCz.
+48	90.0000	0.0000	Cz..
+49	90.0000	0.1086	C2..
+50	90.0000	0.2172	C4..
+51	90.0000	0.3258	C6..
+52	90.0000	0.4344	T8..
+53	108.0000	0.4344	TP8.
+54	111.0000	0.3400	CP6.
+55	118.0000	0.2361	CP4.
+56	135.0000	0.1511	CP2.
+57	158.0000	0.2361	P2..
+58	141.0000	0.2833	P4..
+59	131.0000	0.3542	P6..
+60	126.0000	0.4344	P8..
+61	130.0000	0.5667	P10.
+62	144.0000	0.4344	PO8.
+63	155.0000	0.3494	PO4.
+64	162.0000	0.4344	O2..
+65	0	0	EXG1.
+66	0	0	EXG2.
+67	0	0	EXG3.
+68	0	0	EXG4.
+69	0	0	EXG5.
+70	0	0	EXG6.
+71	0	0	EXG7.
+72	0	0	EXG8.
+73	0	0	GSR1.
+74	0	0	GSR2.
+75	0	0	Erg1.
+76	0	0	Erg2.
+77	0	0	Resp.
+78	0	0	Plet.
+79	0	0	Temp.

+ 31 - 0
04 Analysis/README.md

@@ -0,0 +1,31 @@
+This folder contains the code to run the picture-word experiment, the raw data, and analysis of the results.
+
+```
+.
+|-- figs: Figures generated from the code in this folder.
+|   `-- sample_level: Figures generated from sample-level mixed-effects models analyses.
+|-- localiser_sample_data: Sample-level EEG data for each participant in .csv format, for the localiser.
+|-- max_elec_data: Data for main analysis with amplitudes extracted from per-participant maximal electrodes and timepoints.
+|-- mods: brms models for behavioural and EEG data.
+|-- raw_data
+|   |-- eeg-pc: Raw data from the EEG PC in .bdf (biosemi data format)
+|   `-- stim-pc: Raw (behavioural) data from the stimulus PC, as well as code used to run the experiment.
+|-- sample_data: Sample-level EEG data for each participant in .csv format, for the picture-word task.
+|-- analyse_01_preprocess_localiser.m: MATLAB script to preprocess the data from the localiser with EEGLAB.
+|-- analyse_02_preprocess_pictureword.m: MATLAB script to preprocess the data from the picture-word task with EEGLAB.
+|-- analyse_03_main_analysis.R: Script for running the main analysis.
+|-- analyse_04_other_maximal_electrode.R: Script for an exploratory analysis using the word-versus-noise maximal electrode.
+|-- analyse_05_roi_avg.R: Script for an exploratory analysis using the region of interest average.
+|-- analyse_06_localiser_results_roi.R: Sample-level analysis of Localiser EEG results.
+|-- analyse_07_sample_picture_word_roi.R: Sample-level analysis of Picture-Word task.
+|-- analyse_08_pictureword_rt.R: Response Time analysis of picture-word task.
+|-- analyse_09_pictureword_acc.R: Accuracy analysis of picture-word task.
+|-- analyse_10_localiser_rt_acc.R: Response time and accuracy analyses of localiser.
+|-- analyse_11_preprocess_pictureword_picture.m: Preprocessing to get epochs time-locked to picture presentation in the picture-word task.
+|-- analyse_12_preprocess_pictureword_response.m: Preprocessing to get epochs time-locked to responses in the picture-word task.
+|-- analyse_13_sample_picture_word_topo.R: Script for fitting per-timepoint, per-channel mixed-effects models.
+|-- analyse_14_plot_sample_picture_word_topo.R: Script for plotting results from the models estimated by analyse_13.
+|-- BioSemi64.loc: The 64-channel BioSemi montage used.
+|-- boss.csv: Useful variables from the BOSS norms.
+`-- max_elecs.csv: Per-participant maximal electrodes and time-points, calculated in analyse_01_preprocess_localiser.m
+```

+ 669 - 0
04 Analysis/analyse_01_preprocess_localiser.m

@@ -0,0 +1,669 @@
+%% Setup
+
+% paths to data
+eeg_path = fullfile('raw_data', 'eeg-pc', 'localiser');
+beh_path = fullfile('raw_data', 'stim-pc', 'data', 'localiser');
+
+% import eeglab (assumes eeglab has been added to path), e.g.
+addpath('C:/EEGLAB/eeglab2020_0')
+[ALLEEG, EEG, CURRENTSET, ALLCOM] = eeglab;
+
+% As this uses fastica algorithm for ICA, FastICA needs to be on the path, e.g.
+addpath('C:/EEGLAB/FastICA_25')
+
+% region of interest for finding maximal electrodes
+roi = {'TP7', 'CP5', 'P7', 'P5', 'P9', 'PO7', 'PO3', 'O1'};
+
+% cutoff probability for identifying eye and muscle related ICA components with ICLabel
+icl_cutoff = 0.85;
+
+% sigma parameter for ASR
+asr_sigma = 20;
+
+%% Clear Output Folders
+
+delete(fullfile('localiser_sample_data', '*.csv'))
+
+%% Import lab book
+
+% handle commas in vectors
+lab_book_file = fullfile('raw_data', 'stim-pc', 'participants.csv');
+lab_book_raw_dat = fileread(lab_book_file);
+
+[regstart, regend] = regexp(lab_book_raw_dat, '\[.*?\]');
+
+for regmatch_i = 1:numel(regstart)
+    str_i = lab_book_raw_dat(regstart(regmatch_i):regend(regmatch_i));
+    str_i(str_i==',') = '.';
+    lab_book_raw_dat(regstart(regmatch_i):regend(regmatch_i)) = str_i;
+end
+
+lab_book_fixed_file = fullfile('raw_data', 'stim-pc', 'participants_tmp.csv');
+lab_book_fixed_conn = fopen(lab_book_fixed_file, 'w');
+fprintf(lab_book_fixed_conn, lab_book_raw_dat);
+fclose(lab_book_fixed_conn);
+
+lab_book_readopts = detectImportOptions(lab_book_fixed_file, 'VariableNamesLine', 1, 'Delimiter', ',');
+% read subject ids as class character
+lab_book_readopts.VariableTypes{strcmp(lab_book_readopts.SelectedVariableNames, 'subj_id')} = 'char';
+lab_book = readtable(lab_book_fixed_file, lab_book_readopts);
+
+delete(lab_book_fixed_file)
+
+%% Count the total number of excluded electrodes
+
+n_bads = 0;
+n_bads_per_s = zeros(size(lab_book, 1), 0);
+
+for subject_nr = 1:size(lab_book, 1)
+    bad_channels = eval(strrep(strrep(strrep(lab_book.loc_bad_channels{subject_nr}, '[', '{'), ']', '}'), '.', ','));
+    n_bads_per_s(subject_nr) = numel(bad_channels);
+    n_bads = n_bads + numel(bad_channels);
+end
+
+perc_bads = n_bads / (64 * size(lab_book, 1)) * 100;
+
+%% Set up results table
+
+max_elec_columns = {'subject_id',...
+    'max_elec_bacs', 'max_time_bacs', 'max_diff_bacs',...
+    'max_elec_noise', 'max_time_noise', 'max_diff_noise'};
+empty_tablecells = cell(size(lab_book, 1), numel(max_elec_columns));
+max_elecs = cell2table(empty_tablecells);
+max_elecs.Properties.VariableNames = max_elec_columns;
+
+%% Iterate over subjects
+
+% record trial exclusions
+total_excl_trials_incorr = zeros(1, size(lab_book, 1));
+total_excl_trials_rt = zeros(1, size(lab_book, 1));
+
+n_bad_ica = zeros(size(lab_book, 1), 0);
+
+for subject_nr = 1:size(lab_book, 1)
+    
+    subject_id = lab_book.subj_id{subject_nr};
+    fprintf('\n\n Subject Iteration %g/%g, ID: %s\n', subject_nr, size(lab_book, 1), subject_id)
+    
+    %% get subject-specific info from lab book
+    exclude = lab_book.exclude(subject_nr);
+    bad_channels = eval(strrep(strrep(strrep(lab_book.loc_bad_channels{subject_nr}, '[', '{'), ']', '}'), '.', ','));
+    bad_channels_pictureword = eval(strrep(strrep(strrep(lab_book.pw_bad_channels{subject_nr}, '[', '{'), ']', '}'), '.', ','));
+    bad_trigger_indices = eval(strrep(lab_book.loc_bad_trigger_indices{subject_nr}, '.', ','));
+
+    % add PO4 to bad channels, which seems to be consistently noisy, even when not marked as bad
+    if sum(strcmp('PO4', bad_channels))==0
+        bad_channels(numel(bad_channels)+1) = {'PO4'};
+    end
+    
+    %% abort if excluded
+    
+    if exclude
+        % this is not planned to be used, but will be an easy way for other
+        % researchers to see the effect of excluding specific participants
+        % by editing the participants.csv file
+        fprintf('Subject %s excluded. Preprocessing aborted.\n', subject_id)
+        fprintf('Lab book note: %s\n', lab_book.note{subject_nr})
+        continue
+    end
+    
+    if (numel(bad_channels) >= 10) || (numel(bad_channels_pictureword) >= 10)
+        fprintf('Subject %s excluded as >=10 electrodes marked as bad in either task. Preprocessing aborted.\n', subject_id)
+        fprintf('Lab book note: %s\n', lab_book.note{subject_nr})
+        continue
+    end
+    
+    %% load participant's data
+    
+    % load raw eeg
+    raw_datapath = fullfile(eeg_path, append(subject_id, '.bdf'));
+    
+    % abort if no EEG data collected yet
+    if ~isfile(raw_datapath)
+        fprintf('Subject %s skipped: no EEG data found\n', subject_id)
+        continue
+    end
+    
+    EEG = pop_biosig(raw_datapath, 'importevent', 'on', 'rmeventchan', 'off');
+    
+    % load behavioural
+    all_beh_files = dir(beh_path);
+    beh_regex_matches = regexpi({all_beh_files.name}, append('^', subject_id, '_.+\.csv$'), 'match');
+    regex_emptymask = cellfun('isempty', beh_regex_matches);
+    beh_regex_matches(regex_emptymask) = [];
+    subj_beh_files = cellfun(@(x) x{:}, beh_regex_matches, 'UniformOutput', false);
+    
+    if numel(subj_beh_files)>1
+        fprintf('%g behavioural files found?\n', size(subj_beh_files))
+        break
+    end
+    
+    beh_datapath = fullfile(beh_path, subj_beh_files{1});
+    beh = readtable(beh_datapath);
+    
+    %% Set data features
+    
+    % set channel locations
+    
+    orig_locs = EEG.chanlocs;
+    EEG.chanlocs = pop_chanedit(EEG.chanlocs, 'load', {'BioSemi64.loc', 'filetype', 'loc'});  % doesn't match order for the data
+    
+    % set channel types
+    for ch_nr = 1:64
+        EEG.chanlocs(ch_nr).type = 'EEG';
+    end
+    
+    for ch_nr = 65:72
+        EEG.chanlocs(ch_nr).type = 'EOG';
+    end
+    
+    for ch_nr = 73:79
+        EEG.chanlocs(ch_nr).type = 'MISC';
+    end
+    
+    for ch_nr = 65:79
+        EEG.chanlocs(ch_nr).theta = [];
+        EEG.chanlocs(ch_nr).radius = [];
+        EEG.chanlocs(ch_nr).sph_theta = [];
+        EEG.chanlocs(ch_nr).sph_phi = [];
+        EEG.chanlocs(ch_nr).X = [];
+        EEG.chanlocs(ch_nr).Y = [];
+        EEG.chanlocs(ch_nr).Z = [];
+    end
+    
+    % change the order of channels in EEG.data to match the new order in chanlocs
+    data_reordered = EEG.data;
+    for ch_nr = 1:64        
+        % make sure the new eeg data array matches the listed order
+        ch_lab = EEG.chanlocs(ch_nr).labels;
+        orig_locs_idx = find(strcmp(lower({orig_locs.labels}), lower(ch_lab)));
+        data_reordered(ch_nr, :) = EEG.data(orig_locs_idx, :);
+    end
+    EEG.data = data_reordered;
+    
+    % remove unused channels
+    EEG = pop_select(EEG, 'nochannel', 69:79);
+    
+    % plot the ROI for the paper
+    if strcmp(subject_id, '1')
+
+        roi_fig = figure;
+        roi_idx = find(ismember({EEG.chanlocs.labels}, roi));
+
+        hold on;
+        topoplot(zeros(64, 0), EEG.chanlocs, 'electrodes', 'off');
+        % set line width
+        set(findall(gca, 'Type', 'Line'), 'LineWidth', 1);
+        for i = 1:64
+            if ismember(i, roi_idx)
+                markcol = [1, 0, 0];
+            else
+                markcol = [0.75, 0.75, 0.75];
+            end
+            topoplot(zeros(64, 0), EEG.chanlocs, 'colormap', [0,0,0], 'emarker', {'.', markcol, 15, 1}, 'plotchans', i, 'headrad', 0);
+        end
+        hold off
+
+        set(roi_fig, 'Units', 'Inches', 'Position', [0, 0, 1.5, 1.5], 'PaperUnits', 'Inches', 'PaperSize', [1.5, 1.5])
+        exportgraphics(roi_fig, 'figs/roi_channels.pdf', 'BackgroundColor','none')
+
+        close all
+
+    end
+
+    % remove bad channels
+    ur_chanlocs = EEG.chanlocs;  % store a copy of the full channel locations before removing (for later interpolation)
+    bad_channels_indices = find(ismember(lower({EEG.chanlocs.labels}), lower(bad_channels)));
+    EEG = pop_select(EEG, 'nochannel', bad_channels_indices);
+    
+    %% Identify events (trials)
+    
+    % make the sopen function happy
+    x = fileparts( which('sopen') );
+    rmpath(x);
+    addpath(x,'-begin');
+    
+    % build the events manually from the raw eeg file (pop_biosig removes event offsets)
+    % NB: this assumes no resampling between reading the BDF file and now
+    bdf_dat = sopen(raw_datapath, 'r', [0, Inf], 'OVERFLOWDETECTION:OFF');
+    event_types = bdf_dat.BDF.Trigger.TYP;
+    event_pos = bdf_dat.BDF.Trigger.POS;
+    event_time = EEG.times(event_pos);
+    sclose(bdf_dat);
+    clear bdf_dat;
+    
+    triggers = struct(...
+        'off', 0,...
+        'word', 101,...
+        'bacs', 102,...
+        'noise', 103,...
+        'practice', 25);
+    
+    % add 61440 to each trigger value (because of number of bits in pp)
+    trigger_labels = fieldnames(triggers);
+    for field_nr = 1:numel(trigger_labels)
+        triggers.(trigger_labels{field_nr}) = triggers.(trigger_labels{field_nr}) + 61440;
+    end
+        
+    % remove the first trigger if it is at time 0 and has a value which isn't a recognised trigger
+    if (event_time(1)==0 && ~ismember(event_types(1), [triggers.off, triggers.word, triggers.bacs, triggers.noise, triggers.practice]))
+        event_types(1) = [];
+        event_pos(1) = [];
+        event_time(1) = [];
+    end
+    
+    % remove the new first trigger if it has a value of off
+    if (event_types(1)==triggers.off)
+        event_types(1) = [];
+        event_pos(1) = [];
+        event_time(1) = [];
+    end
+    
+    % check every second trigger is an offset
+    offset_locs = find(event_types==triggers.off);
+    if any(offset_locs' ~= 2:2:numel(event_types))
+        fprintf('Expected each second trigger to be an off?')
+        break
+    end
+    
+    % check every first trigger is non-zero
+    onset_locs = find(event_types~=triggers.off);
+    if any(onset_locs' ~= 1:2:numel(event_types))
+        fprintf('Expected each first trigger to be an event?')
+        break
+    end
+    
+    % create the events struct manually    
+    events_onset_types = event_types(onset_locs);
+    events_onsets = event_pos(onset_locs);
+    events_offsets = event_pos(offset_locs);
+    events_durations = events_offsets - events_onsets;
+    
+    EEG.event = struct();
+    for event_nr = 1:numel(events_onsets)
+        EEG.event(event_nr).type = events_onset_types(event_nr);
+        EEG.event(event_nr).latency = events_onsets(event_nr);
+        EEG.event(event_nr).offset = events_offsets(event_nr);
+        EEG.event(event_nr).duration = events_durations(event_nr);
+    end
+    
+    % copy the details over to urevent
+    EEG.urevent = EEG.event;
+    
+    % record the urevent in event, for reference if they change
+    for event_nr = 1:numel(events_onsets)
+        EEG.event(event_nr).urevent = event_nr;
+    end
+    
+    % remove bad events recorded in lab book (misfired triggers)
+    EEG = pop_editeventvals(EEG, 'delete', find(ismember([EEG.event.urevent], bad_trigger_indices)));
+    
+    % remove practice trials
+    EEG = pop_editeventvals(EEG, 'delete', find(ismember([EEG.event.type], triggers.practice)));
+    
+    % check the events make sense
+    if sum(~ismember([EEG.event.type], [triggers.word, triggers.bacs, triggers.noise])) > 0
+        fprintf('Unexpected trial types?\n')
+        break
+    end
+    
+    if numel({EEG.event.type})~=300
+        fprintf('%g trial triggers detected (expected 300)?\n',  numel({EEG.event.type}))
+        break
+    end
+    
+    if sum(ismember([EEG.event.type], [triggers.word])) ~= sum(ismember([EEG.event.type], [triggers.bacs]))
+        fprintf('Unequal number of word and BACS trials?\n')
+        break
+    end
+    
+    if sum(ismember([EEG.event.type], [triggers.word])) ~= sum(ismember([EEG.event.type], [triggers.noise]))
+        fprintf('Unequal number of word and noise trials?\n')
+        break
+    end
+    
+    % add the trials' onsets, offsets, durations, and triggers to the behavioural data
+    beh.event = zeros(size(beh, 1), 1);
+    beh.latency = zeros(size(beh, 1), 1);
+    for row_nr = 1:size(beh, 1)
+        cond_i = beh.condition(row_nr);
+        beh.event(row_nr) = triggers.(cond_i{:});  % look up the trial's expected trigger
+        beh.latency(row_nr) = EEG.event(row_nr).latency;
+        beh.offset(row_nr) = EEG.event(row_nr).offset;
+        beh.duration(row_nr) = EEG.event(row_nr).duration;
+        beh.duration_ms(row_nr) = (EEG.event(row_nr).duration * 1000/EEG.srate) - 500;  % minus 500 as event timer starts at word presentation, but rt timer starts once word turns green
+    end
+    
+    % check events expected in beh are same as those in the events struct
+    if any(beh.event' ~= [EEG.event.type])
+        fprintf('%g mismatches between behavioural data and triggers?\n', sum(beh.event' ~= [EEG.event.type]))
+        break
+    end
+    
+    % check the difference between the durations and the response times (should be very small)
+    % hist(beh.rt - beh.duration_ms, 100)
+    
+    % record trial numbers in EEG.event
+    for row_nr = 1:size(beh, 1)
+        EEG.event(row_nr).trl_nr = beh.trl_nr(row_nr);
+    end
+    
+    %% Remove segments of data that fall outside of blocks
+    
+    % record block starts
+    beh.is_block_start(1) = 1;
+    for row_nr = 2:size(beh, 1)
+        beh.is_block_start(row_nr) = beh.block_nr(row_nr) - beh.block_nr(row_nr-1) == 1;
+    end
+    % record block ends
+    beh.is_block_end(size(beh, 1)) = 1;
+    for row_nr = 1:(size(beh, 1)-1)
+        beh.is_block_end(row_nr) = beh.block_nr(row_nr+1) - beh.block_nr(row_nr) == 1;
+    end
+    % record block boundaries (first start and last end point of each block, with 0.75 seconds buffer)
+    beh.block_boundary = zeros(size(beh, 1), 1);
+    for row_nr = 1:size(beh, 1)
+        if beh.is_block_start(row_nr)
+            beh.block_boundary(row_nr) = beh.latency(row_nr) - (EEG.srate * 0.75);
+        elseif beh.is_block_end(row_nr)
+            beh.block_boundary(row_nr) = beh.offset(row_nr) + (EEG.srate * 0.75);
+        end
+    end
+    
+    % get the boundary indices in required format (start1, end1; start2, end2; start3, end3)
+    block_boundaries = reshape(beh.block_boundary(beh.block_boundary~=0), 2, [])';
+    
+    % remove anything outside of blocks
+    EEG = pop_select(EEG, 'time', (block_boundaries / EEG.srate));
+    
+    %% Trial selection
+    
+    % include only correct responses
+    beh_filt_acc_only = beh(beh.acc==1, :);
+    excl_trials_incorr = size(beh, 1)-size(beh_filt_acc_only, 1);
+    total_excl_trials_incorr(subject_nr) = excl_trials_incorr;
+    fprintf('Lost %g trials to incorrect responses\n', excl_trials_incorr)
+    
+    % include only responses faster than 1500 ms
+    beh_filt = beh_filt_acc_only(beh_filt_acc_only.rt<=1500, :);
+    excl_trials_rt = size(beh_filt_acc_only, 1)-size(beh_filt, 1);
+    total_excl_trials_rt(subject_nr) = excl_trials_rt;
+    fprintf('Lost %g trials to RTs above 1500\n', excl_trials_rt)
+    
+    fprintf('Lost %g trials in total to behavioural data\n', size(beh, 1)-size(beh_filt, 1))
+    
+    % filter the events structure
+    discarded_trls = beh.trl_nr(~ismember(beh.trl_nr, beh_filt.trl_nr));
+    discarded_events_indices = [];  % (collect in a for loop, as [EEG.event.trl_nr] would remove missing data)
+    for event_nr = 1:size(EEG.event, 2)
+        if ismember(EEG.event(event_nr).trl_nr, discarded_trls)
+            discarded_events_indices = [discarded_events_indices, event_nr];
+        end
+    end
+    EEG = pop_editeventvals(EEG, 'delete', discarded_events_indices);
+    
+    % check the discarded trials are the expected length
+    if numel(discarded_trls) ~= size(beh, 1)-size(beh_filt, 1)
+        fprintf('Mismatch between behavioural data and EEG events in the number of trials to discard?')
+        break
+    end
+    
+    % check the sizes match
+    if numel([EEG.event.trl_nr]) ~= size(beh_filt, 1)
+        fprintf('Inconsistent numbers of trials between events structure and behavioural data after discarding trials?')
+        break
+    end
+    
+    % check the trl numbers match
+    if any([EEG.event.trl_nr]' ~= beh_filt.trl_nr)
+        fprintf('Trial IDs mmismatch between events structure and behavioural data after discarding trials?')
+        break
+    end
+    
+    %% Rereference, downsample, and filter
+    
+    % rereference
+    EEG = pop_reref(EEG, []);
+    
+    % downsample if necessary
+    if EEG.srate ~= 512
+        EEG = pop_resample(EEG, 512);
+    end
+    
+    % filter
+    % EEG = eeglab_butterworth(EEG, 0.5, 40, 4, 1:size(EEG.chanlocs, 2));  % preregistered filter
+    EEG = eeglab_butterworth(EEG, 0.1, 40, 4, 1:size(EEG.chanlocs, 2));  % filter with lower highpass
+
+    %% ICA
+    
+    % apply ASR
+    %EEG_no_asr = EEG;
+    EEG = clean_asr(EEG, asr_sigma, [], [], [], [], [], [], [], [], 1024);  % The last number is available memory in mb, needed for reproducibility
+
+    rng(3101)  % set seed for reproducibility
+    EEG = pop_runica(EEG, 'icatype', 'fastica', 'approach', 'symm');
+
+    % classify components with ICLabel
+    EEG = iclabel(EEG);
+
+    % store results for easy indexing
+    icl_res = EEG.etc.ic_classification.ICLabel.classifications;
+    icl_classes = EEG.etc.ic_classification.ICLabel.classes;
+    
+    % identify and remove artefact components
+    artefact_comps = find(icl_res(:, strcmp(icl_classes, 'Eye')) >= icl_cutoff | icl_res(:, strcmp(icl_classes, 'Muscle')) >= icl_cutoff);
+    fprintf('Removing %g artefact-related ICA components\n', numel(artefact_comps))
+    n_bad_ica(subject_nr) = numel(artefact_comps);
+    %EEG_no_iclabel = EEG;
+    EEG = pop_subcomp(EEG, artefact_comps);
+            
+    %% Interpolate bad channels
+    
+    % give the original chanlocs structure so EEGLAB interpolates the missing electrode(s)
+    if numel(bad_channels)>0
+        EEG = pop_interp(EEG, ur_chanlocs);
+    end
+    
+    %% Epoch the data
+    
+    % identify and separate into epochs
+    EEG_epo = struct();
+    EEG_epo.word = pop_epoch(EEG, {triggers.word}, [-0.25, 1]);
+    EEG_epo.bacs = pop_epoch(EEG, {triggers.bacs}, [-0.25, 1]);
+    EEG_epo.noise = pop_epoch(EEG, {triggers.noise}, [-0.25, 1]);
+    
+    % remove baseline
+    EEG_epo.word = pop_rmbase(EEG_epo.word, [-200, 0]);
+    EEG_epo.bacs = pop_rmbase(EEG_epo.bacs, [-200, 0]);
+    EEG_epo.noise = pop_rmbase(EEG_epo.noise, [-200, 0]);
+    
+    % check times vectors are identical
+    if ~isequal(EEG_epo.word.times, EEG_epo.bacs.times, EEG_epo.noise.times)
+        fprintf('The times vectors in the epoch structures are not identical!')
+        break
+    end
+    
+    %% Get the maximal electrode
+    % (word Vs. BACS for main analysis, but word Vs. noise also found)
+    
+    fprintf('Getting maximal electrodes...\n')
+    
+    % get channel means for each condition
+    ch_avg = struct();
+    ch_avg.word = mean(EEG_epo.word.data, 3);
+    ch_avg.bacs = mean(EEG_epo.bacs.data, 3);
+    ch_avg.noise = mean(EEG_epo.noise.data, 3);
+    
+    % get index of time window
+    targ_window = [120, 200];
+    targ_window_idx = EEG_epo.word.times >= targ_window(1) & EEG_epo.word.times <= targ_window(2);
+    
+    % get index of roi channels
+    eeg_chan_idx = ismember({EEG.chanlocs.labels}, roi);  % EEG chanlocs same as chanlocs in ch_avg structs as they're copied over
+    
+    % store vectors of times and channels in ch_avg
+    ch_avg.times = EEG_epo.word.times(targ_window_idx);  % taken from word condition but identical across conditions
+    ch_avg.chanlocs = EEG.chanlocs(eeg_chan_idx);
+    
+    % get only roi electrode data in target window
+    ch_avg.word = ch_avg.word(eeg_chan_idx, targ_window_idx);
+    ch_avg.bacs = ch_avg.bacs(eeg_chan_idx, targ_window_idx);
+    ch_avg.noise = ch_avg.noise(eeg_chan_idx, targ_window_idx);
+    
+    % get differences of interest
+    % - directional, so find max of these
+    ch_avg.diff_word_bacs = ch_avg.bacs - ch_avg.word;
+    ch_avg.diff_word_noise = ch_avg.noise - ch_avg.word;
+    
+    % find the maximum difference indices
+    mean_bacs_diff_perchan = mean(ch_avg.diff_word_bacs, 2);
+    max_bacs_ch_idx = mean_bacs_diff_perchan == max(mean_bacs_diff_perchan);
+    
+    mean_noise_diff_perchan = mean(ch_avg.diff_word_noise, 2);
+    max_noise_ch_idx = mean_noise_diff_perchan == max(mean_noise_diff_perchan);
+    
+    % if multiple channels have an equal mean difference, select one randomly (but reprocubily)
+    if sum(max_bacs_ch_idx) > 1
+        rng(42 + subject_nr)
+        perm_idx = randperm(sum(max_bacs_ch_idx));
+        maxes_idx = find(max_bacs_ch_idx);
+        max_bacs_ch_idx = maxes_idx(perm_idx(numel(maxes_idx)));
+    end
+    
+    if sum(max_noise_ch_idx) > 1
+        rng(42 + subject_nr)
+        perm_idx = randperm(sum(max_noise_ch_idx));
+        maxes_idx = find(max_noise_ch_idx);
+        max_noise_ch_idx = maxes_idx(perm_idx(numel(maxes_idx)));
+    end
+    
+    % get the channel names
+    chan_names = {ch_avg.chanlocs.labels};
+    max_chan_bacs = chan_names{max_bacs_ch_idx};
+    max_chan_noise = chan_names{max_noise_ch_idx};
+    
+    % get the timepoint and value of the maximum difference for the max channels
+    [max_chan_bacs_peak_diff, max_chan_bacs_peak_diff_idx] = max(ch_avg.diff_word_bacs(max_bacs_ch_idx, :));
+    max_chan_bacs_peak_diff_signed = ch_avg.diff_word_bacs(max_bacs_ch_idx, max_chan_bacs_peak_diff_idx);
+    max_chan_bacs_peak_time = ch_avg.times(max_chan_bacs_peak_diff_idx);
+    
+    [max_chan_noise_peak_diff, max_chan_noise_peak_diff_idx] = max(ch_avg.diff_word_noise(max_noise_ch_idx, :));
+    max_chan_noise_peak_diff_signed = ch_avg.diff_word_noise(max_noise_ch_idx, max_chan_noise_peak_diff_idx);
+    max_chan_noise_peak_time = ch_avg.times(max_chan_noise_peak_diff_idx);
+    
+    % store the values in the table
+    max_elecs.subject_id(subject_nr) = {subject_id};
+    
+    max_elecs.max_elec_bacs(subject_nr) = {max_chan_bacs};
+    max_elecs.max_time_bacs(subject_nr) = {max_chan_bacs_peak_time};
+    max_elecs.max_diff_bacs(subject_nr) = {max_chan_bacs_peak_diff_signed};
+    
+    max_elecs.max_elec_noise(subject_nr) = {max_chan_noise};
+    max_elecs.max_time_noise(subject_nr) = {max_chan_noise_peak_time};
+    max_elecs.max_diff_noise(subject_nr) = {max_chan_noise_peak_diff_signed};
+    
+    %% Save sample-level data for all electrodes
+    
+    disp('Getting sample-level localiser results...')
+    
+    % resample to 256 Hz
+    EEG_256 = pop_resample(EEG, 256);
+    
+    % get epochs of low-srate data
+    EEG_epo_256 = pop_epoch(EEG_256, {triggers.word, triggers.bacs, triggers.noise}, [-0.25, 0.5]);
+    % remove baseline
+    EEG_epo_256 = pop_rmbase(EEG_epo_256, [-200, 0]);
+    
+    % pre-allocate the table
+    var_names = {'subj_id', 'stim_grp', 'resp_grp', 'trl_nr', 'ch_name', 'time', 'uV'};
+    var_types = {'string', 'string', 'string', 'double', 'string', 'double', 'double'};
+    nrows = 64 * size(EEG_epo_256.times, 2) * size(beh_filt, 1);
+    sample_res = table('Size',[nrows, numel(var_names)], 'VariableTypes',var_types, 'VariableNames',var_names);
+    
+    sample_res.subj_id = repmat(beh_filt.subj_id, 64*size(EEG_epo_256.times, 2), 1);
+    sample_res.stim_grp = repmat(beh_filt.stim_grp, 64*size(EEG_epo_256.times, 2), 1);
+    sample_res.resp_grp = repmat(beh_filt.resp_grp, 64*size(EEG_epo_256.times, 2), 1);
+    
+    % get the 64 channel eeg data as an array
+    eeg_arr = EEG_epo_256.data(1:64, :, :);
+    
+    % a vector of all eeg data
+    eeg_vec = squeeze(reshape(eeg_arr, 1, 1, []));
+    
+    % array and vector of the channel labels for each value in EEG.data
+    channel_labels_arr = cell(size(eeg_arr));
+    channel_label_lookup = {EEG_epo_256.chanlocs.labels};
+    for chan_nr = 1:size(eeg_arr, 1)
+        channel_labels_arr(chan_nr, :, :) = repmat(channel_label_lookup(chan_nr), size(channel_labels_arr, 2), size(channel_labels_arr, 3));
+    end
+    
+    channel_labels_vec = squeeze(reshape(channel_labels_arr, 1, 1, []));
+    
+    % array and vector of the item numbers for each value in EEG.data
+    times_arr = zeros(size(eeg_arr));
+    times_lookup = EEG_epo_256.times;
+    for time_idx = 1:size(eeg_arr, 2)
+        times_arr(:, time_idx, :) = repmat(times_lookup(time_idx), size(times_arr, 1), size(times_arr, 3));
+    end
+    
+    times_vec = squeeze(reshape(times_arr, 1, 1, []));
+    
+    % array and vector of the trial numbers
+    trials_arr = zeros(size(eeg_arr));
+    trials_lookup = beh_filt.trl_nr;
+    for trl_idx = 1:size(eeg_arr, 3)
+        trials_arr(:, :, trl_idx) = repmat(trials_lookup(trl_idx), size(trials_arr, 1), size(trials_arr, 2));
+    end
+    
+    trials_vec = squeeze(reshape(trials_arr, 1, 1, []));
+        
+    % store sample-level results in the table
+    sample_res.ch_name = channel_labels_vec;
+    sample_res.trl_nr = trials_vec;
+    sample_res.time = times_vec;
+    sample_res.uV = eeg_vec;
+    
+    % look up and store some info about the trials
+    trial_info_lookup = beh_filt(:, {'trl_nr', 'condition', 'string', 'item_nr'});
+    sample_res = outerjoin(sample_res, trial_info_lookup, 'MergeKeys', true);
+    
+    % sort by time, channel, item_nr
+    sample_res = sortrows(sample_res, {'time', 'ch_name', 'trl_nr'});
+    
+    % Save the sample-level results
+    disp('Saving sample-level localiser results...')
+    writetable(sample_res, fullfile('localiser_sample_data', [subject_id, '.csv']));
+    
+end
+
+%% save the results
+
+fprintf('\nSaving results...\n')
+writetable(max_elecs, 'max_elecs.csv');
+
+fprintf('Finished preprocessing localiser data!\n')
+
+%% Functions
+
+% custom function for applying a Butterworth filter to EEGLAB data
+function EEG = eeglab_butterworth(EEG, low, high, order, chanind)
+    fprintf('Applying Butterworth filter between %g and %g Hz (order of %g)\n', low, high, order)
+    % create filter
+    [b, a] = butter(order, [low, high]/(EEG.srate/2));
+    % apply to data (requires transposition for filtfilt)
+    data_trans = single(filtfilt(b, a, double(EEG.data(chanind, :)')));
+    EEG.data(chanind, :) = data_trans';
+end
+
+% custom function for finding the closest timepoint in an EEG dataset
+function [idx, closesttime] = eeglab_closest_time(EEG, time)
+    dists = abs(EEG.times - time);
+    idx = find(dists == min(dists));
+    % in the unlikely case there are two equidistant times, select one randomly
+    if numel(idx) > 1
+        fprintf('Two equidistant times! Selecting one randomly.')
+        idx = idx(randperm(numel(idx)));
+        idx = idx(1);
+    end
+    closesttime = EEG.times(idx);
+end

+ 638 - 0
04 Analysis/analyse_02_preprocess_pictureword.m

@@ -0,0 +1,638 @@
+%% Setup
+
+% paths to data
+eeg_path = fullfile('raw_data', 'eeg-pc', 'pictureword');
+beh_path = fullfile('raw_data', 'stim-pc', 'data', 'pictureword');
+
+% import eeglab (assumes eeglab has been added to path), e.g.
+addpath('C:/EEGLAB/eeglab2020_0')
+[ALLEEG, EEG, CURRENTSET, ALLCOM] = eeglab;
+
+% This script uses fastica algorithm for ICA, so FastICA needs to be on the path, e.g.
+addpath('C:/EEGLAB/FastICA_25')
+
+% region of interest for trial-level ROI average
+roi = {'TP7', 'CP5', 'P7', 'P5', 'P9', 'PO7', 'PO3', 'O1'};
+
+% cutoff probability for identifying eye and muscle related ICA components with ICLabel
+icl_cutoff = 0.85;
+
+% sigma parameter for ASR
+asr_sigma = 20;
+
+%% Clear output folders
+
+delete(fullfile('max_elec_data', '*.csv'))
+delete(fullfile('sample_data', '*.csv'))
+
+%% Import lab book
+
+% handle commas in vectors
+lab_book_file = fullfile('raw_data', 'stim-pc', 'participants.csv');
+lab_book_raw_dat = fileread(lab_book_file);
+
+[regstart, regend] = regexp(lab_book_raw_dat, '\[.*?\]');
+
+for regmatch_i = 1:numel(regstart)
+    str_i = lab_book_raw_dat(regstart(regmatch_i):regend(regmatch_i));
+    str_i(str_i==',') = '.';
+    lab_book_raw_dat(regstart(regmatch_i):regend(regmatch_i)) = str_i;
+end
+
+lab_book_fixed_file = fullfile('raw_data', 'stim-pc', 'participants_tmp.csv');
+lab_book_fixed_conn = fopen(lab_book_fixed_file, 'w');
+fprintf(lab_book_fixed_conn, lab_book_raw_dat);
+fclose(lab_book_fixed_conn);
+
+lab_book_readopts = detectImportOptions(lab_book_fixed_file, 'VariableNamesLine', 1, 'Delimiter', ',');
+% read subject ids as class character
+lab_book_readopts.VariableTypes{strcmp(lab_book_readopts.SelectedVariableNames, 'subj_id')} = 'char';
+lab_book = readtable(lab_book_fixed_file, lab_book_readopts);
+
+delete(lab_book_fixed_file)
+
+%% Count the total number of excluded electrodes
+
+n_bads = 0;
+n_bads_per_s = zeros(size(lab_book, 1), 0);
+
+for subject_nr = 1:size(lab_book, 1)
+    bad_channels = eval(strrep(strrep(strrep(lab_book.pw_bad_channels{subject_nr}, '[', '{'), ']', '}'), '.', ','));
+    n_bads_per_s(subject_nr) = numel(bad_channels);
+    n_bads = n_bads + numel(bad_channels);
+end
+
+perc_bads = n_bads / (64 * size(lab_book, 1)) * 100;
+
+%% Import max electrode info
+
+% this contains participants' maximal electrodes for the N170 from the
+% localisation task
+max_elecs = readtable('max_elecs.csv');
+
+%% Iterate over subjects
+
+% record trial exclusions
+total_excl_trials_incorr = zeros(1, size(lab_book, 1));
+total_excl_trials_rt = zeros(1, size(lab_book, 1));
+
+n_bad_ica = zeros(size(lab_book, 1), 0);
+
+% table that will contain all sample-level tables
+all_sample_level = table();
+
+for subject_nr = 1:size(lab_book, 1)
+    
+    subject_id = lab_book.subj_id{subject_nr};
+    fprintf('\n\n Subject Iteration %g/%g, ID: %s\n', subject_nr, size(lab_book, 1), subject_id)
+    
+    %% get subject-specific info from lab book
+    exclude = lab_book.exclude(subject_nr);
+    bad_channels = eval(strrep(strrep(strrep(lab_book.pw_bad_channels{subject_nr}, '[', '{'), ']', '}'), '.', ','));
+    bad_trigger_indices = eval(strrep(lab_book.pw_bad_trigger_indices{subject_nr}, '.', ','));
+    
+    % add PO4 to bad channels, which seems to be consistently noisy, even when not marked as bad
+    if sum(strcmp('PO4', bad_channels))==0
+        bad_channels(numel(bad_channels)+1) = {'PO4'};
+    end
+
+    %% abort if excluded
+    
+    if exclude
+        fprintf('Subject %s excluded. Preprocessing aborted.\n', subject_id)
+        fprintf('Lab book note: %s\n', lab_book.note{subject_nr})
+        continue
+    end
+    
+    %% load participant's data
+    
+    % load raw eeg
+    raw_datapath = fullfile(eeg_path, append(subject_id, '.bdf'));
+    
+    % abort if no EEG data collected yet
+    if ~isfile(raw_datapath)
+        fprintf('Subject %s skipped: no EEG data found\n', subject_id)
+        continue
+    end
+    
+    EEG = pop_biosig(raw_datapath, 'importevent', 'on', 'rmeventchan', 'off');
+    
+    % load behavioural
+    all_beh_files = dir(beh_path);
+    beh_regex_matches = regexpi({all_beh_files.name}, append('^', subject_id, '_.+\.csv$'), 'match');
+    regex_emptymask = cellfun('isempty', beh_regex_matches);
+    beh_regex_matches(regex_emptymask) = [];
+    subj_beh_files = cellfun(@(x) x{:}, beh_regex_matches, 'UniformOutput', false);
+    
+    if size(subj_beh_files)>1
+        fprintf('%g behavioural files found?\n', size(subj_beh_files))
+        break
+    end
+    
+    beh_datapath = fullfile(beh_path, subj_beh_files{1});
+    beh = readtable(beh_datapath);
+    
+    %% Set data features
+    
+    % set channel locations
+    
+    orig_locs = EEG.chanlocs;
+    EEG.chanlocs = pop_chanedit(EEG.chanlocs, 'load', {'BioSemi64.loc', 'filetype', 'loc'});  % doesn't match order for the data
+    
+    % set channel types
+    for ch_nr = 1:64
+        EEG.chanlocs(ch_nr).type = 'EEG';
+    end
+    
+    for ch_nr = 65:72
+        EEG.chanlocs(ch_nr).type = 'EOG';
+    end
+    
+    for ch_nr = 73:79
+        EEG.chanlocs(ch_nr).type = 'MISC';
+    end
+    
+    for ch_nr = 65:79
+        EEG.chanlocs(ch_nr).theta = [];
+        EEG.chanlocs(ch_nr).radius = [];
+        EEG.chanlocs(ch_nr).sph_theta = [];
+        EEG.chanlocs(ch_nr).sph_phi = [];
+        EEG.chanlocs(ch_nr).X = [];
+        EEG.chanlocs(ch_nr).Y = [];
+        EEG.chanlocs(ch_nr).Z = [];
+    end
+    
+    % change the order of channels in EEG.data to match the new order in chanlocs
+    data_reordered = EEG.data;
+    for ch_nr = 1:64        
+        % make sure the new eeg data array matches the listed order
+        ch_lab = EEG.chanlocs(ch_nr).labels;
+        orig_locs_idx = find(strcmp(lower({orig_locs.labels}), lower(ch_lab)));
+        data_reordered(ch_nr, :) = EEG.data(orig_locs_idx, :);
+    end
+    EEG.data = data_reordered;
+    
+    % remove unused channels
+    EEG = pop_select(EEG, 'nochannel', 69:79);
+    
+    % remove bad channels
+    ur_chanlocs = EEG.chanlocs;  % store a copy of the full channel locations before removing (for later interpolation)
+    bad_channels_indices = find(ismember(lower({EEG.chanlocs.labels}), lower(bad_channels)));
+    EEG = pop_select(EEG, 'nochannel', bad_channels_indices);
+    
+    %% Identify events (trials)
+    
+    % make the sopen function happy
+    x = fileparts( which('sopen') );
+    rmpath(x);
+    addpath(x,'-begin');
+    
+    % build the events manually from the raw eeg file (pop_biosig removes event offsets)
+    % NB: this assumes no resampling between reading the BDF file and now
+    bdf_dat = sopen(raw_datapath, 'r', [0, Inf], 'OVERFLOWDETECTION:OFF');
+    event_types = bdf_dat.BDF.Trigger.TYP;
+    event_pos = bdf_dat.BDF.Trigger.POS;
+    event_time = EEG.times(event_pos);
+    sclose(bdf_dat);
+    clear bdf_dat;
+    
+    triggers = struct(...
+        'off', 0,...
+        'A1', 1,...
+        'A2', 2,...
+        'practice', 25,...
+        'image', 99);
+    
+    % add 61440 to each trigger value (because of number of bits in pp)
+    trigger_labels = fieldnames(triggers);
+    for field_nr = 1:numel(trigger_labels)
+        triggers.(trigger_labels{field_nr}) = triggers.(trigger_labels{field_nr}) + 61440;
+    end
+    
+    % remove the first trigger if it is at time 0 and has a value which isn't a recognised trigger
+    if (event_time(1)==0 && ~ismember(event_types(1), [triggers.off, triggers.A1, triggers.A2, triggers.practice, triggers.image]))
+        event_types(1) = [];
+        event_pos(1) = [];
+        event_time(1) = [];
+    end
+    
+    % remove the new first trigger if it has a value of off
+    if (event_types(1)==triggers.off)
+        event_types(1) = [];
+        event_pos(1) = [];
+        event_time(1) = [];
+    end
+    
+    % check every second trigger is an offset
+    offset_locs = find(event_types==triggers.off);
+    if any(offset_locs' ~= 2:2:numel(event_types))
+        fprintf('Expected each second trigger to be an off?')
+        break
+    end
+    
+    % check every first trigger is non-zero
+    onset_locs = find(event_types~=triggers.off);
+    if any(onset_locs' ~= 1:2:numel(event_types))
+        fprintf('Expected each first trigger to be an event?')
+        break
+    end
+    
+    % create the events struct manually    
+    events_onset_types = event_types(onset_locs);
+    events_onsets = event_pos(onset_locs);
+    events_offsets = event_pos(offset_locs);
+    events_durations = events_offsets - events_onsets;
+    
+    EEG.event = struct();
+    for event_nr = 1:numel(events_onsets)
+        EEG.event(event_nr).type = events_onset_types(event_nr);
+        EEG.event(event_nr).latency = events_onsets(event_nr);
+        EEG.event(event_nr).offset = events_offsets(event_nr);
+        EEG.event(event_nr).duration = events_durations(event_nr);
+    end
+    
+    % copy the details over to urevent
+    EEG.urevent = EEG.event;
+    
+    % record the urevent
+    for event_nr = 1:numel(events_onsets)
+        EEG.event(event_nr).urevent = event_nr;
+    end
+    
+    % remove bad events recorded in lab book (misfired triggers)
+    EEG = pop_editeventvals(EEG, 'delete', find(ismember([EEG.event.urevent], bad_trigger_indices)));
+    
+    % remove practice trials
+    EEG = pop_editeventvals(EEG, 'delete', find(ismember([EEG.event.type], triggers.practice)));
+    
+    % remove triggers saying that image is displayed
+    EEG = pop_editeventvals(EEG, 'delete', find(ismember([EEG.event.type], triggers.image)));
+    
+    % check the events make sense
+    if sum(~ismember([EEG.event.type], [triggers.A1, triggers.A2])) > 0
+        fprintf('Unexpected trial types?\n')
+        break
+    end
+    
+    if numel({EEG.event.type})~=200
+        fprintf('%g trial triggers detected?\n',  numel({EEG.event.type}))
+        break
+    end
+    
+    if sum(ismember([EEG.event.type], [triggers.A1])) ~= sum(ismember([EEG.event.type], [triggers.A2]))
+        fprintf('Unequal number of congruent and incongruent trials?\n')
+        break
+    end
+    
+    % add the trials' onsets, offsets, durations, and triggers to the behavioural data
+    beh.event = zeros(size(beh, 1), 1);
+    beh.latency = zeros(size(beh, 1), 1);
+    for row_nr = 1:size(beh, 1)
+        cond_i = beh.condition(row_nr);
+        beh.event(row_nr) = triggers.(cond_i{:});
+        beh.latency(row_nr) = EEG.event(row_nr).latency;
+        beh.offset(row_nr) = EEG.event(row_nr).offset;
+        beh.duration(row_nr) = EEG.event(row_nr).duration;
+        beh.duration_ms(row_nr) = (EEG.event(row_nr).duration * 1000/EEG.srate) - 500;  % minus 500 as event timer starts at word presentation, but rt timer starts once word turns green
+    end
+    
+    % check events expected in beh are same as those in the events struct
+    if any(beh.event' ~= [EEG.event.type])
+        fprintf('%g mismatches between behavioural data and triggers?\n', sum(beh.event' ~= [EEG.event.type]))
+        break
+    end
+    
+    % check the difference between the durations and the response times (should be very small)
+    % hist(beh.rt - beh.duration_ms, 100)
+    % plot(beh.rt, beh.duration, 'o')
+    
+    % record trial numbers in EEG.event
+    for row_nr = 1:size(beh, 1)
+        EEG.event(row_nr).trl_nr = beh.trl_nr(row_nr);
+    end
+    
+    %% Remove segments of data that fall outside of blocks
+    
+    % record block starts
+    beh.is_block_start(1) = 1;
+    for row_nr = 2:size(beh, 1)
+        beh.is_block_start(row_nr) = beh.block_nr(row_nr) - beh.block_nr(row_nr-1) == 1;
+    end
+    % record block ends
+    beh.is_block_end(size(beh, 1)) = 1;
+    for row_nr = 1:(size(beh, 1)-1)
+        beh.is_block_end(row_nr) = beh.block_nr(row_nr+1) - beh.block_nr(row_nr) == 1;
+    end
+    % record block boundaries (first start and last end point of each block, with 0.75 seconds buffer)
+    beh.block_boundary = zeros(size(beh, 1), 1);
+    for row_nr = 1:size(beh, 1)
+        if beh.is_block_start(row_nr)
+            beh.block_boundary(row_nr) = beh.latency(row_nr) - (EEG.srate * 0.75);
+        elseif beh.is_block_end(row_nr)
+            beh.block_boundary(row_nr) = beh.offset(row_nr) + (EEG.srate * 0.75);
+        end
+    end
+    
+    % get the boundary indices in required format (start1, end1; start2, end2; start3, end3)
+    block_boundaries = reshape(beh.block_boundary(beh.block_boundary~=0), 2, [])';
+    
+    % remove anything outside of blocks
+    EEG = pop_select(EEG, 'time', (block_boundaries / EEG.srate));
+    
+    %% Trial selection
+    
+    % include only correct responses
+    beh_filt_acc_only = beh(beh.acc==1, :);
+    excl_trials_incorr = size(beh, 1)-size(beh_filt_acc_only, 1);
+    total_excl_trials_incorr(subject_nr) = excl_trials_incorr;
+    fprintf('Lost %g trials to incorrect responses\n', excl_trials_incorr)
+    
+    % include only responses between 100 and 1500 ms
+    beh_filt = beh_filt_acc_only(beh_filt_acc_only.rt<=1500, :);
+    excl_trials_rt = size(beh_filt_acc_only, 1)-size(beh_filt, 1);
+    total_excl_trials_rt(subject_nr) = excl_trials_rt;
+    fprintf('Lost %g trials to RTs above 1500\n', excl_trials_rt)
+    
+    fprintf('Lost %g trials in total to behavioural data\n', size(beh, 1)-size(beh_filt, 1))
+    
+    % filter the events structure
+    discarded_trls = beh.trl_nr(~ismember(beh.trl_nr, beh_filt.trl_nr));
+    discarded_events_indices = [];  % (collect in a for loop, as [EEG.event.trl_nr] would remove missing data)
+    for event_nr = 1:size(EEG.event, 2)
+        if ismember(EEG.event(event_nr).trl_nr, discarded_trls)
+            discarded_events_indices = [discarded_events_indices, event_nr];
+        end
+    end
+    EEG = pop_editeventvals(EEG, 'delete', discarded_events_indices);
+    
+    % check the discarded trials are the expected length
+    if numel(discarded_trls) ~= size(beh, 1)-size(beh_filt, 1)
+        fprintf('Mismatch between behavioural data and EEG events in the number of trials to discard?')
+        break
+    end
+    
+    % check the sizes match
+    if numel([EEG.event.trl_nr]) ~= size(beh_filt, 1)
+        fprintf('Inconsistent numbers of trials between events structure and behavioural data after discarding trials?')
+        break
+    end
+    
+    % check the trl numbers match
+    if any([EEG.event.trl_nr]' ~= beh_filt.trl_nr)
+        fprintf('Trial IDs mmismatch between events structure and behavioural data after discarding trials?')
+        break
+    end
+    
+    %% Rereference, downsample, and filter
+    
+    % rereference
+    EEG = pop_reref(EEG, []);
+    
+    % downsample
+    EEG = pop_resample(EEG, 512);
+    
+    % filter
+    % EEG = eeglab_butterworth(EEG, 0.5, 40, 4, 1:size(EEG.chanlocs, 2));  % preregistered filter
+    EEG = eeglab_butterworth(EEG, 0.1, 40, 4, 1:size(EEG.chanlocs, 2));  % filter with lower highpass
+    
+    %% ICA
+    
+    % apply ASR
+    %EEG_no_asr = EEG;
+    EEG = clean_asr(EEG, asr_sigma, [], [], [], [], [], [], [], [], 1024);  % The last number is available memory in mb, needed for reproducibility
+
+    rng(3101)  % set seed for reproducibility
+    EEG = pop_runica(EEG, 'icatype', 'fastica', 'approach', 'symm');
+
+    % classify components with ICLabel
+    EEG = iclabel(EEG);
+
+    % store results for easy indexing
+    icl_res = EEG.etc.ic_classification.ICLabel.classifications;
+    icl_classes = EEG.etc.ic_classification.ICLabel.classes;
+    
+    % identify and remove artefact components
+    artefact_comps = find(icl_res(:, strcmp(icl_classes, 'Eye')) >= icl_cutoff | icl_res(:, strcmp(icl_classes, 'Muscle')) >= icl_cutoff);
+    fprintf('Removing %g artefact-related ICA components\n', numel(artefact_comps))
+    n_bad_ica(subject_nr) = numel(artefact_comps);
+    %EEG_no_iclabel = EEG;
+    EEG = pop_subcomp(EEG, artefact_comps);
+    
+    %% Interpolate bad channels
+    
+    % give the original chanlocs structure so EEGLAB interpolates the missing electrode(s)
+    if numel(bad_channels)>0
+        EEG = pop_interp(EEG, ur_chanlocs);
+    end
+    
+    %% Epoch the data
+    
+    % identify and separate into epochs
+    EEG_epo = pop_epoch(EEG, {triggers.A1, triggers.A2}, [-0.25, 1]);
+    
+    % remove baseline
+    EEG_epo = pop_rmbase(EEG_epo, [-200, 0]);
+    
+    %% Check data
+    
+%     pop_eegplot(EEG);
+%     
+%     
+%     PO7_idx = find(strcmp({EEG_epo.chanlocs.labels}, 'PO7'));
+%     
+%     topo_times = [0, 100, 125, 150, 175, 200, 250, 300, 350, 400, 450, 500];
+%     pop_topoplot(EEG_epo, 1, topo_times)
+%     
+%     
+%     figure;
+%     hold on;
+%     
+%     for ch_nr = 1:64
+%         subplot(8, 8, ch_nr);        
+%         plot(EEG_epo.times, mean(EEG_epo.data(ch_nr, :, :), 3))
+%     end
+%     
+%     break
+    
+    %% Get trial-level microvolts for main analysis
+    
+    disp('Getting trial-level results...')
+    
+    max_elec = max_elecs.max_elec_bacs(max_elecs.subject_id == str2double(subject_id));
+    max_time = max_elecs.max_time_bacs(max_elecs.subject_id == str2double(subject_id));
+    [max_time_idx, max_time_closest] = eeglab_closest_time(EEG_epo, max_time);
+    
+    % get the index of the maximal electrode
+    max_elec_idx = find(strcmp({EEG_epo.chanlocs.labels}, max_elec{:}));
+    
+    % check the maximal electrode is there
+    if isempty(max_elec_idx)
+        fprintf('Could not find maximal electrode %s in EEG.chanlocs?\n', max_elec{:})
+        break
+    end
+    
+    % Plot the average ERP for the maximal electrode, showing the maximal timepoint
+    %plot(EEG_epo.times, mean(EEG_epo.data(max_elec_idx, :, :), 3))
+    %hold on
+    %plot(EEG_epo.times(max_time_idx), mean(EEG_epo.data(max_elec_idx, max_time_idx, :), 'all'), 'o')
+    %line(max_time_closest)
+    
+    % get trial-level microvolts for maximal electrode
+    % (per-trial average of 3 samples centred on maximal timepoint)
+    main_res = beh_filt;
+    max_time_idx_lower = max_time_idx - 1;
+    max_time_idx_upper = max_time_idx + 1;
+    main_res.uV = squeeze(mean(EEG_epo.data(max_elec_idx, max_time_idx_lower:max_time_idx_upper, :)));
+    
+    %% Get trial-level microvolts for maximal electrode from word vs. noise comparison
+    
+    max_elec_noise = max_elecs.max_elec_noise(max_elecs.subject_id == str2double(subject_id));
+    max_time_noise = max_elecs.max_time_noise(max_elecs.subject_id == str2double(subject_id));
+    [max_time_noise_idx, max_time_noise_closest] = eeglab_closest_time(EEG_epo, max_time);
+    
+    % get the index of the maximal electrode
+    max_elec_noise_idx = find(strcmp({EEG_epo.chanlocs.labels}, max_elec_noise));
+    
+    % check the maximal electrode is there
+    if isempty(max_elec_noise_idx)
+        fprintf('Could not find maximal electrode %s (word vs. noise) in EEG.chanlocs?\n', max_elec{:})
+        break
+    end
+    
+    % Plot the average ERP for the maximal electrode, showing the maximal timepoint
+    %plot(EEG_epo.times, mean(EEG_epo.data(max_elec_idx, :, :), 3))
+    %hold on
+    %plot(EEG_epo.times(max_time_idx), mean(EEG_epo.data(max_elec_idx, max_time_idx, :), 'all'), 'o')
+    %xline(max_time_closest)
+    
+    % get trial-level microvolts for maximal electrode
+    % (per-trial average of 3 samples centred on maximal timepoint)
+    max_time_idx_lower_noise = max_time_noise_idx - 1;
+    max_time_idx_upper_noise = max_time_noise_idx + 1;
+    main_res.uV_noise_elec = squeeze(mean(EEG_epo.data(max_elec_noise_idx, max_time_idx_lower_noise:max_time_idx_upper_noise, :)));
+    
+    %% Get trial-level average microvolts in ROI, from 150 to 250 ms
+
+    % get the indices of the ROI
+    roi_chan_idx = ismember({EEG.chanlocs.labels}, roi);
+
+    % check the right number of channels
+    if sum(roi_chan_idx)~=numel(roi)
+        fprintf('Expected %g channels in ROI, but data has %g?\n', numel(roi), sum(roi_chan_idx))
+        break
+    end
+
+    % get indices of time window
+    targ_window = [150, 250];
+    targ_window_idx = EEG_epo.times >= targ_window(1) & EEG_epo.times <= targ_window(2);
+
+    % get per-trial averages across ROI, within the target window
+    main_res.uV_roi = squeeze(mean(EEG_epo.data(roi_chan_idx, targ_window_idx, :), [1, 2]));
+
+    %% Get sample level microvolts for exploratory analysis
+    
+    disp('Getting sample-level results...')
+    
+    % resample to 256 Hz
+    EEG_256 = pop_resample(EEG, 256);
+    
+    % get epochs of low-srate data
+    EEG_epo_256 = pop_epoch(EEG_256, {triggers.A1, triggers.A2}, [-0.25, 1]);
+    
+    % remove baseline
+    EEG_epo_256 = pop_rmbase(EEG_epo_256, [-200, 0]);
+    
+    % pre-allocate the table
+    var_names = {'subj_id', 'stim_grp', 'resp_grp', 'item_nr', 'ch_name', 'time', 'uV'};
+    var_types = {'string', 'string', 'string', 'double', 'string', 'double', 'double'};
+    nrows = 64 * size(EEG_epo_256.times, 2) * size(beh_filt, 1);
+    sample_res = table('Size',[nrows, numel(var_names)], 'VariableTypes',var_types, 'VariableNames',var_names);
+    
+    sample_res.subj_id = repmat(beh_filt.subj_id, 64*size(EEG_epo_256.times, 2), 1);
+    sample_res.stim_grp = repmat(beh_filt.stim_grp, 64*size(EEG_epo_256.times, 2), 1);
+    sample_res.resp_grp = repmat(beh_filt.resp_grp, 64*size(EEG_epo_256.times, 2), 1);
+    
+    % get the 64 channel eeg data as an array
+    eeg_arr = EEG_epo_256.data(1:64, :, :);
+    
+    % a vector of all eeg data
+    eeg_vec = squeeze(reshape(eeg_arr, 1, 1, []));
+    
+    % array and vector of the channel labels for each value in EEG.data
+    channel_labels_arr = cell(size(eeg_arr));
+    channel_label_lookup = {EEG_epo_256.chanlocs.labels};
+    for chan_nr = 1:size(eeg_arr, 1)
+        channel_labels_arr(chan_nr, :, :) = repmat(channel_label_lookup(chan_nr), size(channel_labels_arr, 2), size(channel_labels_arr, 3));
+    end
+    
+    channel_labels_vec = squeeze(reshape(channel_labels_arr, 1, 1, []));
+    
+    % array and vector of the item numbers for each value in EEG.data
+    times_arr = zeros(size(eeg_arr));
+    times_lookup = EEG_epo_256.times;
+    for time_idx = 1:size(eeg_arr, 2)
+        times_arr(:, time_idx, :) = repmat(times_lookup(time_idx), size(times_arr, 1), size(times_arr, 3));
+    end
+    
+    times_vec = squeeze(reshape(times_arr, 1, 1, []));
+    
+    % array and vector of the trial numbers
+    trials_arr = zeros(size(eeg_arr));
+    trials_lookup = beh_filt.item_nr;
+    for trl_idx = 1:size(eeg_arr, 3)
+        trials_arr(:, :, trl_idx) = repmat(trials_lookup(trl_idx), size(trials_arr, 1), size(trials_arr, 2));
+    end
+    
+    trials_vec = squeeze(reshape(trials_arr, 1, 1, []));
+    
+    % store sample-level results in the table
+    sample_res.ch_name = channel_labels_vec;
+    sample_res.item_nr = trials_vec;
+    sample_res.time = times_vec;
+    sample_res.uV = eeg_vec;
+    
+    % look up and store some info about the trials
+    trial_info_lookup = beh_filt(:, {'item_nr', 'condition', 'image', 'string'});
+    sample_res = outerjoin(sample_res, trial_info_lookup, 'MergeKeys', true);
+    
+    % sort by time, channel, item_nr
+    sample_res = sortrows(sample_res, {'time', 'ch_name', 'item_nr'});
+    
+    %% save the results
+    
+    disp('Saving results...')
+    writetable(main_res, fullfile('max_elec_data', [subject_id, '.csv']));
+    writetable(sample_res, fullfile('sample_data', [subject_id, '.csv']));
+    
+    %% concatenate the sample-level data
+
+    all_sample_level = [all_sample_level; sample_res];
+
+end
+
+fprintf('\nFinished preprocessing picture-word data!\n')
+
+%% Functions
+
+% custom function for applying a Butterworth filter to EEGLAB data
+function EEG = eeglab_butterworth(EEG, low, high, order, chanind)
+    fprintf('Applying Butterworth filter between %g and %g Hz (order of %g)\n', low, high, order)
+    % create filter
+    [b, a] = butter(order, [low, high]/(EEG.srate/2));
+    % apply to data (requires transposition for filtfilt)
+    data_trans = single(filtfilt(b, a, double(EEG.data(chanind, :)')));
+    EEG.data(chanind, :) = data_trans';
+end
+
+% custom function for finding the closest timepoint in an EEG dataset
+function [idx, closesttime] = eeglab_closest_time(EEG, time)
+    dists = abs(EEG.times - time);
+    idx = find(dists == min(dists));
+    % in the unlikely case there are two equidistant times, select one randomly
+    if numel(idx) > 1
+        fprintf('Two equidistant times! Selecting one randomly.')
+        idx = idx(randperm(numel(idx)));
+        idx = idx(1);
+    end
+    closesttime = EEG.times(idx);
+end

+ 375 - 0
04 Analysis/analyse_03_main_analysis.R

@@ -0,0 +1,375 @@
+library(lme4)
+library(dplyr)
+library(purrr)
+library(readr)
+library(tidyr)
+library(ggplot2)
+library(scales)
+library(patchwork)
+library(ggnewscale)
+library(ggdist)
+library(ggforce)
+library(brms)
+
+ggplot2::theme_set(ggplot2::theme_classic() + theme(strip.background = element_rect(fill = "white")))
+
+cong_cols <- c("#E69F00", "#009E73")
+
+cong_cols_light <- sapply(cong_cols, function(x) {
+  x_rgb <- as.numeric(col2rgb(x))
+  x_li_rgb <- round(rowMeans(cbind(x_rgb*1.1, c(255, 255, 255)*0.9)))
+  x_li <- rgb(x_li_rgb[1], x_li_rgb[2], x_li_rgb[3], maxColorValue=255)
+  x_li
+}, USE.NAMES=FALSE)
+
+# function to normalise between 0 and 1
+norm01 <- function(x, ...) (x-min(x, ...))/(max(x, ...)-min(x, ...))
+
+# get the stimuli's percentage of name agreement values
+stim <- read_csv("boss.csv", col_types = cols(perc_name_agree_denom_fq_inputs = col_number())) %>%
+  select(filename, perc_name_agree_denom_fq_inputs) %>%
+  rename(perc_name_agree = perc_name_agree_denom_fq_inputs)
+
+# import the max electrode data from the preprocessing, and set up the variables for the model
+d <- list.files("max_elec_data", pattern=".+\\.csv$", full.names=TRUE) %>%
+  map_dfr(function(f) {
+    read_csv(
+      f,
+      col_types=cols(
+        date = col_date(format = "%d/%m/%Y"),
+        trial_save_time = col_time(format = "%H:%M:%S"),
+        subj_id = col_character(),
+        stim_grp = col_integer(),
+        resp_grp = col_integer(),
+        sex = col_character(),
+        trl_nr = col_integer(),
+        item_nr = col_integer(),
+        condition = col_character(),
+        eeg_trigg = col_integer(),
+        image = col_character(),
+        string = col_character(),
+        corr_ans = col_character(),
+        resp = col_character(),
+        acc = col_integer(),
+        fix1_jitt_flip = col_integer(),
+        fix2_jitt_flip = col_integer(),
+        event = col_integer(),
+        is_block_start = col_logical(),
+        is_block_end = col_logical(),
+        .default = col_double()
+      )
+    ) %>%
+      select(
+        subj_id, stim_grp, resp_grp, item_nr, condition,
+        image, string, acc, rt, uV, uV_noise_elec
+      )
+  }) %>%
+  left_join(stim, by=c("image" = "filename")) %>%
+  mutate(
+    cong_dev = as.numeric(scale(ifelse(condition=="A2", 0, 1), center=TRUE, scale=FALSE)),  # A1=congruent, A2=incongruent
+    cong_dum_incong = as.numeric(condition=="A1"),  # 0 is at incongruent
+    cong_dum_cong = as.numeric(condition=="A2"),  # 0 is at congruent
+    prop_agree = perc_name_agree/100,
+    pred_norm = norm01(prop_agree, na.rm=TRUE),
+    pred_norm_max = pred_norm - 1
+  )
+
+# fit the pre-registered model ----
+m <- lmer(
+  uV ~ cong_dev * pred_norm +
+    (cong_dev * pred_norm | subj_id) +
+    (cong_dev | image) +
+    (1 | string),
+  REML=FALSE,
+  control = lmerControl(optimizer="bobyqa"),
+  data=d
+)
+
+# the effect of the interaction
+m_no_interact <- update(m, ~. -cong_dev:pred_norm)
+eff_interact <- anova(m, m_no_interact)
+
+# decompose interaction by congruency ----
+
+# effect of predictability in incongruent trials
+m_incong <- lmer(
+  uV ~ cong_dum_incong * pred_norm +
+    (cong_dum_incong * pred_norm | subj_id) +
+    (cong_dum_incong | image) +
+    (1 | string),
+  REML=FALSE,
+  control = lmerControl(optimizer="bobyqa"),
+  data=d
+)
+
+m_incong_no_pred <- update(m_incong, ~. -pred_norm)
+eff_pred_incong <- anova(m_incong, m_incong_no_pred)
+
+# effect of predictability in congruent trials
+m_cong <- lmer(
+  uV ~ cong_dum_cong * pred_norm +
+    (cong_dum_cong * pred_norm | subj_id) +
+    (cong_dum_cong | image) +
+    (1 | string),
+  REML=FALSE,
+  control = lmerControl(optimizer="bobyqa"),
+  data=d
+)
+
+m_cong_no_pred <- update(m_cong, ~. -pred_norm)
+eff_pred_cong <- anova(m_cong, m_cong_no_pred)
+
+# decompose interaction by predictability ----
+
+# effect of congruency at minimum level of predictability is in main model
+
+# effect of congruency at maximum predictability
+m_maxpred <- lmer(
+  uV ~ cong_dev * pred_norm_max +
+    (cong_dev * pred_norm_max | subj_id) +
+    (cong_dev | image) +
+    (1 | string),
+  REML=FALSE,
+  control = lmerControl(optimizer="bobyqa"),
+  data=d
+)
+
+m_maxpred_no_cong <- update(m_maxpred, ~. -cong_dev)
+eff_cong_maxpred <- anova(m_maxpred, m_maxpred_no_cong)
+
+# plot relationship ----
+
+d_cells_pred <- tibble(
+  pred_norm = rep(range(d$pred_norm), 2),
+  perc_name_agree = rep(range(d$perc_name_agree), 2),
+  cong_dev = rep(unique(d$cong_dev), each=2),
+  condition = ifelse(cong_dev==min(cong_dev), "A2", "A1")
+) %>%
+  mutate(pred_uV = predict(m, newdata=., re.form=~0))
+
+# get prediction intervals (no random slopes for feasibility)
+m_no_rand_s <- lmer(
+  uV ~ cong_dev * pred_norm +
+    (1 | subj_id) +
+    (1 | image) +
+    (1 | string),
+  REML=FALSE,
+  control = lmerControl(optimizer="bobyqa"),
+  data=d
+)
+
+d_preds_boot <- expand_grid(
+  cong_dev = unique(d$cong_dev),
+  perc_name_agree = seq(min(d$perc_name_agree), max(d$perc_name_agree), 0.01)
+) %>%
+  mutate(
+    prop_agree = perc_name_agree/100,
+    pred_norm = norm01(prop_agree, na.rm=TRUE),
+    condition = ifelse(cong_dev==min(cong_dev), "A2", "A1")
+  )
+
+message("Bootstrapping prediction intervals")
+
+boots <- bootMer(m_no_rand_s, nsim=5000, FUN=function(m_i) {
+  predict(m_i, newdata=d_preds_boot, re.form=~0)
+}, seed=3101, .progress="txt")
+
+ci_norm <- function(x, level=c(0.025, 0.975)) {
+  qnorm(p=level, mean=mean(x), sd=sd(x))
+}
+
+d_preds_boot <- d_preds_boot %>%
+  mutate(
+    pred_int_lwr = apply(boots$t, 2, ci_norm, level=.025),
+    pred_int_upr = apply(boots$t, 2, ci_norm, level=.975)
+  )
+
+lmm_plot_scatter <- d %>%
+  ggplot(aes(perc_name_agree, uV, colour = condition)) +
+  geom_point(shape=1, show.legend=FALSE, alpha=0.25) +
+  scale_colour_manual(name = NA, labels = NULL, values = cong_cols_light) +
+  new_scale_colour() +
+  geom_line(aes(y=pred_uV, colour = condition), data=d_cells_pred, linewidth=0.9) +
+  scale_colour_manual(name = "Picture-Word Congruency", labels = c("Congruent", "Incongruent"), values = cong_cols) +
+  scale_y_continuous(breaks=scales::extended_breaks(n=6)) +
+  guides(colour = "none") +
+  labs(x = "Predictability (%)", y = "N1 Amplitude (µV)", tag="a")
+
+lmm_plot_lines <- ggplot() +
+  geom_ribbon(aes(x=perc_name_agree, ymin=pred_int_lwr, ymax=pred_int_upr, colour=condition), data=d_preds_boot, fill=NA, linetype="dashed", show.legend = FALSE) +
+  geom_line(aes(x=perc_name_agree, y=pred_uV, colour=condition), data=d_cells_pred, linewidth=1.75) +
+  scale_colour_manual(name = "Picture-Word Congruency", labels = c("Congruent", "Incongruent"), values = cong_cols) +
+  scale_y_continuous(breaks=scales::extended_breaks(n=6)) +
+  labs(x = "Predictability (%)", y = "N1 Amplitude (µV)", tag="b") +
+  theme(legend.position = "bottom")
+
+lmm_plot <- (lmm_plot_scatter | lmm_plot_lines) +
+  plot_layout(guides = "collect") &
+  theme(
+    legend.position = "bottom",
+    plot.background = element_blank()
+  ) &
+  scale_x_continuous(expand=c(0, 0))
+
+ggsave(file.path("figs", "03_lmm_summary_plot.png"), lmm_plot, device="png", type="cairo", width=4.9, height=3, dpi=600)
+
+ggsave(file.path("figs", "03_lmm_summary_plot.pdf"), lmm_plot, device="pdf", width=4.9, height=3)
+
+pred_results <- tibble(
+  pred_norm = c(0, 1, 0, 1),
+  perc_name_agree = c(7, 100, 7, 100),
+  condition = c("A2", "A2", "A1", "A1"),
+  pred_uV = c(-5, -5, -5, -4.25)
+)
+
+pred_plot <- ggplot() +
+  geom_line(aes(x=perc_name_agree, y=pred_uV, colour=condition), data=pred_results, linewidth=1.75) +
+  scale_colour_manual(name = "Picture-Word Congruency", labels = c("Congruent", "Incongruent"), values = cong_cols) +
+  labs(x = "Predictability (%)", y = "N1 Amplitude (µV)", tag="a")  
+
+pred_actual_plot <- (
+  pred_plot + scale_y_continuous(limits=c(-6.5, -3.25), breaks=seq(-6.5, -3.25, 0.5)) + ggtitle("Predicted Results") |
+    lmm_plot_lines + scale_y_continuous(limits=c(-5, -1.75), breaks=seq(-5, -1.75, 0.5)) + ggtitle("Observed Results")
+) +
+  plot_layout(guides = "collect") &
+  theme(
+    legend.position = "bottom",
+    plot.title = element_text(hjust=0.5),
+    plot.background = element_blank()
+  ) &
+  scale_x_continuous(expand=c(0, 0))
+
+ggsave(file.path("figs", "03_pred_vs_actual.png"), pred_actual_plot, device="png", type="cairo", width=6.5, height=3.75, dpi=600)
+
+ggsave(file.path("figs", "03_pred_vs_actual.pdf"), pred_actual_plot, device="pdf", width=6.5, height=3.75)
+
+pred_results_b <- tibble(
+  pred_norm = c(0, 1, 0, 1),
+  perc_name_agree = c(7, 100, 7, 100),
+  condition = c("A2", "A2", "A1", "A1"),
+  pred_uV = c(-5, -5.75, -5, -5)
+)
+
+pred_results_c <- tibble(
+  pred_norm = c(0, 1, 0, 1),
+  perc_name_agree = c(7, 100, 7, 100),
+  condition = c("A2", "A2", "A1", "A1"),
+  pred_uV = c(-5, -5.375, -5, -4.625)
+)
+
+pred_plot_b <- ggplot() +
+  geom_line(aes(x=perc_name_agree, y=pred_uV, colour=condition), data=pred_results_b, linewidth=1.75) +
+  scale_colour_manual(name = "Picture-Word Congruency", labels = c("Congruent", "Incongruent"), values = cong_cols) +
+  labs(x = "Predictability (%)", y = "N1 Amplitude (µV)")
+
+pred_plot_c <- ggplot() +
+  geom_line(aes(x=perc_name_agree, y=pred_uV, colour=condition), data=pred_results_c, linewidth=1.75) +
+  scale_colour_manual(name = "Picture-Word Congruency", labels = c("Congruent", "Incongruent"), values = cong_cols) +
+  labs(x = "Predictability (%)", y = "N1 Amplitude (µV)")
+
+pred_actual_plot_abc <- (
+  pred_plot + scale_y_continuous(limits=c(-6.5, -3.25), breaks=seq(-6.5, -3.25, 0.5)) + labs(title="Expected", x=NULL, tag=NULL) |
+    pred_plot_b + scale_y_continuous(limits=c(-6.5, -3.25), breaks=seq(-6.5, -3.25, 0.5)) + labs(title="(or)", y=NULL, x=NULL) + theme(axis.ticks.y = element_blank(), axis.text.y=element_blank()) |
+    pred_plot_c + scale_y_continuous(limits=c(-6.5, -3.25), breaks=seq(-6.5, -3.25, 0.5)) + labs(title="(or)", y=NULL) + theme(axis.ticks.y = element_blank(), axis.text.y=element_blank(), axis.title.x=element_text(hjust=20)) |
+    lmm_plot_lines + scale_y_continuous(limits=c(-5, -1.75), breaks=seq(-5, -1.75, 0.5)) + labs(title="Observed", y=NULL, x=NULL, tag=NULL)
+) +
+  plot_annotation(tag_levels = "a") +
+  plot_layout(guides = "collect") &
+  theme(
+    legend.position = "bottom",
+    plot.title = element_text(hjust=0.5, size=12),
+    plot.background = element_blank()
+  ) &
+  scale_x_continuous(expand=c(0, 0))
+
+ggsave(file.path("figs", "03_pred_vs_actual_all_patterns.png"), pred_actual_plot_abc, device="png", type="cairo", width=6.5, height=2.75, dpi=600)
+
+ggsave(file.path("figs", "03_pred_vs_actual_all_patterns.pdf"), pred_actual_plot_abc, device="pdf", width=6.5, height=2.75)
+
+# evidences ratio for effect -----------------------------------------------
+
+p <- c(
+  set_prior("normal(-5, 10)", class="Intercept"),
+  set_prior("normal(0, 5)", class="b", coef="cong_dev"),
+  set_prior("normal(0, 5)", class="b", coef="pred_norm"),
+  set_prior("normal(0, 5)", class="b", coef="cong_dev:pred_norm"),
+  set_prior("lkj_corr_cholesky(1)", class="L")
+)
+
+m_b <- brm(
+  uV ~ cong_dev * pred_norm +
+    (cong_dev * pred_norm | subj_id) +
+    (cong_dev | image) +
+    (1 | string),
+  data = d,
+  prior = p,
+  chains = 5,
+  cores = 5,
+  iter = 5000,
+  control = list(
+    adapt_delta = 0.8,
+    max_treedepth = 10
+  ),
+  refresh = 25,
+  file = file.path("mods", "m_main.rds"),
+  seed = 420
+)
+
+m_b_h <- hypothesis(m_b, "cong_dev:pred_norm>0")
+
+m_b_h_hdi <- as_draws_df(m_b, variable="b_cong_dev:pred_norm", regex=FALSE) %>%
+  rename(int_est = `b_cong_dev:pred_norm`) %>%
+  median_hdi(int_est, .width=0.89)
+
+m_b_pl <- as_draws_df(m_b, variable="b_cong_dev:pred_norm", regex=FALSE) %>%
+  rename(int_est = `b_cong_dev:pred_norm`) %>%
+  ggplot(aes(int_est, fill=after_stat(x>0))) +
+  stat_slab(slab_colour="black", slab_size=0.4) +
+  stat_pointinterval(
+    .width=0.89, point_interval="median_hdi",
+    point_size=1, size=1, show.legend = FALSE
+  ) +
+  geom_vline(xintercept=0, linetype="dashed") +
+  geom_text(
+    aes(x = x, y = y),
+    data = tibble(x=-3, y=1),
+    label = deparse(bquote(BF["01"]==.(round(1/m_b_h$hypothesis$Evid.Ratio, 2)))),
+    parse = TRUE,
+    hjust = 0,
+    vjust = 1,
+    size = 4.5
+  ) +
+  scale_y_continuous(expand = expansion(0, 0.04), limits = c(0, NA)) +
+  coord_cartesian() +
+  labs(
+    x = "Congruency-Predictability Interaction (µV)",
+    y = "Posterior Density"
+  ) +
+  theme(legend.position = "none") +
+  scale_fill_manual(
+    values = c("grey90", "#8B0000"),
+    labels = c(bquote(H["0"]), bquote(H["1"]))
+  )
+
+ggsave(file.path("figs", "03_post.png"), width=3.5, height=1.75, device="png", type="cairo")
+ggsave(file.path("figs", "03_post.pdf"), width=3.5, height=1.75, device="pdf")
+
+
+lmm_plot_post <- wrap_plots(
+  lmm_plot,
+  (
+    m_b_pl +
+      labs(
+        fill = NULL,
+        tag = "c",
+        x = "Congruency-Predictability\nInteraction"
+      ) +
+      theme(
+        plot.background = element_blank(),
+        legend.position = "bottom"
+      )
+  )
+) +
+  plot_layout(widths = c(2, 1))
+
+ggsave(file.path("figs", "03_lmm_and_post.pdf"), lmm_plot_post, width=6.5, height=3, device="pdf")

+ 298 - 0
04 Analysis/analyse_04_other_maximal_electrode.R

@@ -0,0 +1,298 @@
+# this script is the same as analyse_03 but uses the data from the word vs noise maximal electrode instead of word vs consonants
+
+library(lme4)
+library(brms)
+library(ggdist)
+library(dplyr)
+library(purrr)
+library(readr)
+library(tidyr)
+library(ggplot2)
+library(scales)
+library(patchwork)
+library(ggnewscale)
+
+ggplot2::theme_set(ggplot2::theme_classic() + theme(strip.background = element_rect(fill = "white")))
+
+cong_cols <- c("#E69F00", "#009E73")
+
+cong_cols_light <- sapply(cong_cols, function(x) {
+  x_rgb <- as.numeric(col2rgb(x))
+  x_li_rgb <- round(rowMeans(cbind(x_rgb*1.25, c(255, 255, 255)*0.75)))
+  x_li <- rgb(x_li_rgb[1], x_li_rgb[2], x_li_rgb[3], maxColorValue=255)
+  x_li
+}, USE.NAMES=FALSE)
+
+# function to normalise between 0 and 1
+norm01 <- function(x, ...) (x-min(x, ...))/(max(x, ...)-min(x, ...))
+
+# get the stimuli's percentage of name agreement values
+stim <- read_csv("boss.csv", col_types = cols(perc_name_agree_denom_fq_inputs = col_number())) %>%
+  select(filename, perc_name_agree_denom_fq_inputs) %>%
+  rename(perc_name_agree = perc_name_agree_denom_fq_inputs)
+
+# import the max electrode data from the preprocessing, and set up the variables for the model
+d <- list.files("max_elec_data", pattern=".+\\.csv$", full.names=TRUE) %>%
+  map_dfr(function(f) {
+    read_csv(
+      f,
+      col_types=cols(
+        date = col_date(format = "%d/%m/%Y"),
+        trial_save_time = col_time(format = "%H:%M:%S"),
+        subj_id = col_character(),
+        stim_grp = col_integer(),
+        resp_grp = col_integer(),
+        sex = col_character(),
+        trl_nr = col_integer(),
+        item_nr = col_integer(),
+        condition = col_character(),
+        eeg_trigg = col_integer(),
+        image = col_character(),
+        string = col_character(),
+        corr_ans = col_character(),
+        resp = col_character(),
+        acc = col_integer(),
+        fix1_jitt_flip = col_integer(),
+        fix2_jitt_flip = col_integer(),
+        event = col_integer(),
+        is_block_start = col_logical(),
+        is_block_end = col_logical(),
+        .default = col_double()
+      )
+    ) %>%
+      select(
+        subj_id, stim_grp, resp_grp, item_nr, condition,
+        image, string, acc, rt, uV, uV_noise_elec
+      )
+  }) %>%
+  left_join(stim, by=c("image" = "filename")) %>%
+  mutate(
+    cong_dev = as.numeric(scale(ifelse(condition=="A2", 0, 1), center=TRUE, scale=FALSE)),
+    cong_dum_incong = as.numeric(condition=="A1"),  # 0 is at incongruent
+    cong_dum_cong = as.numeric(condition=="A2"),  # 0 is at congruent
+    prop_agree = perc_name_agree/100,
+    pred_norm = norm01(prop_agree, na.rm=TRUE),
+    pred_norm_max = pred_norm - 1
+  )
+
+# fit the model ----
+m <- lmer(
+  uV_noise_elec ~ cong_dev * pred_norm +
+    (cong_dev * pred_norm | subj_id) +
+    (cong_dev | image) +
+    (1 | string),
+  REML=FALSE,
+  control = lmerControl(optimizer="bobyqa"),
+  data=d
+)
+
+# the effect of the interaction
+m_no_interact <- update(m, ~. -cong_dev:pred_norm)
+eff_interact <- anova(m, m_no_interact)
+
+# decompose interaction by congruency ----
+
+# effect of predictability in incongruent trials
+m_incong <- lmer(
+  uV_noise_elec ~ cong_dum_incong * pred_norm +
+    (cong_dum_incong * pred_norm | subj_id) +
+    (cong_dum_incong | image) +
+    (1 | string),
+  REML=FALSE,
+  control = lmerControl(optimizer="bobyqa"),
+  data=d
+)
+
+m_incong_no_pred <- update(m_incong, ~. -pred_norm)
+eff_pred_incong <- anova(m_incong, m_incong_no_pred)
+
+# effect of predictability in congruent trials
+m_cong <- lmer(
+  uV_noise_elec ~ cong_dum_cong * pred_norm +
+    (cong_dum_cong * pred_norm | subj_id) +
+    (cong_dum_cong | image) +
+    (1 | string),
+  REML=FALSE,
+  control = lmerControl(optimizer="bobyqa"),
+  data=d
+)
+
+m_cong_no_pred <- update(m_cong, ~. -pred_norm)
+eff_pred_cong <- anova(m_cong, m_cong_no_pred)
+
+# decompose interaction by predictability ----
+
+# effect of congruency at minimum level of predictability is in main model
+
+# effect of congruency at maximum predictability
+m_maxpred <- lmer(
+  uV_noise_elec ~ cong_dev * pred_norm_max +
+    (cong_dev * pred_norm_max | subj_id) +
+    (cong_dev | image) +
+    (1 | string),
+  REML=FALSE,
+  control = lmerControl(optimizer="bobyqa"),
+  data=d
+)
+
+m_maxpred_no_cong <- update(m_maxpred, ~. -cong_dev)
+eff_cong_maxpred <- anova(m_maxpred, m_maxpred_no_cong)
+
+# plot relationship ----
+
+d_cells_pred <- tibble(
+  pred_norm = rep(range(d$pred_norm), 2),
+  perc_name_agree = rep(range(d$perc_name_agree), 2),
+  cong_dev = rep(unique(d$cong_dev), each=2),
+  condition = ifelse(cong_dev==min(cong_dev), "A2", "A1")
+) %>%
+  mutate(pred_uV = predict(m, newdata=., re.form=~0))
+
+# get prediction intervals (no random slopes for feasibility)
+m_no_rand_s <- lmer(
+  uV_noise_elec ~ cong_dev * pred_norm +
+    (1 | subj_id) +
+    (1 | image) +
+    (1 | string),
+  REML=FALSE,
+  control = lmerControl(optimizer="bobyqa"),
+  data=d
+)
+
+d_preds_boot <- expand_grid(
+  cong_dev = unique(d$cong_dev),
+  perc_name_agree = seq(min(d$perc_name_agree), max(d$perc_name_agree), 0.01)
+) %>%
+  mutate(
+    prop_agree = perc_name_agree/100,
+    pred_norm = norm01(prop_agree, na.rm=TRUE),
+    condition = ifelse(cong_dev==min(cong_dev), "A2", "A1")
+  )
+
+boots <- bootMer(m_no_rand_s, nsim=5000, FUN=function(m_i) {
+  predict(m_i, newdata=d_preds_boot, re.form=~0)
+}, seed=3101, .progress="txt")
+
+ci_norm <- function(x, level=c(0.025, 0.975)) {
+  qnorm(p=level, mean=mean(x), sd=sd(x))
+}
+
+d_preds_boot <- d_preds_boot %>%
+  mutate(
+    pred_int_lwr = apply(boots$t, 2, ci_norm, level=.025),
+    pred_int_upr = apply(boots$t, 2, ci_norm, level=.975)
+  )
+
+lmm_plot_scatter <- d %>%
+  ggplot(aes(perc_name_agree, uV_noise_elec, colour = condition)) +
+  geom_point(shape=1, show.legend=FALSE, alpha=0.25) +
+  scale_colour_manual(name = NA, labels = NULL, values = cong_cols_light) +
+  new_scale_colour() +
+  geom_line(aes(y=pred_uV, colour = condition), data=d_cells_pred, linewidth=1.75) +
+  scale_colour_manual(name = "Picture-Word Congruency", labels = c("Congruent", "Incongruent"), values = cong_cols) +
+  scale_y_continuous(breaks=scales::extended_breaks(n=6)) +
+  labs(x = "Predictability (%)", y = "N1 Amplitude (µV)", tag="a")
+
+lmm_plot_lines <- ggplot() +
+  geom_ribbon(aes(x=perc_name_agree, ymin=pred_int_lwr, ymax=pred_int_upr, colour=condition), data=d_preds_boot, fill=NA, linetype="dashed", show.legend = FALSE) +
+  geom_line(aes(x=perc_name_agree, y=pred_uV, colour=condition), data=d_cells_pred, linewidth=1.75) +
+  scale_colour_manual(name = "Picture-Word Congruency", labels = c("Congruent", "Incongruent"), values = cong_cols) +
+  scale_y_continuous(breaks=scales::extended_breaks(n=6)) +
+  labs(x = "Predictability (%)", y = "N1 Amplitude (µV)", tag="b")
+
+lmm_plot <- (lmm_plot_scatter | lmm_plot_lines) +
+  plot_layout(guides = "collect") &
+  theme(legend.position = "bottom", plot.background = element_blank()) &
+  scale_x_continuous(expand=c(0, 0))
+
+ggsave(file.path("figs", "04_lmm_summary_plot_noise_elec.png"), lmm_plot, device="png", type="cairo", width=6.5, height=3.75, dpi=600)
+
+ggsave(file.path("figs", "04_lmm_summary_plot_noise_elec.pdf"), lmm_plot, device="pdf", width=6.5, height=3.75)
+
+# evidences ratio for effect -----------------------------------------------
+
+p <- c(
+  set_prior("normal(-5, 10)", class="Intercept"),
+  set_prior("normal(0, 5)", class="b", coef="cong_dev"),
+  set_prior("normal(0, 5)", class="b", coef="pred_norm"),
+  set_prior("normal(0, 5)", class="b", coef="cong_dev:pred_norm"),
+  set_prior("lkj_corr_cholesky(1)", class="L")
+)
+
+m_b <- brm(
+  uV_noise_elec ~ cong_dev * pred_norm +
+    (cong_dev * pred_norm | subj_id) +
+    (cong_dev | image) +
+    (1 | string),
+  data = d,
+  prior = p,
+  chains = 5,
+  cores = 5,
+  iter = 5000,
+  control = list(
+    adapt_delta = 0.8,
+    max_treedepth = 10
+  ),
+  refresh = 25,
+  seed = 420
+)
+
+m_b_h <- hypothesis(m_b, "cong_dev:pred_norm>0")
+
+m_b_h_hdi <- as_draws_df(m_b, variable="b_cong_dev:pred_norm", regex=FALSE) %>%
+  rename(int_est = `b_cong_dev:pred_norm`) %>%
+  median_hdi(int_est, .width=0.89)
+
+m_b_pl <- as_draws_df(m_b, variable="b_cong_dev:pred_norm", regex=FALSE) %>%
+  rename(int_est = `b_cong_dev:pred_norm`) %>%
+  ggplot(aes(int_est, fill=after_stat(x>0))) +
+  stat_slab(slab_colour="black", slab_size=0.4) +
+  stat_pointinterval(
+    .width=0.89, point_interval="median_hdi",
+    point_size=1, size=1, show.legend = FALSE
+  ) +
+  geom_vline(xintercept=0, linetype="dashed") +
+  geom_text(
+    aes(x = x, y = y),
+    data = tibble(x=-3.9, y=1),
+    label = deparse(bquote(BF["01"]==.(round(1/m_b_h$hypothesis$Evid.Ratio, 2)))),
+    parse = TRUE,
+    hjust = 0,
+    vjust = 1,
+    size = 4.5
+  ) +
+  scale_y_continuous(expand = expansion(0, 0.04), limits = c(0, NA)) +
+  coord_cartesian() +
+  labs(
+    x = "Congruency-Predictability Interaction (µV)",
+    y = "Posterior Density"
+  ) +
+  theme(legend.position = "none") +
+  scale_fill_manual(
+    values = c("grey90", "#8B0000"),
+    labels = c(bquote(H["0"]), bquote(H["1"]))
+  )
+
+ggsave(file.path("figs", "04_post.png"), width=3.5, height=1.75, device="png", type="cairo")
+ggsave(file.path("figs", "04_post.pdf"), width=3.5, height=1.75, device="pdf")
+
+
+lmm_plot_post <- wrap_plots(
+  lmm_plot,
+  (
+    m_b_pl +
+      labs(
+        fill = NULL,
+        tag = "c",
+        x = "Congruency-Predictability\nInteraction"
+      ) +
+      theme(
+        plot.background = element_blank(),
+        legend.position = "bottom"
+      )
+  )
+) +
+  plot_layout(widths = c(2, 1))
+
+ggsave(file.path("figs", "04_lmm_and_post.pdf"), lmm_plot_post, width=6.5, height=3, device="pdf")
+

+ 298 - 0
04 Analysis/analyse_05_roi_avg.R

@@ -0,0 +1,298 @@
+# this script is the same as analyse_03 but uses the ROI average within a post-hoc window
+
+library(lme4)
+library(brms)
+library(ggdist)
+library(dplyr)
+library(purrr)
+library(readr)
+library(tidyr)
+library(ggplot2)
+library(scales)
+library(patchwork)
+library(ggnewscale)
+
+ggplot2::theme_set(ggplot2::theme_classic() + theme(strip.background = element_rect(fill = "white")))
+
+cong_cols <- c("#E69F00", "#009E73")
+
+cong_cols_light <- sapply(cong_cols, function(x) {
+  x_rgb <- as.numeric(col2rgb(x))
+  x_li_rgb <- round(rowMeans(cbind(x_rgb*1.25, c(255, 255, 255)*0.75)))
+  x_li <- rgb(x_li_rgb[1], x_li_rgb[2], x_li_rgb[3], maxColorValue=255)
+  x_li
+}, USE.NAMES=FALSE)
+
+# function to normalise between 0 and 1
+norm01 <- function(x, ...) (x-min(x, ...))/(max(x, ...)-min(x, ...))
+
+# get the stimuli's percentage of name agreement values
+stim <- read_csv("boss.csv", col_types = cols(perc_name_agree_denom_fq_inputs = col_number())) %>%
+  select(filename, perc_name_agree_denom_fq_inputs) %>%
+  rename(perc_name_agree = perc_name_agree_denom_fq_inputs)
+
+# import the max electrode data from the preprocessing, and set up the variables for the model
+d <- list.files("max_elec_data", pattern=".+\\.csv$", full.names=TRUE) %>%
+  map_dfr(function(f) {
+    read_csv(
+      f,
+      col_types=cols(
+        date = col_date(format = "%d/%m/%Y"),
+        trial_save_time = col_time(format = "%H:%M:%S"),
+        subj_id = col_character(),
+        stim_grp = col_integer(),
+        resp_grp = col_integer(),
+        sex = col_character(),
+        trl_nr = col_integer(),
+        item_nr = col_integer(),
+        condition = col_character(),
+        eeg_trigg = col_integer(),
+        image = col_character(),
+        string = col_character(),
+        corr_ans = col_character(),
+        resp = col_character(),
+        acc = col_integer(),
+        fix1_jitt_flip = col_integer(),
+        fix2_jitt_flip = col_integer(),
+        event = col_integer(),
+        is_block_start = col_logical(),
+        is_block_end = col_logical(),
+        .default = col_double()
+      )
+    ) %>%
+      select(
+        subj_id, stim_grp, resp_grp, item_nr, condition,
+        image, string, acc, rt, uV, uV_roi
+      )
+  }) %>%
+  left_join(stim, by=c("image" = "filename")) %>%
+  mutate(
+    cong_dev = as.numeric(scale(ifelse(condition=="A2", 0, 1), center=TRUE, scale=FALSE)),
+    cong_dum_incong = as.numeric(condition=="A1"),  # 0 is at incongruent
+    cong_dum_cong = as.numeric(condition=="A2"),  # 0 is at congruent
+    prop_agree = perc_name_agree/100,
+    pred_norm = norm01(prop_agree, na.rm=TRUE),
+    pred_norm_max = pred_norm - 1
+  )
+
+# fit the model ----
+m <- lmer(
+  uV_roi ~ cong_dev * pred_norm +
+    (cong_dev * pred_norm | subj_id) +
+    (cong_dev | image) +
+    (1 | string),
+  REML=FALSE,
+  control = lmerControl(optimizer="bobyqa"),
+  data=d
+)
+
+# the effect of the interaction
+m_no_interact <- update(m, ~. -cong_dev:pred_norm)
+eff_interact <- anova(m, m_no_interact)
+
+# decompose interaction by congruency ----
+
+# effect of predictability in incongruent trials
+m_incong <- lmer(
+  uV_roi ~ cong_dum_incong * pred_norm +
+    (cong_dum_incong * pred_norm | subj_id) +
+    (cong_dum_incong | image) +
+    (1 | string),
+  REML=FALSE,
+  control = lmerControl(optimizer="bobyqa"),
+  data=d
+)
+
+m_incong_no_pred <- update(m_incong, ~. -pred_norm)
+eff_pred_incong <- anova(m_incong, m_incong_no_pred)
+
+# effect of predictability in congruent trials
+m_cong <- lmer(
+  uV_roi ~ cong_dum_cong * pred_norm +
+    (cong_dum_cong * pred_norm | subj_id) +
+    (cong_dum_cong | image) +
+    (1 | string),
+  REML=FALSE,
+  control = lmerControl(optimizer="bobyqa"),
+  data=d
+)
+
+m_cong_no_pred <- update(m_cong, ~. -pred_norm)
+eff_pred_cong <- anova(m_cong, m_cong_no_pred)
+
+# decompose interaction by predictability ----
+
+# effect of congruency at minimum level of predictability is in main model
+
+# effect of congruency at maximum predictability
+m_maxpred <- lmer(
+  uV_roi ~ cong_dev * pred_norm_max +
+    (cong_dev * pred_norm_max | subj_id) +
+    (cong_dev | image) +
+    (1 | string),
+  REML=FALSE,
+  control = lmerControl(optimizer="bobyqa"),
+  data=d
+)
+
+m_maxpred_no_cong <- update(m_maxpred, ~. -cong_dev)
+eff_cong_maxpred <- anova(m_maxpred, m_maxpred_no_cong)
+
+# plot relationship ----
+
+d_cells_pred <- tibble(
+  pred_norm = rep(range(d$pred_norm), 2),
+  perc_name_agree = rep(range(d$perc_name_agree), 2),
+  cong_dev = rep(unique(d$cong_dev), each=2),
+  condition = ifelse(cong_dev==min(cong_dev), "A2", "A1")
+) %>%
+  mutate(pred_uV = predict(m, newdata=., re.form=~0))
+
+# get prediction intervals (no random slopes for feasibility)
+m_no_rand_s <- lmer(
+  uV_roi ~ cong_dev * pred_norm +
+    (1 | subj_id) +
+    (1 | image) +
+    (1 | string),
+  REML=FALSE,
+  control = lmerControl(optimizer="bobyqa"),
+  data=d
+)
+
+d_preds_boot <- expand_grid(
+  cong_dev = unique(d$cong_dev),
+  perc_name_agree = seq(min(d$perc_name_agree), max(d$perc_name_agree), 0.01)
+) %>%
+  mutate(
+    prop_agree = perc_name_agree/100,
+    pred_norm = norm01(prop_agree, na.rm=TRUE),
+    condition = ifelse(cong_dev==min(cong_dev), "A2", "A1")
+  )
+
+boots <- bootMer(m_no_rand_s, nsim=5000, FUN=function(m_i) {
+  predict(m_i, newdata=d_preds_boot, re.form=~0)
+}, seed=3101, .progress="txt")
+
+ci_norm <- function(x, level=c(0.025, 0.975)) {
+  qnorm(p=level, mean=mean(x), sd=sd(x))
+}
+
+d_preds_boot <- d_preds_boot %>%
+  mutate(
+    pred_int_lwr = apply(boots$t, 2, ci_norm, level=.025),
+    pred_int_upr = apply(boots$t, 2, ci_norm, level=.975)
+  )
+
+lmm_plot_scatter <- d %>%
+  ggplot(aes(perc_name_agree, uV_roi, colour = condition)) +
+  geom_point(shape=1, show.legend=FALSE, alpha=0.25) +
+  scale_colour_manual(name = NA, labels = NULL, values = cong_cols_light) +
+  new_scale_colour() +
+  geom_line(aes(y=pred_uV, colour = condition), data=d_cells_pred, linewidth=1.75) +
+  scale_colour_manual(name = "Picture-Word Congruency", labels = c("Congruent", "Incongruent"), values = cong_cols) +
+  scale_y_continuous(breaks=scales::extended_breaks(n=6)) +
+  labs(x = "Predictability (%)", y = "N1 Amplitude (µV)", tag="a")
+
+lmm_plot_lines <- ggplot() +
+  geom_ribbon(aes(x=perc_name_agree, ymin=pred_int_lwr, ymax=pred_int_upr, colour=condition), data=d_preds_boot, fill=NA, linetype="dashed", show.legend = FALSE) +
+  geom_line(aes(x=perc_name_agree, y=pred_uV, colour=condition), data=d_cells_pred, linewidth=1.75) +
+  scale_colour_manual(name = "Picture-Word Congruency", labels = c("Congruent", "Incongruent"), values = cong_cols) +
+  scale_y_continuous(breaks=scales::extended_breaks(n=6)) +
+  labs(x = "Predictability (%)", y = "N1 Amplitude (µV)", tag="b")
+
+lmm_plot <- (lmm_plot_scatter | lmm_plot_lines) +
+  plot_layout(guides = "collect") &
+  theme(legend.position = "bottom", plot.background = element_blank()) &
+  scale_x_continuous(expand=c(0, 0))
+
+ggsave(file.path("figs", "05_lmm_summary_plot_roi.png"), lmm_plot, device="png", type="cairo", width=6.5, height=3.75, dpi=600)
+
+ggsave(file.path("figs", "05_lmm_summary_plot_roi.pdf"), lmm_plot, device="pdf", width=6.5, height=3.75)
+
+# evidences ratio for effect -----------------------------------------------
+
+p <- c(
+  set_prior("normal(-5, 10)", class="Intercept"),
+  set_prior("normal(0, 5)", class="b", coef="cong_dev"),
+  set_prior("normal(0, 5)", class="b", coef="pred_norm"),
+  set_prior("normal(0, 5)", class="b", coef="cong_dev:pred_norm"),
+  set_prior("lkj_corr_cholesky(1)", class="L")
+)
+
+m_b <- brm(
+  uV_roi ~ cong_dev * pred_norm +
+    (cong_dev * pred_norm | subj_id) +
+    (cong_dev | image) +
+    (1 | string),
+  data = d,
+  prior = p,
+  chains = 5,
+  cores = 5,
+  iter = 5000,
+  control = list(
+    adapt_delta = 0.8,
+    max_treedepth = 10
+  ),
+  refresh = 25,
+  seed = 420
+)
+
+m_b_h <- hypothesis(m_b, "cong_dev:pred_norm>0")
+
+m_b_h_hdi <- as_draws_df(m_b, variable="b_cong_dev:pred_norm", regex=FALSE) %>%
+  rename(int_est = `b_cong_dev:pred_norm`) %>%
+  median_hdi(int_est, .width=0.89)
+
+m_b_pl <- as_draws_df(m_b, variable="b_cong_dev:pred_norm", regex=FALSE) %>%
+  rename(int_est = `b_cong_dev:pred_norm`) %>%
+  ggplot(aes(int_est, fill=after_stat(x>0))) +
+  stat_slab(slab_colour="black", slab_size=0.4) +
+  stat_pointinterval(
+    .width=0.89, point_interval="median_hdi",
+    point_size=1, size=1, show.legend = FALSE
+  ) +
+  geom_vline(xintercept=0, linetype="dashed") +
+  geom_text(
+    aes(x = x, y = y),
+    data = tibble(x=-3, y=1),
+    label = deparse(bquote(BF["01"]==.(round(1/m_b_h$hypothesis$Evid.Ratio, 2)))),
+    parse = TRUE,
+    hjust = 0,
+    vjust = 1,
+    size = 4.5
+  ) +
+  scale_y_continuous(expand = expansion(0, 0.04), limits = c(0, NA)) +
+  coord_cartesian() +
+  labs(
+    x = "Congruency-Predictability Interaction (µV)",
+    y = "Posterior Density"
+  ) +
+  theme(legend.position = "none") +
+  scale_fill_manual(
+    values = c("grey90", "#8B0000"),
+    labels = c(bquote(H["0"]), bquote(H["1"]))
+  )
+
+ggsave(file.path("figs", "05_post.png"), width=3.5, height=1.75, device="png", type="cairo")
+ggsave(file.path("figs", "05_post.pdf"), width=3.5, height=1.75, device="pdf")
+
+
+lmm_plot_post <- wrap_plots(
+  lmm_plot,
+  (
+    m_b_pl +
+      labs(
+        fill = NULL,
+        tag = "c",
+        x = "Congruency-Predictability\nInteraction"
+      ) +
+      theme(
+        plot.background = element_blank(),
+        legend.position = "bottom"
+      )
+  )
+) +
+  plot_layout(widths = c(2, 1))
+
+ggsave(file.path("figs", "05_lmm_and_post.pdf"), lmm_plot_post, width=6.5, height=3, device="pdf")
+

+ 250 - 0
04 Analysis/analyse_06_localiser_results_roi.R

@@ -0,0 +1,250 @@
+# this script summarises the ERPs from the localiser task
+
+library(lme4)
+library(dplyr)
+library(purrr)
+library(readr)
+library(tidyr)
+library(ggplot2)
+library(parallel)
+library(patchwork)
+library(svglite)
+
+ggplot2::theme_set(ggplot2::theme_classic() + theme(strip.background = element_rect(fill = "white")))
+
+eff_cols <- c(
+  "none" = "#000000",
+  "Words vs. False Font" = "#D81B60",
+  "Words vs. Phase-Shuffled" = "#1E88E5"
+)
+
+cond_cols <- c(
+  "Words" = "#000000",
+  "False Font" = "#D81B60",
+  "Phase-Shuffled" = "#1E88E5"
+)
+
+# how many cores to parallelise over
+n_cores <- 12
+
+# get alist with the unique values for the times and channel names (taken from the first csv file)
+unique_ids <- list.files("localiser_sample_data", pattern=".+\\.csv$", full.names=TRUE) %>%
+  first() %>%
+  read_csv() %>%
+  select(ch_name, time) %>%
+  as.list() %>%
+  lapply(unique)
+
+# function to import all data for selected channels at the selected times (exact matches)
+get_sample_dat <- function(channels=NA, times=NA) {
+  list.files("localiser_sample_data", pattern=".+\\.csv$", full.names=TRUE) %>%
+    map_dfr(function(f) {
+      read_csv(
+        f,
+        col_types=cols(
+          subj_id = col_character(),
+          stim_grp = col_integer(),
+          resp_grp = col_integer(),
+          trl_nr = col_integer(),
+          ch_name = col_character(),
+          condition = col_character(),
+          string = col_character(),
+          item_nr = col_integer(),
+          .default = col_double()
+        ),
+        progress = FALSE
+      ) %>%
+        when(
+          is.na(channels) ~ .,
+          ~ filter(., ch_name %in% channels)
+        ) %>%
+        when(
+          is.na(times) ~ .,
+          ~ filter(., time %in% times)
+        ) %>%
+        # get a unique id for each stimulus and simplify condition column
+        mutate(
+          condition = substr(condition, 1, 1),
+          item_id = paste(item_nr, condition, sep="_")
+        ) %>%
+        # remove unused columns to save memory
+        select(-string, -resp_grp, -stim_grp, -trl_nr)
+    }) %>%
+    # deviation-code condition (with word as the null)
+    mutate(
+      ff_dev = scale(ifelse(condition=="b", 1, 0), center=TRUE, scale=FALSE),
+      noise_dev = scale(ifelse(condition=="n", 1, 0), center=TRUE, scale=FALSE)
+    )
+}
+
+rois <- list(
+  c("TP7", "CP5", "P5", "P7", "P9", "PO3", "PO7", "O1"),
+  c("TP8", "CP6", "P6", "P8", "P10", "PO4", "PO8", "O2")
+)
+
+# get fixed effect results from models for each timepoint, for the ROI
+message("Importing data")
+
+d_i_list <- get_sample_dat(channels = c(rois[[1]], rois[[2]])) %>%
+  mutate(
+    hemis = ifelse(ch_name %in% rois[[1]], "l", "r"),
+    hemis_dev = scale(ifelse(hemis == "l", 0, 1), center=TRUE, scale=FALSE)
+  ) |>
+  group_split(time)
+
+gc()
+
+message(sprintf("Fitting %g models to timecourse in parallel", length(d_i_list)))
+
+cl <- makeCluster(n_cores)
+cl_packages <- clusterEvalQ(cl, {
+  library(dplyr)
+  library(lme4)
+})
+
+m_fe <- parLapply(cl, d_i_list, function(d_i) {
+  m <- lme4::lmer(
+    uV ~ (ff_dev + noise_dev) * hemis_dev +
+      (1 | subj_id) +
+      (1 | ch_name:subj_id) +
+      (1 | item_nr) +
+      (1 | item_id),
+    REML = FALSE,
+    control = lme4::lmerControl(optimizer="bobyqa"),
+    data = d_i
+  )
+  
+  m %>%
+    summary() %>%
+    with(coefficients) %>%
+    as_tibble(rownames = "fe")
+}) %>%
+  reduce(bind_rows) %>%
+  mutate(time = rep(unique_ids$time, each=6))
+
+stopCluster(cl)
+
+
+fe_res_tidy <- m_fe %>%
+  mutate(
+    stim_eff_lab = factor(case_when(
+      grepl("ff_dev", fe, fixed=TRUE) ~ "Words vs. False Font",
+      grepl("noise_dev", fe, fixed=TRUE) ~ "Words vs. Phase-Shuffled",
+      TRUE ~ "none"
+    ), levels = c("none", "Words vs. False Font", "Words vs. Phase-Shuffled")),
+    main_or_int = factor(ifelse(
+      grepl(":", fe, fixed=TRUE),
+      "Interaction",
+      "Main Effect"
+    ), levels = c("Main Effect", "Interaction")),
+    eff_lab = factor(case_when(
+      grepl(":", fe, fixed=TRUE) ~ "Stimulus * Hemisphere",
+      fe == "hemis_dev" ~ "Hemisphere",
+      fe == "(Intercept)" ~ "Intercept",
+      TRUE ~ "Stimulus"
+    ), levels = c("Intercept", "Hemisphere", "Stimulus", "Stimulus * Hemisphere")),
+    bound_lower = Estimate - 1.96 * `Std. Error`,
+    bound_upper = Estimate + 1.96 * `Std. Error`
+  )
+
+pl_eff <- fe_res_tidy %>%
+  ggplot(aes(time, Estimate, group=stim_eff_lab, colour=stim_eff_lab, fill=stim_eff_lab)) +
+  geom_vline(xintercept=0, linetype="dashed") +
+  geom_hline(yintercept=0, linetype="dashed") +
+  geom_ribbon(aes(ymin=bound_lower, ymax=bound_upper), size=0.2, alpha=0.5) +
+  # geom_line(size=0.5) +
+  facet_wrap(vars(eff_lab), nrow=2) +
+  labs(x = "Time (ms)", y = "Effect Estimate (µV)", fill="Fixed Effect (95% CI)", colour="Fixed Effect (95% CI)", tag="a") +
+  scale_x_continuous(breaks = seq(-200, 1000, 200), expand=c(0, 0)) +
+  scale_y_continuous(breaks = scales::pretty_breaks(n = 5)) +
+  scale_fill_manual(values = eff_cols, breaks = c("Words vs. False Font", "Words vs. Phase-Shuffled")) +
+  scale_colour_manual(values = eff_cols, breaks = c("Words vs. False Font", "Words vs. Phase-Shuffled")) +
+  theme(
+    legend.position = "top",
+    legend.key.height = unit(6, "pt"),
+    strip.background = element_blank(),
+    axis.line.x = element_blank()
+  )
+
+
+preds <- fe_res_tidy %>%
+  select(time, fe, Estimate) %>%
+  pivot_wider(names_from = fe, values_from = Estimate) %>%
+  expand_grid(
+    condition = c("w", "b", "n"),
+    hemis = c("l", "r")
+  ) %>%
+  left_join(
+    d_i_list[[1]] %>%
+      select(condition, hemis, hemis_dev, ff_dev, noise_dev) %>%
+      distinct() %>%
+      rename(
+        ff_dev_val = ff_dev,
+        noise_dev_val = noise_dev,
+        hemis_dev_val = hemis_dev
+      ),
+    by = c("condition", "hemis")
+  ) %>%
+  mutate(
+    pred_uV = `(Intercept)` +
+      ff_dev_val * ff_dev +
+      noise_dev_val * noise_dev +
+      hemis_dev_val * hemis_dev +
+      ff_dev_val * hemis_dev_val * `ff_dev:hemis_dev` +
+      noise_dev_val * hemis_dev_val * `noise_dev:hemis_dev`,
+    cond_lab = factor(recode(
+      condition,
+      w = "Words",
+      b = "False Font",
+      n = "Phase-Shuffled"
+    ), levels = c("Words", "False Font", "Phase-Shuffled")),
+    hemis_lab = factor(
+      ifelse(hemis=="l", "Left Hemisphere", "Right Hemisphere"),
+      levels=c("Left Hemisphere", "Right Hemisphere")
+    )
+  )
+
+pl_preds <- preds %>%
+  ggplot(aes(time, pred_uV, colour=cond_lab)) +
+  geom_vline(xintercept=0, linetype="dashed") +
+  geom_hline(yintercept=0, linetype="dashed") +
+  geom_line() +
+  scale_x_continuous(breaks = seq(-200, 1000, 200), expand=c(0, 0)) +
+  scale_y_continuous(breaks = scales::pretty_breaks(n = 5)) +
+  scale_colour_manual(values = cond_cols) +
+  labs(
+    x = "Time (ms)",
+    y = "Amplitude (µV)",
+    colour = "Stimulus ERP",
+    tag = "b"
+  ) +
+  theme(
+    legend.position = "top",
+    plot.margin = margin(),
+    strip.background = element_blank(),
+    axis.line.x = element_blank()
+  ) +
+  facet_wrap(vars(hemis_lab))
+
+# function to get a ggplot's legend as a ggplot object that patchwork will accept
+get_legend <- function(pl) {
+  tmp <- ggplot_gtable(ggplot_build(pl))
+  leg <- which(sapply(tmp$grobs, function(x) x$name) == "guide-box")
+  legend <- tmp$grobs[[leg]]
+  legend
+}
+
+eff_leg <- get_legend(pl_eff)
+stim_leg <- get_legend(pl_preds)
+
+pl_leg <- wrap_plots(list(eff_leg, stim_leg), ncol=1)
+
+pl <- (
+  (pl_leg) /
+    (pl_eff + theme(legend.position="none")) /
+    (pl_preds + theme(legend.position="none"))
+  ) +
+  plot_layout(heights = c(0.1, 0.05, 1, 0.4))
+
+ggsave(file.path("figs", "sample_level", "localiser", "roi.svg"), pl, width=6.5, height=6.5)
+ggsave(file.path("figs", "sample_level", "localiser", "roi.pdf"), pl, width=6.5, height=6.5)

+ 463 - 0
04 Analysis/analyse_07_sample_picture_word_roi.R

@@ -0,0 +1,463 @@
+library(lme4)
+library(dplyr)
+library(tidyr)
+library(purrr)
+library(readr)
+library(ggplot2)
+library(ggtext)
+library(cowplot)
+library(patchwork)
+library(parallel)
+ggplot2::theme_set(ggplot2::theme_classic())
+cong_cols <- c("#E69F00", "#009E73")
+n_cores <- 14
+
+library(RColorBrewer)
+eff_cols <- c("#000000", brewer.pal(3, "Set1")[1:2], brewer.pal(5, "Set1")[5])
+cong_cols <- c("#E69F00", "#009E73")
+cong_cols_alpha_0.4_white <- c("#F5D899", "#99D8C7")
+
+roi <- c("TP7", "CP5", "P5", "P7", "P9", "PO3", "PO7", "O1")
+time_lims <- c(-250, 650)
+
+# function to normalise between 0 and 1
+norm01 <- function(x, ...) (x-min(x, ...))/(max(x, ...)-min(x, ...))
+
+# get the stimuli's percentage of name agreement values
+stim <- read_csv("boss.csv", col_types = cols(perc_name_agree_denom_fq_inputs = col_number())) %>%
+  select(filename, perc_name_agree_denom_fq_inputs) %>%
+  rename(perc_name_agree = perc_name_agree_denom_fq_inputs)
+
+# import the max electrode data from the preprocessing, and set up the variables for the model
+get_sample_data <- function(ch_i = NA) {
+  list.files("sample_data", pattern=".+\\.csv$", full.names=TRUE) %>%
+    map_dfr(function(f) {
+      x <- read_csv(
+        f,
+        col_types=cols(
+          subj_id = col_character(),
+          stim_grp = col_integer(),
+          resp_grp = col_integer(),
+          item_nr = col_integer(),
+          ch_name = col_character(),
+          time = col_double(),
+          uV = col_double(),
+          condition = col_character(),
+          image = col_character(),
+          string = col_character(),
+          .default = col_double()
+        ),
+        progress = FALSE
+      ) |>
+        select(-stim_grp, -resp_grp, -item_nr) |>
+        filter(time > time_lims[[1]] & time < time_lims[[2]])
+      if (all(!is.na(ch_i))) x <- filter(x, ch_name %in% ch_i)
+      x
+    }) %>%
+    left_join(stim, by=c("image" = "filename")) %>%
+    mutate(
+      prop_agree = perc_name_agree/100,
+      pred_norm = norm01(prop_agree, na.rm=TRUE),
+      pred_norm_max = pred_norm - 1
+    ) %>%
+    select(-perc_name_agree)
+}
+
+unique_ids <- list.files("sample_data", pattern=".+\\.csv$", full.names=TRUE) %>%
+  first() %>%
+  read_csv() |>
+  filter(time > time_lims[[1]] & time < time_lims[[2]]) |>
+  select(ch_name, time) %>%
+  as.list() %>%
+  lapply(unique)
+
+target_time_ids <- unique_ids$time
+
+# get models for each timepoint, for each electrode, and extract fixed effects
+d_i_list <- get_sample_data(roi) |>
+  mutate(
+    cong_dev = as.numeric(scale(ifelse(condition=="A2", 0, 1), center=TRUE, scale=FALSE)),
+    cong_dum_incong = as.numeric(condition=="A1"),  # 0 is at incongruent
+    cong_dum_cong = as.numeric(condition=="A2")  # 0 is at congruent
+  ) |>
+  group_split(time)
+
+gc()
+
+# # sanity check
+# sc_pl <- d_i_list |>
+#   reduce(bind_rows) |>
+#   mutate(pred_bin = factor(case_when(
+#     pred_norm <.33 ~ "low",
+#     pred_norm <.66 ~ "medium",
+#     pred_norm < Inf ~ "high"
+#   ), levels = c("low", "medium", "high"))) |>
+#   group_by(time, condition, pred_bin, ch_name) |>
+#   summarise(mean_uV = mean(uV)) |>
+#   ungroup() |>
+#   pivot_wider(names_from=condition, values_from=mean_uV) |>
+#   mutate(cong_minus_incong = A1-A2) |>
+#   ggplot(aes(time, cong_minus_incong, colour=ch_name)) +
+#   geom_line() +
+#   facet_wrap(vars(pred_bin))
+
+message("Fitting models in parallel to OT channels in picture-word experiment")
+
+# d_i_list <- get_sample_data(roi) %>%
+#   mutate(cong_dev = as.numeric(scale(ifelse(condition=="A2", 0, 1), center=TRUE, scale=FALSE))) %>%
+#   filter(time <= 655) %>%
+#   arrange(time) %>%
+#   split(time)
+# 
+# gc()
+
+cl <- makeCluster(n_cores)
+cl_packages <- clusterEvalQ(cl, {
+  library(dplyr)
+  library(lme4)
+})
+
+fe_res <- parLapply(cl, d_i_list, function(d_i) {
+  # m <- lme4::lmer(
+  #   uV ~ cong_dev * pred_norm +
+  #     (cong_dev * pred_norm | subj_id) +
+  #     (cong_dev * pred_norm | ch_name:subj_id) +
+  #     (cong_dev | image) +
+  #     (1 | string),
+  #   REML=FALSE,
+  #   control = lmerControl(optimizer="bobyqa"),
+  #   data=d_i
+  # )
+  m <- lme4::lmer(
+    uV ~ cong_dev * pred_norm +
+      (1 | subj_id) +
+      (1 | ch_name:subj_id) +
+      (1 | image) +
+      (1 | string),
+    REML=FALSE,
+    control = lmerControl(optimizer="bobyqa"),
+    data=d_i
+  )
+  
+  m %>%
+    summary() %>%
+    with(coefficients) %>%
+    as_tibble(rownames = "fe")
+}) %>%
+  reduce(bind_rows) %>%
+  mutate(time = rep(target_time_ids, each=4))
+
+stopCluster(cl)
+
+fe_res_tidy <- fe_res %>%
+  mutate(
+    fe_lab = factor(recode(
+      fe,
+      `(Intercept)` = "Intercept",
+      cong_dev = "Congruency",
+      pred_norm = "Predictability",
+      `cong_dev:pred_norm` = "Congruency * Predictability",
+    ), levels = c("Intercept", "Congruency", "Predictability", "Congruency * Predictability")),
+    fe_lab_newline = fe_lab,
+    bound_lower = Estimate - 1.96 * `Std. Error`,
+    bound_upper = Estimate + 1.96 * `Std. Error`
+  )
+
+levels(fe_res_tidy$fe_lab) <- c("Intercept"="Intercept", "Congruency"="Congruency", "Predictability"="Predictability", "Congruency * Predictability"="Congruency × Predictability")
+
+levels(fe_res_tidy$fe_lab_newline) <- c("Intercept"="Intercept", "Congruency"="Congruency", "Predictability"="Predictability", "Congruency * Predictability"="Congruency<br>× Predictability")
+
+overlay_dat <- fe_res_tidy %>%
+  filter(fe_lab=="Intercept") %>%
+  select(-fe_lab, -fe_lab_newline) %>%
+  expand_grid(fe_lab = unique(fe_res_tidy$fe_lab)) %>%
+  mutate(
+    Estimate = ifelse(fe_lab=="Intercept", NA, Estimate),
+    fe_lab_newline = fe_lab
+  )
+
+levels(overlay_dat$fe_lab) <- c("Intercept"="Intercept", "Congruency"="Congruency", "Predictability"="Predictability", "Congruency * Predictability"="Congruency × Predictability")
+
+levels(overlay_dat$fe_lab_newline) <- c("Intercept"="Intercept", "Congruency"="Congruency", "Predictability"="Predictability", "Congruency * Predictability"="Congruency<br>× Predictability")
+
+ylims <- round(c(min(fe_res_tidy$bound_lower), max(fe_res_tidy$bound_upper)))
+
+pl_roi <- fe_res_tidy %>%
+  ggplot(aes(time, Estimate, ymin=bound_lower, ymax=bound_upper, group=fe_lab)) +
+  geom_line(colour="black", data=overlay_dat, alpha=0.6) +
+  geom_ribbon(alpha=0.5, linewidth=0.1, fill="dodgerblue") +
+  geom_line() +
+  geom_vline(xintercept=0, linetype="dashed") +
+  geom_hline(yintercept=0, linetype="dashed") +
+  facet_wrap(vars(fe_lab), nrow=2) +
+  labs(x = "Time (ms)", y = "Fixed Effect Estimate (µV)", tag="a") +
+  # scale_colour_manual(values = eff_cols) +
+  # scale_fill_manual(values = eff_cols) +
+  scale_x_continuous(expand = c(0, 0)) +
+  scale_y_continuous(limits=ylims) +
+  theme(
+    legend.position = "none",
+    strip.background = element_blank(),
+    legend.background = element_blank(),
+    axis.line.x = element_blank(),
+    panel.spacing.x = unit(12, "pt"),
+    strip.text.x = ggtext::element_markdown()
+  ) +
+  coord_cartesian(xlim=c(-250, time_lims[2]))
+
+pl_roi_1row <- fe_res_tidy %>%
+  ggplot(aes(time, Estimate, ymin=bound_lower, ymax=bound_upper, group=fe_lab)) +
+  geom_line(colour="darkgrey", data=overlay_dat) +
+  geom_ribbon(alpha=0.5, linewidth=0.1, fill="dodgerblue") +
+  geom_line() +
+  geom_vline(xintercept=0, linetype="dashed") +
+  geom_hline(yintercept=0, linetype="dashed") +
+  facet_wrap(vars(fe_lab_newline), nrow=1) +
+  labs(x = "Time (ms)", y = "Effect Estimate (µV)", tag="a") +
+  # scale_colour_manual(values = eff_cols) +
+  # scale_fill_manual(values = eff_cols) +
+  scale_x_continuous(expand = c(0, 0)) +
+  scale_y_continuous(limits=ylims) +
+  theme(
+    legend.position = "none",
+    strip.background = element_blank(),
+    legend.background = element_blank(),
+    axis.line.x = element_blank(),
+    panel.spacing.x = unit(12, "pt"),
+    strip.text.x = ggtext::element_markdown(),
+    strip.clip = "off"
+  ) +
+  coord_cartesian(xlim=c(-250, time_lims[2]))
+
+preds <- expand_grid(
+  prop_agree = seq(0.1, 1, 0.1),
+  condition = factor(c("Picture-Congruent", "Picture-Incongruent"), levels=c("Picture-Congruent", "Picture-Incongruent"))
+) %>%
+  rowwise() %>%
+  mutate(pred_norm = ifelse(prop_agree==1, 1, norm01(c(0.07, prop_agree, 1))[2])) %>%
+  expand_grid(time = target_time_ids) %>%
+  ungroup() %>%
+  mutate(
+    perc_name_agree = prop_agree*100,
+    cong_dev = as.numeric(scale(ifelse(condition=="Picture-Incongruent", 0, 1), center=TRUE, scale=FALSE))
+  ) %>%
+  left_join(
+    fe_res %>%
+      select(time, fe, Estimate) %>%
+      pivot_wider(names_from=fe, values_from=Estimate, names_prefix="fe_"),
+    by = "time"
+  ) %>%
+  mutate(
+    pred_uV = `fe_(Intercept)` +
+      cong_dev * fe_cong_dev +
+      pred_norm * fe_pred_norm +
+      cong_dev * pred_norm * `fe_cong_dev:pred_norm`
+  )
+
+pl_preds <- preds %>%
+  ggplot(aes(time, pred_uV, colour=prop_agree)) +
+  geom_line(aes(group = as.factor(prop_agree))) +
+  geom_vline(xintercept=0, linetype="dashed") +
+  geom_hline(yintercept=0, linetype="dashed") +
+  scale_colour_continuous(
+    type="viridis", breaks=sort(unique(preds$prop_agree)),
+    labels=sprintf("%s%%", sort(unique(preds$prop_agree))*100),
+    # guide=guide_colourbar(barheight = 7.6, barwidth = 1)
+    guide = guide_legend(override.aes = list(linewidth=1), reverse = TRUE)
+  ) +
+  scale_x_continuous(expand=c(0, 0)) +
+  scale_y_continuous(limits=ylims) +
+  facet_wrap(vars(condition), nrow=1) +
+  labs(
+    x = "Time (ms)",
+    y = "Predicted Amplitude (µV)",
+    colour = "Predictability",
+    tag = "b"
+  ) +
+  theme(
+    legend.position = "right",
+    legend.key.height = unit(11, "pt"),
+    legend.text.align = 1,
+    strip.background = element_blank(),
+    plot.background = element_blank(),
+    axis.line.x = element_blank(),
+    panel.spacing.x = unit(12, "pt")
+  ) +
+  coord_cartesian(xlim=c(-250, time_lims[2]))
+
+pl <- plot_grid(pl_roi, pl_preds, ncol=1, rel_heights = c(1, 0.8), align="v", axis="l")
+
+ggsave(file.path("figs", "sample_level", "picture_word", "roi.pdf"), pl, width=6.5, height=6, device="pdf")
+ggsave(file.path("figs", "sample_level", "picture_word", "roi.png"), pl, width=6.5, height=6, device="png", type="cairo")
+
+pl_preds_bypred <- preds %>%
+  mutate(
+    perc_lab = factor(
+      sprintf("%s%%", perc_name_agree),
+      levels = sprintf("%s%%", sort(unique(perc_name_agree)))
+    ),
+    cong_lab = ifelse(condition=="Picture-Congruent", "Congruent", "Incongruent")
+  ) %>%
+  ggplot(aes(time, pred_uV, colour=cong_lab)) +
+  geom_vline(xintercept=0, linetype="dashed") +
+  geom_hline(yintercept=0, linetype="dashed") +
+  geom_line() +
+  scale_colour_manual(
+    values = cong_cols,
+    guide=guide_legend(nrow=1, override.aes = list(linewidth=1))
+  ) +
+  scale_x_continuous(expand=c(0, 0)) +
+  scale_y_continuous(limits=ylims) +
+  facet_wrap(vars(perc_lab), nrow=3) +
+  labs(
+    x = "Time (ms)",
+    y = "Predicted Amplitude (µV)",
+    colour = "Picture-Word Congruency"
+  ) +
+  theme(
+    legend.position = c(0.925, -0.025),
+    legend.justification = c(1, 0),
+    legend.margin = margin(),
+    legend.key.height = unit(12, "pt"),
+    strip.background = element_blank(),
+    plot.background = element_blank(),
+    axis.line.x = element_blank(),
+    panel.spacing.x = unit(12, "pt")
+  ) +
+  coord_cartesian(xlim=c(-250, time_lims[2]))
+
+pl_plus_C <- plot_grid(
+  pl_roi_1row + theme(strip.text = element_text(margin=margin(t=2.5, b=2.5, unit="pt"))),
+  pl_preds +
+    theme(
+      strip.text = element_text(margin=margin(t=2.5, b=2.5, unit="pt")),
+      legend.key.height = unit(9.5, "pt")
+    ),
+  pl_preds_bypred +
+    labs(tag="c") +
+    theme(strip.text = element_text(margin=margin(t=2.5, b=2.5, unit="pt"))),
+  ncol=1, rel_heights = c(0.85, 1.2, 1.5), align="v", axis="l"
+)
+
+
+ggsave(file.path("figs", "sample_level", "picture_word", "roi_3panels.pdf"), pl_plus_C, width=6.5, height=7, device="pdf")
+ggsave(file.path("figs", "sample_level", "picture_word", "roi_3panels.png"), pl_plus_C, width=6.5, height=7, device="png", type="cairo")
+
+
+# simple effects models ---------------------------------------------------
+
+cl <- makeCluster(n_cores)
+cl_packages <- clusterEvalQ(cl, {
+  library(dplyr)
+  library(lme4)
+})
+
+fe_res_cong <- parLapply(cl, d_i_list, function(d_i) {
+  m <- lme4::lmer(
+    uV ~ cong_dum_cong * pred_norm +
+      (1 | subj_id) +
+      (1 | ch_name:subj_id) +
+      (1 | image) +
+      (1 | string),
+    REML=FALSE,
+    control = lmerControl(optimizer="bobyqa"),
+    data=d_i
+  )
+  
+  m %>%
+    summary() %>%
+    with(coefficients) %>%
+    as_tibble(rownames = "fe")
+}) %>%
+  reduce(bind_rows) %>%
+  mutate(time = rep(target_time_ids, each=4))
+
+fe_res_incong <- parLapply(cl, d_i_list, function(d_i) {
+  m <- lme4::lmer(
+    uV ~ cong_dum_incong * pred_norm +
+      (1 | subj_id) +
+      (1 | ch_name:subj_id) +
+      (1 | image) +
+      (1 | string),
+    REML=FALSE,
+    control = lmerControl(optimizer="bobyqa"),
+    data=d_i
+  )
+  
+  m %>%
+    summary() %>%
+    with(coefficients) %>%
+    as_tibble(rownames = "fe")
+}) %>%
+  reduce(bind_rows) %>%
+  mutate(time = rep(target_time_ids, each=4))
+
+stopCluster(cl)
+
+
+fe_res_tidy_se <- bind_rows(
+  fe_res_cong %>%
+    mutate(
+      cong_level = "Congruent",
+      fe_lab = factor(recode(
+        fe,
+        `(Intercept)` = "Intercept",
+        cong_dum_cong = "Congruency",
+        pred_norm = "Predictability",
+        `cong_dum_cong:pred_norm` = "Congruency * Predictability"
+      ), levels = c("Intercept", "Congruency", "Predictability", "Congruency * Predictability")),
+      fe_lab_newline = factor(fe_lab, labels = c("Intercept", "Congruency", "Predictability", "Congruency\n* Predictability")),
+      bound_lower = Estimate - 1.96 * `Std. Error`,
+      bound_upper = Estimate + 1.96 * `Std. Error`
+    ),
+  fe_res_incong %>%
+    mutate(
+      cong_level = "Incongruent",
+      fe_lab = factor(recode(
+        fe,
+        `(Intercept)` = "Intercept",
+        cong_dum_incong = "Congruency",
+        pred_norm = "Predictability",
+        `cong_dum_incong:pred_norm` = "Congruency * Predictability"
+      ), levels = c("Intercept", "Congruency", "Predictability", "Congruency * Predictability")),
+      fe_lab_newline = factor(fe_lab, labels = c("Intercept", "Congruency", "Predictability", "Congruency\n* Predictability")),
+      bound_lower = Estimate - 1.96 * `Std. Error`,
+      bound_upper = Estimate + 1.96 * `Std. Error`
+    )
+)
+
+fe_res_tidy_se_pred <- filter(fe_res_tidy_se, fe_lab=="Predictability")
+
+fe_se_pl_alpha <- 0.4
+
+fe_se_pl <- fe_res_tidy_se_pred %>%
+  ggplot(aes(time, Estimate, ymin=bound_lower, ymax=bound_upper, colour=cong_level, fill=cong_level)) +
+  geom_ribbon(alpha=fe_se_pl_alpha, linewidth=0.25, data=filter(fe_res_tidy_se_pred, cong_level=="Congruent")) +
+  geom_line(data=filter(fe_res_tidy_se_pred, cong_level=="Congruent")) +
+  geom_ribbon(alpha=fe_se_pl_alpha, linewidth=0.25, data=filter(fe_res_tidy_se_pred, cong_level=="Incongruent")) +
+  geom_line(data=filter(fe_res_tidy_se_pred, cong_level=="Incongruent")) +
+  geom_vline(xintercept=0, linetype="dashed") +
+  geom_hline(yintercept=0, linetype="dashed") +
+  scale_colour_manual(values = cong_cols) +
+  scale_fill_manual(values = cong_cols) +
+  scale_x_continuous(expand=c(0, 0), breaks = seq(-200, time_lims[2], 100), limits=c(-250, time_lims[2])) +
+  scale_y_continuous(breaks=scales::breaks_pretty(n=6)) +
+  guides(
+    colour = guide_legend(nrow=1),
+    fill = guide_legend(nrow=1, override.aes = list(alpha=1, fill=cong_cols_alpha_0.4_white))
+  ) +
+  labs(
+    x = "Time (ms)",
+    y = "Effect of Predictability (µV)",
+    colour = "Picture-Word Congruency",
+    fill = "Picture-Word Congruency"
+  ) +
+  theme(
+    legend.key.height = unit(12, "pt"),
+    legend.position = "top",
+    axis.line.x = element_blank()
+  )
+
+ggsave(file.path("figs", "sample_level", "picture_word", "roi_simple_effs.pdf"), fe_se_pl, width=4.75, height=2.75, device="pdf")
+ggsave(file.path("figs", "sample_level", "picture_word", "roi_simple_effs.png"), fe_se_pl, width=4.75, height=2.75, device="png", type="cairo")
+

+ 459 - 0
04 Analysis/analyse_07_sample_picture_word_roi_RH.R

@@ -0,0 +1,459 @@
+library(lme4)
+library(dplyr)
+library(tidyr)
+library(purrr)
+library(readr)
+library(ggplot2)
+library(ggtext)
+library(cowplot)
+library(patchwork)
+library(parallel)
+ggplot2::theme_set(ggplot2::theme_classic())
+cong_cols <- c("#E69F00", "#009E73")
+n_cores <- 14
+
+library(RColorBrewer)
+eff_cols <- c("#000000", brewer.pal(3, "Set1")[1:2], brewer.pal(5, "Set1")[5])
+cong_cols <- c("#E69F00", "#009E73")
+cong_cols_alpha_0.4_white <- c("#F5D899", "#99D8C7")
+
+# roi <- c("TP7", "CP5", "P5", "P7", "P9", "PO3", "PO7", "O1")
+roi <- c("P10", "O2", "PO8", "P8", "TP8", "PO4", "P6", "CP6")
+time_lims <- c(-250, 650)
+
+# function to normalise between 0 and 1
+norm01 <- function(x, ...) (x-min(x, ...))/(max(x, ...)-min(x, ...))
+
+# get the stimuli's percentage of name agreement values
+stim <- read_csv("boss.csv", col_types = cols(perc_name_agree_denom_fq_inputs = col_number())) %>%
+  select(filename, perc_name_agree_denom_fq_inputs) %>%
+  rename(perc_name_agree = perc_name_agree_denom_fq_inputs)
+
+# import the max electrode data from the preprocessing, and set up the variables for the model
+get_sample_data <- function(ch_i = NA) {
+  list.files("sample_data", pattern=".+\\.csv$", full.names=TRUE) %>%
+    map_dfr(function(f) {
+      x <- read_csv(
+        f,
+        col_types=cols(
+          subj_id = col_character(),
+          stim_grp = col_integer(),
+          resp_grp = col_integer(),
+          item_nr = col_integer(),
+          ch_name = col_character(),
+          time = col_double(),
+          uV = col_double(),
+          condition = col_character(),
+          image = col_character(),
+          string = col_character(),
+          .default = col_double()
+        ),
+        progress = FALSE
+      ) |>
+        select(-stim_grp, -resp_grp, -item_nr) |>
+        filter(time > time_lims[[1]] & time < time_lims[[2]])
+      if (all(!is.na(ch_i))) x <- filter(x, ch_name %in% ch_i)
+      x
+    }) %>%
+    left_join(stim, by=c("image" = "filename")) %>%
+    mutate(
+      prop_agree = perc_name_agree/100,
+      pred_norm = norm01(prop_agree, na.rm=TRUE),
+      pred_norm_max = pred_norm - 1
+    ) %>%
+    select(-perc_name_agree)
+}
+
+unique_ids <- list.files("sample_data", pattern=".+\\.csv$", full.names=TRUE) %>%
+  first() %>%
+  read_csv() |>
+  filter(time > time_lims[[1]] & time < time_lims[[2]]) |>
+  select(ch_name, time) %>%
+  as.list() %>%
+  lapply(unique)
+
+target_time_ids <- unique_ids$time
+
+# get models for each timepoint, for each electrode, and extract fixed effects
+d_i_list <- get_sample_data(roi) |>
+  mutate(
+    cong_dev = as.numeric(scale(ifelse(condition=="A2", 0, 1), center=TRUE, scale=FALSE)),
+    cong_dum_incong = as.numeric(condition=="A1"),  # 0 is at incongruent
+    cong_dum_cong = as.numeric(condition=="A2")  # 0 is at congruent
+  ) |>
+  group_split(time)
+
+gc()
+
+# # sanity check
+# sc_pl <- d_i_list |>
+#   reduce(bind_rows) |>
+#   mutate(pred_bin = factor(case_when(
+#     pred_norm <.33 ~ "low",
+#     pred_norm <.66 ~ "medium",
+#     pred_norm < Inf ~ "high"
+#   ), levels = c("low", "medium", "high"))) |>
+#   group_by(time, condition, pred_bin, ch_name) |>
+#   summarise(mean_uV = mean(uV)) |>
+#   ungroup() |>
+#   pivot_wider(names_from=condition, values_from=mean_uV) |>
+#   mutate(cong_minus_incong = A1-A2) |>
+#   ggplot(aes(time, cong_minus_incong, colour=ch_name)) +
+#   geom_line() +
+#   facet_wrap(vars(pred_bin))
+
+message("Fitting models in parallel to OT channels in picture-word experiment")
+
+# d_i_list <- get_sample_data(roi) %>%
+#   mutate(cong_dev = as.numeric(scale(ifelse(condition=="A2", 0, 1), center=TRUE, scale=FALSE))) %>%
+#   filter(time <= 655) %>%
+#   arrange(time) %>%
+#   split(time)
+# 
+# gc()
+
+cl <- makeCluster(n_cores)
+cl_packages <- clusterEvalQ(cl, {
+  library(dplyr)
+  library(lme4)
+})
+
+fe_res <- parLapply(cl, d_i_list, function(d_i) {
+  # m <- lme4::lmer(
+  #   uV ~ cong_dev * pred_norm +
+  #     (cong_dev * pred_norm | subj_id) +
+  #     (cong_dev * pred_norm | ch_name:subj_id) +
+  #     (cong_dev | image) +
+  #     (1 | string),
+  #   REML=FALSE,
+  #   control = lmerControl(optimizer="bobyqa"),
+  #   data=d_i
+  # )
+  m <- lme4::lmer(
+    uV ~ cong_dev * pred_norm +
+      (1 | subj_id) +
+      (1 | ch_name:subj_id) +
+      (1 | image) +
+      (1 | string),
+    REML=FALSE,
+    control = lmerControl(optimizer="bobyqa"),
+    data=d_i
+  )
+  
+  m %>%
+    summary() %>%
+    with(coefficients) %>%
+    as_tibble(rownames = "fe")
+}) %>%
+  reduce(bind_rows) %>%
+  mutate(time = rep(target_time_ids, each=4))
+
+stopCluster(cl)
+
+fe_res_tidy <- fe_res %>%
+  mutate(
+    fe_lab = factor(recode(
+      fe,
+      `(Intercept)` = "Intercept",
+      cong_dev = "Congruency",
+      pred_norm = "Predictability",
+      `cong_dev:pred_norm` = "Congruency * Predictability",
+    ), levels = c("Intercept", "Congruency", "Predictability", "Congruency * Predictability")),
+    fe_lab_newline = fe_lab,
+    bound_lower = Estimate - 1.96 * `Std. Error`,
+    bound_upper = Estimate + 1.96 * `Std. Error`
+  )
+
+levels(fe_res_tidy$fe_lab) <- c("Intercept"="Intercept", "Congruency"="Congruency", "Predictability"="Predictability", "Congruency * Predictability"="Congruency × Predictability")
+
+levels(fe_res_tidy$fe_lab_newline) <- c("Intercept"="Intercept", "Congruency"="Congruency", "Predictability"="Predictability", "Congruency * Predictability"="Congruency<br>× Predictability")
+
+overlay_dat <- fe_res_tidy %>%
+  filter(fe_lab=="Intercept") %>%
+  select(-fe_lab) %>%
+  expand_grid(fe_lab = unique(fe_res_tidy$fe_lab)) %>%
+  mutate(
+    Estimate = ifelse(fe_lab=="Intercept", NA, Estimate)
+  )
+
+ylims <- round(c(min(fe_res_tidy$bound_lower), max(fe_res_tidy$bound_upper)))
+
+pl_roi <- fe_res_tidy %>%
+  ggplot(aes(time, Estimate, ymin=bound_lower, ymax=bound_upper, group=fe_lab)) +
+  geom_line(colour="black", data=overlay_dat, alpha=0.6) +
+  geom_ribbon(alpha=0.5, linewidth=0.1, fill="dodgerblue") +
+  geom_line() +
+  geom_vline(xintercept=0, linetype="dashed") +
+  geom_hline(yintercept=0, linetype="dashed") +
+  facet_wrap(vars(fe_lab), nrow=2) +
+  labs(x = "Time (ms)", y = "Fixed Effect Estimate (µV)", tag="a") +
+  # scale_colour_manual(values = eff_cols) +
+  # scale_fill_manual(values = eff_cols) +
+  scale_x_continuous(expand = c(0, 0)) +
+  scale_y_continuous(limits=ylims) +
+  theme(
+    legend.position = "none",
+    strip.background = element_blank(),
+    legend.background = element_blank(),
+    axis.line.x = element_blank(),
+    panel.spacing.x = unit(12, "pt"),
+    strip.text.x = ggtext::element_markdown()
+  ) +
+  coord_cartesian(xlim=c(-250, time_lims[2]))
+
+pl_roi_1row <- fe_res_tidy %>%
+  ggplot(aes(time, Estimate, ymin=bound_lower, ymax=bound_upper, group=fe_lab)) +
+  geom_line(colour="darkgrey", data=overlay_dat) +
+  geom_ribbon(alpha=0.5, linewidth=0.1, fill="dodgerblue") +
+  geom_line() +
+  geom_vline(xintercept=0, linetype="dashed") +
+  geom_hline(yintercept=0, linetype="dashed") +
+  facet_wrap(vars(fe_lab_newline), nrow=1) +
+  labs(x = "Time (ms)", y = "Effect Estimate (µV)", tag="a") +
+  # scale_colour_manual(values = eff_cols) +
+  # scale_fill_manual(values = eff_cols) +
+  scale_x_continuous(expand = c(0, 0)) +
+  scale_y_continuous(limits=ylims) +
+  theme(
+    legend.position = "none",
+    strip.background = element_blank(),
+    legend.background = element_blank(),
+    axis.line.x = element_blank(),
+    panel.spacing.x = unit(12, "pt"),
+    strip.text.x = ggtext::element_markdown(),
+    strip.clip = "off"
+  ) +
+  coord_cartesian(xlim=c(-250, time_lims[2]))
+
+preds <- expand_grid(
+  prop_agree = seq(0.1, 1, 0.1),
+  condition = factor(c("Picture-Congruent", "Picture-Incongruent"), levels=c("Picture-Congruent", "Picture-Incongruent"))
+) %>%
+  rowwise() %>%
+  mutate(pred_norm = ifelse(prop_agree==1, 1, norm01(c(0.07, prop_agree, 1))[2])) %>%
+  expand_grid(time = target_time_ids) %>%
+  ungroup() %>%
+  mutate(
+    perc_name_agree = prop_agree*100,
+    cong_dev = as.numeric(scale(ifelse(condition=="Picture-Incongruent", 0, 1), center=TRUE, scale=FALSE))
+  ) %>%
+  left_join(
+    fe_res %>%
+      select(time, fe, Estimate) %>%
+      pivot_wider(names_from=fe, values_from=Estimate, names_prefix="fe_"),
+    by = "time"
+  ) %>%
+  mutate(
+    pred_uV = `fe_(Intercept)` +
+      cong_dev * fe_cong_dev +
+      pred_norm * fe_pred_norm +
+      cong_dev * pred_norm * `fe_cong_dev:pred_norm`
+  )
+
+pl_preds <- preds %>%
+  ggplot(aes(time, pred_uV, colour=prop_agree)) +
+  geom_line(aes(group = as.factor(prop_agree))) +
+  geom_vline(xintercept=0, linetype="dashed") +
+  geom_hline(yintercept=0, linetype="dashed") +
+  scale_colour_continuous(
+    type="viridis", breaks=sort(unique(preds$prop_agree)),
+    labels=sprintf("%s%%", sort(unique(preds$prop_agree))*100),
+    # guide=guide_colourbar(barheight = 7.6, barwidth = 1)
+    guide = guide_legend(override.aes = list(linewidth=1), reverse = TRUE)
+  ) +
+  scale_x_continuous(expand=c(0, 0)) +
+  scale_y_continuous(limits=ylims) +
+  facet_wrap(vars(condition), nrow=1) +
+  labs(
+    x = "Time (ms)",
+    y = "Predicted Amplitude (µV)",
+    colour = "Predictability",
+    tag = "b"
+  ) +
+  theme(
+    legend.position = "right",
+    legend.key.height = unit(11, "pt"),
+    legend.text.align = 1,
+    strip.background = element_blank(),
+    plot.background = element_blank(),
+    axis.line.x = element_blank(),
+    panel.spacing.x = unit(12, "pt")
+  ) +
+  coord_cartesian(xlim=c(-250, time_lims[2]))
+
+pl <- plot_grid(pl_roi, pl_preds, ncol=1, rel_heights = c(1, 0.8), align="v", axis="l")
+
+ggsave(file.path("figs", "sample_level", "picture_word", "roi_rh.pdf"), pl, width=6.5, height=6, device="pdf")
+ggsave(file.path("figs", "sample_level", "picture_word", "roi_rh.png"), pl, width=6.5, height=6, device="png", type="cairo")
+
+pl_preds_bypred <- preds %>%
+  mutate(
+    perc_lab = factor(
+      sprintf("%s%%", perc_name_agree),
+      levels = sprintf("%s%%", sort(unique(perc_name_agree)))
+    ),
+    cong_lab = ifelse(condition=="Picture-Congruent", "Congruent", "Incongruent")
+  ) %>%
+  ggplot(aes(time, pred_uV, colour=cong_lab)) +
+  geom_vline(xintercept=0, linetype="dashed") +
+  geom_hline(yintercept=0, linetype="dashed") +
+  geom_line() +
+  scale_colour_manual(
+    values = cong_cols,
+    guide=guide_legend(nrow=1, override.aes = list(linewidth=1))
+  ) +
+  scale_x_continuous(expand=c(0, 0)) +
+  scale_y_continuous(limits=ylims) +
+  facet_wrap(vars(perc_lab), nrow=3) +
+  labs(
+    x = "Time (ms)",
+    y = "Predicted Amplitude (µV)",
+    colour = "Picture-Word Congruency"
+  ) +
+  theme(
+    legend.position = c(0.925, -0.025),
+    legend.justification = c(1, 0),
+    legend.margin = margin(),
+    legend.key.height = unit(12, "pt"),
+    strip.background = element_blank(),
+    plot.background = element_blank(),
+    axis.line.x = element_blank(),
+    panel.spacing.x = unit(12, "pt")
+  ) +
+  coord_cartesian(xlim=c(-250, time_lims[2]))
+
+pl_plus_C <- plot_grid(
+  pl_roi_1row + theme(strip.text = element_text(margin=margin(t=2.5, b=2.5, unit="pt"))),
+  pl_preds +
+    theme(
+      strip.text = element_text(margin=margin(t=2.5, b=2.5, unit="pt")),
+      legend.key.height = unit(9.5, "pt")
+    ),
+  pl_preds_bypred +
+    labs(tag="c") +
+    theme(strip.text = element_text(margin=margin(t=2.5, b=2.5, unit="pt"))),
+  ncol=1, rel_heights = c(0.85, 1.2, 1.5), align="v", axis="l"
+)
+
+
+ggsave(file.path("figs", "sample_level", "picture_word", "roi_3panels_rh.pdf"), pl_plus_C, width=6.5, height=7, device="pdf")
+ggsave(file.path("figs", "sample_level", "picture_word", "roi_3panels_rh.png"), pl_plus_C, width=6.5, height=7, device="png", type="cairo")
+
+
+# simple effects models ---------------------------------------------------
+
+cl <- makeCluster(n_cores)
+cl_packages <- clusterEvalQ(cl, {
+  library(dplyr)
+  library(lme4)
+})
+
+fe_res_cong <- parLapply(cl, d_i_list, function(d_i) {
+  m <- lme4::lmer(
+    uV ~ cong_dum_cong * pred_norm +
+      (1 | subj_id) +
+      (1 | ch_name:subj_id) +
+      (1 | image) +
+      (1 | string),
+    REML=FALSE,
+    control = lmerControl(optimizer="bobyqa"),
+    data=d_i
+  )
+  
+  m %>%
+    summary() %>%
+    with(coefficients) %>%
+    as_tibble(rownames = "fe")
+}) %>%
+  reduce(bind_rows) %>%
+  mutate(time = rep(target_time_ids, each=4))
+
+fe_res_incong <- parLapply(cl, d_i_list, function(d_i) {
+  m <- lme4::lmer(
+    uV ~ cong_dum_incong * pred_norm +
+      (1 | subj_id) +
+      (1 | ch_name:subj_id) +
+      (1 | image) +
+      (1 | string),
+    REML=FALSE,
+    control = lmerControl(optimizer="bobyqa"),
+    data=d_i
+  )
+  
+  m %>%
+    summary() %>%
+    with(coefficients) %>%
+    as_tibble(rownames = "fe")
+}) %>%
+  reduce(bind_rows) %>%
+  mutate(time = rep(target_time_ids, each=4))
+
+stopCluster(cl)
+
+
+fe_res_tidy_se <- bind_rows(
+  fe_res_cong %>%
+    mutate(
+      cong_level = "Congruent",
+      fe_lab = factor(recode(
+        fe,
+        `(Intercept)` = "Intercept",
+        cong_dum_cong = "Congruency",
+        pred_norm = "Predictability",
+        `cong_dum_cong:pred_norm` = "Congruency * Predictability"
+      ), levels = c("Intercept", "Congruency", "Predictability", "Congruency * Predictability")),
+      fe_lab_newline = factor(fe_lab, labels = c("Intercept", "Congruency", "Predictability", "Congruency\n* Predictability")),
+      bound_lower = Estimate - 1.96 * `Std. Error`,
+      bound_upper = Estimate + 1.96 * `Std. Error`
+    ),
+  fe_res_incong %>%
+    mutate(
+      cong_level = "Incongruent",
+      fe_lab = factor(recode(
+        fe,
+        `(Intercept)` = "Intercept",
+        cong_dum_incong = "Congruency",
+        pred_norm = "Predictability",
+        `cong_dum_incong:pred_norm` = "Congruency * Predictability"
+      ), levels = c("Intercept", "Congruency", "Predictability", "Congruency * Predictability")),
+      fe_lab_newline = factor(fe_lab, labels = c("Intercept", "Congruency", "Predictability", "Congruency\n* Predictability")),
+      bound_lower = Estimate - 1.96 * `Std. Error`,
+      bound_upper = Estimate + 1.96 * `Std. Error`
+    )
+)
+
+fe_res_tidy_se_pred <- filter(fe_res_tidy_se, fe_lab=="Predictability")
+
+fe_se_pl_alpha <- 0.4
+
+fe_se_pl <- fe_res_tidy_se_pred %>%
+  ggplot(aes(time, Estimate, ymin=bound_lower, ymax=bound_upper, colour=cong_level, fill=cong_level)) +
+  geom_ribbon(alpha=fe_se_pl_alpha, linewidth=0.25, data=filter(fe_res_tidy_se_pred, cong_level=="Congruent")) +
+  geom_line(data=filter(fe_res_tidy_se_pred, cong_level=="Congruent")) +
+  geom_ribbon(alpha=fe_se_pl_alpha, linewidth=0.25, data=filter(fe_res_tidy_se_pred, cong_level=="Incongruent")) +
+  geom_line(data=filter(fe_res_tidy_se_pred, cong_level=="Incongruent")) +
+  geom_vline(xintercept=0, linetype="dashed") +
+  geom_hline(yintercept=0, linetype="dashed") +
+  scale_colour_manual(values = cong_cols) +
+  scale_fill_manual(values = cong_cols) +
+  scale_x_continuous(expand=c(0, 0), breaks = seq(-200, time_lims[2], 100), limits=c(-250, time_lims[2])) +
+  scale_y_continuous(breaks=scales::breaks_pretty(n=6)) +
+  guides(
+    colour = guide_legend(nrow=1),
+    fill = guide_legend(nrow=1, override.aes = list(alpha=1, fill=cong_cols_alpha_0.4_white))
+  ) +
+  labs(
+    x = "Time (ms)",
+    y = "Effect of Predictability (µV)",
+    colour = "Picture-Word Congruency",
+    fill = "Picture-Word Congruency"
+  ) +
+  theme(
+    legend.key.height = unit(12, "pt"),
+    legend.position = "top",
+    axis.line.x = element_blank()
+  )
+
+ggsave(file.path("figs", "sample_level", "picture_word", "roi_simple_effs_rh.pdf"), fe_se_pl, width=4.75, height=2.75, device="pdf")
+ggsave(file.path("figs", "sample_level", "picture_word", "roi_simple_effs_rh.png"), fe_se_pl, width=4.75, height=2.75, device="png", type="cairo")
+

+ 595 - 0
04 Analysis/analyse_08_pictureword_rt.R

@@ -0,0 +1,595 @@
+library(dplyr)
+library(readr)
+library(purrr)
+library(tidyr)
+
+library(brms)
+library(ggdist)
+
+library(ggplot2)
+theme_set(theme_classic() + theme(strip.background = element_rect(fill = "white"), plot.background = element_blank()))
+library(cowplot)
+
+cong_cols <- c("#E69F00", "#009E73")
+
+norm01 <- function(x, ...) (x-min(x, ...)) / (max(x, ...) - min(x, ...))
+norm01_manual <- function(x, min_x, max_x) (x-min_x) / (max_x - min_x)
+
+# summarise posteriors from behavioural validation experiment -------------
+
+# valid_m <- readRDS(file.path("..", "01 Validation", "02 Analysis", "mods", "m_bme.rds"))
+# 
+# valid_m_ests <- valid_m %>%
+#   as_draws_df("^b\\_|^sd\\_|^cor\\_", regex=TRUE) %>%
+#   select(-starts_with(".")) %>%
+#   pivot_longer(cols=everything(), names_to="par", values_to="est")
+# 
+# lapply(unique(valid_m_ests$par), function(p) {
+#   d_i <- filter(valid_m_ests, par == p)
+#   
+#   ggplot(d_i, aes(est)) +
+#     geom_function(fun=dnorm, args=list(mean=mean(d_i$est), sd=sd(d_i$est)), colour="red") +
+#     geom_density() +
+#     geom_vline(xintercept = median(d_i$est), colour="blue") +
+#     facet_wrap(vars(par), scales="free") +
+#     labs(x=NULL, y=NULL)
+# }) %>%
+#   wrap_plots()
+# 
+# valid_m_norm <- valid_m_ests %>%
+#   group_by(par) %>%
+#   summarise(
+#     median_est = median(est),
+#     mean_est = mean(est),
+#     sd_est = sd(est),
+#     sd_est10 = sd_est*10
+#   )
+# 
+# rm(valid_m)
+# gc()
+
+# import data -------------------------------------------------------------
+
+# get the stimuli's percentage of name agreement values
+stim <- read_csv("boss.csv", col_types = cols(perc_name_agree_denom_fq_inputs = col_number())) %>%
+  select(filename, perc_name_agree_denom_fq_inputs) %>%
+  rename(perc_name_agree = perc_name_agree_denom_fq_inputs)
+
+d <- file.path("raw_data", "stim-pc", "data", "pictureword") %>%
+  list.files(pattern = "^.*\\.csv$", full.names = TRUE) %>%
+  map_df(read_csv, col_types = cols(sex="c")) %>%
+  filter(acc == 1) %>%
+  left_join(stim, by=c("image" = "filename")) %>%
+  mutate(
+    prop_agree = perc_name_agree/100,
+    pred_norm = norm01(prop_agree),
+    cong_dev = scale(if_else(condition == "A1", 1, 0), center = TRUE, scale = FALSE)
+  )
+
+# setup priors for RT model -----------------------------------------------
+
+priors <- c(
+  # FIXED EFFECTS
+  #  mu
+  set_prior("normal(5.75, 0.71)", class = "b", coef = "Intercept"),
+  set_prior("normal(0.472, 0.875)", class = "b", coef = "cong_dev"),
+  set_prior("normal(-0.543, 0.78)", class = "b", coef = "pred_norm"),
+  set_prior("normal(-0.671, 1.29)", class = "b", coef = "cong_dev:pred_norm"),
+  #  sigma
+  set_prior("normal(-0.85, 0.535)", class = "b", coef = "Intercept", dpar="sigma"),
+  set_prior("normal(0.0404, 0.94)", class = "b", coef = "cong_dev", dpar="sigma"),
+  set_prior("normal(0.229, 0.755)", class = "b", coef = "pred_norm", dpar="sigma"),
+  set_prior("normal(0.142, 1.345)", class = "b", coef = "cong_dev:pred_norm", dpar="sigma"),
+  #  delta
+  set_prior("normal(0, 7.5)", class = "b", coef = "Intercept", dpar="ndt"),  # wider than other priors, and equivalent to a delay of just exp(3) = 20 ms rather than the exp(5.63) from the validation posterior, because I expected the forced delay of response (until colour change) to greatly reduce non-decision time, but I'm not sure by how much
+  set_prior("normal(-0.4, 7.5)", class = "b", coef = "cong_dev", dpar="ndt"),
+  set_prior("normal(0.132, 7.5)", class = "b", coef = "pred_norm", dpar="ndt"),
+  set_prior("normal(-0.671, 7.5)", class = "b", coef = "cong_dev:pred_norm", dpar="ndt"),
+  # STANDARD DEVIATIONS OF RANDOM EFFECT DISTRIBUTIONS
+  #  mu
+  #    -subj_id
+  set_prior("student_t(10, 0.29, 0.1)", class = "sd", coef = "Intercept", group = "subj_id"),
+  set_prior("student_t(10, 0.079, 0.1)", class = "sd", coef = "cong_dev", group = "subj_id"),
+  set_prior("student_t(10, 0.128, 0.1)", class = "sd", coef = "pred_norm", group = "subj_id"),
+  set_prior("student_t(10, 0.077, 0.25)", class = "sd", coef = "cong_dev:pred_norm", group = "subj_id"),
+  #    -image
+  set_prior("student_t(10, 0.116, 0.05)", class = "sd", coef = "Intercept", group = "image"),
+  set_prior("student_t(10, 0.137, 0.1)", class = "sd", coef = "cong_dev", group = "image"),
+  #    -string
+  set_prior("student_t(10, 0.379, 0.05)", class = "sd", coef = "Intercept", group = "string"),
+  #  sigma
+  #    -subj_id
+  set_prior("student_t(10, 0.98, 0.1)", class = "sd", coef = "Intercept", group = "subj_id", dpar = "sigma"),
+  set_prior("student_t(10, 0.121, 0.1)", class = "sd", coef = "cong_dev", group = "subj_id", dpar = "sigma"),
+  set_prior("student_t(10, 0.075, 0.1)", class = "sd", coef = "pred_norm", group = "subj_id", dpar = "sigma"),
+  set_prior("student_t(10, 0.084, 0.25)", class = "sd", coef = "cong_dev:pred_norm", group = "subj_id", dpar = "sigma"),
+  #    -image
+  set_prior("student_t(10, 0.068, 0.05)", class = "sd", coef = "Intercept", group = "image", dpar = "sigma"),
+  set_prior("student_t(10, 0.1, 0.1)", class = "sd", coef = "cong_dev", group = "image", dpar = "sigma"),
+  #    -string
+  set_prior("student_t(10, 0.039, 0.05)", class = "sd", coef = "Intercept", group = "string", dpar = "sigma"),
+  #  delta
+  set_prior("student_t(10, 0.096, 0.1)", class = "sd", coef = "Intercept", group = "subj_id", dpar = "ndt"),
+  set_prior("student_t(10, 0.071, 0.1)", class = "sd", coef = "cong_dev", group = "subj_id", dpar = "ndt"),
+  set_prior("student_t(10, 0.028, 0.1)", class = "sd", coef = "pred_norm", group = "subj_id", dpar = "ndt"),
+  set_prior("student_t(10, 0.038, 0.25)", class = "sd", coef = "cong_dev:pred_norm", group = "subj_id", dpar = "ndt"),
+  #    -image
+  set_prior("student_t(10, 0.245, 0.05)", class = "sd", coef = "Intercept", group = "image", dpar = "ndt"),
+  set_prior("student_t(10, 0.023, 0.1)", class = "sd", coef = "cong_dev", group = "image", dpar = "ndt"),
+  #    -string
+  set_prior("student_t(10, 0.015, 0.05)", class = "sd", coef = "Intercept", group = "string", dpar = "ndt")
+)
+
+n_cores <- 7
+
+seed <- 3101
+n_iter <- 10000
+n_warmup <- 7500
+adapt_delta <- 0.99
+max_treedepth <- 10
+n_chains <- 5
+refresh <- 100
+
+f <- brmsformula(
+  rt ~ 0 + Intercept + cong_dev * pred_norm +
+    (cong_dev * pred_norm | subj_id) +
+    (cong_dev | image) +
+    (1 | string),
+  sigma ~ 0 + Intercept + cong_dev * pred_norm +
+    (cong_dev * pred_norm | subj_id) +
+    (cong_dev | image) +
+    (1 | string),
+  ndt ~ 0 + Intercept + cong_dev * pred_norm +
+    (cong_dev * pred_norm | subj_id) +
+    (cong_dev | image) +
+    (1 | string)
+)
+
+m_rt <- brm(
+  formula = f,
+  data = d,
+  family = shifted_lognormal(),
+  prior = priors,
+  iter = n_iter,
+  warmup = n_warmup,
+  chains = n_chains,
+  control = list(
+    adapt_delta = adapt_delta,
+    max_treedepth = max_treedepth
+  ),
+  init = replicate(
+    n_chains,
+    list(b_ndt = as.array(rep(-5, 4))),
+    simplify=FALSE
+  ),
+  sample_prior = "no",
+  silent = TRUE,
+  cores = n_cores,
+  seed = seed,
+  thin = 1,
+  file = file.path("mods", "m_rt.rds"),
+  refresh = refresh
+)
+
+# plot results ------------------------------------------------------------
+
+# get predicted densities
+coding_lookup <- d %>%
+  group_by(condition) %>%
+  summarise(cong_dev = unique(cong_dev))
+
+props <- 1:10/10
+
+fe_tidy <- fixef(m_rt, robust=TRUE) %>%
+  as_tibble(rownames="term")
+
+fe <- sapply(fe_tidy$term, function(term_i) {
+  fe_tidy %>%
+    filter(term==term_i) %>%
+    pull(Estimate)
+})
+
+fe_preds <- tibble(
+  condition = rep(c("A1", "A2"), each = length(props)),
+  condition_label = if_else(condition=="A1", "Congruent", "Incongruent"),
+  prop_agree = rep(props, 2)
+) %>%
+  left_join(coding_lookup, by = "condition") %>%
+  mutate(
+    pred_norm = norm01_manual(prop_agree, min(d$prop_agree), max(d$prop_agree)),
+    int_mu = fe["Intercept"],
+    int_sigma = fe["sigma_Intercept"],
+    int_ndt = fe["ndt_Intercept"],
+    cong_mu = fe["cong_dev"],
+    cong_sigma = fe["sigma_cong_dev"],
+    cong_ndt = fe["ndt_cong_dev"],
+    pred_norm_mu = fe["pred_norm"],
+    pred_norm_sigma = fe["sigma_pred_norm"],
+    pred_norm_ndt = fe["ndt_pred_norm"],
+    interact_mu = fe["cong_dev:pred_norm"],
+    interact_sigma = fe["sigma_cong_dev:pred_norm"],
+    interact_ndt = fe["ndt_cong_dev:pred_norm"],
+    pred_mu = int_mu + cong_dev*cong_mu + pred_norm*pred_norm_mu + cong_dev*pred_norm*interact_mu,
+    pred_sigma = int_sigma + cong_dev*cong_sigma + pred_norm*pred_norm_sigma + cong_dev*pred_norm*interact_sigma,
+    pred_ndt = int_ndt + cong_dev*cong_ndt + pred_norm*pred_norm_ndt + cong_dev*pred_norm*interact_ndt
+  )
+
+quantities <- 0:1000
+
+fe_cond_dens <- map_dfr(quantities, function(q) mutate(fe_preds, rt = q)) %>%
+  mutate(
+    pred_dens = dshifted_lnorm(
+      x = rt,
+      meanlog = pred_mu,
+      sdlog = exp(pred_sigma),
+      shift = exp(pred_ndt)
+    )
+  )
+
+
+# build panel A
+
+panel_A_margin <- theme_get()$plot.margin
+panel_A_margin[[2]] <- unit(0.2, "npc")
+
+pub_panel_A <- fe_cond_dens %>%
+  mutate(condition_label = sprintf("Picture-%s", condition_label)) %>%
+  ggplot(aes(rt, pred_dens, colour = prop_agree)) +
+  geom_line(aes(group = as.factor(prop_agree))) +
+  facet_wrap(~condition_label) +
+  labs(x = "Response Time (ms)", y = "Predicted Density", colour = "Predictability", tag="a") +
+  scale_colour_continuous(
+    type="viridis", breaks=sort(unique(fe_cond_dens$prop_agree)),
+    labels=sprintf("%s%%", sort(unique(fe_cond_dens$prop_agree))*100),
+    # guide=guide_colourbar(barheight = 7.175)
+    guide = guide_legend(override.aes = list(linewidth=1), reverse = TRUE)
+  ) +
+  theme_classic() +
+  theme(
+    plot.margin = panel_A_margin,
+    legend.position = c(1.15, 0.5875),
+    legend.key.height = unit(11, "pt"),
+    legend.text.align = 1,
+    text=element_text(size=12),
+    axis.text.x = element_text(angle=0, hjust=0.5, vjust=0.5),
+    legend.title.align = 0,
+    legend.spacing.y = unit(1, "pt"),
+    # plot.title = element_text(hjust=-0.05),
+    axis.text.y=element_blank(),
+    axis.ticks.y=element_blank(),
+    strip.background = element_rect(fill = "white")
+  )
+
+# get uncertainty in predictions for panel B
+
+# draw all samples from posteriors
+draws_spr <- as_draws_df(m_rt, "^b\\_.*", regex=TRUE)
+
+# function for calculating uncertainty around predictions for a given vector of response times
+get_pred_cr_i <- function(rt_i) {
+  cat(sprintf("\rCalculating densities %s - %s", min(rt_i), max(rt_i)))
+  expand_grid(
+    .draw = unique(draws_spr$.draw),
+    val_cong = unique(d$cong_dev),
+    prop_agree = 1:10/10,
+    rt = rt_i
+  ) %>%
+    left_join(draws_spr, by=".draw") %>%
+    mutate(
+      val_pred = norm01_manual(prop_agree, min(d$prop_agree), max(d$prop_agree)),
+      mu = b_Intercept +
+        (val_cong * b_cong_dev) +
+        (val_pred * b_pred_norm) +
+        (val_cong * val_pred * `b_cong_dev:pred_norm`),
+      sigma = b_sigma_Intercept +
+        (val_cong * b_sigma_cong_dev) +
+        (val_pred * b_sigma_pred_norm) +
+        (val_cong * val_pred * `b_sigma_cong_dev:pred_norm`),
+      delta = b_ndt_Intercept +
+        (val_cong * b_ndt_cong_dev) +
+        (val_pred * b_ndt_pred_norm) +
+        (val_cong * val_pred * `b_ndt_cong_dev:pred_norm`),
+      samp_dens = dshifted_lnorm(
+        x = rt,
+        meanlog = mu,
+        sdlog = exp(sigma),
+        shift = exp(delta)
+      )
+    ) %>%
+    group_by(rt, val_cong, val_pred, prop_agree) %>%
+    summarise(
+      pred_dens = median(samp_dens),
+      cr_i_low = hdi(samp_dens, .width=.89)[1],
+      cr_i_high = hdi(samp_dens, .width=.89)[2],
+      .groups = "drop"
+    )
+}
+
+# get relative likelihoods (chunked into groups of size 25)
+draws_pred_ci <- quantities[quantities>0] %>%
+  split(., ceiling(seq_along(.)/25)) %>%
+  map_dfr(get_pred_cr_i)
+
+# join panel A and panel B
+max_y_uncertainty <- round(max(draws_pred_ci$cr_i_high)+.00005, 5)
+
+pub_panel_A_uncertainty <- pub_panel_A +
+  lims(y = c(0, max_y_uncertainty))
+
+pub_panel_B_uncertainty <- draws_pred_ci %>%
+  left_join(coding_lookup, by=c("val_cong" = "cong_dev")) %>%
+  mutate(
+    condition_label = ifelse(condition=="A1", "Congruent", "Incongruent"),
+    pred_label = factor(sprintf("%s%%", prop_agree*100), levels = sprintf("%s%%", seq(10, 100, 10)))
+  ) %>%
+  ggplot(aes(rt, pred_dens, colour = condition_label, fill = condition_label)) +
+  geom_ribbon(aes(ymin = cr_i_low, ymax = cr_i_high), alpha=0.4) +
+  facet_wrap(vars(pred_label), nrow=2) +
+  scale_colour_manual(values = cong_cols) +
+  scale_fill_manual(values = cong_cols) +
+  guides(fill = guide_legend(override.aes = list(alpha = 0.5))) +
+  lims(y = c(0, max_y_uncertainty)) +
+  labs(
+    x = "Response Time (ms)",
+    y = "Predicted Density",
+    colour = "Picture-Word Congruency",
+    fill = "Picture-Word Congruency",
+    tag = "b"
+  ) +
+  theme(
+    legend.position = "bottom",
+    legend.key.height = unit(4, "pt"),
+    axis.text.y=element_blank(),
+    axis.ticks.y=element_blank(),
+    axis.text.x = element_text(angle=22.5, hjust=1, vjust=1),
+    # legend.key.height = grid::unit(0.1, "lines"),
+    # plot.title = element_text(hjust=-0.04),
+    strip.background = element_rect(fill = "white"),
+    legend.margin = margin()
+  )
+
+pub_fig_uncertainty <- plot_grid(pub_panel_A_uncertainty, pub_panel_B_uncertainty, nrow=2, rel_heights=c(2.5, 3.5))
+
+ggsave(file.path("figs", "08_rt_fixed_effects_uncertainty.pdf"), pub_fig_uncertainty, device = "pdf", units = "in", width = 6.5, height=6)
+ggsave(file.path("figs", "08_rt_fixed_effects_uncertainty.png"), pub_fig_uncertainty, device = "png", type="cairo", units = "in", width = 6.5, height=6)
+
+
+# compare priors and posteriors -------------------------------------------
+
+m_rt_prior_samps <- brm(
+  formula = f,
+  data = d,
+  family = shifted_lognormal(),
+  prior = priors,
+  iter = n_iter,
+  warmup = n_warmup,
+  chains = n_chains,
+  control = list(
+    adapt_delta = adapt_delta,
+    max_treedepth = max_treedepth
+  ),
+  inits = replicate(
+    n_chains,
+    list(b_ndt = as.array(rep(-5, 4))),
+    simplify=FALSE
+  ),
+  sample_prior = "only",
+  silent = TRUE,
+  cores = n_cores,
+  seed = seed,
+  thin = 1,
+  refresh = 2500
+)
+
+rm(draws_spr)
+gc()
+
+draws_joined <- bind_rows(
+  as_draws_df(m_rt, "^b\\_.*|^sd\\_.*", regex=TRUE) %>%
+    select(-.chain, -.iteration, -.draw) %>%
+    pivot_longer(cols=everything(), names_to="par", values_to="est") %>%
+    mutate(source="posterior"),
+  as_draws_df(m_rt_prior_samps, "^b\\_.*|^sd\\_.*", regex=TRUE) %>%
+    select(-.chain, -.iteration, -.draw) %>%
+    pivot_longer(cols=everything(), names_to="par", values_to="est") %>%
+    mutate(source="prior")
+) %>%
+  mutate(source = factor(source, levels = c("prior", "posterior")))
+
+
+pl_prior_post_fe_ints <- draws_joined %>%
+  filter(grepl("^b\\_", par), grepl("Intercept", par, fixed=TRUE)) %>%
+  mutate(
+    par_lab = factor(recode(
+      par,
+      b_Intercept = "mu",
+      b_sigma_Intercept = "sigma",
+      b_ndt_Intercept = "delta"
+    ), levels = c("mu", "sigma", "delta"))
+  ) %>%
+  ggplot(aes(est, "Intercept", colour=source)) +
+  stat_pointinterval(point_interval = "median_hdi", .width=.89, position=position_dodge(width=-0.4)) +
+  facet_wrap(vars(par_lab), scales = "free_x", labeller = label_parsed) +
+  scale_y_discrete(expand = expansion(0.1, 0)) +
+  scale_colour_manual(values = c("black", "red")) +
+  labs(
+    x = NULL,
+    y = NULL
+  ) +
+  theme(legend.position = "none")
+
+pl_prior_post_fe_slopes <- draws_joined %>%
+  filter(grepl("^b\\_", par), !grepl("Intercept", par, fixed=TRUE)) %>%
+  mutate(
+    par_lab = factor(case_when(
+      grepl("sigma", par, fixed=TRUE) ~ "sigma",
+      grepl("ndt", par, fixed=TRUE) ~ "delta",
+      TRUE ~ "mu"
+    ), levels = c("mu", "sigma", "delta")),
+    eff = factor(case_when(
+      grepl("cong_dev:pred_norm", par, fixed=TRUE) ~ "Congruency\n* Predictability",
+      grepl("cong_dev", par, fixed=TRUE) ~ "Congruency",
+      grepl("pred_norm", par, fixed=TRUE) ~ "Predictability"
+    ), levels = c("Congruency", "Predictability", "Congruency\n* Predictability"))
+  ) %>%
+  ggplot(aes(est, reorder(eff, desc(eff)), colour=source)) +
+  stat_pointinterval(point_interval = "median_hdi", .width=.89, position=position_dodge(width=-0.4)) +
+  facet_wrap(vars(par_lab), scales = "free_x", labeller = label_parsed) +
+  scale_y_discrete(expand = expansion(0.1, 0)) +
+  scale_colour_manual(values = c("black", "red"), labels = c("Prior", "Posterior")) +
+  labs(
+    x = "Estimate",
+    y = NULL,
+    colour = NULL
+  ) +
+  theme(
+    legend.position = "bottom",
+    legend.margin = margin(),
+    strip.background = element_blank(),
+    strip.text.x = element_blank()
+  )
+
+pl_prior_post_fe <- plot_grid(pl_prior_post_fe_ints, pl_prior_post_fe_slopes, align="hv", axis="l", ncol=1, rel_heights=c(1.25, 2.85))
+
+ggsave(file.path("figs", "08_rt_prior_post_fixed_effects.pdf"), pl_prior_post_fe, width=6.5, height=3.5)
+ggsave(file.path("figs", "08_rt_prior_post_fixed_effects.png"), pl_prior_post_fe, width=6.5, height=3.5, device="png", type="cairo")
+
+
+# random effects plot
+
+# subject random effects SDs
+pl_prior_post_re_subj_ints <- draws_joined %>%
+  filter(grepl("^sd\\_subj\\_id", par), grepl("Intercept", par, fixed=TRUE)) %>%
+  mutate(
+    par_lab = factor(case_when(
+      grepl("sigma", par, fixed=TRUE) ~ "sigma",
+      grepl("ndt", par, fixed=TRUE) ~ "delta",
+      TRUE ~ "mu"
+    ), levels = c("mu", "sigma", "delta"))
+  ) %>%
+  ggplot(aes(est, "Intercept", colour=source)) +
+  stat_pointinterval(point_interval = "median_hdi", .width=.89, position=position_dodge(width=-0.4)) +
+  facet_wrap(vars(par_lab), scales = "free_x", labeller = label_parsed) +
+  scale_y_discrete(expand = expansion(0.1, 0)) +
+  scale_colour_manual(values = c("black", "red")) +
+  labs(
+    x = NULL,
+    y = NULL,
+    title = "Participant Random Effects SDs",
+    tag = "a"
+  ) +
+  theme(legend.position = "none")
+
+pl_prior_post_re_subj_slopes <- draws_joined %>%
+  filter(grepl("^sd\\_subj\\_id", par), !grepl("Intercept", par, fixed=TRUE)) %>%
+  mutate(
+    par_lab = factor(case_when(
+      grepl("sigma", par, fixed=TRUE) ~ "sigma",
+      grepl("ndt", par, fixed=TRUE) ~ "delta",
+      TRUE ~ "mu"
+    ), levels = c("mu", "sigma", "delta")),
+    eff = factor(case_when(
+      grepl("cong_dev:pred_norm", par, fixed=TRUE) ~ "Congruency\n* Predictability",
+      grepl("cong_dev", par, fixed=TRUE) ~ "Congruency",
+      grepl("pred_norm", par, fixed=TRUE) ~ "Predictability"
+    ), levels = c("Congruency", "Predictability", "Congruency\n* Predictability"))
+  ) %>%
+  ggplot(aes(est, reorder(eff, desc(eff)), colour=source)) +
+  stat_pointinterval(point_interval = "median_hdi", .width=.89, position=position_dodge(width=-0.4)) +
+  facet_wrap(vars(par_lab), scales = "free_x", labeller = label_parsed) +
+  scale_y_discrete(expand = expansion(0.1, 0)) +
+  scale_colour_manual(values = c("black", "red"), labels = c("Prior", "Posterior")) +
+  labs(
+    x = NULL,
+    y = NULL,
+    colour = NULL
+  ) +
+  theme(
+    legend.position = "none",
+    strip.background = element_blank(),
+    strip.text.x = element_blank()
+  )
+
+# image random effects SDs
+pl_prior_post_re_image_ints <- draws_joined %>%
+  filter(grepl("^sd\\_image", par), grepl("Intercept", par, fixed=TRUE)) %>%
+  mutate(
+    par_lab = factor(case_when(
+      grepl("sigma", par, fixed=TRUE) ~ "sigma",
+      grepl("ndt", par, fixed=TRUE) ~ "delta",
+      TRUE ~ "mu"
+    ), levels = c("mu", "sigma", "delta"))
+  ) %>%
+  ggplot(aes(est, "Intercept", colour=source)) +
+  stat_pointinterval(point_interval = "median_hdi", .width=.89, position=position_dodge(width=-0.4)) +
+  facet_wrap(vars(par_lab), scales = "free_x", labeller = label_parsed) +
+  scale_y_discrete(expand = expansion(0.1, 0)) +
+  scale_colour_manual(values = c("black", "red")) +
+  labs(
+    x = NULL,
+    y = NULL,
+    title = "Image Random Effects SDs",
+    tag = "b"
+  ) +
+  theme(legend.position = "none")
+
+pl_prior_post_re_image_slopes <- draws_joined %>%
+  filter(grepl("^sd\\_image", par), !grepl("Intercept", par, fixed=TRUE)) %>%
+  mutate(
+    par_lab = factor(case_when(
+      grepl("sigma", par, fixed=TRUE) ~ "sigma",
+      grepl("ndt", par, fixed=TRUE) ~ "delta",
+      TRUE ~ "mu"
+    ), levels = c("mu", "sigma", "delta"))
+  ) %>%
+  ggplot(aes(est, "Congruency", colour=source)) +
+  stat_pointinterval(point_interval = "median_hdi", .width=.89, position=position_dodge(width=-0.4)) +
+  facet_wrap(vars(par_lab), scales = "free_x", labeller = label_parsed) +
+  scale_y_discrete(expand = expansion(0.1, 0)) +
+  scale_colour_manual(values = c("black", "red"), labels = c("Prior", "Posterior")) +
+  labs(
+    x = NULL,
+    y = NULL,
+    colour = NULL
+  ) +
+  theme(
+    legend.position = "none",
+    strip.background = element_blank(),
+    strip.text.x = element_blank()
+  )
+  
+# word random effects SDs
+pl_prior_post_re_string_ints <- draws_joined %>%
+  filter(grepl("^sd\\_string", par), grepl("Intercept", par, fixed=TRUE)) %>%
+  mutate(
+    par_lab = factor(case_when(
+      grepl("sigma", par, fixed=TRUE) ~ "sigma",
+      grepl("ndt", par, fixed=TRUE) ~ "delta",
+      TRUE ~ "mu"
+    ), levels = c("mu", "sigma", "delta"))
+  ) %>%
+  ggplot(aes(est, "Intercept", colour=source)) +
+  stat_pointinterval(point_interval = "median_hdi", .width=.89, position=position_dodge(width=-0.4)) +
+  facet_wrap(vars(par_lab), scales = "free_x", labeller = label_parsed) +
+  scale_y_discrete(expand = expansion(0.1, 0)) +
+  scale_colour_manual(values = c("black", "red"), labels=c("Prior", "Posterior")) +
+  labs(
+    x = "Estimate",
+    y = NULL,
+    title = "Word Random Effects SDs",
+    tag = "c",
+    colour = NULL
+  ) +
+  theme(legend.position = "bottom", legend.margin = margin())
+
+# join random effects SDs plots
+pl_prior_post_re <- plot_grid(
+  pl_prior_post_re_subj_ints, pl_prior_post_re_subj_slopes,
+  pl_prior_post_re_image_ints, pl_prior_post_re_image_slopes,
+  pl_prior_post_re_string_ints,
+  align="hv", axis="l", ncol=1, rel_heights=c(0.9, 1.2, 0.9, 0.5, 1.255)
+)
+
+ggsave(file.path("figs", "08_rt_prior_post_random_effects.pdf"), pl_prior_post_re, width=6.5, height=7.5)
+ggsave(file.path("figs", "08_rt_prior_post_random_effects.png"), pl_prior_post_re, width=6.5, height=7.5, device="png", type="cairo")
+

+ 166 - 0
04 Analysis/analyse_09_pictureword_acc.R

@@ -0,0 +1,166 @@
+library(dplyr)
+library(readr)
+library(purrr)
+library(tidyr)
+library(forcats)
+
+library(brms)
+library(ggdist)
+
+library(ggplot2)
+theme_set(theme_classic() + theme(strip.background = element_rect(fill = "white"), plot.background = element_blank()))
+library(patchwork)
+
+library(RColorBrewer)
+eff_cols <- c("#000000", brewer.pal(3, "Set1")[1:2], brewer.pal(5, "Set1")[5])
+
+cong_cols <- c("#E69F00", "#009E73")
+
+norm01 <- function(x, ...) (x-min(x, ...)) / (max(x, ...) - min(x, ...))
+norm01_manual <- function(x, min_x, max_x) (x-min_x) / (max_x - min_x)
+
+# import data -------------------------------------------------------------
+
+# get the stimuli's percentage of name agreement values
+stim <- read_csv("boss.csv", col_types = cols(perc_name_agree_denom_fq_inputs = col_number())) %>%
+  select(filename, perc_name_agree_denom_fq_inputs) %>%
+  rename(perc_name_agree = perc_name_agree_denom_fq_inputs)
+
+d <- file.path("raw_data", "stim-pc", "data", "pictureword") %>%
+  list.files(pattern = "^.*\\.csv$", full.names = TRUE) %>%
+  map_df(read_csv, col_types = cols(sex="c")) %>%
+  filter(rt <= 1500) %>%
+  left_join(stim, by=c("image" = "filename")) %>%
+  mutate(
+    prop_agree = perc_name_agree/100,
+    pred_norm = norm01(prop_agree),
+    cong_dev = scale(if_else(condition == "A1", 1, 0), center = TRUE, scale = FALSE)
+  )
+
+# setup priors for RT model -----------------------------------------------
+
+priors <- c(
+  set_prior("normal(4, 1)", class="b", coef="Intercept")
+)
+
+n_cores <- 7
+
+seed <- 3101
+n_iter <- 10000
+n_warmup <- 5000
+adapt_delta <- 0.9
+max_treedepth <- 10
+n_chains <- 5
+refresh <- 100
+
+f <- brmsformula(
+  acc ~ 0 + Intercept + cong_dev * pred_norm +
+    (cong_dev * pred_norm | subj_id) +
+    (cong_dev | image) +
+    (1 | string)
+)
+
+m_acc <- brm(
+  formula = f,
+  data = d,
+  family = bernoulli("logit"),
+  prior = priors,
+  iter = n_iter,
+  warmup = n_warmup,
+  chains = n_chains,
+  control = list(
+    adapt_delta = adapt_delta,
+    max_treedepth = max_treedepth
+  ),
+  sample_prior = "no",
+  silent = TRUE,
+  cores = n_cores,
+  seed = seed,
+  thin = 1,
+  file = file.path("mods", "m_acc.rds"),
+  refresh = refresh
+)
+
+m_draws <- as_draws_df(m_acc, variable="^b\\_", regex=TRUE) %>%
+  select(-.chain, -.iteration, -.draw)
+
+m_draws_long <- m_draws %>%
+  pivot_longer(cols=everything(), names_to="par", values_to="est") %>%
+  mutate(par_lab = factor(recode(
+    par,
+    b_Intercept = "Intercept",
+    b_cong_dev = "Congruency",
+    b_pred_norm = "Predictability",
+    `b_cong_dev:pred_norm` = "Congruency\n* Predictability"
+  ), levels = c("Intercept", "Congruency", "Predictability", "Congruency\n* Predictability")))
+
+pl_intercept <- m_draws_long %>%
+  filter(par == "b_Intercept") %>%
+  ggplot(aes(est, "Intercept")) +
+  stat_pointinterval(point_interval = "median_hdi", .width=.89) +
+  labs(x = NULL, y = NULL, tag = "a")
+
+pl_slopes <- m_draws_long %>%
+  filter(par != "b_Intercept") %>%
+  mutate(par_lab = fct_rev(par_lab)) %>%
+  ggplot(aes(est, par_lab)) +
+  geom_vline(xintercept=0, linetype="dashed") +
+  stat_pointinterval(point_interval = "median_hdi", .width=.89) +
+  labs(x = "Estimate (Logits)", y = NULL) +
+  theme(legend.position = "none")
+
+pl_coefs <- (pl_intercept / pl_slopes) +
+  plot_layout(heights = c(1, 5))
+
+m_preds <- seq(0.1, 1, 0.001) %>%
+  split(., ceiling(seq_along(.)/100)) %>%
+  map_dfr(function(p) {
+    m_draws %>%
+      expand_grid(
+        prop_agree = p,
+        condition = c("Congruent", "Incongruent")
+      ) %>%
+      mutate(
+        pred_norm = norm01_manual(prop_agree, min_x=min(d$prop_agree), max_x=max(d$prop_agree)),
+        cong_dev = as.numeric(scale(if_else(condition == "Congruent", 1, 0), center = TRUE, scale = FALSE)),
+        pred_logit = b_Intercept +
+          cong_dev * b_cong_dev +
+          pred_norm * b_pred_norm +
+          cong_dev * pred_norm * `b_cong_dev:pred_norm`,
+        pred_odds = exp(pred_logit),
+        prob = pred_odds / (1 + pred_odds)
+      ) %>%
+      group_by(condition, prop_agree) %>%
+      median_hdi(prob, .width=0.89)
+  }) %>%
+  mutate(
+    perc_name_agree = prop_agree * 100,
+    condition = factor(condition, levels = c("Congruent", "Incongruent"))
+  )
+
+pl_preds <- m_preds %>%
+  ggplot(aes(perc_name_agree, prob, colour=condition, fill=condition)) +
+  geom_line(size=0.5) +
+  geom_ribbon(aes(ymin=.lower, ymax=.upper), size=0.001, alpha=0.25) +
+  scale_colour_manual(values = cong_cols, guide=guide_legend(nrow=1)) +
+  scale_fill_manual(values = cong_cols, guide=guide_legend(nrow=1)) +
+  labs(
+    x = "Predictability (%)",
+    y = "Probability of Correct Response",
+    colour = "Picture-Word Congruency",
+    fill = "Picture-Word Congruency",
+    tag = "b"
+  ) +
+  scale_x_continuous(expand = expansion()) +
+  scale_y_continuous(expand = expansion(), limits=c(NA, 1)) +
+  theme(
+    legend.position = c(0.95, 0.05),
+    legend.justification = c(1, 0),
+    legend.key.height = unit(4, "pt")
+  )
+
+res_pl <- (pl_coefs | pl_preds) +
+  plot_layout(widths = c(0.55, 1))
+
+ggsave(file.path("figs", "09_pictureword_acc_res.pdf"), res_pl, width=6.5, height=3)
+ggsave(file.path("figs", "09_pictureword_acc_res.png"), device="png", type="cairo", res_pl, width=6.5, height=3)

+ 613 - 0
04 Analysis/analyse_10_localiser_rt_acc.R

@@ -0,0 +1,613 @@
+library(dplyr)
+library(readr)
+library(purrr)
+library(tidyr)
+
+library(brms)
+library(ggdist)
+
+library(ggplot2)
+theme_set(theme_classic() + theme(strip.background = element_rect(fill = "white"), plot.background = element_blank()))
+library(patchwork)
+
+cond_cols <- c(
+  "Words" = "#000000",
+  "False Font" = "#D81B60",
+  "Phase-Shuffled" = "#1E88E5"
+)
+
+# summarise picture-word RT posteriors ------------------------------------
+
+# pw_rt_post_summ <- file.path("mods", "m_rt.rds") %>%
+#   readRDS() %>%
+#   as_draws_df(variable="^sd\\_", regex=TRUE) %>%
+#   select(-.chain, -.iteration, -.draw) %>%
+#   pivot_longer(cols = everything(), names_to="par", values_to="est") %>%
+#   group_by(par) %>%
+#   summarise(
+#     m = mean(est),
+#     s = sd(est)
+#   )
+
+# import data -------------------------------------------------------------
+
+d <- file.path("raw_data", "stim-pc", "data", "localiser") %>%
+  list.files(pattern = "^.*\\.csv$", full.names = TRUE) %>%
+  map_df(read_csv, col_types = cols(sex="c"))
+
+d_cleaned_for_rt_m <- filter(d, acc == 1) %>%
+  mutate(
+    ff_dev = scale(ifelse(condition=="bacs", 1, 0), center=TRUE, scale=FALSE),
+    noise_dev = scale(ifelse(condition=="noise", 1, 0), center=TRUE, scale=FALSE)
+  )
+
+d_cleaned_for_acc_m <- filter(d, rt <= 1500) %>%
+  mutate(
+    ff_dev = scale(ifelse(condition=="bacs", 1, 0), center=TRUE, scale=FALSE),
+    noise_dev = scale(ifelse(condition=="noise", 1, 0), center=TRUE, scale=FALSE)
+  )
+
+
+# fit rt model ------------------------------------------------------------
+
+rt_priors <- c(
+  # FIXED EFFECTS
+  #  mu
+  set_prior("normal(5.3, 1)", class = "b", coef = "Intercept"),
+  set_prior("normal(0, 1)", class = "b", coef = "ff_dev"),
+  set_prior("normal(0, 1)", class = "b", coef = "noise_dev"),
+  #  sigma
+  set_prior("normal(-0.561, 1)", class = "b", coef = "Intercept", dpar="sigma"),
+  set_prior("normal(0, 1)", class = "b", coef = "ff_dev", dpar="sigma"),
+  set_prior("normal(0, 1)", class = "b", coef = "noise_dev", dpar="sigma"),
+  #  delta
+  set_prior("normal(-9, 5)", class = "b", coef = "Intercept", dpar="ndt"),
+  # SDs of RANDOM EFFECTS
+  set_prior("student_t(5, 0, 1)", class = "sd")
+)
+
+
+n_cores <- 7
+seed <- 3101
+refresh <- 25
+
+n_chains_rt <- 5
+n_iter_rt <- 10000
+n_warmup_rt <- 7500
+adapt_delta_rt <- 0.9
+max_treedepth_rt <- 10
+
+inits_rt <- replicate(
+  n_chains_rt,
+  list(b_ndt = as.array(c(-5))),
+  simplify=FALSE
+)
+
+m_rt <- brm(
+  formula = brmsformula(
+    rt ~ 0 + Intercept + ff_dev + noise_dev +
+      (ff_dev + noise_dev | subj_id) +
+      (ff_dev + noise_dev | item_nr) +
+      (1 | image),
+    sigma ~ 0 + Intercept + ff_dev + noise_dev +
+      (ff_dev + noise_dev | subj_id) +
+      (ff_dev + noise_dev | item_nr) +
+      (1 | image),
+    ndt ~ 0 + Intercept
+  ),
+  data = d_cleaned_for_rt_m,
+  family = shifted_lognormal(),
+  prior = rt_priors,
+  iter = n_iter_rt,
+  warmup = n_warmup_rt,
+  chains = n_chains_rt,
+  control = list(
+    adapt_delta = adapt_delta_rt,
+    max_treedepth = max_treedepth_rt
+  ),
+  inits = inits_rt,
+  sample_prior = "no",
+  silent = TRUE,
+  cores = n_cores,
+  seed = seed,
+  thin = 1,
+  file = file.path("mods", "m_loc_rt.rds"),
+  refresh = refresh
+)
+
+
+# fit accuracy model ------------------------------------------------------
+
+acc_priors <- c(
+  set_prior("normal(5, 1)", class="b", coef="Intercept"),
+  set_prior("normal(0, 5)", class="b", coef="ff_dev"),
+  set_prior("normal(0, 5)", class="b", coef="noise_dev"),
+  set_prior("student_t(5, 0, 1)", class = "sd")
+)
+
+n_iter_acc <- 10000
+n_warmup_acc <- 7500
+adapt_delta_acc <- 0.99
+max_treedepth_acc <- 10
+n_chains_acc <- 5
+
+m_acc <- brm(
+  formula = acc ~ 0 + Intercept + ff_dev + noise_dev +
+    (ff_dev + noise_dev | subj_id) +
+    (ff_dev + noise_dev | item_nr) +
+    (1 | image),
+  data = d_cleaned_for_acc_m,
+  family = bernoulli("logit"),
+  prior = acc_priors,
+  iter = n_iter_acc,
+  warmup = n_warmup_acc,
+  chains = n_chains_acc,
+  control = list(
+    adapt_delta = adapt_delta_acc,
+    max_treedepth = max_treedepth_acc
+  ),
+  sample_prior = "no",
+  silent = TRUE,
+  cores = n_cores,
+  seed = seed,
+  thin = 1,
+  file = file.path("mods", "m_loc_acc.rds"),
+  refresh = refresh
+)
+
+draws_preds <- as_draws_df(m_acc, variable="^b\\_", regex=TRUE) %>%
+  expand_grid(condition = unique(d_cleaned_for_acc_m$condition)) %>%
+  left_join(
+    d_cleaned_for_acc_m %>%
+      select(condition, ff_dev, noise_dev) %>%
+      distinct(),
+    by = "condition"
+  ) %>%
+  mutate(
+    pred_logit = b_Intercept + b_ff_dev * ff_dev + b_noise_dev * noise_dev,
+    pred_odds = exp(pred_logit),
+    prob = pred_odds / (1 + pred_odds),
+    cond_lab = factor(recode(
+      condition,
+      word = "Words",
+      bacs = "False Font",
+      noise = "Phase-Shuffled"
+    ), levels = c("Words", "False Font", "Phase-Shuffled"))
+  )
+
+acc_pl <- draws_preds %>%
+  mutate(pointinterval_pos = recode(condition, word=-30, bacs=-60, noise=-90)) %>%
+  ggplot(aes(prob, fill=cond_lab, colour=cond_lab)) +
+  geom_density(alpha=0.4, trim=TRUE) +
+  stat_pointinterval(aes(y=pointinterval_pos), point_interval = median_hdi, .width=.89, interval_size=2, point_size=1.75) +
+  scale_colour_manual(values = cond_cols) +
+  scale_fill_manual(values = cond_cols) +
+  scale_x_continuous(expand = expansion(), limits=c(NA, 1)) +
+  scale_y_continuous(expand = expansion(mult=0.04)) +
+  labs(
+    x = "Probability of Correct Response",
+    y = "Posterior Density",
+    colour = "Stimulus Type",
+    fill = "Stimulus Type"
+  ) +
+  theme(
+    legend.position = "top",
+    axis.ticks.y = element_blank(),
+    axis.text.y = element_blank()
+  )
+
+
+# get predictions of densities for RT -------------------------------------
+
+times <- 1:1000
+
+rt_dens_pred <- as_draws_df(m_rt, "^b\\_.*", regex=TRUE) %>%
+  select(-starts_with(".")) %>%
+  expand_grid(condition = c("bacs", "noise", "word")) %>%
+  left_join(
+    d_cleaned_for_rt_m %>%
+      select(condition, ff_dev, noise_dev) %>%
+      distinct(),
+    by = "condition"
+  ) %>%
+  mutate(
+    pred_mu = b_Intercept + ff_dev * b_ff_dev + noise_dev * b_noise_dev,
+    pred_sigma = b_sigma_Intercept + ff_dev * b_sigma_ff_dev + noise_dev * b_sigma_noise_dev,
+    # pred_ndt = b_ndt_Intercept + ff_dev * b_ndt_ff_dev + noise_dev * b_ndt_noise_dev,
+    pred_ndt = b_ndt_Intercept
+  ) %>%
+  select(condition, starts_with("pred")) %>%
+  expand_grid(rt = times) %>%
+  mutate(
+    pred_dens = dshifted_lnorm(
+      x = rt,
+      meanlog = pred_mu,
+      sdlog = exp(pred_sigma),
+      shift = exp(pred_ndt)
+    )
+  ) %>%
+  group_by(rt, condition) %>%
+  median_hdi(pred_dens, .width=0.89) %>%
+  ungroup() %>%
+  mutate(
+    cond_lab = factor(recode(
+      condition,
+      word = "Words",
+      bacs = "False Font",
+      noise = "Phase-Shuffled"
+    ), levels = c("Words", "False Font", "Phase-Shuffled"))
+  )
+
+rt_pl <- rt_dens_pred %>%
+  filter(pred_dens>0) %>%
+  ggplot(aes(rt, pred_dens, ymin=.lower, ymax=.upper, fill=cond_lab, colour=cond_lab)) +
+  geom_ribbon(alpha=0.4, show.legend = FALSE) +
+  geom_text(aes(y=pred_dens * 1.15, label=""), show.legend=FALSE) +
+  scale_colour_manual(values = cond_cols) +
+  scale_fill_manual(values = cond_cols) +
+  scale_x_continuous(expand = expansion(), limits=c(0, NA)) +
+  scale_y_continuous(expand = expansion()) +
+  labs(
+    x = "Response Time (ms)",
+    y = "Predicted Density",
+    colour = "Stimulus Type",
+    fill = "Stimulus Type"
+  ) +
+  theme(
+    axis.text.y = element_blank(),
+    axis.ticks.y = element_blank()
+  )
+
+preds_pl <- (acc_pl | rt_pl) +
+  plot_layout(guides = "collect") +
+  plot_annotation(tag_levels = "a") &
+  theme(legend.position = "bottom")
+
+ggsave(file.path("figs", "10_loc_rt_acc_preds.pdf"), preds_pl, width=5.5, height=2.5)
+
+
+# compare priors and posteriors -------------------------------------------
+
+priors_m_rt <- brm(
+  formula = brmsformula(
+    rt ~ 0 + Intercept + ff_dev + noise_dev +
+      (ff_dev + noise_dev | subj_id) +
+      (ff_dev + noise_dev | item_nr) +
+      (1 | image),
+    sigma ~ 0 + Intercept + ff_dev + noise_dev +
+      (ff_dev + noise_dev | subj_id) +
+      (ff_dev + noise_dev | item_nr) +
+      (1 | image),
+    ndt ~ 0 + Intercept
+  ),
+  data = d_cleaned_for_rt_m,
+  family = shifted_lognormal(),
+  prior = rt_priors,
+  iter = n_iter_rt,
+  warmup = n_warmup_rt,
+  chains = n_chains_rt,
+  control = list(
+    adapt_delta = adapt_delta_rt,
+    max_treedepth = max_treedepth_rt
+  ),
+  inits = inits_rt,
+  sample_prior = "only",
+  silent = TRUE,
+  cores = n_cores,
+  seed = seed,
+  thin = 1,
+  refresh = 1000
+)
+
+priors_m_acc <- brm(
+  formula = acc ~ 0 + Intercept + ff_dev + noise_dev +
+    (ff_dev + noise_dev | subj_id) +
+    (ff_dev + noise_dev | item_nr) +
+    (1 | image),
+  data = d_cleaned_for_acc_m,
+  family = bernoulli("logit"),
+  prior = acc_priors,
+  iter = n_iter_acc,
+  warmup = n_warmup_acc,
+  chains = n_chains_acc,
+  control = list(
+    adapt_delta = adapt_delta_acc,
+    max_treedepth = max_treedepth_acc
+  ),
+  sample_prior = "only",
+  silent = TRUE,
+  cores = n_cores,
+  seed = seed,
+  thin = 1,
+  refresh = 1000
+)
+
+prior_post_rt <- bind_rows(
+  as_draws_df(m_rt, "^b\\_.*|^sd\\_.*", regex=TRUE) %>%
+    select(-.chain, -.iteration, -.draw) %>%
+    pivot_longer(cols=everything(), names_to="par", values_to="est") %>%
+    mutate(source="posterior"),
+  as_draws_df(priors_m_rt, "^b\\_.*|^sd\\_.*", regex=TRUE) %>%
+    select(-.chain, -.iteration, -.draw) %>%
+    pivot_longer(cols=everything(), names_to="par", values_to="est") %>%
+    mutate(source="prior")
+) %>%
+  mutate(source = factor(source, levels = c("prior", "posterior")))
+
+prior_post_acc <- bind_rows(
+  as_draws_df(m_acc, "^b\\_.*|^sd\\_.*", regex=TRUE) %>%
+    select(-.chain, -.iteration, -.draw) %>%
+    pivot_longer(cols=everything(), names_to="par", values_to="est") %>%
+    mutate(source="posterior"),
+  as_draws_df(priors_m_acc, "^b\\_.*|^sd\\_.*", regex=TRUE) %>%
+    select(-.chain, -.iteration, -.draw) %>%
+    pivot_longer(cols=everything(), names_to="par", values_to="est") %>%
+    mutate(source="prior")
+) %>%
+  mutate(source = factor(source, levels = c("prior", "posterior")))
+
+
+# plots comparing priors and posteriors -----------------------------------
+
+# fixed effects
+pl_prior_post_fe_ints_rt <- prior_post_rt %>%
+  filter(grepl("^b\\_", par), grepl("Intercept", par, fixed=TRUE)) %>%
+  mutate(
+    par_lab = factor(recode(
+      par,
+      b_Intercept = "mu",
+      b_sigma_Intercept = "sigma",
+      b_ndt_Intercept = "delta"
+    ), levels = c("mu", "sigma", "delta"))
+  ) %>%
+  ggplot(aes(est, "Intercept", colour=source)) +
+  stat_pointinterval(point_interval = "median_hdi", .width=.89, position=position_dodge(width=-0.4), show.legend = FALSE) +
+  facet_wrap(vars(par_lab), scales = "free_x", labeller = label_parsed) +
+  scale_y_discrete(expand = expansion(0.1, 0)) +
+  scale_colour_manual(values = c("black", "red"), labels = c("Prior", "Posterior")) +
+  labs(
+    x = NULL,
+    y = NULL,
+    colour = NULL,
+    tag = "b"
+  ) +
+  theme(
+    legend.position = "none",
+    axis.ticks.y = element_blank(),
+    axis.text.y = element_blank()
+  )
+
+pl_prior_post_fe_slopes_rt <- prior_post_rt %>%
+  filter(grepl("^b\\_", par), !grepl("Intercept", par, fixed=TRUE)) %>%
+  add_row(
+    source = factor(c("prior", "prior", "posterior", "posterior"), levels=c("prior", "posterior")),
+    par = rep(c("b_ndt_ff_dev", "b_ndt_noise_dev"), 2)
+  ) %>%
+  mutate(
+    par_lab = factor(case_when(
+      grepl("sigma", par, fixed=TRUE) ~ "sigma",
+      grepl("ndt", par, fixed=TRUE) ~ "delta",
+      TRUE ~ "mu"
+    ), levels = c("mu", "sigma", "delta")),
+    eff = factor(case_when(
+      grepl("ff_dev", par, fixed=TRUE) ~ "Words Vs.\nFalse Font",
+      grepl("noise_dev", par, fixed=TRUE) ~ "Words Vs.\nPhase-Shuffled",
+    ))
+  ) %>%
+  ggplot(aes(est, reorder(eff, desc(eff)), colour=source)) +
+  stat_pointinterval(point_interval = "median_hdi", .width=.89, position=position_dodge(width=-0.4), show.legend = FALSE) +
+  facet_wrap(vars(par_lab), scales = "free_x", labeller = label_parsed) +
+  scale_y_discrete(expand = expansion(0.1, 0)) +
+  scale_colour_manual(values = c("black", "red"), labels = c("Prior", "Posterior")) +
+  labs(
+    x = "RT Model Estimate",
+    y = NULL,
+    colour = NULL
+  ) +
+  theme(
+    legend.position = "bottom",
+    legend.margin = margin(),
+    strip.background = element_blank(),
+    strip.text.x = element_blank(),
+    axis.ticks.y = element_blank(),
+    axis.text.y = element_blank()
+  )
+
+
+pl_prior_post_fe_ints_acc <- prior_post_acc %>%
+  filter(grepl("^b\\_", par), grepl("Intercept", par, fixed=TRUE)) %>%
+  ggplot(aes(est, "Intercept", colour=source)) +
+  stat_pointinterval(point_interval = "median_hdi", .width=.89, position=position_dodge(width=-0.4), show.legend = FALSE) +
+  scale_y_discrete(expand = expansion(0.1, 0)) +
+  scale_colour_manual(values = c("black", "red"), labels = c("Prior", "Posterior")) +
+  labs(
+    x = NULL,
+    y = NULL,
+    colour = NULL,
+    tag = "a"
+  ) +
+  theme(
+    legend.position = "none"
+  )
+
+pl_prior_post_fe_slopes_acc <- prior_post_acc %>%
+  filter(grepl("^b\\_", par), !grepl("Intercept", par, fixed=TRUE)) %>%
+  mutate(
+    eff = factor(case_when(
+      grepl("ff_dev", par, fixed=TRUE) ~ "Words Vs.\nFalse Font",
+      grepl("noise_dev", par, fixed=TRUE) ~ "Words Vs.\nPhase-Shuffled",
+    ))
+  ) %>%
+  ggplot(aes(est, reorder(eff, desc(eff)), colour=source)) +
+  stat_pointinterval(point_interval = "median_hdi", .width=.89, position=position_dodge(width=-0.4)) +
+  scale_y_discrete(expand = expansion(0.1, 0)) +
+  scale_colour_manual(values = c("black", "red"), labels = c("Prior", "Posterior")) +
+  labs(
+    x = "Accuracy Model\nEstimate",
+    y = NULL,
+    colour = NULL
+  ) +
+  theme(
+    legend.position = "bottom",
+    legend.margin = margin(),
+    strip.background = element_blank()
+  )
+
+pl_prior_post_fe <- pl_prior_post_fe_ints_acc + pl_prior_post_fe_ints_rt+
+  pl_prior_post_fe_slopes_acc + pl_prior_post_fe_slopes_rt +
+  plot_layout(guides = "collect", widths = c(1, 3), heights = c(1, 1.75)) &
+  theme(legend.position = "bottom")
+
+ggsave(file.path("figs", "10_localiser_beh_prior_post_fes.pdf"), pl_prior_post_fe, width=6.5, height=3.25, device="pdf")
+
+
+
+# random effects
+
+# # subject random effects SDs
+# pl_prior_post_re_subj_ints_rt <- prior_post_rt %>%
+#   filter(grepl("^sd\\_subj\\_id", par), grepl("Intercept", par, fixed=TRUE)) %>%
+#   add_row(
+#     source = factor(c("prior", "posterior"), levels=c("prior", "posterior")),
+#     par = rep("sd_subj_id__ndt_Intercept", 2)
+#   ) %>%
+#   mutate(
+#     par_lab = factor(case_when(
+#       grepl("sigma", par, fixed=TRUE) ~ "sigma",
+#       grepl("ndt", par, fixed=TRUE) ~ "delta",
+#       TRUE ~ "mu"
+#     ), levels = c("mu", "sigma", "delta"))
+#   ) %>%
+#   ggplot(aes(est, "Intercept", colour=source)) +
+#   stat_pointinterval(point_interval = "median_hdi", .width=.89, position=position_dodge(width=-0.4)) +
+#   facet_wrap(vars(par_lab), scales = "free_x", labeller = label_parsed) +
+#   scale_y_discrete(expand = expansion(0.1, 0)) +
+#   scale_colour_manual(values = c("black", "red"), labels = c("Prior", "Posterior")) +
+#   labs(
+#     x = NULL,
+#     y = NULL,
+#     title = "Participant Random Effects SDs",
+#     tag = "A"
+#   ) +
+#   theme(legend.position = "none")
+# 
+# pl_prior_post_re_subj_slopes_rt <- prior_post_rt %>%
+#   filter(grepl("^sd\\_subj\\_id", par), !grepl("Intercept", par, fixed=TRUE)) %>%
+#   add_row(
+#     source = factor(c("prior", "prior", "posterior", "posterior"), levels=c("prior", "posterior")),
+#     par = rep(c("sd_subj_id__ndt_ff_dev", "sd_subj_id__ndt_noise_dev"), 2)
+#   ) %>%
+#   mutate(
+#     par_lab = factor(case_when(
+#       grepl("sigma", par, fixed=TRUE) ~ "sigma",
+#       grepl("ndt", par, fixed=TRUE) ~ "delta",
+#       TRUE ~ "mu"
+#     ), levels = c("mu", "sigma", "delta")),
+#     eff = factor(case_when(
+#       grepl("ff_dev", par, fixed=TRUE) ~ "Words Vs.\nFalse Font",
+#       grepl("noise_dev", par, fixed=TRUE) ~ "Words Vs.\nPhase-Shuffled",
+#     ))
+#   ) %>%
+#   ggplot(aes(est, reorder(eff, desc(eff)), colour=source)) +
+#   stat_pointinterval(point_interval = "median_hdi", .width=.89, position=position_dodge(width=-0.4)) +
+#   facet_wrap(vars(par_lab), scales = "free_x", labeller = label_parsed) +
+#   scale_y_discrete(expand = expansion(0.1, 0)) +
+#   scale_colour_manual(values = c("black", "red"), labels = c("Prior", "Posterior")) +
+#   labs(
+#     x = NULL,
+#     y = NULL,
+#     colour = NULL
+#   ) +
+#   theme(
+#     legend.position = "none",
+#     strip.background = element_blank(),
+#     strip.text.x = element_blank()
+#   )
+# 
+# # item (match set) random effects SDs
+# pl_prior_post_re_item_ints_rt <- prior_post_rt %>%
+#   filter(grepl("^sd\\_item", par), grepl("Intercept", par, fixed=TRUE)) %>%
+#   add_row(
+#     source = factor(c("prior", "posterior"), levels=c("prior", "posterior")),
+#     par = rep("sd_item_nr__ndt_Intercept", 2)
+#   ) %>%
+#   mutate(
+#     par_lab = factor(case_when(
+#       grepl("sigma", par, fixed=TRUE) ~ "sigma",
+#       grepl("ndt", par, fixed=TRUE) ~ "delta",
+#       TRUE ~ "mu"
+#     ), levels = c("mu", "sigma", "delta"))
+#   ) %>%
+#   ggplot(aes(est, "Intercept", colour=source)) +
+#   stat_pointinterval(point_interval = "median_hdi", .width=.89, position=position_dodge(width=-0.4)) +
+#   facet_wrap(vars(par_lab), scales = "free_x", labeller = label_parsed) +
+#   scale_y_discrete(expand = expansion(0.1, 0)) +
+#   scale_colour_manual(values = c("black", "red")) +
+#   labs(
+#     x = NULL,
+#     y = NULL,
+#     title = "Image Random Effects SDs",
+#     tag = "B"
+#   ) +
+#   theme(legend.position = "none")
+# 
+# pl_prior_post_re_item_slopes_rt <- prior_post_rt %>%
+#   filter(grepl("^sd\\_item", par), !grepl("Intercept", par, fixed=TRUE)) %>%
+#   add_row(
+#     source = factor(c("prior", "prior", "posterior", "posterior"), levels=c("prior", "posterior")),
+#     par = rep(c("sd_item_nr__ndt_ff_dev", "sd_item_nr__ndt_noise_dev"), 2)
+#   ) %>%
+#   mutate(
+#     par_lab = factor(case_when(
+#       grepl("sigma", par, fixed=TRUE) ~ "sigma",
+#       grepl("ndt", par, fixed=TRUE) ~ "delta",
+#       TRUE ~ "mu"
+#     ), levels = c("mu", "sigma", "delta")),
+#     eff = factor(case_when(
+#       grepl("ff_dev", par, fixed=TRUE) ~ "Words Vs.\nFalse Font",
+#       grepl("noise_dev", par, fixed=TRUE) ~ "Words Vs.\nPhase-Shuffled",
+#     ))
+#   ) %>%
+#   ggplot(aes(est, eff, colour=source)) +
+#   stat_pointinterval(point_interval = "median_hdi", .width=.89, position=position_dodge(width=-0.4)) +
+#   facet_wrap(vars(par_lab), scales = "free_x", labeller = label_parsed) +
+#   scale_y_discrete(expand = expansion(0.1, 0)) +
+#   scale_colour_manual(values = c("black", "red"), labels = c("Prior", "Posterior")) +
+#   labs(
+#     x = NULL,
+#     y = NULL,
+#     colour = NULL
+#   ) +
+#   theme(
+#     legend.position = "none",
+#     strip.background = element_blank(),
+#     strip.text.x = element_blank()
+#   )
+# 
+# # image random effects SDs
+# pl_prior_post_re_string_ints_rt <- prior_post_rt %>%
+#   filter(grepl("^sd\\_image", par), grepl("Intercept", par, fixed=TRUE)) %>%
+#   add_row(
+#     source = factor(c("prior", "posterior"), levels=c("prior", "posterior")),
+#     par = rep("sd_image__ndt_Intercept", 2)
+#   ) %>%
+#   mutate(
+#     par_lab = factor(case_when(
+#       grepl("sigma", par, fixed=TRUE) ~ "sigma",
+#       grepl("ndt", par, fixed=TRUE) ~ "delta",
+#       TRUE ~ "mu"
+#     ), levels = c("mu", "sigma", "delta"))
+#   ) %>%
+#   ggplot(aes(est, "Intercept", colour=source)) +
+#   stat_pointinterval(point_interval = "median_hdi", .width=.89, position=position_dodge(width=-0.4)) +
+#   facet_wrap(vars(par_lab), scales = "free_x", labeller = label_parsed) +
+#   scale_y_discrete(expand = expansion(0.1, 0)) +
+#   scale_colour_manual(values = c("black", "red"), labels=c("Prior", "Posterior")) +
+#   labs(
+#     x = "Estimate",
+#     y = NULL,
+#     title = "Word Random Effects SDs",
+#     tag = "C",
+#     colour = NULL
+#   ) +
+#   theme(legend.position = "bottom", legend.margin = margin())

+ 516 - 0
04 Analysis/analyse_11_preprocess_pictureword_picture.m

@@ -0,0 +1,516 @@
+%% Setup
+
+% paths to data
+eeg_path = fullfile('raw_data', 'eeg-pc', 'pictureword');
+beh_path = fullfile('raw_data', 'stim-pc', 'data', 'pictureword');
+
+% import eeglab (assumes eeglab has been added to path), e.g.
+addpath('C:/EEGLAB/eeglab2020_0')
+[ALLEEG, EEG, CURRENTSET, ALLCOM] = eeglab;
+
+% This script uses fastica algorithm for ICA, so FastICA needs to be on the path, e.g.
+addpath('C:/EEGLAB/FastICA_25')
+
+% region of interest for trial-level ROI average
+roi = {'TP7', 'CP5', 'P7', 'P5', 'P9', 'PO7', 'PO3', 'O1'};
+
+% cutoff probability for identifying eye and muscle related ICA components with ICLabel
+icl_cutoff = 0.85;
+
+% sigma parameter for ASR
+asr_sigma = 20;
+
+%% Clear output folders
+
+delete(fullfile('sample_data_picture', '*.csv'))
+
+%% Import lab book
+
+% handle commas in vectors
+lab_book_file = fullfile('raw_data', 'stim-pc', 'participants.csv');
+lab_book_raw_dat = fileread(lab_book_file);
+
+[regstart, regend] = regexp(lab_book_raw_dat, '\[.*?\]');
+
+for regmatch_i = 1:numel(regstart)
+    str_i = lab_book_raw_dat(regstart(regmatch_i):regend(regmatch_i));
+    str_i(str_i==',') = '.';
+    lab_book_raw_dat(regstart(regmatch_i):regend(regmatch_i)) = str_i;
+end
+
+lab_book_fixed_file = fullfile('raw_data', 'stim-pc', 'participants_tmp.csv');
+lab_book_fixed_conn = fopen(lab_book_fixed_file, 'w');
+fprintf(lab_book_fixed_conn, lab_book_raw_dat);
+fclose(lab_book_fixed_conn);
+
+lab_book_readopts = detectImportOptions(lab_book_fixed_file, 'VariableNamesLine', 1, 'Delimiter', ',');
+% read subject ids as class character
+lab_book_readopts.VariableTypes{strcmp(lab_book_readopts.SelectedVariableNames, 'subj_id')} = 'char';
+lab_book = readtable(lab_book_fixed_file, lab_book_readopts);
+
+delete(lab_book_fixed_file)
+
+%% Count the total number of excluded electrodes
+
+n_bads = 0;
+n_bads_per_s = zeros(size(lab_book, 1), 0);
+
+for subject_nr = 1:size(lab_book, 1)
+    bad_channels = eval(strrep(strrep(strrep(lab_book.pw_bad_channels{subject_nr}, '[', '{'), ']', '}'), '.', ','));
+    n_bads_per_s(subject_nr) = numel(bad_channels);
+    n_bads = n_bads + numel(bad_channels);
+end
+
+perc_bads = n_bads / (64 * size(lab_book, 1)) * 100;
+
+%% Import max electrode info
+
+% this contains participants' maximal electrodes for the N170 from the
+% localisation task
+max_elecs = readtable('max_elecs.csv');
+
+%% Iterate over subjects
+
+% record trial exclusions
+total_excl_trials_incorr = zeros(1, size(lab_book, 1));
+total_excl_trials_rt = zeros(1, size(lab_book, 1));
+
+n_bad_ica = zeros(size(lab_book, 1), 0);
+
+for subject_nr = 1:size(lab_book, 1)
+    
+    subject_id = lab_book.subj_id{subject_nr};
+    fprintf('\n\n Subject Iteration %g/%g, ID: %s\n', subject_nr, size(lab_book, 1), subject_id)
+    
+    %% get subject-specific info from lab book
+    exclude = lab_book.exclude(subject_nr);
+    bad_channels = eval(strrep(strrep(strrep(lab_book.pw_bad_channels{subject_nr}, '[', '{'), ']', '}'), '.', ','));
+    bad_trigger_indices = eval(strrep(lab_book.pw_bad_trigger_indices{subject_nr}, '.', ','));
+    
+    % add PO4 to bad channels, which seems to be consistently noisy, even when not marked as bad
+    if sum(strcmp('PO4', bad_channels))==0
+        bad_channels(numel(bad_channels)+1) = {'PO4'};
+    end
+
+    %% abort if excluded
+    
+    if exclude
+        fprintf('Subject %s excluded. Preprocessing aborted.\n', subject_id)
+        fprintf('Lab book note: %s\n', lab_book.note{subject_nr})
+        continue
+    end
+    
+    %% load participant's data
+    
+    % load raw eeg
+    raw_datapath = fullfile(eeg_path, append(subject_id, '.bdf'));
+    
+    % abort if no EEG data collected yet
+    if ~isfile(raw_datapath)
+        fprintf('Subject %s skipped: no EEG data found\n', subject_id)
+        continue
+    end
+    
+    EEG = pop_biosig(raw_datapath, 'importevent', 'on', 'rmeventchan', 'off');
+    
+    % load behavioural
+    all_beh_files = dir(beh_path);
+    beh_regex_matches = regexpi({all_beh_files.name}, append('^', subject_id, '_.+\.csv$'), 'match');
+    regex_emptymask = cellfun('isempty', beh_regex_matches);
+    beh_regex_matches(regex_emptymask) = [];
+    subj_beh_files = cellfun(@(x) x{:}, beh_regex_matches, 'UniformOutput', false);
+    
+    if size(subj_beh_files)>1
+        fprintf('%g behavioural files found?\n', size(subj_beh_files))
+        break
+    end
+    
+    beh_datapath = fullfile(beh_path, subj_beh_files{1});
+    beh = readtable(beh_datapath);
+    
+    %% Set data features
+    
+    % set channel locations
+    
+    orig_locs = EEG.chanlocs;
+    EEG.chanlocs = pop_chanedit(EEG.chanlocs, 'load', {'BioSemi64.loc', 'filetype', 'loc'});  % doesn't match order for the data
+    
+    % set channel types
+    for ch_nr = 1:64
+        EEG.chanlocs(ch_nr).type = 'EEG';
+    end
+    
+    for ch_nr = 65:72
+        EEG.chanlocs(ch_nr).type = 'EOG';
+    end
+    
+    for ch_nr = 73:79
+        EEG.chanlocs(ch_nr).type = 'MISC';
+    end
+    
+    for ch_nr = 65:79
+        EEG.chanlocs(ch_nr).theta = [];
+        EEG.chanlocs(ch_nr).radius = [];
+        EEG.chanlocs(ch_nr).sph_theta = [];
+        EEG.chanlocs(ch_nr).sph_phi = [];
+        EEG.chanlocs(ch_nr).X = [];
+        EEG.chanlocs(ch_nr).Y = [];
+        EEG.chanlocs(ch_nr).Z = [];
+    end
+    
+    % change the order of channels in EEG.data to match the new order in chanlocs
+    data_reordered = EEG.data;
+    for ch_nr = 1:64        
+        % make sure the new eeg data array matches the listed order
+        ch_lab = EEG.chanlocs(ch_nr).labels;
+        orig_locs_idx = find(strcmp(lower({orig_locs.labels}), lower(ch_lab)));
+        data_reordered(ch_nr, :) = EEG.data(orig_locs_idx, :);
+    end
+    EEG.data = data_reordered;
+    
+    % remove unused channels
+    EEG = pop_select(EEG, 'nochannel', 69:79);
+    
+    % remove bad channels
+    ur_chanlocs = EEG.chanlocs;  % store a copy of the full channel locations before removing (for later interpolation)
+    bad_channels_indices = find(ismember(lower({EEG.chanlocs.labels}), lower(bad_channels)));
+    EEG = pop_select(EEG, 'nochannel', bad_channels_indices);
+    
+    %% Identify events (trials) - getting the picture as the trigger instead of the word
+    
+    % make the sopen function happy
+    x = fileparts( which('sopen') );
+    rmpath(x);
+    addpath(x,'-begin');
+    
+    % build the events manually from the raw eeg file (pop_biosig removes event offsets)
+    % NB: this assumes no resampling between reading the BDF file and now
+    bdf_dat = sopen(raw_datapath, 'r', [0, Inf], 'OVERFLOWDETECTION:OFF');
+    event_types = bdf_dat.BDF.Trigger.TYP;
+    event_pos = bdf_dat.BDF.Trigger.POS;
+    event_time = EEG.times(event_pos);
+    sclose(bdf_dat);
+    clear bdf_dat;
+    
+    triggers = struct(...
+        'off', 0,...
+        'A1', 1,...
+        'A2', 2,...
+        'practice', 25,...
+        'image', 99);
+    
+    % add 61440 to each trigger value (because of number of bits in pp)
+    trigger_labels = fieldnames(triggers);
+    for field_nr = 1:numel(trigger_labels)
+        triggers.(trigger_labels{field_nr}) = triggers.(trigger_labels{field_nr}) + 61440;
+    end
+    
+    % remove the first trigger if it is at time 0 and has a value which isn't a recognised trigger
+    if (event_time(1)==0 && ~ismember(event_types(1), [triggers.off, triggers.A1, triggers.A2, triggers.practice, triggers.image]))
+        event_types(1) = [];
+        event_pos(1) = [];
+        event_time(1) = [];
+    end
+    
+    % remove the new first trigger if it has a value of off
+    if (event_types(1)==triggers.off)
+        event_types(1) = [];
+        event_pos(1) = [];
+        event_time(1) = [];
+    end
+    
+    % check every second trigger is an offset
+    offset_locs = find(event_types==triggers.off);
+    if any(offset_locs' ~= 2:2:numel(event_types))
+        fprintf('Expected each second trigger to be an off?')
+        break
+    end
+    
+    % check every first trigger is non-zero
+    onset_locs = find(event_types~=triggers.off);
+    if any(onset_locs' ~= 1:2:numel(event_types))
+        fprintf('Expected each first trigger to be an event?')
+        break
+    end
+    
+    % create the events struct manually    
+    events_onset_types = event_types(onset_locs);
+    events_onsets = event_pos(onset_locs);
+    events_offsets = event_pos(offset_locs);
+    events_durations = events_offsets - events_onsets;
+    
+    EEG.event = struct();
+    for event_nr = 1:numel(events_onsets)
+        EEG.event(event_nr).type = events_onset_types(event_nr);
+        EEG.event(event_nr).latency = events_onsets(event_nr);
+        EEG.event(event_nr).offset = events_offsets(event_nr);
+        EEG.event(event_nr).duration = events_durations(event_nr);
+    end
+    
+    % copy the details over to urevent
+    EEG.urevent = EEG.event;
+    
+    % record the urevent
+    for event_nr = 1:numel(events_onsets)
+        EEG.event(event_nr).urevent = event_nr;
+    end
+    
+    % remove bad events recorded in lab book (misfired triggers)
+    EEG = pop_editeventvals(EEG, 'delete', find(ismember([EEG.event.urevent], bad_trigger_indices)));
+    
+    % remove practice trials
+    EEG = pop_editeventvals(EEG, 'delete', find(ismember([EEG.event.type], triggers.practice)));
+    
+    % remove triggers for words
+    EEG = pop_editeventvals(EEG, 'delete', find(ismember([EEG.event.type], [triggers.A1, triggers.A2])));
+
+    % remove triggers for all but the last 200 triggers (i.e., remove the practice images)
+    EEG = pop_editeventvals(EEG, 'delete', fliplr( 1:numel([EEG.event.type]) ) > 200);
+    
+    % check the events make sense
+    if sum(~ismember([EEG.event.type], triggers.image)) > 0
+        fprintf('Unexpected trial types?\n')
+        break
+    end
+    
+    if numel({EEG.event.type})~=200
+        fprintf('%g trial triggers detected?\n',  numel({EEG.event.type}))
+        break
+    end
+    
+    % add the trials' onsets, offsets, durations, and triggers to the behavioural data
+    beh.event = zeros(size(beh, 1), 1);
+    beh.latency = zeros(size(beh, 1), 1);
+    for row_nr = 1:size(beh, 1)
+        cond_i = beh.condition(row_nr);
+        beh.event(row_nr) = triggers.(cond_i{:});
+        beh.latency(row_nr) = EEG.event(row_nr).latency;
+        beh.offset(row_nr) = EEG.event(row_nr).offset;
+        beh.duration(row_nr) = EEG.event(row_nr).duration;
+        beh.duration_ms(row_nr) = (EEG.event(row_nr).duration * 1000/EEG.srate) - 500;  % minus 500 as event timer starts at word presentation, but rt timer starts once word turns green
+    end
+    
+    % record trial numbers in EEG.event
+    for row_nr = 1:size(beh, 1)
+        EEG.event(row_nr).trl_nr = beh.trl_nr(row_nr);
+    end
+    
+    %% Remove segments of data that fall outside of blocks
+    
+    % record block starts
+    beh.is_block_start(1) = 1;
+    for row_nr = 2:size(beh, 1)
+        beh.is_block_start(row_nr) = beh.block_nr(row_nr) - beh.block_nr(row_nr-1) == 1;
+    end
+    % record block ends
+    beh.is_block_end(size(beh, 1)) = 1;
+    for row_nr = 1:(size(beh, 1)-1)
+        beh.is_block_end(row_nr) = beh.block_nr(row_nr+1) - beh.block_nr(row_nr) == 1;
+    end
+
+    % record block boundaries (first start and last end point of each block, with 1 seconds buffer)
+    beh.block_boundary = zeros(size(beh, 1), 1);
+    for row_nr = 1:size(beh, 1)
+        if beh.is_block_start(row_nr)
+            beh.block_boundary(row_nr) = beh.latency(row_nr) - (EEG.srate * 1);
+        elseif beh.is_block_end(row_nr)
+            beh.block_boundary(row_nr) = beh.offset(row_nr) + (EEG.srate * 1);
+        end
+    end
+    
+    % get the boundary indices in required format (start1, end1; start2, end2; start3, end3)
+    block_boundaries = reshape(beh.block_boundary(beh.block_boundary~=0), 2, [])';
+    
+    % remove anything outside of blocks
+    EEG = pop_select(EEG, 'time', (block_boundaries / EEG.srate));
+    
+    %% Trial selection
+    
+    % include only correct responses
+    beh_filt_acc_only = beh(beh.acc==1, :);
+    excl_trials_incorr = size(beh, 1)-size(beh_filt_acc_only, 1);
+    total_excl_trials_incorr(subject_nr) = excl_trials_incorr;
+    fprintf('Lost %g trials to incorrect responses\n', excl_trials_incorr)
+    
+    % include only responses between 100 and 1500 ms
+    beh_filt = beh_filt_acc_only(beh_filt_acc_only.rt<=1500, :);
+    excl_trials_rt = size(beh_filt_acc_only, 1)-size(beh_filt, 1);
+    total_excl_trials_rt(subject_nr) = excl_trials_rt;
+    fprintf('Lost %g trials to RTs above 1500\n', excl_trials_rt)
+    
+    fprintf('Lost %g trials in total to behavioural data\n', size(beh, 1)-size(beh_filt, 1))
+    
+    % filter the events structure
+    discarded_trls = beh.trl_nr(~ismember(beh.trl_nr, beh_filt.trl_nr));
+    discarded_events_indices = [];  % (collect in a for loop, as [EEG.event.trl_nr] would remove missing data)
+    for event_nr = 1:size(EEG.event, 2)
+        if ismember(EEG.event(event_nr).trl_nr, discarded_trls)
+            discarded_events_indices = [discarded_events_indices, event_nr];
+        end
+    end
+    EEG = pop_editeventvals(EEG, 'delete', discarded_events_indices);
+    
+    % check the discarded trials are the expected length
+    if numel(discarded_trls) ~= size(beh, 1)-size(beh_filt, 1)
+        fprintf('Mismatch between behavioural data and EEG events in the number of trials to discard?')
+        break
+    end
+    
+    % check the sizes match
+    if numel([EEG.event.trl_nr]) ~= size(beh_filt, 1)
+        fprintf('Inconsistent numbers of trials between events structure and behavioural data after discarding trials?')
+        break
+    end
+    
+    % check the trl numbers match
+    if any([EEG.event.trl_nr]' ~= beh_filt.trl_nr)
+        fprintf('Trial IDs mmismatch between events structure and behavioural data after discarding trials?')
+        break
+    end
+    
+    %% Rereference, downsample, and filter
+    
+    % rereference
+    EEG = pop_reref(EEG, []);
+    
+    % downsample
+    EEG = pop_resample(EEG, 512);
+    
+    % filter
+    % EEG = eeglab_butterworth(EEG, 0.5, 40, 4, 1:size(EEG.chanlocs, 2));  % preregistered filter
+    EEG = eeglab_butterworth(EEG, 0.1, 40, 4, 1:size(EEG.chanlocs, 2));  % filter with lower highpass
+    
+    %% ICA
+    
+    % apply ASR
+    %EEG_no_asr = EEG;
+    %EEG = clean_asr(EEG, asr_sigma, [], [], [], [], [], [], [], [], 1024);  % The last number is available memory in mb, needed for reproducibility
+    
+    % ASR is not used in this exploratory analysis
+
+    rng(3101)  % set seed for reproducibility
+    EEG = pop_runica(EEG, 'icatype', 'fastica', 'approach', 'symm');
+
+    % classify components with ICLabel
+    EEG = iclabel(EEG);
+
+    % store results for easy indexing
+    icl_res = EEG.etc.ic_classification.ICLabel.classifications;
+    icl_classes = EEG.etc.ic_classification.ICLabel.classes;
+    
+    % identify and remove artefact components
+    artefact_comps = find(icl_res(:, strcmp(icl_classes, 'Eye')) >= icl_cutoff | icl_res(:, strcmp(icl_classes, 'Muscle')) >= icl_cutoff);
+    fprintf('Removing %g artefact-related ICA components\n', numel(artefact_comps))
+    n_bad_ica(subject_nr) = numel(artefact_comps);
+    %EEG_no_iclabel = EEG;
+    EEG = pop_subcomp(EEG, artefact_comps);
+    
+    %% Interpolate bad channels
+    
+    % give the original chanlocs structure so EEGLAB interpolates the missing electrode(s)
+    if numel(bad_channels)>0
+        EEG = pop_interp(EEG, ur_chanlocs);
+    end
+    
+    %% Get sample level microvolts for exploratory analysis checking image ERPs
+    
+    disp('Getting sample-level results...')
+    
+    % resample to 256 Hz
+    EEG_256 = pop_resample(EEG, 256);
+    
+    % get epochs of low-srate data
+    EEG_epo_256 = pop_epoch(EEG_256, {triggers.image}, [-0.25, 1.8]);
+    
+    % remove baseline
+    EEG_epo_256 = pop_rmbase(EEG_epo_256, [-200, 0]);
+    
+    % pre-allocate the table
+    var_names = {'subj_id', 'stim_grp', 'resp_grp', 'item_nr', 'ch_name', 'time', 'uV'};
+    var_types = {'string', 'string', 'string', 'double', 'string', 'double', 'double'};
+    nrows = 64 * size(EEG_epo_256.times, 2) * size(beh_filt, 1);
+    sample_res = table('Size',[nrows, numel(var_names)], 'VariableTypes',var_types, 'VariableNames',var_names);
+    
+    sample_res.subj_id = repmat(beh_filt.subj_id, 64*size(EEG_epo_256.times, 2), 1);
+    sample_res.stim_grp = repmat(beh_filt.stim_grp, 64*size(EEG_epo_256.times, 2), 1);
+    sample_res.resp_grp = repmat(beh_filt.resp_grp, 64*size(EEG_epo_256.times, 2), 1);
+    
+    % get the 64 channel eeg data as an array
+    eeg_arr = EEG_epo_256.data(1:64, :, :);
+    
+    % a vector of all eeg data
+    eeg_vec = squeeze(reshape(eeg_arr, 1, 1, []));
+    
+    % array and vector of the channel labels for each value in EEG.data
+    channel_labels_arr = cell(size(eeg_arr));
+    channel_label_lookup = {EEG_epo_256.chanlocs.labels};
+    for chan_nr = 1:size(eeg_arr, 1)
+        channel_labels_arr(chan_nr, :, :) = repmat(channel_label_lookup(chan_nr), size(channel_labels_arr, 2), size(channel_labels_arr, 3));
+    end
+    
+    channel_labels_vec = squeeze(reshape(channel_labels_arr, 1, 1, []));
+    
+    % array and vector of the item numbers for each value in EEG.data
+    times_arr = zeros(size(eeg_arr));
+    times_lookup = EEG_epo_256.times;
+    for time_idx = 1:size(eeg_arr, 2)
+        times_arr(:, time_idx, :) = repmat(times_lookup(time_idx), size(times_arr, 1), size(times_arr, 3));
+    end
+    
+    times_vec = squeeze(reshape(times_arr, 1, 1, []));
+    
+    % array and vector of the trial numbers
+    trials_arr = zeros(size(eeg_arr));
+    trials_lookup = beh_filt.item_nr;
+    for trl_idx = 1:size(eeg_arr, 3)
+        trials_arr(:, :, trl_idx) = repmat(trials_lookup(trl_idx), size(trials_arr, 1), size(trials_arr, 2));
+    end
+    
+    trials_vec = squeeze(reshape(trials_arr, 1, 1, []));
+    
+    % store sample-level results in the table
+    sample_res.ch_name = channel_labels_vec;
+    sample_res.item_nr = trials_vec;
+    sample_res.time = times_vec;
+    sample_res.uV = eeg_vec;
+    
+    % look up and store some info about the trials
+    trial_info_lookup = beh_filt(:, {'item_nr', 'condition', 'image', 'string'});
+    sample_res = outerjoin(sample_res, trial_info_lookup, 'MergeKeys', true);
+    
+    % sort by time, channel, item_nr
+    sample_res = sortrows(sample_res, {'time', 'ch_name', 'item_nr'});
+    
+    %% save the results
+    
+    disp('Saving results...')
+    writetable(sample_res, fullfile('sample_data_picture', [subject_id, '.csv']));
+    
+end
+
+fprintf('\nFinished preprocessing picture-word data!\n')
+
+%% Functions
+
+% custom function for applying a Butterworth filter to EEGLAB data
+function EEG = eeglab_butterworth(EEG, low, high, order, chanind)
+    fprintf('Applying Butterworth filter between %g and %g Hz (order of %g)\n', low, high, order)
+    % create filter
+    [b, a] = butter(order, [low, high]/(EEG.srate/2));
+    % apply to data (requires transposition for filtfilt)
+    data_trans = single(filtfilt(b, a, double(EEG.data(chanind, :)')));
+    EEG.data(chanind, :) = data_trans';
+end
+
+% custom function for finding the closest timepoint in an EEG dataset
+function [idx, closesttime] = eeglab_closest_time(EEG, time)
+    dists = abs(EEG.times - time);
+    idx = find(dists == min(dists));
+    % in the unlikely case there are two equidistant times, select one randomly
+    if numel(idx) > 1
+        fprintf('Two equidistant times! Selecting one randomly.')
+        idx = idx(randperm(numel(idx)));
+        idx = idx(1);
+    end
+    closesttime = EEG.times(idx);
+end

+ 529 - 0
04 Analysis/analyse_12_preprocess_pictureword_response.m

@@ -0,0 +1,529 @@
+%% Setup
+
+% paths to data
+eeg_path = fullfile('raw_data', 'eeg-pc', 'pictureword');
+beh_path = fullfile('raw_data', 'stim-pc', 'data', 'pictureword');
+
+% import eeglab (assumes eeglab has been added to path), e.g.
+addpath('C:/EEGLAB/eeglab2020_0')
+[ALLEEG, EEG, CURRENTSET, ALLCOM] = eeglab;
+
+% This script uses fastica algorithm for ICA, so FastICA needs to be on the path, e.g.
+addpath('C:/EEGLAB/FastICA_25')
+
+% region of interest for trial-level ROI average
+roi = {'TP7', 'CP5', 'P7', 'P5', 'P9', 'PO7', 'PO3', 'O1'};
+
+% cutoff probability for identifying eye and muscle related ICA components with ICLabel
+icl_cutoff = 0.85;
+
+% sigma parameter for ASR
+asr_sigma = 20;
+
+%% Clear output folders
+
+delete(fullfile('sample_data_response', '*.csv'))
+
+%% Import lab book
+
+% handle commas in vectors
+lab_book_file = fullfile('raw_data', 'stim-pc', 'participants.csv');
+lab_book_raw_dat = fileread(lab_book_file);
+
+[regstart, regend] = regexp(lab_book_raw_dat, '\[.*?\]');
+
+for regmatch_i = 1:numel(regstart)
+    str_i = lab_book_raw_dat(regstart(regmatch_i):regend(regmatch_i));
+    str_i(str_i==',') = '.';
+    lab_book_raw_dat(regstart(regmatch_i):regend(regmatch_i)) = str_i;
+end
+
+lab_book_fixed_file = fullfile('raw_data', 'stim-pc', 'participants_tmp.csv');
+lab_book_fixed_conn = fopen(lab_book_fixed_file, 'w');
+fprintf(lab_book_fixed_conn, lab_book_raw_dat);
+fclose(lab_book_fixed_conn);
+
+lab_book_readopts = detectImportOptions(lab_book_fixed_file, 'VariableNamesLine', 1, 'Delimiter', ',');
+% read subject ids as class character
+lab_book_readopts.VariableTypes{strcmp(lab_book_readopts.SelectedVariableNames, 'subj_id')} = 'char';
+lab_book = readtable(lab_book_fixed_file, lab_book_readopts);
+
+delete(lab_book_fixed_file)
+
+%% Count the total number of excluded electrodes
+
+n_bads = 0;
+n_bads_per_s = zeros(size(lab_book, 1), 0);
+
+for subject_nr = 1:size(lab_book, 1)
+    bad_channels = eval(strrep(strrep(strrep(lab_book.pw_bad_channels{subject_nr}, '[', '{'), ']', '}'), '.', ','));
+    n_bads_per_s(subject_nr) = numel(bad_channels);
+    n_bads = n_bads + numel(bad_channels);
+end
+
+perc_bads = n_bads / (64 * size(lab_book, 1)) * 100;
+
+%% Import max electrode info
+
+% this contains participants' maximal electrodes for the N170 from the
+% localisation task
+max_elecs = readtable('max_elecs.csv');
+
+%% Iterate over subjects
+
+% record trial exclusions
+total_excl_trials_incorr = zeros(1, size(lab_book, 1));
+total_excl_trials_rt = zeros(1, size(lab_book, 1));
+
+n_bad_ica = zeros(size(lab_book, 1), 0);
+
+for subject_nr = 1:size(lab_book, 1)
+    
+    subject_id = lab_book.subj_id{subject_nr};
+    fprintf('\n\n Subject Iteration %g/%g, ID: %s\n', subject_nr, size(lab_book, 1), subject_id)
+    
+    %% get subject-specific info from lab book
+    exclude = lab_book.exclude(subject_nr);
+    bad_channels = eval(strrep(strrep(strrep(lab_book.pw_bad_channels{subject_nr}, '[', '{'), ']', '}'), '.', ','));
+    bad_trigger_indices = eval(strrep(lab_book.pw_bad_trigger_indices{subject_nr}, '.', ','));
+    
+    % add PO4 to bad channels, which seems to be consistently noisy, even when not marked as bad
+    if sum(strcmp('PO4', bad_channels))==0
+        bad_channels(numel(bad_channels)+1) = {'PO4'};
+    end
+
+    %% abort if excluded
+    
+    if exclude
+        fprintf('Subject %s excluded. Preprocessing aborted.\n', subject_id)
+        fprintf('Lab book note: %s\n', lab_book.note{subject_nr})
+        continue
+    end
+    
+    %% load participant's data
+    
+    % load raw eeg
+    raw_datapath = fullfile(eeg_path, append(subject_id, '.bdf'));
+    
+    % abort if no EEG data collected yet
+    if ~isfile(raw_datapath)
+        fprintf('Subject %s skipped: no EEG data found\n', subject_id)
+        continue
+    end
+    
+    EEG = pop_biosig(raw_datapath, 'importevent', 'on', 'rmeventchan', 'off');
+    
+    % load behavioural
+    all_beh_files = dir(beh_path);
+    beh_regex_matches = regexpi({all_beh_files.name}, append('^', subject_id, '_.+\.csv$'), 'match');
+    regex_emptymask = cellfun('isempty', beh_regex_matches);
+    beh_regex_matches(regex_emptymask) = [];
+    subj_beh_files = cellfun(@(x) x{:}, beh_regex_matches, 'UniformOutput', false);
+    
+    if size(subj_beh_files)>1
+        fprintf('%g behavioural files found?\n', size(subj_beh_files))
+        break
+    end
+    
+    beh_datapath = fullfile(beh_path, subj_beh_files{1});
+    beh = readtable(beh_datapath);
+    
+    %% Set data features
+    
+    % set channel locations
+    
+    orig_locs = EEG.chanlocs;
+    EEG.chanlocs = pop_chanedit(EEG.chanlocs, 'load', {'BioSemi64.loc', 'filetype', 'loc'});  % doesn't match order for the data
+    
+    % set channel types
+    for ch_nr = 1:64
+        EEG.chanlocs(ch_nr).type = 'EEG';
+    end
+    
+    for ch_nr = 65:72
+        EEG.chanlocs(ch_nr).type = 'EOG';
+    end
+    
+    for ch_nr = 73:79
+        EEG.chanlocs(ch_nr).type = 'MISC';
+    end
+    
+    for ch_nr = 65:79
+        EEG.chanlocs(ch_nr).theta = [];
+        EEG.chanlocs(ch_nr).radius = [];
+        EEG.chanlocs(ch_nr).sph_theta = [];
+        EEG.chanlocs(ch_nr).sph_phi = [];
+        EEG.chanlocs(ch_nr).X = [];
+        EEG.chanlocs(ch_nr).Y = [];
+        EEG.chanlocs(ch_nr).Z = [];
+    end
+    
+    % change the order of channels in EEG.data to match the new order in chanlocs
+    data_reordered = EEG.data;
+    for ch_nr = 1:64        
+        % make sure the new eeg data array matches the listed order
+        ch_lab = EEG.chanlocs(ch_nr).labels;
+        orig_locs_idx = find(strcmp(lower({orig_locs.labels}), lower(ch_lab)));
+        data_reordered(ch_nr, :) = EEG.data(orig_locs_idx, :);
+    end
+    EEG.data = data_reordered;
+    
+    % remove unused channels
+    EEG = pop_select(EEG, 'nochannel', 69:79);
+    
+    % remove bad channels
+    ur_chanlocs = EEG.chanlocs;  % store a copy of the full channel locations before removing (for later interpolation)
+    bad_channels_indices = find(ismember(lower({EEG.chanlocs.labels}), lower(bad_channels)));
+    EEG = pop_select(EEG, 'nochannel', bad_channels_indices);
+    
+    %% Identify events (trials) - getting the response as the trigger instead of the word
+    
+    % make the sopen function happy
+    x = fileparts( which('sopen') );
+    rmpath(x);
+    addpath(x,'-begin');
+    
+    % build the events manually from the raw eeg file (pop_biosig removes event offsets)
+    % NB: this assumes no resampling between reading the BDF file and now
+    bdf_dat = sopen(raw_datapath, 'r', [0, Inf], 'OVERFLOWDETECTION:OFF');
+    event_types = bdf_dat.BDF.Trigger.TYP;
+    event_pos = bdf_dat.BDF.Trigger.POS;
+    event_time = EEG.times(event_pos);
+    sclose(bdf_dat);
+    clear bdf_dat;
+    
+    triggers = struct(...
+        'off', 0,...
+        'A1', 1,...
+        'A2', 2,...
+        'practice', 25,...
+        'image', 99);
+    
+    % add 61440 to each trigger value (because of number of bits in pp)
+    trigger_labels = fieldnames(triggers);
+    for field_nr = 1:numel(trigger_labels)
+        triggers.(trigger_labels{field_nr}) = triggers.(trigger_labels{field_nr}) + 61440;
+    end
+    
+    % remove the first trigger if it is at time 0 and has a value which isn't a recognised trigger
+    if (event_time(1)==0 && ~ismember(event_types(1), [triggers.off, triggers.A1, triggers.A2, triggers.practice, triggers.image]))
+        event_types(1) = [];
+        event_pos(1) = [];
+        event_time(1) = [];
+    end
+    
+    % remove the new first trigger if it has a value of off
+    if (event_types(1)==triggers.off)
+        event_types(1) = [];
+        event_pos(1) = [];
+        event_time(1) = [];
+    end
+    
+    % check every second trigger is an offset
+    offset_locs = find(event_types==triggers.off);
+    if any(offset_locs' ~= 2:2:numel(event_types))
+        fprintf('Expected each second trigger to be an off?')
+        break
+    end
+    
+    % check every first trigger is non-zero
+    onset_locs = find(event_types~=triggers.off);
+    if any(onset_locs' ~= 1:2:numel(event_types))
+        fprintf('Expected each first trigger to be an event?')
+        break
+    end
+    
+    % create the events struct manually
+    events_onset_types = event_types(onset_locs);
+%     events_onsets = event_pos(onset_locs);
+    events_offsets = event_pos(offset_locs);
+%     events_durations = events_offsets - events_onsets;
+
+    % adjust manually so that we timelock to responses
+    events_onsets = events_offsets;
+    events_offsets = events_onsets + 500 * 1000/EEG.srate;
+    events_durations = events_offsets - events_onsets;
+    
+    EEG.event = struct();
+    for event_nr = 1:numel(events_onsets)
+        EEG.event(event_nr).type = events_onset_types(event_nr);
+        EEG.event(event_nr).latency = events_onsets(event_nr);
+        EEG.event(event_nr).offset = events_offsets(event_nr);
+        EEG.event(event_nr).duration = events_durations(event_nr);
+    end
+    
+    % copy the details over to urevent
+    EEG.urevent = EEG.event;
+    
+    % record the urevent
+    for event_nr = 1:numel(events_onsets)
+        EEG.event(event_nr).urevent = event_nr;
+    end
+    
+    % remove bad events recorded in lab book (misfired triggers)
+    EEG = pop_editeventvals(EEG, 'delete', find(ismember([EEG.event.urevent], bad_trigger_indices)));
+    
+    % remove practice trials
+    EEG = pop_editeventvals(EEG, 'delete', find(ismember([EEG.event.type], triggers.practice)));
+    
+    % remove triggers saying that image is displayed
+    EEG = pop_editeventvals(EEG, 'delete', find(ismember([EEG.event.type], triggers.image)));
+    
+    % check the events make sense
+    if sum(~ismember([EEG.event.type], [triggers.A1, triggers.A2])) > 0
+        fprintf('Unexpected trial types?\n')
+        break
+    end
+    
+    if numel({EEG.event.type})~=200
+        fprintf('%g trial triggers detected?\n',  numel({EEG.event.type}))
+        break
+    end
+    
+    if sum(ismember([EEG.event.type], [triggers.A1])) ~= sum(ismember([EEG.event.type], [triggers.A2]))
+        fprintf('Unequal number of congruent and incongruent trials?\n')
+        break
+    end
+    
+    % add the trials' onsets, offsets, durations, and triggers to the behavioural data
+    beh.event = zeros(size(beh, 1), 1);
+    beh.latency = zeros(size(beh, 1), 1);
+    for row_nr = 1:size(beh, 1)
+        cond_i = beh.condition(row_nr);
+        beh.event(row_nr) = triggers.(cond_i{:});
+        beh.latency(row_nr) = EEG.event(row_nr).latency;
+        beh.offset(row_nr) = EEG.event(row_nr).offset;
+        beh.duration(row_nr) = EEG.event(row_nr).duration;
+        beh.duration_ms(row_nr) = (EEG.event(row_nr).duration * 1000/EEG.srate) - 500;  % minus 500 as event timer starts at word presentation, but rt timer starts once word turns green
+    end
+    
+    % check events expected in beh are same as those in the events struct
+    if any(beh.event' ~= [EEG.event.type])
+        fprintf('%g mismatches between behavioural data and triggers?\n', sum(beh.event' ~= [EEG.event.type]))
+        break
+    end
+    
+    % record trial numbers in EEG.event
+    for row_nr = 1:size(beh, 1)
+        EEG.event(row_nr).trl_nr = beh.trl_nr(row_nr);
+    end
+            
+    %% Remove segments of data that fall outside of blocks
+    
+    % record block starts
+    beh.is_block_start(1) = 1;
+    for row_nr = 2:size(beh, 1)
+        beh.is_block_start(row_nr) = beh.block_nr(row_nr) - beh.block_nr(row_nr-1) == 1;
+    end
+    % record block ends
+    beh.is_block_end(size(beh, 1)) = 1;
+    for row_nr = 1:(size(beh, 1)-1)
+        beh.is_block_end(row_nr) = beh.block_nr(row_nr+1) - beh.block_nr(row_nr) == 1;
+    end
+
+    % record block boundaries (first start and last end point of each block, with 1 seconds buffer, or 1.5 seconds before)
+    beh.block_boundary = zeros(size(beh, 1), 1);
+    for row_nr = 1:size(beh, 1)
+        if beh.is_block_start(row_nr)
+            beh.block_boundary(row_nr) = beh.latency(row_nr) - (EEG.srate * 1.5);
+        elseif beh.is_block_end(row_nr)
+            beh.block_boundary(row_nr) = beh.offset(row_nr) + (EEG.srate * 1);
+        end
+    end
+    
+    % get the boundary indices in required format (start1, end1; start2, end2; start3, end3)
+    block_boundaries = reshape(beh.block_boundary(beh.block_boundary~=0), 2, [])';
+    
+    % remove anything outside of blocks
+    EEG = pop_select(EEG, 'time', (block_boundaries / EEG.srate));
+    
+    %% Trial selection
+    
+    % include only correct responses
+    beh_filt_acc_only = beh(beh.acc==1, :);
+    excl_trials_incorr = size(beh, 1)-size(beh_filt_acc_only, 1);
+    total_excl_trials_incorr(subject_nr) = excl_trials_incorr;
+    fprintf('Lost %g trials to incorrect responses\n', excl_trials_incorr)
+    
+    % include only responses between 100 and 1500 ms
+    beh_filt = beh_filt_acc_only(beh_filt_acc_only.rt<=1500, :);
+    excl_trials_rt = size(beh_filt_acc_only, 1)-size(beh_filt, 1);
+    total_excl_trials_rt(subject_nr) = excl_trials_rt;
+    fprintf('Lost %g trials to RTs above 1500\n', excl_trials_rt)
+    
+    fprintf('Lost %g trials in total to behavioural data\n', size(beh, 1)-size(beh_filt, 1))
+    
+    % filter the events structure
+    discarded_trls = beh.trl_nr(~ismember(beh.trl_nr, beh_filt.trl_nr));
+    discarded_events_indices = [];  % (collect in a for loop, as [EEG.event.trl_nr] would remove missing data)
+    for event_nr = 1:size(EEG.event, 2)
+        if ismember(EEG.event(event_nr).trl_nr, discarded_trls)
+            discarded_events_indices = [discarded_events_indices, event_nr];
+        end
+    end
+    EEG = pop_editeventvals(EEG, 'delete', discarded_events_indices);
+    
+    % check the discarded trials are the expected length
+    if numel(discarded_trls) ~= size(beh, 1)-size(beh_filt, 1)
+        fprintf('Mismatch between behavioural data and EEG events in the number of trials to discard?')
+        break
+    end
+    
+    % check the sizes match
+    if numel([EEG.event.trl_nr]) ~= size(beh_filt, 1)
+        fprintf('Inconsistent numbers of trials between events structure and behavioural data after discarding trials?')
+        break
+    end
+    
+    % check the trl numbers match
+    if any([EEG.event.trl_nr]' ~= beh_filt.trl_nr)
+        fprintf('Trial IDs mmismatch between events structure and behavioural data after discarding trials?')
+        break
+    end
+    
+    %% Rereference, downsample, and filter
+    
+    % rereference
+    EEG = pop_reref(EEG, []);
+    
+    % downsample
+    EEG = pop_resample(EEG, 512);
+    
+    % filter
+    % EEG = eeglab_butterworth(EEG, 0.5, 40, 4, 1:size(EEG.chanlocs, 2));  % preregistered filter
+    EEG = eeglab_butterworth(EEG, 0.1, 40, 4, 1:size(EEG.chanlocs, 2));  % filter with lower highpass
+    
+    %% ICA
+    
+    % apply ASR
+    %EEG_no_asr = EEG;
+    %EEG = clean_asr(EEG, asr_sigma, [], [], [], [], [], [], [], [], 1024);  % The last number is available memory in mb, needed for reproducibility
+    
+    % ASR is not used in this exploratory analysis
+
+    rng(3101)  % set seed for reproducibility
+    EEG = pop_runica(EEG, 'icatype', 'fastica', 'approach', 'symm');
+
+    % classify components with ICLabel
+    EEG = iclabel(EEG);
+
+    % store results for easy indexing
+    icl_res = EEG.etc.ic_classification.ICLabel.classifications;
+    icl_classes = EEG.etc.ic_classification.ICLabel.classes;
+    
+    % identify and remove artefact components
+    artefact_comps = find(icl_res(:, strcmp(icl_classes, 'Eye')) >= icl_cutoff | icl_res(:, strcmp(icl_classes, 'Muscle')) >= icl_cutoff);
+    fprintf('Removing %g artefact-related ICA components\n', numel(artefact_comps))
+    n_bad_ica(subject_nr) = numel(artefact_comps);
+    %EEG_no_iclabel = EEG;
+    EEG = pop_subcomp(EEG, artefact_comps);
+    
+    %% Interpolate bad channels
+    
+    % give the original chanlocs structure so EEGLAB interpolates the missing electrode(s)
+    if numel(bad_channels)>0
+        EEG = pop_interp(EEG, ur_chanlocs);
+    end
+    
+    %% Get sample level microvolts for exploratory analysis checking image ERPs
+    
+    disp('Getting sample-level results...')
+    
+    % resample to 256 Hz
+    EEG_256 = pop_resample(EEG, 256);
+    
+    % get epochs of low-srate data
+    EEG_epo_256 = pop_epoch(EEG_256, {triggers.A1, triggers.A2}, [-1, 0.5]);
+    
+    % remove baseline  - don't do this for the response ERP at present
+%     EEG_epo_256 = pop_rmbase(EEG_epo_256, [-200, 0]);
+    
+    % pre-allocate the table
+    var_names = {'subj_id', 'stim_grp', 'resp_grp', 'item_nr', 'ch_name', 'time', 'uV'};
+    var_types = {'string', 'string', 'string', 'double', 'string', 'double', 'double'};
+    nrows = 64 * size(EEG_epo_256.times, 2) * size(beh_filt, 1);
+    sample_res = table('Size',[nrows, numel(var_names)], 'VariableTypes',var_types, 'VariableNames',var_names);
+    
+    sample_res.subj_id = repmat(beh_filt.subj_id, 64*size(EEG_epo_256.times, 2), 1);
+    sample_res.stim_grp = repmat(beh_filt.stim_grp, 64*size(EEG_epo_256.times, 2), 1);
+    sample_res.resp_grp = repmat(beh_filt.resp_grp, 64*size(EEG_epo_256.times, 2), 1);
+    
+    % get the 64 channel eeg data as an array
+    eeg_arr = EEG_epo_256.data(1:64, :, :);
+    
+    % a vector of all eeg data
+    eeg_vec = squeeze(reshape(eeg_arr, 1, 1, []));
+    
+    % array and vector of the channel labels for each value in EEG.data
+    channel_labels_arr = cell(size(eeg_arr));
+    channel_label_lookup = {EEG_epo_256.chanlocs.labels};
+    for chan_nr = 1:size(eeg_arr, 1)
+        channel_labels_arr(chan_nr, :, :) = repmat(channel_label_lookup(chan_nr), size(channel_labels_arr, 2), size(channel_labels_arr, 3));
+    end
+    
+    channel_labels_vec = squeeze(reshape(channel_labels_arr, 1, 1, []));
+    
+    % array and vector of the item numbers for each value in EEG.data
+    times_arr = zeros(size(eeg_arr));
+    times_lookup = EEG_epo_256.times;
+    for time_idx = 1:size(eeg_arr, 2)
+        times_arr(:, time_idx, :) = repmat(times_lookup(time_idx), size(times_arr, 1), size(times_arr, 3));
+    end
+    
+    times_vec = squeeze(reshape(times_arr, 1, 1, []));
+    
+    % array and vector of the trial numbers
+    trials_arr = zeros(size(eeg_arr));
+    trials_lookup = beh_filt.item_nr;
+    for trl_idx = 1:size(eeg_arr, 3)
+        trials_arr(:, :, trl_idx) = repmat(trials_lookup(trl_idx), size(trials_arr, 1), size(trials_arr, 2));
+    end
+    
+    trials_vec = squeeze(reshape(trials_arr, 1, 1, []));
+    
+    % store sample-level results in the table
+    sample_res.ch_name = channel_labels_vec;
+    sample_res.item_nr = trials_vec;
+    sample_res.time = times_vec;
+    sample_res.uV = eeg_vec;
+    
+    % look up and store some info about the trials
+    trial_info_lookup = beh_filt(:, {'item_nr', 'condition', 'image', 'string'});
+    sample_res = outerjoin(sample_res, trial_info_lookup, 'MergeKeys', true);
+    
+    % sort by time, channel, item_nr
+    sample_res = sortrows(sample_res, {'time', 'ch_name', 'item_nr'});
+    
+    %% save the results
+    
+    disp('Saving results...')
+    writetable(sample_res, fullfile('sample_data_response', [subject_id, '.csv']));
+    
+end
+
+fprintf('\nFinished preprocessing picture-word data!\n')
+
+%% Functions
+
+% custom function for applying a Butterworth filter to EEGLAB data
+function EEG = eeglab_butterworth(EEG, low, high, order, chanind)
+    fprintf('Applying Butterworth filter between %g and %g Hz (order of %g)\n', low, high, order)
+    % create filter
+    [b, a] = butter(order, [low, high]/(EEG.srate/2));
+    % apply to data (requires transposition for filtfilt)
+    data_trans = single(filtfilt(b, a, double(EEG.data(chanind, :)')));
+    EEG.data(chanind, :) = data_trans';
+end
+
+% custom function for finding the closest timepoint in an EEG dataset
+function [idx, closesttime] = eeglab_closest_time(EEG, time)
+    dists = abs(EEG.times - time);
+    idx = find(dists == min(dists));
+    % in the unlikely case there are two equidistant times, select one randomly
+    if numel(idx) > 1
+        fprintf('Two equidistant times! Selecting one randomly.')
+        idx = idx(randperm(numel(idx)));
+        idx = idx(1);
+    end
+    closesttime = EEG.times(idx);
+end

File diff suppressed because it is too large
+ 168001 - 0
04 Analysis/analyse_13_sample_picture_res.csv


+ 143 - 0
04 Analysis/analyse_13_sample_picture_word_topo.R

@@ -0,0 +1,143 @@
+# This script takes the sample-level data from analyse_02, analyse_11, and analyse_12, fits linear mixed-effects models, and saves the fixed effects
+
+library(lme4)
+library(dplyr)
+library(purrr)
+library(readr)
+library(parallel)
+
+n_cores <- parallel::detectCores(all.tests=FALSE, logical=TRUE) - 2
+# n_cores <- 7
+
+# function to normalise between 0 and 1
+norm01 <- function(x, ...) (x-min(x, ...))/(max(x, ...)-min(x, ...))
+
+# get the stimuli's percentage of name agreement values
+stim <- read_csv("boss.csv", col_types = cols(perc_name_agree_denom_fq_inputs = col_number())) %>%
+  select(filename, perc_name_agree_denom_fq_inputs) %>%
+  rename(perc_name_agree = perc_name_agree_denom_fq_inputs)
+
+# import the max electrode data from the preprocessing, and set up the variables for the model
+get_sample_data <- function(path="sample_data", ch_i = NA) {
+  list.files(path, pattern=".+\\.csv$", full.names=TRUE) %>%
+    map_dfr(function(f) {
+      message(sprintf("  - importing %s", f))
+      x <- read_csv(
+        f,
+        col_types=cols(
+          subj_id = col_character(),
+          stim_grp = col_integer(),
+          resp_grp = col_integer(),
+          item_nr = col_integer(),
+          ch_name = col_character(),
+          time = col_double(),
+          uV = col_double(),
+          condition = col_character(),
+          image = col_character(),
+          string = col_character(),
+          .default = col_double()
+        ),
+        progress = FALSE
+      ) %>%
+        select(-stim_grp, -resp_grp, -item_nr)
+      if (all(!is.na(ch_i))) x <- filter(x, ch_name %in% ch_i)
+      x
+    }) %>%
+    left_join(stim, by=c("image" = "filename")) |>
+    mutate(
+      prop_agree = perc_name_agree/100,
+      pred_norm = norm01(prop_agree, na.rm=TRUE),
+      # as factors for size efficiency
+      ch_name = factor(ch_name),
+      image = factor(image),
+      string = factor(string),
+      time = factor(time),
+      subj_id = factor(subj_id)
+    ) |>
+    select(
+      time, condition, pred_norm,
+      subj_id, ch_name, image, string,
+      uV
+    )
+}
+
+# get models for each timepoint, for each electrode, and extract fixed effects, for each time locked event
+
+paths <- c("sample_data", "sample_data_picture", "sample_data_response")
+
+for (p in paths) {
+  
+  message(sprintf("Modelling time points and channels individually for %s", p))
+  
+  # get list of data frames for each time point
+  d_list <- get_sample_data(path=p)  |>
+    mutate(cong_dev = as.numeric(scale(ifelse(condition=="A2", 0, 1), center=TRUE, scale=FALSE))) |>
+    group_split(time, ch_name)
+  
+  gc_out <- gc()
+  
+  message(sprintf("  - fitting models on %g cores", n_cores))
+  
+  # fit models in parallel
+  cl <- makeCluster(n_cores)
+  cl_packages <- clusterEvalQ(cl, {
+    library(dplyr)
+    library(lme4)
+  })
+  
+  fe_res <- parLapply(cl, d_list, function(d_i) {
+    # m <- lme4::lmer(
+    #   uV ~ cong_dev * pred_norm +
+    #     (cong_dev * pred_norm | subj_id) +
+    #     (cong_dev | image) +
+    #     (1 | string),
+    #   REML=FALSE,
+    #   control = lmerControl(optimizer="bobyqa"),
+    #   data=d_i
+    # )
+    m <- lme4::lmer(
+      uV ~ cong_dev * pred_norm +
+        (1 | subj_id) +
+        (1 | image) +
+        (1 | string),
+      REML=FALSE,
+      control = lmerControl(optimizer="bobyqa"),
+      data=d_i
+    )
+    
+    m |>
+      summary() |>
+      with(coefficients) |>
+      as_tibble(rownames = "fe") |>
+      mutate(
+        time = unique(d_i$time),
+        ch_name = unique(d_i$ch_name)
+      )
+  }) |>
+    reduce(bind_rows)
+  
+  stopCluster(cl)
+  
+  fe_res_tidy <- fe_res |>
+    mutate(
+      fe_lab = factor(recode(
+        fe,
+        `(Intercept)` = "Intercept",
+        cong_dev = "Congruency",
+        pred_norm = "Predictability",
+        `cong_dev:pred_norm` = "Congruency * Predictability"
+      ), levels = c("Intercept", "Congruency", "Predictability", "Congruency * Predictability")),
+      fe_lab_newline = factor(fe_lab, labels = c("Intercept", "Congruency", "Predictability", "Congruency\n* Predictability")),
+      bound_lower = Estimate - 1.96 * `Std. Error`,
+      bound_upper = Estimate + 1.96 * `Std. Error`
+    )
+  
+  p_short <- sub("_data", "", p, fixed=TRUE)
+  
+  write_csv(fe_res_tidy, sprintf("analyse_13_%s_res.csv", p_short))
+  
+  rm("d_list")
+  gc_out <- gc()
+  
+}
+

File diff suppressed because it is too large
+ 102401 - 0
04 Analysis/analyse_13_sample_res.csv


File diff suppressed because it is too large
+ 122881 - 0
04 Analysis/analyse_13_sample_response_res.csv


+ 401 - 0
04 Analysis/analyse_14_plot_sample_picture_word_topo.m

@@ -0,0 +1,401 @@
+%% Setup
+
+addpath('matlab_extensions');
+
+% import eeglab (assumes eeglab has been added to path), e.g.
+addpath('C:/EEGLAB/eeglab2020_0')
+[ALLEEG, EEG, CURRENTSET, ALLCOM] = eeglab;
+
+erps = {'sample', 'sample_picture', 'sample_response'};
+
+% settings for x ticks
+time_mins = [-200, -200, -1000];
+time_steps = [100, 200, 200];
+
+% special settings for y range of fixed effect slopes
+slope_y_lims = {[], [-4, 4], []};
+slope_y_ticks = {[], -4:2:4, []};
+
+% settings for the event lines and topo plots
+event_times = {[500], [1000], []};
+event_rects = {[], [1150, 1650], []};
+
+erp_topo_times = {[105, 185, 235, 300, 400, 660], [110, 150, 225, 600, 800, 1150], [-800, -600, -400, -200, 0, 200]};
+
+for e_nr = 1:numel(erps)
+
+    e = erps{e_nr};
+
+    % import LMM estimates
+    res_file = sprintf('analyse_13_%s_res.csv', e);
+    fe_res = readtable(res_file);
+    fe_res.ch_name = upper(fe_res.ch_name);  % upper case for case insensitivity in joins
+
+    % get summary variables
+    channels = table2array( unique(fe_res(:, 'ch_name')) );
+    times = table2array( sortrows(unique(fe_res(:, 'time'))) );
+    fes = table2array( unique(fe_res(:, 'fe_lab')) );
+    
+    % get channel locations
+    chanlocs = pop_chanedit([], 'load', {'BioSemi64.loc', 'filetype', 'loc'});
+    
+    % get channel order
+    ch_name = upper({chanlocs.labels}');
+    ch_order = cell2mat({chanlocs.urchan}');
+    chan_orders = table(ch_name, ch_order);
+    
+    % only keep channels in the data in the table, and the chanlocs struct
+    ch_in_fe = false(numel(chan_orders.ch_name), 0);
+    
+    for i = 1:numel(chan_orders.ch_name)
+        ch = chan_orders.ch_name{i};
+        ch_in_fe(i) = sum( strcmp(unique(fe_res.ch_name), ch) );
+    end
+    
+    chan_orders = chan_orders(ch_in_fe, :);
+    chanlocs = pop_chanedit(chanlocs, 'delete', find(~ch_in_fe));
+    
+    % join channel order to fe_res
+    fe_res = join(fe_res, chan_orders, 'Keys', 'ch_name');
+    
+    %% get xyz colours
+    
+    % note: order of x and y are flipped, and y is reversed, to approximate the MNE coordinates
+    rgb = [-[chanlocs.Y]; [chanlocs.X]; [chanlocs.Z]];
+    
+    for colour_chan = 1:3
+        rgb(colour_chan, :) = normalize(rgb(colour_chan, :), 'range');
+    end
+    
+    %% get fixed effects in matrices
+    
+    % coerce fixed effects into matrices of time x channel
+    mats = struct();
+    
+    for fe = fes'
+        % subset relevant rows of table
+        fe_res_i = fe_res(strcmp(fe_res.fe_lab, fe), :);
+        % sort rows of table
+        fe_res_i = sortrows(fe_res_i, {'ch_order', 'time'});
+        % extract fixed effect estimates and put in matrix
+        % (matlab reshape uses fortan order)
+        field_name = regexprep(fe{:}, '*', '');  % asterisk disallowed in field name
+        field_name = regexprep(field_name, '\s\s', '_');  % space disallowed in field name
+        mats.(field_name) = reshape( fe_res_i.Estimate, [numel(times), numel(channels)] );
+    end
+    
+    %% Join all plots together
+    
+    fig = figure;
+    set(0, 'DefaultLineLineWidth', 1);
+    set(0, 'DefaultAxesLineWidth', 1);
+    vhline_width = 1.25;
+    
+    fe_names = {'Intercept', 'Congruency', 'Predictability', 'Congruency_Predictability'};
+    
+    % get tidy labels
+    fe_labels = {};
+    max_nchar = 0;
+    for fe = 1:numel(fe_names)
+        fe_labels{fe} = split( strcat( regexprep(fe_names{fe}, '_', ' ×nl'), ' (µV)' ), 'nl' );
+        if strlength(fe_labels{fe}) > max_nchar
+            max_nchar = strlength(fe_labels{fe});
+        end
+    end
+    
+    % pad and justify with spaces
+    for fe = 1:numel(fe_names)
+        fe_labels{fe} = strjust(pad(fe_labels{fe}, max_nchar), 'center');
+    end
+    
+    % get topoplot settings
+    y_lims = [floor(min(fe_res.Estimate)), ceil(max(fe_res.Estimate))];
+    map_lims = [-max(abs(y_lims)), max(abs(y_lims))];
+    map_range = map_lims(2) - map_lims(1);
+    map_ticks = map_lims(1):(map_range/4):map_lims(2);
+    
+    cmap_resolution = 101;
+    topo_resolution = 101;
+    bwr = [0, 0, 1; 1, 1, 1; 1, 0, 0];
+    %cmap_bwr = interp1([-cmap_resolution/2; 0; cmap_resolution/2], bwr, (-cmap_resolution/2:cmap_resolution/2));
+    
+    cmap_bwr = twilight_shifted(cmap_resolution);
+    
+    % linear interpolation to focus on specific timepoints
+    topo_times = erp_topo_times{e_nr};
+    
+    % x axis for timecourse
+    time_lims = [round(min(times), -1), round(max(times), -1)];
+    time_ticks = time_mins(e_nr):time_steps(e_nr):time_lims(2);
+    
+    % parameters for adjusting plotting space
+    x_adj_l = 0.025;  % offset space to be added to the left
+    x_adj_r = 0.01;  % space to be added to width
+    
+    y_adj_top = -0.05; % space to be added (or removed if negative) from the top
+    
+    % define the shape of the subplots
+    sp_ncol = numel(topo_times);
+    sp_nrow = 2 * numel(fe_names);
+    
+    % generate subplot coordinates in normalised units
+    % xmin = .175;
+    xmin = .255;
+    xmax = .975;
+    ymin = -0.05;
+    ymax = 0.89;
+    
+    x_width = (xmax-xmin)/sp_ncol;
+    x_props = xmin:x_width:xmax;
+    y_height = (ymax-ymin)/sp_nrow;
+    y_props = flip(ymin:y_height:ymax);
+    
+    y_props_topo_adj = y_props + y_height*0.075;
+    topo_yheight = y_height * 0.8;
+    
+    % plot!
+    
+    % plot colours for timecourse legend (must come first to preserve colorbar)
+    colour_legend_pos = [xmin/2 - x_width/1.5, y_props_topo_adj(1), x_width, topo_yheight];  % x position may need adjusting if fig size changes
+    subplot('Position', colour_legend_pos);
+    hold on;
+    topoplot(zeros(numel(chan_orders.ch_name), 0), chanlocs, 'electrodes', 'off');
+    % set line width
+    set(findall(gca, 'Type', 'Line'), 'LineWidth', 1);
+    for i = 1:numel(chan_orders.ch_name)
+        topoplot(zeros(numel(chan_orders.ch_name), 0), chanlocs, 'colormap', [0,0,0], 'emarker', {'.', rgb(:, i)', 11, 1}, 'plotchans', i, 'headrad', 0);
+    end
+    hold off
+    
+    % now, plot everything else
+    for fe = 1:numel(fe_names)
+        fe_name = fe_names{fe};
+    
+        mat_fe = mats.(fe_name);
+    
+        
+        if strcmp(fe_name, 'Intercept')
+            fe_topo_lims = map_lims;
+        else
+            if numel(slope_y_lims{e_nr})>0
+                fe_topo_lims = slope_y_lims{e_nr};
+            else
+                fe_topo_lims = map_lims;
+            end
+        end
+
+        topo_nfus = zeros(numel(topo_times), 2);
+    
+        % plot topoplots
+        for time_nr = 1:numel(topo_times)
+            subplot('Position', [x_props(time_nr), y_props_topo_adj( fe * 2 - 1 ), x_width, topo_yheight]);   
+            mat_fe_topo_t = interp1(times, mat_fe, topo_times(time_nr));
+            topoplot(mat_fe_topo_t, chanlocs, 'colormap', cmap_bwr, 'gridscale', topo_resolution, 'electrodes', 'off', 'maplimits', fe_topo_lims);
+            % get lines
+            lines = findall(gca, 'Type', 'Line');
+            % set line width
+            set(lines, 'LineWidth', 1);
+            % get surfaces to extract y locations
+            surf = findall(gca, 'Type', 'Surface');
+            % get topo location to draw line to
+    %         [tx, ty] = ds2nfu(mean(lines(4).XData), min(lines(4).YData));
+            [tx, ty] = ds2nfu(0, min(surf.YData(:)));
+            topo_nfus(time_nr, :) = [tx, ty];
+        end
+        
+        % plot timecourse
+        subplot('Position', [x_props(1), y_props( fe * 2 ), x_width*sp_ncol, y_height])
+        hold on
+
+        % plot rectangle for jittered period
+        rect = event_rects{e_nr};
+        if ~isempty(rect)
+            rectangle('Position', [rect(1) map_lims(1) rect(2)-rect(1) map_range], 'FaceColor', [0.4, 0.4, 0.4], 'EdgeColor', 'none')
+        end
+        
+        if ~strcmp(fe_name, 'Intercept')
+            plot(times, mats.('Intercept'), 'Color', [0.8, 0.8, 0.8]);
+        end
+        
+        for ch = 1:size(mat_fe, 2)
+            plot(times, mat_fe(:, ch), 'Color', rgb(:, ch), 'LineWidth', 0.55);
+        end
+    
+        xlim(time_lims)
+        xticks(time_ticks)
+        set(gca,'FontSize', 8)
+    
+        ylim(fe_topo_lims)
+
+        xline(0, '--', 'LineWidth', vhline_width)
+        yline(0, '--', 'LineWidth', vhline_width)
+
+        for t = event_times{e_nr}
+            xline(t, '--', 'LineWidth', vhline_width)
+        end
+    
+        tc_nfus = zeros(numel(topo_times), 2);
+    
+        for time_nr = 1:numel(topo_times)
+            xline(topo_times(time_nr), 'LineWidth', vhline_width, 'Color', [0.4, 0.4, 0.4])
+            % get the location of the top of the line
+            [tcx, tcy] = ds2nfu(topo_times(time_nr), max(fe_topo_lims));
+            tc_nfus(time_nr, :) = [tcx, tcy];
+        end
+    
+        hold off
+    
+        % draw annotation lines between lines and topoplots
+        for time_nr = 1:numel(topo_times)
+            % colour is set to .6 intensity rather than .4 in vertical lines above - hacky solution for bug(?)
+            annotation('line', [tc_nfus(time_nr, 1), topo_nfus(time_nr, 1)], [tc_nfus(time_nr, 2), topo_nfus(time_nr, 2)], 'Color', [0.6, 0.6, 0.6], 'LineWidth', vhline_width);
+        end
+    
+        % plot styling
+    
+        % add colorbar to the y axis to show corresponding y and colour values
+        ax = gca;
+        ax_pos = get(ax, 'Position');
+        cb = colorbar(ax);
+
+        if strcmp(fe_name, 'Intercept')
+            cb.Ticks = map_ticks;
+        else
+            if numel(slope_y_ticks{e_nr})>0
+                cb.Ticks = slope_y_ticks{e_nr};
+            else
+                cb.Ticks = map_ticks;
+            end
+        end
+
+%         cb.Ticks = map_ticks;
+        cb.TickDirection = 'out';
+        cb.Position = [ax_pos(1)-0.015, ax_pos(2), .015, ax_pos(4)];
+    
+        cb.Label.String = fe_labels{fe};
+        cb.Label.Rotation = 0;
+        cb.Label.VerticalAlignment = 'middle';
+        cb.Label.HorizontalAlignment = 'right';
+        cb.FontSize = 8;
+        cb.Label.FontSize = 11;
+        
+        ax.Colormap = cmap_bwr;
+        ax.CLim = fe_topo_lims;
+        ax.YTickLabels = [];
+        ax.YLabel = [];
+        ax.YAxis.Visible = 'off';
+    
+        % ticks outside plot
+        set(gca,'TickDir','out');
+    
+        if strcmp(fe_name, 'Congruency_Predictability')
+            % add x label
+            xlabel('Time (ms)', 'FontSize', 12)
+        else
+            % remove x tick labels
+            set(gca,'XTickLabel', {[]});
+        end
+    
+        % put white line over the default axes to remove them while keeping the ticks and labels (hacky)    
+        ax2 = axes('Position',gca().Position,...
+        'XColor',[1 1 1],...
+        'YColor',[1 1 1],... 
+        'Color','none',...
+        'XTick',[],...
+        'YTick',[]);
+    
+    end
+    
+    set(fig, 'Units', 'Inches', 'Position', [0, 0, 6.5, 7], 'PaperUnits', 'Inches', 'PaperSize', [6.5, 7])
+    
+    plot_file = sprintf('figs/14_topo_timecourse_%s.pdf', e);
+    exportgraphics(fig, plot_file, 'BackgroundColor','none')
+
+    if strcmp(e, 'sample')
+
+        % plot the N400 model predictions
+        N400_t = 400;
+    
+        pred_levels_uncoded = 0.1:0.1:1;
+        pred_levels = interp1([0.07, 1], [0, 1], pred_levels_uncoded);  % because predictability was normalised between 0 and 1
+        %cong_levels = [-0.5, 0.5];
+        cong_levels = [-0.4938494, 0.5061506];  % [incong, cong] - copied from the unique deviation values in the script that fits the models
+    
+        B = struct();
+        for fe_name = fe_names
+    	    B.(fe_name{:}) = interp1(times, mats.(fe_name{:}), N400_t);
+        end
+    
+        model_preds = zeros(numel(pred_levels), numel(cong_levels), height(chan_orders));
+
+%         topoplot(B.Congruency, chanlocs, 'colormap', cmap_bwr, 'gridscale', topo_resolution, 'electrodes', 'off');
+    
+        for pred_lvl_nr = 1:numel(pred_levels)
+            p = pred_levels(pred_lvl_nr);
+    
+            for cong_lvl_nr = 1:numel(cong_levels)
+                c = cong_levels(cong_lvl_nr);
+    
+                model_preds(pred_lvl_nr, cong_lvl_nr, :) = B.Intercept +...
+                    B.Congruency * c +...
+                    B.Predictability * p +...
+                    B.Congruency_Predictability * p * c;
+    
+            end
+        end
+    
+        abs_max = max(abs([floor(min(model_preds, [], 'all')*2)/2, ceil(max(model_preds, [], 'all')*2)/2]));
+        preds_topo_lims = [-abs_max, abs_max];
+        N400_fig = figure;
+
+        t = tiledlayout(numel(cong_levels), numel(pred_levels) + 2, 'TileSpacing', 'none', 'Padding', 'none');
+
+        for cong_lvl_nr = [2, 1] %1:numel(cong_levels)
+            
+            nexttile;
+            
+            if cong_lvl_nr == 1
+                cong_lab = 'Incongruent';
+            else
+                cong_lab = 'Congruent';
+            end
+
+            text(0.33, 0.5, cong_lab, 'FontWeight', 'normal', 'HorizontalAlignment', 'center', 'FontSize', 9)
+            ax = gca;
+            ax.Visible = 0;
+
+            for pred_lvl_nr = 1:numel(pred_levels)
+                t_i = nexttile;
+                topoplot(model_preds(pred_lvl_nr, cong_lvl_nr, :), chanlocs, 'colormap', cmap_bwr, 'gridscale', topo_resolution, 'electrodes', 'off', 'maplimits', preds_topo_lims);
+                set(findall(gca, 'Type', 'Line'), 'LineWidth', 0.5);
+
+                if cong_lvl_nr == 2
+                    title( sprintf('%g%%', pred_levels_uncoded(pred_lvl_nr)*100), 'FontWeight', 'normal', 'FontSize', 9)
+                end
+            end
+
+            if cong_lvl_nr == 2
+                nexttile([2 1])
+                ax = gca;
+                ax.Visible = 0;
+
+                cb = colorbar(t_i);
+                cb.Ticks = preds_topo_lims(1):2.5:preds_topo_lims(2);
+                cb.TickDirection = 'out';
+                cb.Label.String = 'N400 Amplitude (µV)';
+                cb.FontSize = 8;
+                cb.Label.FontSize = 9;
+                cb.Position = [0.92,0.13,0.0125,0.75];
+                cb.Label.Position(1) = 0.8 * cb.Label.Position(1);
+            end
+
+        end
+
+        set(N400_fig, 'Units', 'Inches', 'Position', [0, 0, 6.5, 1.4], 'PaperUnits', 'Inches', 'PaperSize', [6.5, 1.4])
+
+        N400_plot_file = sprintf('figs/14_N400_%s.pdf', e);
+        exportgraphics(N400_fig, N400_plot_file, 'BackgroundColor','none')
+
+    end
+
+end

File diff suppressed because it is too large
+ 1469 - 0
04 Analysis/boss.csv


BIN
04 Analysis/erplab_running_version.erpm


+ 70 - 0
04 Analysis/max_elecs.csv

@@ -0,0 +1,70 @@
+subject_id,max_elec_bacs,max_time_bacs,max_diff_bacs,max_elec_noise,max_time_noise,max_diff_noise
+,,,,,,
+1,P9,136.71875,1.555241,P9,199.21875,5.496669
+2,TP7,146.484375,1.883921,PO7,199.21875,23.87398
+3,P9,121.09375,1.502857,TP7,193.359375,2.961614
+4,O1,199.21875,5.966243,O1,199.21875,6.576681
+5,TP7,154.296875,3.565121,CP5,150.390625,2.309417
+6,P5,142.578125,0.7830833,TP7,121.09375,1.709324
+7,TP7,150.390625,0.7797322,P9,199.21875,3.854183
+8,TP7,199.21875,3.41,O1,199.21875,20.64417
+9,P7,199.21875,3.668735,PO7,136.71875,8.711228
+10,CP5,150.390625,0.9335607,CP5,138.671875,1.580629
+11,P5,199.21875,3.511495,P9,199.21875,7.025156
+12,P5,199.21875,2.217614,P9,199.21875,9.104761
+13,PO7,199.21875,5.567117,PO7,199.21875,14.40367
+14,P7,199.21875,1.894452,P7,199.21875,6.690656
+15,PO7,199.21875,6.056674,O1,199.21875,12.43033
+16,CP5,162.109375,1.58543,TP7,197.265625,4.671831
+17,PO3,121.09375,2.466407,PO7,199.21875,9.661812
+18,PO3,132.8125,-0.02289961,CP5,199.21875,3.332692
+19,TP7,199.21875,3.400432,P9,199.21875,7.166287
+20,P5,199.21875,1.349259,P7,199.21875,6.417466
+21,P7,199.21875,2.822012,P7,199.21875,4.152136
+22,CP5,177.734375,0.6779526,PO7,132.8125,4.176888
+23,CP5,164.0625,2.344434,TP7,152.34375,3.858768
+24,CP5,197.265625,2.182065,PO7,199.21875,14.23336
+25,CP5,152.34375,1.23536,PO7,199.21875,14.43208
+26,TP7,199.21875,1.74915,PO7,199.21875,8.897171
+27,CP5,175.78125,1.160977,PO7,199.21875,22.96928
+28,PO7,191.40625,2.49857,P9,199.21875,5.424606
+29,CP5,199.21875,1.570606,P7,195.3125,7.983541
+30,CP5,175.78125,2.815941,P9,199.21875,10.37645
+31,P9,199.21875,4.782439,PO7,199.21875,9.887265
+32,CP5,156.25,0.9955507,O1,199.21875,9.634882
+33,PO7,199.21875,7.743178,PO7,199.21875,12.29184
+34,TP7,167.96875,1.280173,P7,199.21875,8.193825
+35,P7,199.21875,4.069245,PO7,199.21875,18.24561
+36,P9,199.21875,5.196513,PO7,199.21875,17.75576
+37,P9,121.09375,1.981554,P9,193.359375,2.278572
+38,PO3,199.21875,2.185105,P9,195.3125,16.96531
+39,TP7,185.546875,2.120515,TP7,185.546875,3.376915
+40,P7,121.09375,1.788295,TP7,136.71875,1.759744
+41,CP5,148.4375,1.817063,P9,199.21875,3.956536
+42,P7,199.21875,7.929962,PO7,199.21875,17.72324
+43,PO7,199.21875,8.668878,P9,199.21875,17.06076
+44,P7,142.578125,2.331278,P7,199.21875,5.105011
+45,P9,193.359375,5.297216,PO7,199.21875,16.17698
+46,TP7,166.015625,0.7291226,P9,199.21875,6.769402
+47,P5,193.359375,0.838177,TP7,191.40625,1.534477
+48,P9,199.21875,2.119571,P9,199.21875,5.703169
+49,P5,187.5,2.717711,PO7,199.21875,4.820813
+50,PO7,199.21875,1.921271,PO7,199.21875,9.987026
+51,CP5,138.671875,0.5824715,PO7,199.21875,18.37492
+52,P5,199.21875,2.306229,P9,195.3125,9.622493
+53,TP7,123.046875,3.42209,P9,199.21875,11.19392
+54,PO7,199.21875,8.316064,PO7,199.21875,15.9758
+55,P9,199.21875,4.639625,PO7,199.21875,21.08067
+56,P9,136.71875,1.963759,P9,199.21875,3.785243
+57,P9,136.71875,3.178551,O1,199.21875,8.29003
+58,O1,199.21875,6.043978,O1,193.359375,13.11263
+59,PO7,199.21875,3.767185,PO7,199.21875,12.67805
+60,TP7,199.21875,3.076091,PO7,199.21875,13.26992
+61,O1,138.671875,1.799824,PO7,142.578125,3.1345
+62,CP5,181.640625,2.055486,PO7,199.21875,23.85325
+63,P7,199.21875,6.139647,P7,191.40625,7.60163
+64,P7,125,4.379202,P9,199.21875,10.34344
+65,PO3,199.21875,5.218065,P7,199.21875,8.154162
+66,CP5,158.203125,1.408301,CP5,142.578125,1.091037
+67,P7,152.34375,3.889155,P7,199.21875,9.807606
+68,CP5,136.71875,0.7233823,P7,199.21875,3.693493