1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556 |
- function [goodtrials, sample_wise_goodtrials] = remove_outliers(data,zthres)
- % THIS FUNCTION TREATS DATA AS TIME X TRIALS
- % input:
- % - data (time x trials)
- % - zthres (set this to 3 as default)
- %
- % - output:
- % - goodtrials (based on a trialwise removal of outliers)
- % - sample_wise_goodtrials (both trials that are outliers on average
- % and trials with outlying samples are removed)
- %
- % Rob Teeuwen 20190815
- % trial based outlier removal
- % average across samples, giving 1 value per trial. z-score these
- % values, and delete trials that have z-score higher than xxx.
- q = mean(data);
- mn = mean(q);
- st = std(q);
- z = abs((q-mn)./st);
- badtrials = find(z>zthres);
-
- % in addition, we can compute the zscore of all samples, and then
- % remove trials that have any samples higher than, let's say 10
- z2 = zscore(data,0,'all');
- z2 = (z2>10);
- badtrials2 = find(sum(z2));
- goodtrials = ones(1,size(data,2));
- goodtrials(badtrials) = 0;
-
- sample_wise_goodtrials = goodtrials;
- sample_wise_goodtrials(badtrials2) = 0;
- % test if there are any crazy z-scores, higher than 20. if so,
- % there could be other outliers that are masked by the
- % craziness of the crazy outliers, so we should consider
- % removing outliers twice.
- data_adj = data;
- while max(z) > 20
-
- % we have to treat outliers as missing values rather
- % than remove them, otherwise trial numbers won't match
- % up because 'data' will be of different size
- data_adj(:,badtrials) = nan(size(data,1),length(badtrials));
-
- q2 = nanmean(data_adj);
- mn2 = nanmean(q2);
- st2 = nanstd(q2);
- z = abs((q2-mn2)./st2);
- badtrials = find(z>zthres);
- % remove bad trials from goodtrials array
- goodtrials(badtrials) = 0;
- end
- end
|