I've been working on an fMRI dataset with 80+ subjects. For second-level thresholding, I've started using AFNI's 3dClustSim to estimate the appropriate cluster extent threshold given a particular height threshold (e.g. p < .005). The best (i.e. least likely to inflate Type I errors) method requires specific smoothness estimates from the first-level analysis. The basic workflow is:
- Write residuals from model estimation
- Estimate smoothness from the residuals
- Average smoothness estimates across participants
- Use average estimates to condition 3dClustSim when determining thresholds
Step 1. Write residuals
The first step is the have SPM write out the residuals images during model estimation. There are two relatively straightforward was to accomplish this:
- If you are using the Batch Editor, set "Write residuals" to Yes
- If you are instead scripting jobs, turn on model estimate like so:
matlabbatch{1}.spm.stats.fmri_est.spmmat = {'/path/to/subject/SPM.mat'};
matlabbatch{1}.spm.stats.fmri_est.write_residuals = 1; % save residuals
matlabbatch{1}.spm.stats.fmri_est.method.Classical = 1;
spm_jobman('initcfg')
spm_jobman('run', matlabbatch);
Once you have the residuals (one for each EPI image using during the model estimation), merge them into a single 4D file. I think the easiest is using FSL's fslmerge:
> fslmerge -t Residuals.nii `imglob Res_*`
If you want to stay in MATLAB, you can also get the same result using the Batch Editor, using the "3D to 4D File Conversion" utility. Again, you can script it instead:
stat_dir = '/path/to/stats/';
% Read in all Res_* files
res_files = spm_select('ExtFPList', repaired_dir, '^Res_.*.nii');
% Preallocate a cell to hold the files in the way SPM12 needs them
n_res = size(res_files, 1);
res_paths = cell(n_res,1); % must be a column vector, not a row vector
for ri=1:n_res
res_paths{ri} = res_files(ri,:);
end
res_batch{1}.spm.util.cat.vols = res_paths;
res_batch{1}.spm.util.cat.name = 'Residuals.nii';
res_batch{1}.spm.util.cat.dtype = 4;
spm_jobman('initcfg')
spm_jobman('run', res_batch);
I've also had to use ArtRepair on these data. Unfortunately, when ArtRepair re-estimates the design and generates new contrasts, it does not write out new residuals. I spent a day or so working on scripting up a solution to manually re-estimate the designs, until I discovered that SPM includes a function designed to accomplish this exact thing.
If you call spm_write_residuals() without any arguments, a GUI opens asking you to select an SPM.mat file, then indicate whether to adjust for the contrasts that have already been estimates (not desirable for these purposes).
You can also use it in a script:
sdat = load('/path/to/SPM.mat');
spm_write_residuals(sdat.SPM, 0);
Step 2. Estimate smoothness from residuals
The next step is to use AFNI's 3dFWHMx to estimate the smoothness parameters from the residuals. Setting the -acf flag is crucial. From the 3dFHWM man page:
A completely new method for estimating and using noise smoothness values is now available in 3dFWHMx and 3dClustSim. This method is implemented in the
https://afni.nimh.nih.gov/pub/dist/doc/program_help/3dFWHMx.html
'-acf' options to both programs. 'ACF' stands for (spatial) AutoCorrelation
Function, and it is estimated by calculating moments of differences out to
a larger radius than before.
Notably, real FMRI data does not actually have a Gaussian-shaped ACF, so the
estimated ACF is then fit (in 3dFWHMx) to a mixed model (Gaussian plus
mono-exponential) of the form
ACF(r) = a * exp(-r*r/(2*b*b)) + (1-a)*exp(-r/c)
where 'r' is the radius, and 'a', 'b', 'c' are the fitted parameters.
The apparent FWHM from this model is usually somewhat larger in real data
than the FWHM estimated from just the nearest-neighbor differences used
in the 'classic' analysis.
It's a good idea to mask the image with the mask.nii that SPM automatically generates. You'll also want to save the output to a text file to use later. I've found it easiest to use the terminal and navigate to the stats directory for the given subject:
> 3dFWHMx -mask mask.nii -acf tmp.txt Residuals.nii > acf_estimates.txt
Step 3. Average smoothness estimates across participants
In order to average the smoothness estimates, I've found it easiest to collect all of them into a single file. Specifically, the goal is to extract the first three values from the acf_estimates.txt file, which correspond to a, b, and c in the AutoCorrelation Function that 3dClustSim will use to run its Monte Carlo simulations. Again, I've found it easiest to use the terminal to accomplish this. The bash commands assumes your data is laid out like this:
- Study
- sub-01
- func
- stats
- func
- sub-02
- func
- stats
- func
- sub-01
You'll need to modify the exactly path specifications for this to run on your local system, but this bash loop will find every subject, then read the last line from the acf_estimates.txt file and append the Subject ID along with the first three values to a new file, acf_all_estimates.txt in the Study folder.
> STUDY_DIR="/path/to/Study/"
> cd STUDY_DIR
> for SUB in sub-[0-9][0-9]; do tail -n1 "${SUB}/stats/acf_estimates.txt" | awk -v subj=$SUB 'BEGIN { OFS="," } {print subj,$1,$2,$3}' >> acf_all_estimates.txt; done
Now that the estimates are merged into a single file, use awk again to average them
> awk -F ',' '{sum1 += $2; sum2 += $3; sum3 += $4;} END {print sum1/NR,sum2/NR,sum3/NR}' < acf_all_estimates.txt > smoothness_estimates.txt
Step 4. Use average estimates to condition 3dClustSim when determining thresholds
Finally, you can read in the averaged a, b, and c values to start up 3dClustSim. Here, it's helpful to use the mask.nii file generated by SPM at the second-level:
> 3dClustSim -acf `cat smoothness_estimates.txt` -mask 2nd_level/con_0001_A_gt_B/mask.nii > 3dClust_thresholds.txt
The above script saves the tables generated by 3dClustSim to a file called 3dClust_thresholds.txt. Its contents contain the tables of cluster extents, based on your desired alpha level and various height thresholds. It includes nine tables, using nearest-neighbor counting of 1, 2, or 3, first for one-sided thresholding, then two-sided thresholding, then bi-directional thresholding. For this study, I'm interested in the bi-directional values, using the nearest neighbor of 3, so my table is:
# CLUSTER SIZE THRESHOLD(pthr,alpha) in Voxels
# -NN 3 | alpha = Prob(Cluster >= given size)
# pthr | .10000 .05000 .02000 .01000
# ------ | ------ ------ ------ ------
0.050000 1980.5 2355.0 2854.5 3234.0
0.020000 949.0 1168.0 1455.3 1674.0
0.010000 569.0 730.8 921.0 1102.0
0.005000 344.5 466.5 623.0 740.7
0.002000 176.2 253.2 369.5 452.0
0.001000 101.6 155.0 239.7 320.0
0.000500 54.0 89.6 152.5 209.0
0.000200 18.7 38.7 80.3 114.0
0.000100 6.6 17.6 43.0 71.2
This means that for a height threshold of p < .005 and alpha = .05, the cluster extent threshold should be 466.