Run cifti_conn_wrapper.py to generate correlation matrices from BOLD dense or parcellated time series data.
Clone this repository and save it somewhere on the Linux system that you want to use it from.
- Python v3.5.2 or greater
- MathWorks MATLAB Runtime Environment (MRE) version 9.1 (2016b)
- Washington University Workbench Command (wb_command)
This wrapper uses .conc text files for several of its inputs. All of them should only contain valid file paths. Although relative paths can be used, they will only work when running the wrapper from the right directory, so using absolute paths is recommended. Each line of each .conc file should have one path and nothing else.
All .conc files used as input to this wrapper should have the same number of file paths, and therefore the same number of lines. That is required because the wrapper assumes that each path represents the same subject session as the path at the same line in every other input .conc file.
None of the time series files listed in the .conc files should have exactly the same base file name. If any do, then the wrapper will overwrite one of their outputs with the other one's outputs.
If you want to run this wrapper on only one subject session, you do not need to use a .conc file. Instead, you can use the file paths that would have gone into the .conc files directly. For example, you can use your *.ptseries.nii file path for the series-file argument, the *power_2014_FD_only.mat file for the --motion argument, etc.
This wrapper processes dense (dtseries) or parcellated (ptseries) time series data. For dtseries, each subject session is expected to have one file following the naming convention XXXXX_Atlas.dtseries.nii. Smoothing will create an additional dtseries called XXXX_Atlas_SMOOTHED.dtseries.nii.
The wrapper expects file naming conventions to follow BIDS specifications. So, every file must have a different name. If multiple files have the same name, then they will collide and all but one of their output files will be overwritten.
When run in matrix mode with the --make-conn-conc flag, the wrapper will create a .conc file listing the paths to the connectivity matrix files. It will automatically generate the filename, and this filename depends on several parameters (including series-file, --fd-threshold, and --motion). So a .conc file made from a run with specific values of those parameters may not work for runs with other values of those parameters. Changing the name of the .conc file may fix this problem.
-
Run
matrixmode on yourdtseriesorptseriesto generate a correlation matrix (in Fisher-Z by default) of all the greyordinates/parcellations to all other greyordinate/parcellations. -
Run
templatemode to build an average template connectivity matrix of a list of subjects. It adds all the connectivity matrices one by one, then divides by the number of subjects. -
Run
pairwise_corrmode to make a correlation of correlation martices. This compares the connectivity matrix of each individual to the template, and provides a vector where each element is the correlation of connectivity to that greyorindate/parcellation.
cifti_conn_wrapper.py can run any of these modes. It requires 4 positional arguments, and accepts many optional arguments. Each mode corresponds to one compiled MATLAB file in the ./src/ directory.
These arguments must be given in order:
-
series-filetakes one path to the.concfile with a list of paths to each dense (dtseries) or parcellated (ptseries) timeseries file. -
trtakes the repetition time (time interval between frames of a scan) for your data as a floating-point number. -
outputtakes a path to the directory which will be filled with output data. -
scripts-to-runtakes one or more argument(s), the name(s) of each mode that you want to run.
For more usage information, call this script with the --help command: python3 cifti_conn_wrapper.py --help
Since only the four positional argument are technically required, this is a valid command:
python3.5 cifti_conn_wrapper.py ./raw/group_ptseries.conc 2.5 ./data/ matrix
However, running with no optional arguments will not do any motion correction or smoothing. It will also include subject sessions with any amount of good data.
Here is a common use case of the cifti_conn_wrapper in matrix mode:
python3.5 cifti_conn_wrapper.py --motion raw/round1_motion.conc --mre-dir /home/code/MATLAB_MCR/v91 --wb-command /home/code/workbench/bin_linux64/wb_command --minutes 4 --fd-threshold 0.3 /home/data/processed/round1_ptseries.conc 2.5 /home/data/processed matrix
This case includes the 4 required arguments, as well as the fd-threshold, minutes, motion, mre-dir, and wb-command optional arguments.
This wrapper can run any combination of the three modes in any order. To run multiple modes, list all of their names in the order that you want the wrapper to run them. Here is an example command which will run all three scripts in order:
python3.5 cifti_conn_wrapper.py ./raw/group_ptseries.conc 4 ./data/ matrix template pairwise_corr
There are three kinds of options: Server-dependent flags used by all 3 run modes, flags used by matrix and template mode, and flags used by template and pairwise_corr mode.
| matrix | template | pairwise_corr | |
|---|---|---|---|
--mre-dir |
Y | Y | Y |
--wb-command |
Y | Y | Y |
--additional-mask |
Y | Y | |
--beta8 |
Y | Y | |
--dtseries |
Y | Y | |
--fd-threshold |
Y | Y | |
--left and --right |
Y | Y | |
--make-conn-conc |
Y | Y | |
--minutes |
Y | Y | |
--motion |
Y | Y | |
--remove-outliers |
Y | Y | |
--smoothing-kernel |
Y | Y | |
--suppress-warnings |
Y | Y | |
--keep-conn-matrices |
Y | Y | |
--template |
Y | Y |
These arguments apply to all 3 run modes. If they are excluded, then by default the wrapper will use hardcoded paths which are only valid on the RUSHMORE or Exacloud servers. So these arguments are required if this script is run on a different server or locally. However, if there is already a wb_command file in the user's BASH PATH variable, then the script will use that.
-
--mre-dirtakes one argument, a valid path to the MATLAB Runtime Environment directory. Example:--mre-dir /usr/home/code/Matlab2016bRuntime/v91 -
--wb-commandtakes one argument, a valid path to the workbench command file. Example:--wb-command /usr/local/home/wb_command
These optional arguments apply only to the first 2 run modes, matrix and template.
-
--smoothing-kerneltakes the smoothing kernel as one floating-point number. Include this argument to use smoothing, but not otherwise. If theseries_filehasptseriesfiles, then smoothing will use the--dtseriesargument. By default, the wrapper will not do smoothing. -
--minutestakes the minutes limit as one floating-point number. The minutes limit is the minimum number of minutes of good data necessary for a subject session to be included in the correlation matrix. By default, the wrapper will have no minutes limit and make anallframesbelowFDXmatrix. Subjects will have different numbers of time points that in each connectivity matrix. -
--fd-thresholdtakes the motion threshold distinguishing good from bad data. This floating-point number is the maximum amount of acceptable motion between frames of a scan. Raising the FD threshold excludes more data by setting a more stringent quality requirement. The default value is0.2.
These arguments can be included without a value.
-
--remove-outlierswill remove outliers from the BOLD signal. By default, frames will only be censored by the FD threshold. -
--make-conn-concwill make a list of connectivity matrices created by this wrapper. By default, the wrapper will not make a list. Runningpairwise_corrmode will only work if there is already a list of connectivity matrices. Create one by runningmatrixortemplatemode with the--make-conn-concflag. That will automatically build the.concfile name and save that file in theoutputfolder. -
--beta8will run a beta version to reduce file size. Include this argument to reduce floating point precision and discard lower triangle of matrix. Exclude it to leave the same. If included, this will produce 8Gb.dconns. Otherwise, this will make 33Gb.dconns. This option does nothing forptseries. -
--suppress-warningswill prevent the wrapper from asking user for confirmation if the.dconnfiles created by the wrapper will exceed a certain threshold. By default, the wrapper will warn the user if it will create files totaling over 100 GB. This argument does nothing forptseries.
Each of the arguments below accepts one value, a valid file or directory path. Each argument has a default value, but the user can optionally specify a different path.
-
--inputtakes a path to the directory containing all input.concfiles. By default, this will be the directory containingseries-file. -
--motiontakes the name of a.conctext file listing paths pointing to FNL motion mat files for each dt or ptseries. This flag is necessary for motion correction, since by default the wrapper will not do any motion correction. -
--leftand--righttake the.concfile name of subjects' left and right midthickness files, respectively. These arguments are only needed for smoothing. If these flags are included without filenames is given, thenADHD_DVARS_group_left/right_midthickness_surfaces.concwill be used as the default name. If only a filename is given, then the wrapper will look for the file in the--inputdirectory. However, this argument also accepts absolute paths. -
--dtseriestakes the path to 1 .conc file with a list of.dtseries.niifile paths. If theseries-filehas a list of paths to .ptseries.nii files, then a dtseries .conc file is still needed for outlier detection and removal. If--dtseriesis excluded, then this script will try to find a dtseries.concfile in the same location as theseries-fileargument. -
--additional-masktakes the path to an additional mask on top of the FD threshold. The mask should be a.txtfile with0s and1s where0s are frames to be discarded and1s are frames to be used to make your matrix. This mask can be used to extract rests between blocks of task. This vector of0s and1s should be the same length as your dtseries. By default, no additional mask is used.
These arguments only apply to the second and last run modes, template and pairwise_corr.
-
--keep-conn-matriceswill make the wrapper keep thedconn/pconnfiles after creating them. Otherwise, it will delete thed/pconnsafter adding them to the averaged/pconn. Thed/pconnfiles are needed to run thepairwise_corrmode, which compares thed/pconnfiles to the average file created by thetemplatemode. So if this flag is excluded andpairwise_corris run, then thed/pconnfiles will be kept untilpairwise_corrfinishes. -
--templatetakes the full path to a template file created by runningtemplatemode. Ifpairwise_corrmode is run beforetemplatemode, then the--templatefile should already exist at the specified path. By default, the wrapper will build the name of this file by combining the values of theseries-file,--motion,--fd-threshold, and--minutesarguments. The wrapper will then create, or look for, a file by that name in theoutputdirectory if needed.
Within the output folder, here is what the outputs will look like:
- The output of
cifti_conn_matrixwill look like./data/XXXXXX-X_FNL_preproc_Gordon_subcortical.ptseries.nii_5_minutes_of_data_at_FD_0.2.pconn.nii - The output of
cifti_conn_templatewill look like./data/dtseries_AVG.dconn.nii - The output of
cifti_conn_pairwise_corrwill look like./data/XXXXXX-X_FNL_preproc_Gordon_subcortical.ptseries.nii_5_minutes_of_data_at_FD_0.2.pconn.nii
This code build a connectivity matrix from BOLD times series data. The code has the option of using motion censoring (highly encouraged), to ensure that the connectivity matrix is accurate. It takes these arguments from the wrapper: series-file, time-series, motion-file, fd-threshold, tr, minutes-limit, smoothing-kernel, left, right, and beta8.
The time-series argument passed to cifti_conn_matrix is either 'dtseries' or 'ptseries'. The wrapper infers it from the series-file.
To avoid conflating multiple files with the same name listed in the input .conc, for each connectivity matrix, this wrapper generates a random hash composed of the first character of each folder name in its .d/ptseries.nii file's path. The wrapper also appends the first 5 characters of the .d/ptseries.nii file name. This gives each output connectivity matrix a unique but consistent filename.
This code builds a template d/pconn from a list of d/ptseries. If the d/pconn exists, if will load it instead of making it anew (calls cifti_conn_matrix_for_wrapper). It takes the same arguments from the wrapper as cifti_conn_matrix_for_wrapper, as well as keep-conn-matrices.
This compares the connectivity matrix of each individual to the template (see cifti_conn_template_for_wrapper), and provides a vector where each element is the correlation of connectivity to that greyorindate/parcellation. It takes these arguments from the wrapper: template, p-or-dconn, make-conn-conc, and keep-conn-matrices.
The p-or-dconn argument passed to cifti_conn_pairwise_corr_exaversion is simply the time-series argument reformatted: dtseries becomes dconn, and ptseries becomes pconn. By default, the wrapper builds the name of the matrices-conc file by combining the series-file, time-series, minutes, and fd-threshold arguments.
- The code added a pause function each time it computes outliers. The pause function was added because sometimes the outlier file was read before it was done being written.
- The outlier file persists after creation (currently saved as
dtseries_temp.txt. This file need to be deleted as part of regular clean-up, but has not been implemeneted.
- v1: added to make a "all-frames" dconn (using "none" as minutes limit). This will make dconns from subjects with differing amounts of time series data, but for subjects with a considerable amounts of data, this becomes less of an issue.
- added an option to keep dconns in cifti_conn_template.m rather then automatically deleting them.
- v1: add an option to remove outliers from the bold signal.
- Added an option to provide an additional mask in addition to the FD mask provided. This allows one to use periods of rest between tasks.
- Added an option to not make a list of dconns after running. When running this code in parallel often it would created many small files.
- Wrote Python CLI wrapper to run all three scripts, wrote documentation for the wrapper, and reorganized directory structure
- Fixed a bug where the wrapper would not run unless connectivity matrix list
.concfile already exited, even though the wrapper includes the functionality to make that list.
- Fixed a bug where
cifti-conn-matrixwould treat two different files with the same name in different folders as if they were the same file, skipping over one. Also, updated README to reflect that --remove-outliers, --additional-mask, and --make-conn-conc were added.
- Added the
--dtseriesparameter. - Updated description of how the hash appended to each connectivity matrix filename is generated.
- Added details about file naming limits/conventions which may cause problems if overlooked