
Simulate a longitudinal brain surface dataset with associated phenotype data
Source:R/simulate_dataset.R
simulate_dataset.Rd
Generates a synthetic longitudinal dataset for multiple sites/cohorts, each with multiple timepoints/sessions per subject. The function produces:
Brain surface data in FreeSurfer
.mgh
format, organised in a verywise folder structure (see vignettes for details).A matching
pheno
data frame with mock participant sex and age, saved as"phenotype.csv"
in thepath
directory.
This is useful for testing pipelines or demonstrations where realistic FreeSurfer-style data and phenotypic information are required.
Arguments
- path
Character string. Directory where the dataset should be created. Will be created if it does not exist.
- data_structure
Named list defining cohorts/sites. Each element is a list with:
"sessions"
Character vector of session labels.
"n_subjects"
Integer number of subjects.
- fs_template
Character (default =
"fsaverage"
). FreeSurfer template for vertex registration. This is used to determine the size of the synthetic brain surface data. Options:"fsaverage"
= 163842 vertices (highest resolution)"fsaverage6"
= 40962 vertices"fsaverage5"
= 10242 vertices"fsaverage4"
= 2562 vertices"fsaverage3"
= 642 vertices
- simulate_association
Optional. If numeric, must be of length equal to the number of generated files; if character, must have the format
"<beta> * <variable_name>"
. Associations are injected into one small region (the entorhinal cortex).- overwrite
Logical (default =
TRUE
). Whether to overwrite an existing phenotype file.- seed
Integer (default =
3108
). Random seed.- verbose
Logical (default =
TRUE
). IfTRUE
, print progress messages.- ...
Additional arguments passed to
simulate_freesurfer_data
.