Splits
We often need to divide the original dataset in different subgroups to see if they show different distributions.For example, how do the distributions of babies who died a few days after their birth compare with the distributions of general population ?
In this program, this is handled by the notion of split.
A split permits to generate subgroups from the original groups.
In general, a study has a split with only one subgroup containing the whole dataset.
The working directory
Taking the example of a dataset with birth and death dates, and a study which defines one split, called "age" (making sub-groups depending on the age at death):working-dir ├── controls │ ├── control-001 │ │ ├── birth │ │ ├── birth-death │ │ └── death │ ├── ... │ └── control-100 │ ├── birth │ ├── birth-death │ └── death ├── split-age │ ├── 01--0-2days │ │ ├── chi2 │ │ ├── data.bz2 │ │ ├── expected │ │ │ ├── birth │ │ │ ├── birth-death │ │ │ └── death │ │ └── observed │ │ ├── birth │ │ ├── birth-death │ │ └── death │ ├── ... │ └── 08--90years-150years │ ├── chi2 │ ├── data.bz2 │ ├── expected │ │ ├── birth │ │ ├── birth-death │ │ └── death │ └── observed │ ├── birth │ ├── birth-death │ └── death └── split-all └── 01--0-150years ├── chi2 ├── data.bz2 ├── expected │ ├── birth │ ├── birth-death │ └── death └── observed ├── birth ├── birth-death └── death