Metric Data Format
Experimental and clinical data are represented in DigiPopData.jl as a collection of metrics, i.e. aggregated summary statistics describing real patient populations. Each metric defines a population-level experimental target that can be compared to individual-level simulations.
Internal Julia representation
Internally, each row of experimental data is represented as a MetricBinding. A MetricBinding links:
- an experimental metric (e.g. mean, quantile, category, survival),
- a
scenario(e.g. treatment arm), - and an
endpointcolumn in the simulation table.
Example:
mb = MetricBinding(
"m_mean_conc24_Tx", # metric id
"Tx", # scenario (e.g. treatment arm)
MeanMetric(
40, # experimental sample size
2.1, # mean value
0.2 # standard deviation
),
"conc_t24", # endpoint column in simulation data
true, # active flag
2.0 # optional loss weight
)Tabular metric definition
For practical workflows, metrics are usually defined in a table and loaded in bulk (e.g. from CSV or DataFrame) using parse_metric_bindings. Each row corresponds to one metric.
Core columns
A metric table typically includes:
id— unique object identifieractive— whether the metric is included in the loss (1or0, default is1)weight— optional multiplier applied to this row's contribution to the total loss (default is1.0)scenario— scenario identifier used to match simulation conditionsendpoint— name of the simulation output column used for comparisonmetric.type— metric type (e.g.mean,mean_sd,category,quantile,survival)metric.size— experimental sample sizemetric.<prop>— additional metric-specific properties, see more details in Overview
Example table
The table below defines two metrics for the same scenario Tx:
| id | active | scenario | metric.type | metric.size | weight | endpoint | metric.mean | metric.sd | metric.levels | metric.values |
|---|---|---|---|---|---|---|---|---|---|---|
| m_conc24_mean_Tx | 1 | Tx | mean | 40 | 1.0 | conc_t24 | 2.10 | 0.2 | ||
| m_biomarker_q_Tx | 1 | Tx | quantile | 40 | 0.5 | biomarker | 0.25;0.50;0.75 | 0.1;1.35;10.1 |
Interpretation:
m_conc24_mean_Txtargets the mean ofconc_t24in the experimental population.m_biomarker_q_Txtargets the quantiles (0.25, 0.50, 0.75) ofbiomarker.
In practice, you may store only the columns required by the metric types used in your dataset. In case of multiple metrics types, some columns may be left empty for certain rows (e.g. metric.mean and metric.sd are not used for quantile metrics).
Loading from CSV
using CSV, DataFrames
metrics_df = CSV.File("metrics.csv") |> DataFrame
metrics = parse_metric_bindings(metrics_df)