Metric Data Format
Experimental and clinical data are represented in DigiPopData.jl as a collection of metrics, i.e. aggregated summary statistics describing real patient populations. Each metric defines a population-level experimental target that can be compared to individual-level simulations.
Internal Julia representation
Internally, each row of experimental data is represented as a MetricBinding. A MetricBinding links:
- an experimental metric (e.g. mean, quantile, category, survival),
- a
scenario(e.g. treatment arm), - and an
endpointcolumn in the simulation table.
Example:
mb = MetricBinding(
"m_mean_conc24_Tx", # metric id
"Tx", # scenario (e.g. treatment arm)
MeanMetric(
40, # experimental sample size
2.1, # mean value
0.2 # standard deviation
),
"conc_t24", # endpoint column in simulation data
true # active flag
)Tabular metric definition
For practical workflows, metrics are usually defined in a table and loaded in bulk (e.g. from CSV or DataFrame) using parse_metric_bindings. Each row corresponds to one metric.
Core columns
A metric table typically includes:
id— unique object identifieractive— whether the metric is included in the loss (1or0)scenario— scenario identifier used to match simulation conditionsendpoint— name of the simulation output column used for comparisonmetric.type— metric type (e.g.mean,mean_sd,category,quantile,survival)metric.size— experimental sample sizemetric.<prop>— additional metric-specific properties, see more details in Overview
Example table
The table below defines two metrics for the same scenario Tx:
| id | active | scenario | metric.type | metric.size | endpoint | metric.mean | metric.sd | metric.levels | metric.values |
|---|---|---|---|---|---|---|---|---|---|
| m_conc24_mean_Tx | 1 | Tx | mean | 40 | conc_t24 | 2.10 | 0.2 | ||
| m_biomarker_q_Tx | 1 | Tx | quantile | 40 | biomarker | 0.25;0.50;0.75 | 0.1;1.35;10.1 |
Interpretation:
m_conc24_mean_Txtargets the mean ofconc_t24in the experimental population.m_biomarker_q_Txtargets the quantiles (0.25, 0.50, 0.75) ofbiomarker.
In practice, you may store only the columns required by the metric types used in your dataset.
Loading from CSV
using CSV, DataFrames
metrics_df = CSV.File("metrics.csv") |> DataFrame
metrics = parse_metric_bindings(metrics_df)