DigiPopData.jl API
DigiPopData.AbstractMetric — Type
AbstractMetricAbstract super-type for all metric descriptors used by DigiPopData.
Purpose
Group together heterogeneous metrics (Mean, MeanSD, Category, …) so they can share the same dispatch points (mismatch, mismatch_expression, get_loss, ...).
Required interface
mismatch: Function to calculate the loss for a given metric and simulated data as a value.mismatch_expression: Function to calculate the loss for a given metric and simulated data as a JuMP expression.validate: Function to validate the simulated data against the metric.
The parsing rules for the metric type are defined in the PARSERS dictionary to convert from DataFrame row to specific Metric struture. It is used in the parse_metric_bindings method.
PARSERS["<metric_type>"] = (row) -> begin
# parsing logic
endDigiPopData.CategoryMetric — Type
CategoryMetric <: AbstractMetricCategoryMetric is a metric descriptor for categorical data. It is based on polinomial distribution within the groups.
Fields
size::Int: The size of the dataset.groups::Vector{String}: The names of the groups.rates::Vector{Float64}: The probabilities of each group.cov_inv::Matrix{Float64}: The inverse of the covariance matrix of the groups.group_active::Vector{Bool}: A boolean vector indicating which groups are active (non-zero rates).
Constructor
CategoryMetric(size::Int, groups::Vector{String}, rates::Vector{Float64}): Creates a new instance of CategoryMetric. It validates the input data and calculates the inverse covariance matrix.
DigiPopData.MeanMetric — Type
MeanMetric <: AbstractMetricA metric that compares the mean of a simulated dataset to a target mean.
Fields
size::Int: The size of the dataset.mean::Float64: The target mean value.sd::Float64: The target standard deviation value.
DigiPopData.MeanSDMetric — Type
MeanSDMetric <: AbstractMetricA metric that compares the mean and standard deviation (SD) of a simulated dataset to a target mean and SD.
Fields
size::Int: The size of the dataset.mean::Float64: The target mean value.sd::Float64: The target standard deviation value.
DigiPopData.MetricBinding — Type
MetricBinding(id, scenario, metric, endpoint; active = true, weight = 1.0)
MetricBinding(id, scenario, metric, endpoint, active, weight)Bind one experimental metric to one simulated endpoint in one scenario.
A MetricBinding tells loss calculations which simulated column should be compared against a concrete AbstractMetric, and how that metric should contribute to the total loss.
Arguments
id::String: Unique identifier for this binding.scenario::String: Scenario label used to select rows from the simulated data.metric::AbstractMetric: Experimental target, such asMeanMetricorCategoryMetric.endpoint::String: Name of the simulated data column used for comparison.active::Bool: Whether this binding is included in loss calculations.weight::Real: Non-negative, finite multiplier applied to this binding's loss.
Defaults
activedefaults totruewhen omitted from the keyword constructor.weightdefaults to1.0when omitted from the keyword constructor.
Throws
Throws ArgumentError when weight is negative, infinite, or NaN.
Examples
metric = MeanMetric(40, 2.1, 0.2)
binding = MetricBinding("m_conc_mean", "Tx", metric, "conc_t24"; weight = 2.0)DigiPopData.QuantileMetric — Type
QuantileMetric <: AbstractMetricQuantileMetric is a metric descriptor for quantile data. It is based on the quantiles of the data and their corresponding values.
Fields
size::Int: The size of the dataset.levels::Vector{Float64}: The quantile levels (e.g. 0.25, 0.5, 0.75).values::Vector{Float64}: The corresponding values for the quantile levels.skip_nan::Bool: Iftrue, NaN values are allowed in simulated data and will be ignored. Iffalse`, NaN values are not allowed.cov_inv::Matrix{Float64}: The inverse of the covariance matrix of the groups.group_active::Vector{Bool}: A boolean vector indicating which groups are active (non-zero rates).rates::Vector{Float64}: The probabilities of each group.
Constructor
QuantileMetric(size::Int, levels::Vector{Float64}, values::Vector{Float64}; skip_nan::Bool = false): Creates a new instance of QuantileMetric. It validates the input data and calculates the inverse covariance matrix.
DigiPopData.SurvivalMetric — Type
" SurvivalMetric <: AbstractMetric
Fields
size::Int: The size of the dataset.levels::Vector{Float64}: The survival levels (e.g. 0.9, 0.8, 0.7).values::Vector{Float64}: The corresponding values for the survival levels.cov_inv::Matrix{Float64}: The inverse of the covariance matrix of the groups.group_active::Vector{Bool}: A boolean vector indicating which groups are active (non-zero rates).rates::Vector{Float64}: The probabilities of each group.
Constructor
SurvivalMetric(size::Int, levels::Vector{Float64}, values::Vector{Float64}): Creates a new instance of SurvivalMetric. It validates the input data and calculates the inverse covariance matrix.
DigiPopData.add_loss_expression! — Method
add_loss_expression!(prob::GenericModel, sim::AbstractVector, b::MetricBinding, X::Vector{VariableRef}, X_len::Int) -> QuadExprCreate a metric mismatch expression, push it to prob[:LOSS] taking into account weight and active status, and return it.
DigiPopData.get_loss — Method
get_loss(simulated::DataFrame, metric_bindings::Vector{MetricBinding}, cohort::Vector{String}) -> Float64Calculate total loss after restricting simulated to rows whose id is in cohort.
This method first filters the simulation table, then calls get_loss(simulated_subset, metric_bindings).
DigiPopData.get_loss — Method
get_loss(simulated::DataFrame, metric_bindings::Vector{MetricBinding}) -> Float64Calculate the total weighted loss for all active metric bindings.
For each active MetricBinding, this function selects simulated values from rows where simulated.scenario .== binding.scenario and from the column named by binding.endpoint. It then adds the binding's weighted mismatch to the total. Inactive bindings are ignored.
Arguments
simulated::DataFrame: Simulation table. It must contain ascenariocolumn and every endpoint column referenced by active bindings.metric_bindings::Vector{MetricBinding}: Bindings that define which metrics to evaluate and how they are weighted.
Returns
The sum of weighted mismatches as a Float64.
Examples
df = DataFrame(scenario = ["Tx", "Tx", "Tx"], conc_t24 = [2.0, 2.1, 2.2])
metric = MeanMetric(40, 2.1, 0.2)
binding = MetricBinding("m_conc_mean", "Tx", metric, "conc_t24")
loss = get_loss(df, [binding])DigiPopData.mismatch — Method
mismatch(sim::AbstractVector, metric::AbstractMetric) -> Float64Arguments
sim::AbstractVector: A vector of simulated data.metric::AbstractMetric: An instance of a metric descriptor (e.g.,MeanMetric,CategoryMetric, etc.).
Return a loss that quantifies the mismatch between simulated data sim and the target metric metric. The concrete formula depends on the subtype of AbstractMetric.
DigiPopData.mismatch_expression — Method
mismatch_expression(prob::GenericModel, sim::AbstractVector, dp::AbstractMetric, X::Vector{VariableRef}, X_len::Int) -> QuadExprArguments
prob::GenericModel: A JuMP optimization model.sim::AbstractVector: A vector of simulated data.dp::AbstractMetric: An instance of a metric descriptor (e.g.,MeanMetric,CategoryMetric, etc.).X::Vector{VariableRef}: A vector of JuMP variable references.X_len::Int: The length of the vector of JuMP variable references.
Return an unweighted expression that quantifies the mismatch between simulated data sim and the target metric metric. The concrete formula depends on the subtype of AbstractMetric.
DigiPopData.parse_metric_bindings — Method
parse_metric_bindings(df::DataFrame) -> Vector{MetricBinding}Parse a metric definition table into MetricBinding objects.
Each row in df describes one metric binding: which experimental metric to construct, which simulated scenario and endpoint it applies to, whether it is active, and how much it contributes to the total loss.
Required columns
id: Unique identifier for the binding.scenario: Scenario label used to match rows in simulated data.endpoint: Name of the simulated data column used for comparison.metric.type: Metric parser key, such as"mean","mean_sd","category","quantile", or"survival".metric.<property>: Metric-specific columns required by the selectedmetric.type, for examplemetric.mean,metric.sd, ormetric.size.
Optional columns
active: Whether the binding is included in loss calculations. Missing or empty values default totrue. Accepted values areBool,0/1, and the strings"false","true","0", and"1".weight: Non-negative, finite loss multiplier. Missing or empty values default to1.0.
Throws
Throws ErrorException with the row number when a row cannot be parsed. The wrapped error may come from an unknown metric.type, an invalid metric parameter, an invalid active value, or an invalid weight.
Examples
df = DataFrame(
id = ["m_conc_mean"],
active = [1],
scenario = ["Tx"],
endpoint = ["conc_t24"],
var"metric.type" = ["mean"],
var"metric.size" = [40],
var"metric.mean" = [2.1],
var"metric.sd" = [0.2],
weight = [2.0],
)
bindings = parse_metric_bindings(df)DigiPopData.validate — Method
validate(sim::AbstractVector{<:Real, dp::AbstractMetric)Arguments
sim::AbstractVector{<:Real}: A vector of simulated data.dp::AbstractMetric: An instance of a metric descriptor (e.g.,MeanMetric,CategoryMetric, etc.).
Validate the simulated data sim against the target metric metric. It throws an error if the validation fails. The concrete validation rules depend on the subtype of AbstractMetric.