DigiPopData.jl API

DigiPopData.AbstractMetricType
AbstractMetric

Abstract super-type for all metric descriptors used by DigiPopData.

Purpose

Group together heterogeneous metrics (Mean, MeanSD, Category, …) so they can share the same dispatch points (mismatch, mismatch_expression, get_loss, ...).

Required interface

  • mismatch: Function to calculate the loss for a given metric and simulated data as a value.
  • mismatch_expression: Function to calculate the loss for a given metric and simulated data as a JuMP expression.
  • validate: Function to validate the simulated data against the metric.

The parsing rules for the metric type are defined in the PARSERS dictionary to convert from DataFrame row to specific Metric struture. It is used in the parse_metric_bindings method.

PARSERS["<metric_type>"] = (row) -> begin
    # parsing logic
end
source
DigiPopData.CategoryMetricType
CategoryMetric <: AbstractMetric

CategoryMetric is a metric descriptor for categorical data. It is based on polinomial distribution within the groups.

Fields

  • size::Int: The size of the dataset.
  • groups::Vector{String}: The names of the groups.
  • rates::Vector{Float64}: The probabilities of each group.
  • cov_inv::Matrix{Float64}: The inverse of the covariance matrix of the groups.
  • group_active::Vector{Bool}: A boolean vector indicating which groups are active (non-zero rates).

Constructor

  • CategoryMetric(size::Int, groups::Vector{String}, rates::Vector{Float64}): Creates a new instance of CategoryMetric. It validates the input data and calculates the inverse covariance matrix.
source
DigiPopData.MeanMetricType
MeanMetric <: AbstractMetric

A metric that compares the mean of a simulated dataset to a target mean.

Fields

  • size::Int: The size of the dataset.
  • mean::Float64: The target mean value.
  • sd::Float64: The target standard deviation value.
source
DigiPopData.MeanSDMetricType
MeanSDMetric <: AbstractMetric

A metric that compares the mean and standard deviation (SD) of a simulated dataset to a target mean and SD.

Fields

  • size::Int: The size of the dataset.
  • mean::Float64: The target mean value.
  • sd::Float64: The target standard deviation value.
source
DigiPopData.MetricBindingType
MetricBinding(id, scenario, metric, endpoint; active = true, weight = 1.0)
MetricBinding(id, scenario, metric, endpoint, active, weight)

Bind one experimental metric to one simulated endpoint in one scenario.

A MetricBinding tells loss calculations which simulated column should be compared against a concrete AbstractMetric, and how that metric should contribute to the total loss.

Arguments

  • id::String: Unique identifier for this binding.
  • scenario::String: Scenario label used to select rows from the simulated data.
  • metric::AbstractMetric: Experimental target, such as MeanMetric or CategoryMetric.
  • endpoint::String: Name of the simulated data column used for comparison.
  • active::Bool: Whether this binding is included in loss calculations.
  • weight::Real: Non-negative, finite multiplier applied to this binding's loss.

Defaults

  • active defaults to true when omitted from the keyword constructor.
  • weight defaults to 1.0 when omitted from the keyword constructor.

Throws

Throws ArgumentError when weight is negative, infinite, or NaN.

Examples

metric = MeanMetric(40, 2.1, 0.2)
binding = MetricBinding("m_conc_mean", "Tx", metric, "conc_t24"; weight = 2.0)
source
DigiPopData.QuantileMetricType
QuantileMetric <: AbstractMetric

QuantileMetric is a metric descriptor for quantile data. It is based on the quantiles of the data and their corresponding values.

Fields

  • size::Int: The size of the dataset.
  • levels::Vector{Float64}: The quantile levels (e.g. 0.25, 0.5, 0.75).
  • values::Vector{Float64}: The corresponding values for the quantile levels.
  • skip_nan::Bool: If true, NaN values are allowed in simulated data and will be ignored. Iffalse`, NaN values are not allowed.
  • cov_inv::Matrix{Float64}: The inverse of the covariance matrix of the groups.
  • group_active::Vector{Bool}: A boolean vector indicating which groups are active (non-zero rates).
  • rates::Vector{Float64}: The probabilities of each group.

Constructor

  • QuantileMetric(size::Int, levels::Vector{Float64}, values::Vector{Float64}; skip_nan::Bool = false): Creates a new instance of QuantileMetric. It validates the input data and calculates the inverse covariance matrix.
source
DigiPopData.SurvivalMetricType

" SurvivalMetric <: AbstractMetric

Fields

  • size::Int: The size of the dataset.
  • levels::Vector{Float64}: The survival levels (e.g. 0.9, 0.8, 0.7).
  • values::Vector{Float64}: The corresponding values for the survival levels.
  • cov_inv::Matrix{Float64}: The inverse of the covariance matrix of the groups.
  • group_active::Vector{Bool}: A boolean vector indicating which groups are active (non-zero rates).
  • rates::Vector{Float64}: The probabilities of each group.

Constructor

  • SurvivalMetric(size::Int, levels::Vector{Float64}, values::Vector{Float64}): Creates a new instance of SurvivalMetric. It validates the input data and calculates the inverse covariance matrix.
source
DigiPopData.add_loss_expression!Method
add_loss_expression!(prob::GenericModel, sim::AbstractVector, b::MetricBinding, X::Vector{VariableRef}, X_len::Int) -> QuadExpr

Create a metric mismatch expression, push it to prob[:LOSS] taking into account weight and active status, and return it.

source
DigiPopData.get_lossMethod
get_loss(simulated::DataFrame, metric_bindings::Vector{MetricBinding}, cohort::Vector{String}) -> Float64

Calculate total loss after restricting simulated to rows whose id is in cohort.

This method first filters the simulation table, then calls get_loss(simulated_subset, metric_bindings).

source
DigiPopData.get_lossMethod
get_loss(simulated::DataFrame, metric_bindings::Vector{MetricBinding}) -> Float64

Calculate the total weighted loss for all active metric bindings.

For each active MetricBinding, this function selects simulated values from rows where simulated.scenario .== binding.scenario and from the column named by binding.endpoint. It then adds the binding's weighted mismatch to the total. Inactive bindings are ignored.

Arguments

  • simulated::DataFrame: Simulation table. It must contain a scenario column and every endpoint column referenced by active bindings.
  • metric_bindings::Vector{MetricBinding}: Bindings that define which metrics to evaluate and how they are weighted.

Returns

The sum of weighted mismatches as a Float64.

Examples

df = DataFrame(scenario = ["Tx", "Tx", "Tx"], conc_t24 = [2.0, 2.1, 2.2])
metric = MeanMetric(40, 2.1, 0.2)
binding = MetricBinding("m_conc_mean", "Tx", metric, "conc_t24")

loss = get_loss(df, [binding])
source
DigiPopData.mismatchMethod
mismatch(sim::AbstractVector, metric::AbstractMetric) -> Float64

Arguments

  • sim::AbstractVector: A vector of simulated data.
  • metric::AbstractMetric: An instance of a metric descriptor (e.g., MeanMetric, CategoryMetric, etc.).

Return a loss that quantifies the mismatch between simulated data sim and the target metric metric. The concrete formula depends on the subtype of AbstractMetric.

source
DigiPopData.mismatch_expressionMethod
mismatch_expression(prob::GenericModel, sim::AbstractVector, dp::AbstractMetric, X::Vector{VariableRef}, X_len::Int) -> QuadExpr

Arguments

  • prob::GenericModel: A JuMP optimization model.
  • sim::AbstractVector: A vector of simulated data.
  • dp::AbstractMetric: An instance of a metric descriptor (e.g., MeanMetric, CategoryMetric, etc.).
  • X::Vector{VariableRef}: A vector of JuMP variable references.
  • X_len::Int: The length of the vector of JuMP variable references.

Return an unweighted expression that quantifies the mismatch between simulated data sim and the target metric metric. The concrete formula depends on the subtype of AbstractMetric.

source
DigiPopData.parse_metric_bindingsMethod
parse_metric_bindings(df::DataFrame) -> Vector{MetricBinding}

Parse a metric definition table into MetricBinding objects.

Each row in df describes one metric binding: which experimental metric to construct, which simulated scenario and endpoint it applies to, whether it is active, and how much it contributes to the total loss.

Required columns

  • id: Unique identifier for the binding.
  • scenario: Scenario label used to match rows in simulated data.
  • endpoint: Name of the simulated data column used for comparison.
  • metric.type: Metric parser key, such as "mean", "mean_sd", "category", "quantile", or "survival".
  • metric.<property>: Metric-specific columns required by the selected metric.type, for example metric.mean, metric.sd, or metric.size.

Optional columns

  • active: Whether the binding is included in loss calculations. Missing or empty values default to true. Accepted values are Bool, 0/1, and the strings "false", "true", "0", and "1".
  • weight: Non-negative, finite loss multiplier. Missing or empty values default to 1.0.

Throws

Throws ErrorException with the row number when a row cannot be parsed. The wrapped error may come from an unknown metric.type, an invalid metric parameter, an invalid active value, or an invalid weight.

Examples

df = DataFrame(
    id = ["m_conc_mean"],
    active = [1],
    scenario = ["Tx"],
    endpoint = ["conc_t24"],
    var"metric.type" = ["mean"],
    var"metric.size" = [40],
    var"metric.mean" = [2.1],
    var"metric.sd" = [0.2],
    weight = [2.0],
)

bindings = parse_metric_bindings(df)
source
DigiPopData.validateMethod
validate(sim::AbstractVector{<:Real, dp::AbstractMetric)

Arguments

  • sim::AbstractVector{<:Real}: A vector of simulated data.
  • dp::AbstractMetric: An instance of a metric descriptor (e.g., MeanMetric, CategoryMetric, etc.).

Validate the simulated data sim against the target metric metric. It throws an error if the validation fails. The concrete validation rules depend on the subtype of AbstractMetric.

source