
Kaplan-Meier Estimator
mlr_learners_surv.kaplan.RdCalls survival::survfit() during training to estimate the survival function
at the unique event times from the training set. During prediction, each
test observation is assigned the same survival probabilities at these times.
The risk score (crank) is calculated as the sum of the cumulative hazard
function (also called expected mortality) at the event times.
Dictionary
This Learner can be instantiated via the dictionary mlr_learners or with the associated sugar function lrn():
Meta Information
Task type: “surv”
Predict Types: “crank”, “distr”
Feature Types: “logical”, “integer”, “numeric”, “character”, “factor”, “ordered”
Required Packages: mlr3, mlr3survival, survival
References
Kaplan EL, Meier P (1958). “Nonparametric Estimation from Incomplete Observations.” Journal of the American Statistical Association, 53(282), 457–481. doi:10.1080/01621459.1958.10501452 .
See also
Other survival learners:
mlr_learners_surv.coxph
Super classes
mlr3::Learner -> mlr3survival::LearnerSurv -> LearnerSurvKaplan
Active bindings
native_model(survival::survfit)
The fitted model.
Methods
Method importance()
All features have a score of 0 for this learner.
This method exists solely for compatibility with the mlr3 ecosystem,
as this learner is used as a fallback for other survival learners that
require an importance() method.
Returns
Named numeric().
Method selected_features()
Selected features are always the empty set for this learner.
This method is implemented only for compatibility with the mlr3 API,
as this learner does not perform feature selection.
Examples
# Define the Learner
learner = lrn("surv.kaplan")
learner
#>
#> ── <LearnerSurvKaplan> (surv.kaplan): Kaplan-Meier Estimator ───────────────────
#> • Model: -
#> • Parameters: list()
#> • Packages: mlr3, mlr3survival, and survival
#> • Predict Types: [crank] and distr
#> • Feature Types: logical, integer, numeric, character, factor, and ordered
#> • Encapsulation: none (fallback: -)
#> • Properties: importance, missings, selected_features, and weights
#> • Other settings: use_weights = 'use'
# Define a Task
task = tsk("lung")
# Stratification based on event
task$set_col_roles(cols = "status", add_to = "stratum")
# Create train and test set
part = partition(task)
# Train the learner on the train set
learner$train(task, row_ids = part$train)
learner$native_model
#> Call: survfit(formula = task$formula(1), data = task$data(cols = task$target_names),
#> weights = NULL)
#>
#> n events median 0.95LCL 0.95UCL
#> [1,] 112 81 310 267 390
# Make predictions for the test set
predictions = learner$predict(task, row_ids = part$test)
predictions
#>
#> ── <PredictionSurv> for 56 observations: ───────────────────────────────────────
#> row_ids time status crank distr
#> 9 567 TRUE 52.63894 <list[1]>
#> 13 301 TRUE 52.63894 <list[1]>
#> 21 473 TRUE 52.63894 <list[1]>
#> --- --- --- --- ---
#> 158 185 FALSE 52.63894 <list[1]>
#> 159 222 FALSE 52.63894 <list[1]>
#> 168 177 FALSE 52.63894 <list[1]>
# Score the predictions
predictions$score()
#> surv.cindex
#> 0.5