The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
Risk scores are sparse linear models that map an integer linear combination of covariates to the probability of an outcome occurring. Unlike regression models, risk score models consist of integer coefficients for often dichotomous variables. This allows risk score predictions to be easily computed by adding or subtracting a few small numbers.
Risk scores developed heuristically by altering logistic regression models have decreased performance, as there is a fundamental trade-off between the model’s simplicity and its predictive accuracy. In contrast, this package presents an optimization approach to learning risk scores, where the constraints for sparsity and integer coefficients are integrated into the model-fitting process, rather than implemented afterward.
You can install the development version of riskscores from GitHub with:
# install.packages("devtools")
::install_github("hjeglinton/riskscores", build_vignettes = TRUE) devtools
We’ll fit a risk score model to predict breast cancer from biopsy data. More details can be found in the package’s vignette.
library(riskscores)
# Prepare data
<- breastcancer[,1]
y <- as.matrix(breastcancer[,-1])
X
# Fit risk score model
<- risk_mod(X, y, lambda = 0.0392) mod
The integer risk score model can be viewed by calling
mod$model_card
. An individual’s risk score can be
calculated by multiplying each covariate response by its respective
number of points and then adding all points together. In our example
below, a patient with a ClumpThickness value of 5, a BareNuclei value of
1, and a BlandChromatin value of 3 would receive a score of \(9(5) + 7(1) + 8(3) = 76\).
Points | |
---|---|
ClumpThickness | 9 |
BareNuclei | 7 |
BlandChromatin | 8 |
Each score can then be mapped to a risk probability. The
mod$score_map
dataframe maps an integer range of scores to
their associated risk. We can see that a patient who received a score of
120 would have a 78.86% risk of their tissue sample being malignant.
Score | Risk |
---|---|
30 | 0.0012 |
60 | 0.0176 |
90 | 0.2052 |
120 | 0.7886 |
150 | 0.9818 |
180 | 0.9987 |
210 | 0.9999 |
240 | 1.0000 |
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.