The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
modelc is an R model object to SQL compiler. It generates SQL select statements from linear and generalized linear models.
Its interface currently consists of a single function,
modelc
, which takes a single input, namely an
lm
or glm
model object.
It currently supports Gaussian and gamma family distributions using log or identity link functions.
To import linear models directly to your SQL Server database,
consider using Castpack, which depends
on modelc
.
Supposing the following data
<- 1:10
a <- 2*1:10 + runif(1) * 1.5
b <- as.factor(1:10)
c <- data.frame(a,b,c)
df = b ~ a + c formula
A vanilla linear model
<- lm(formula, data=df)
linear_model modelc(linear_model)
generates the following SQL
0.231808555545287 + 2 * `a` + (
CASE
WHEN c = 2 THEN -0.00000000000000193216758587821 * c
WHEN c = 3 THEN -0.000000000000000776180314897008 * c
WHEN c = 4 THEN -0.000000000000000665297412768863 * c
WHEN c = 5 THEN -0.00000000000000055441451064072 * c
WHEN c = 6 THEN -0.000000000000000887620818362638 * c
WHEN c = 7 THEN -0.000000000000000332648706384432 * c
WHEN c = 8 THEN -0.00000000000000110994422395641 * c
WHEN c = 9 THEN -0.00000000000000188723974152839 * c
WHEN c = 10 THEN 0 * c
END
)
GLMs are also supported with log or identity link functions
<- glm(formula, data=df, family=Gamma(link="log"))
glm_model modelc(glm_model)
EXP(
0.557874070609732 + 0.244938197625494 * `a` + (
CASE
WHEN c = 2 THEN 0.394878990324516 * c
WHEN c = 3 THEN 0.536977925025217 * c
WHEN c = 4 THEN 0.570378881020516 * c
WHEN c = 5 THEN 0.542936294999294 * c
WHEN c = 6 THEN 0.476536561025273 * c
WHEN c = 7 THEN 0.383038044594683 * c
WHEN c = 8 THEN 0.269593156578649 * c
WHEN c = 9 THEN 0.140849942185343 * c
WHEN c = 10 THEN 0 * c
END
) )
<- glm(formula, data=df, family=Gamma(link="identity"))
glm_model_idlink modelc(glm_model_idlink)
0.231808555545287 + 2 * `a` + (
CASE
WHEN c = 2 THEN 0.00000000000000139594865689472 * c
WHEN c = 3 THEN -0.000000000000000581567338978993 * c
WHEN c = 4 THEN -0.00000000000000111588502938831 * c
WHEN c = 5 THEN 0.000000000000000967650035758108 * c
WHEN c = 6 THEN -0.00000000000000149265067586469 * c
WHEN c = 7 THEN -0.000000000000000100985345060517 * c
WHEN c = 8 THEN -0.0000000000000000673235633736781 * c
WHEN c = 9 THEN 0.00000000000000199047558220559 * c
WHEN c = 10 THEN 0 * c
END
)
In order to avoid generating invalid SQL, modelc
temporarily sets your scipen
option to 999.
Using devtools
:
install.packages("devtools")
install.packages("remotes")
::install_github("sparkfish/modelc") remotes
Note that you may encounter minor differences between the output of your R and generated SQL models depending on the precision with which your numeric types are represented in the database. To ensure parity between the two models, numeric types should have a precision of at least 17.
Tests are written using testthat
. To run them, simply
do
::test() devtools
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.