The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

Fitting Multinomial Logistic Regression model in Divide and Recombine approach to Large Data Sets

Using the function drglm.multinom(), multinomial logistic regression models can be fitted to large data sets.

#Generating a Data Set

set.seed(123)
#Number of rows to be generated
n <- 1000000
#creating dataset
dataset <- data.frame( 
Var_1 = round(rnorm(n, mean = 50, sd = 10)), 
Var_2 = round(rnorm(n, mean = 7.5, sd = 2.1)), 
Var_3 = as.factor(sample(c("0", "1"), n, replace = TRUE)), 
Var_4 = as.factor(sample(c("0", "1", "2"), n, replace = TRUE)), 
Var_5 = as.factor(sample(0:15, n, replace = TRUE)), 
Var_6 = round(rnorm(n, mean = 60, sd = 5))
)

This data set contains six variables of which four of them are continuous generated from normal distribution and two of them are categorical and other one is count variable. Now we shall fit different GLMs with this data set below.

Fitting Multinomial Logistic Regression Model

Now, we shall fit multinomial logistic regression model to the data sets assuming Var_4 as response variable and all other variables as independent ones.

mmodel=drglm::drglm.multinom(Var_4~ Var_1+ Var_2+ Var_3+ Var_5+ Var_6, 
                             data=dataset, k=10)
## # weights:  63 (40 variable)
## initial  value 109861.228867 
## final  value 109861.228162 
## converged
## # weights:  63 (40 variable)
## initial  value 109861.228867 
## iter  10 value 109842.503510
## iter  20 value 109840.273128
## final  value 109838.002508 
## converged
## # weights:  63 (40 variable)
## initial  value 109861.228867 
## iter  10 value 109850.296686
## iter  20 value 109846.528490
## final  value 109842.945823 
## converged
## # weights:  63 (40 variable)
## initial  value 109861.228867 
## iter  10 value 109847.393856
## iter  20 value 109841.079169
## final  value 109840.175418 
## converged
## # weights:  63 (40 variable)
## initial  value 109861.228867 
## iter  10 value 109842.805655
## iter  20 value 109840.979230
## iter  30 value 109838.911934
## final  value 109838.864166 
## converged
## # weights:  63 (40 variable)
## initial  value 109861.228867 
## iter  10 value 109841.472994
## iter  20 value 109839.598647
## final  value 109837.733262 
## converged
## # weights:  63 (40 variable)
## initial  value 109861.228867 
## iter  10 value 109851.271296
## iter  20 value 109846.660324
## iter  30 value 109839.769091
## iter  40 value 109838.903624
## iter  40 value 109838.903182
## iter  40 value 109838.903178
## final  value 109838.903178 
## converged
## # weights:  63 (40 variable)
## initial  value 109861.228867 
## iter  10 value 109840.806578
## iter  20 value 109837.263429
## final  value 109834.528438 
## converged
## # weights:  63 (40 variable)
## initial  value 109861.228867 
## iter  10 value 109850.031314
## iter  20 value 109849.169972
## final  value 109846.685488 
## converged
## # weights:  63 (40 variable)
## initial  value 109861.228867 
## iter  10 value 109848.501910
## iter  20 value 109846.077070
## final  value 109845.048526 
## converged
#Output
print(mmodel)
##                Estimate.1    Estimate.2 standard error.1 standard error.2
## (Intercept)  4.081904e-02  2.071676e-03     0.0344340641     0.0344561368
## Var_1       -9.984185e-05  1.415146e-05     0.0002448509     0.0002449696
## Var_2        1.402186e-03  2.012445e-04     0.0011559414     0.0011565234
## Var_31      -1.835696e-03 -5.230905e-05     0.0048983192     0.0049007854
## Var_51      -2.570995e-03  6.045345e-03     0.0138744940     0.0138774417
## Var_52       2.589983e-03  7.659461e-03     0.0138717808     0.0138809960
## Var_53      -4.951806e-03 -1.604007e-02     0.0138678622     0.0139049681
## Var_54       1.456459e-03  1.530690e-02     0.0138888131     0.0138830238
## Var_55      -2.225580e-02 -2.838295e-02     0.0138490644     0.0138778135
## Var_56      -1.001576e-02 -1.472764e-02     0.0138454752     0.0138710823
## Var_57       3.229535e-03 -1.157117e-03     0.0138747222     0.0139001413
## Var_58       2.181392e-05 -1.234939e-03     0.0138865421     0.0139071506
## Var_59      -1.823170e-02 -1.626911e-02     0.0138691698     0.0138837951
## Var_510     -1.050656e-02 -1.295762e-02     0.0138664395     0.0138884550
## Var_511     -1.114918e-02  6.444328e-03     0.0138833413     0.0138709237
## Var_512     -5.482693e-03  1.265131e-03     0.0138716161     0.0138773671
## Var_513     -1.979504e-02 -2.113650e-02     0.0138717368     0.0138919857
## Var_514     -3.300604e-02 -1.611510e-02     0.0138574025     0.0138463110
## Var_515     -8.855361e-03  3.537469e-03     0.0138783001     0.0138751272
## Var_6       -6.124825e-04 -1.379973e-05     0.0004887467     0.0004889809
##                z value.1   z value.2 Pr(>|z|).1 Pr(>|z|).2 95% lower CI.1
## (Intercept)  1.185426192  0.06012503 0.23584898 0.95205606  -0.0266704840
## Var_1       -0.407765881  0.05776822 0.68344556 0.95393325  -0.0005797408
## Var_2        1.213025485  0.17400812 0.22512008 0.86185908  -0.0008634171
## Var_31      -0.374760305 -0.01067361 0.70783874 0.99148386  -0.0114362249
## Var_51      -0.185303723  0.43562392 0.85299082 0.66310961  -0.0297645039
## Var_52       0.186708754  0.55179480 0.85188899 0.58108895  -0.0245982079
## Var_53      -0.357070594 -1.15354944 0.72103896 0.24868494  -0.0321323162
## Var_54       0.104865617  1.10256256 0.91648244 0.27021717  -0.0257651146
## Var_55      -1.607025597 -2.04520340 0.10804875 0.04083481  -0.0493994683
## Var_56      -0.723395603 -1.06175109 0.46943687 0.28834870  -0.0371523886
## Var_57       0.232763905 -0.08324495 0.81594474 0.93365677  -0.0239644213
## Var_58       0.001570868 -0.08879882 0.99874663 0.92924180  -0.0271953084
## Var_59      -1.314548921 -1.17180582 0.18866155 0.24127502  -0.0454147755
## Var_510     -0.757696897 -0.93297782 0.44863246 0.35083142  -0.0376842803
## Var_511     -0.803061741  0.46459254 0.42193905 0.64222327  -0.0383600292
## Var_512     -0.395245419  0.09116504 0.69266178 0.92736145  -0.0326705607
## Var_513     -1.427005454 -1.52148900 0.15357832 0.12813717  -0.0469831484
## Var_514     -2.381834365 -1.16385533 0.01722664 0.24448265  -0.0601660472
## Var_515     -0.638072450  0.25495039 0.52342652 0.79876141  -0.0360563293
## Var_6       -1.253169669 -0.02822140 0.21014397 0.97748557  -0.0015704084
##             95% lower CI.2 95% upper CI.1 95% upper CI.2
## (Intercept)  -0.0654611110   0.1083085670   0.0696044634
## Var_1        -0.0004659801   0.0003800571   0.0004942831
## Var_2        -0.0020654997   0.0036677897   0.0024679886
## Var_31       -0.0096576719   0.0077648337   0.0095530538
## Var_51       -0.0211539403   0.0246225131   0.0332446313
## Var_52       -0.0195467909   0.0297781737   0.0348657137
## Var_53       -0.0432933050   0.0222287046   0.0112131685
## Var_54       -0.0119033243   0.0286780325   0.0425171290
## Var_55       -0.0555829659   0.0048878664  -0.0011829367
## Var_56       -0.0419144585   0.0171208768   0.0124591850
## Var_57       -0.0284008929   0.0304234903   0.0260866597
## Var_58       -0.0284924528   0.0272389363   0.0260225757
## Var_59       -0.0434808504   0.0089513711   0.0109426265
## Var_510      -0.0401784920   0.0166711639   0.0142632510
## Var_511      -0.0207421833   0.0160616687   0.0336308387
## Var_512      -0.0259340090   0.0217051753   0.0284642705
## Var_513      -0.0483642950   0.0073930604   0.0060912882
## Var_514      -0.0432533737  -0.0058460277   0.0110231681
## Var_515      -0.0236572804   0.0183456074   0.0307322187
## Var_6        -0.0009721847   0.0003454434   0.0009445853

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.