The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
The mathematical representation of lactation curves has significant applications in various areas of animal science. Models that simulate milk production under different conditions are valuable for physiologists, nutritionists, and geneticists, allowing them to study mammary gland function and test hypotheses. These models also support management decisions related to timing and efficiency. Over the past decades, numerous models have been proposed for representing lactation curves in dairy species. These models differ mainly in their regression type (linear or nonlinear), number of parameters, relationships among parameters, and ability to represent lactation patterns such as peak yield, time at peak, and persistency.
Despite the advantages of having a diverse range of models, selecting a single model to represent the lactation curve can present limitations. Typically, model selection is based on comparing metrics such as the Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC). However, this metric-based approach assumes that the chosen metric is an optimal criterion for model selection. In practice, these metrics may fail to correctly identify the best model, especially when no single model is clearly superior. Furthermore, using a single model selected based on these metrics can lead to overfitting and introduce bias, particularly when the dataset is noisy or contains numerous variables.
Ensemble modeling and model averaging are powerful techniques that enhance robustness, accuracy, and generalization in predictive modeling. Instead of relying on a single model, ensemble methods combine multiple models to reduce variance, mitigate overfitting, and improve predictive performance. Model averaging incorporates model uncertainty by weighting predictions according to their posterior probabilities, leading to more reliable estimates. These approaches are particularly valuable when individual models exhibit varying performance across different datasets or conditions. By integrating multiple models, ensemble methods improve stability and resilience, making them especially useful in complex biological and ecological systems where uncertainty quantification is crucial.
The EMOTIONS package provides a set of tools for fitting 47 different lactation curve models previously reported in the literature. Some of these models and the pre-defined starting parameters were obtained from the lactcurves R package (https://CRAN.R-project.org/package=lactcurves). Once the data is fitted to each model, ensemble predictions are generated using bagging based on AIC, BIC, root mean square percentage error (RMSPE), mean absolute error (MAE), and variance. Additionally, the package provides predictions for daily milk records using Bayesian Model Averaging (BMA) and calculates cosine similarity for each model’s predictions. The ranking of models across individual predictions can be visualized using the RidgeModels and ModelRankRange functions, which help users better understand the weight assigned to each model. Furthermore, the PlotWeightLac function allows users to compare predicted and actual daily milk records. Lastly, EMOTIONS enables the estimation of resilience indicators based on lag-1 autocorrelation, logarithm of residual variance, and residual skewness using predicted daily milking records.
Use the following code to install EMOTIONS:
EMOTIONS includes a dummy dataset containing daily milk records (up to 210 days) for 100 unique individuals. This dataset can be accessed as follows:
# Load the dummy dataset
data("LacData")
# Display the first rows
head(LacData)
#> ID DIM DMY
#> 1 ID2 1 0.920000
#> 2 ID2 2 1.533911
#> 3 ID2 3 1.720438
#> 4 ID2 4 1.968925
#> 5 ID2 5 2.201476
#> 6 ID2 6 2.339388
The dataset consists of three columns:
The core function of EMOTIONS is LacCurveFit. This wrapper function integrates multiple supplementary functions that fit 47 lactation curve models. It takes daily milk production and days in milk from LacData as input. The following arguments must be provided:
Example usage:
# Running model fitting and ensemble modeling
out.ensemble <- LacCurveFit(
data = LacData, ID = "ID", trait = "DMY",
dim = "DIM", alpha = 0.1,
models = "All", param_list = NULL, silent=TRUE
)
The output of LacCurveFit consists of three main lists:
Checking the first six fitted models for individual ID2:
head(out.ensemble$converged_models$ID2)
#> $MM
#> Nonlinear regression model
#> model: DMY ~ (a * DIM)/(b + DIM)
#> data: x
#> a b
#> 1.678 -1.426
#> residual sum-of-squares: 102.4
#>
#> Number of iterations to convergence: 16
#> Achieved convergence tolerance: 8.24e-06
#>
#> $MMR
#> Nonlinear regression model
#> model: DMY ~ 1/(1 + a/(b + DIM))
#> data: x
#> a b
#> -0.4402 -11.2975
#> residual sum-of-squares: 222
#>
#> Number of iterations to convergence: 10
#> Achieved convergence tolerance: 7.544e-06
#>
#> $brody23
#> Nonlinear regression model
#> model: DMY ~ a * exp(-b * DIM)
#> data: x
#> a b
#> 2.472794 0.002908
#> residual sum-of-squares: 40.12
#>
#> Number of iterations to convergence: 5
#> Achieved convergence tolerance: 1.871e-06
#>
#> $brody24
#> Nonlinear regression model
#> model: DMY ~ a * exp(-b * DIM) - a * exp(-c * DIM)
#> data: x
#> a b c
#> 2.608615 0.003347 0.409834
#> residual sum-of-squares: 35.6
#>
#> Number of iterations to convergence: 6
#> Achieved convergence tolerance: 1.757e-06
#>
#> $SCH
#> Nonlinear regression model
#> model: DMY ~ exp(a + b/DIM)
#> data: x
#> a b
#> 0.6086 0.0746
#> residual sum-of-squares: 61.22
#>
#> Number of iterations to convergence: 24
#> Achieved convergence tolerance: 5.52e-06
#>
#> $SCHL
#> Nonlinear regression model
#> model: DMY ~ a * exp(b * DIM/(DIM + 1))
#> data: x
#> a b
#> 2.9461 -0.4806
#> residual sum-of-squares: 60.6
#>
#> Number of iterations to convergence: 12
#> Achieved convergence tolerance: 6.89e-06
Checking the first six rows of the models_weight list:
head(out.ensemble$models_weight$ID2)
#> Model AIC BIC RMSPE MAE delta_AIC delta_BIC delta_RMSPE
#> 17 GS1 248.8716 262.6586 0.4066434 0.2980797 64.06626 60.61952 0.03673873
#> 19 LQ 221.7219 235.5088 0.3835325 0.2764029 36.91651 33.46977 0.01362778
#> 16 PapBo6 236.7595 250.5465 0.3961658 0.2892085 51.95413 48.50739 0.02626116
#> 37 qntReg 184.8054 202.0391 0.4409210 0.2986891 0.00000 0.00000 0.07101635
#> 9 DHA 254.6571 268.4441 0.4117455 0.3025809 69.85173 66.40499 0.04184078
#> 8 wood 254.6571 268.4441 0.4117455 0.3025810 69.85173 66.40499 0.04184078
#> delta_MAE AIC_weight BIC_weight RMSPE_weight MAE_weight Var_weight
#> 17 0.04155304 0.0010871462 0.0015945608 0.02708594 0.02706331 0.0011616592
#> 19 0.01987616 0.0164205111 0.0240846192 0.02714861 0.02712204 0.0009363620
#> 16 0.03268181 0.0036501551 0.0053538282 0.02711434 0.02708733 0.0009997349
#> 37 0.04216239 0.6586434175 0.6844056210 0.02699326 0.02706166 0.0009299743
#> 9 0.04605421 0.0006095767 0.0008940905 0.02707212 0.02705113 0.0011220927
#> 8 0.04605427 0.0006095767 0.0008940905 0.02707212 0.02705113 0.0011220929
#> CosSquared_weight BMA_weight RMSPE_rank MAE_rank AIC_rank BIC_rank BMA_rank
#> 17 0.02717808 0.007028742 22 24 21 19 26
#> 19 0.02717127 0.026882458 8 8 8 7 11
#> 16 0.02716655 0.013334229 12 19 13 10 20
#> 37 0.02716609 0.027536974 33 25 1 1 10
#> 9 0.02716575 0.005804098 26 27 27 27 28
#> 8 0.02716575 0.005804086 24 29 25 25 30
#> Var_rank CosSquared_rank
#> 17 9 1
#> 19 30 2
#> 16 24 3
#> 37 31 4
#> 9 14 5
#> 8 13 6
Checking the first six rows of the production list:
head(out.ensemble$production$ID2)
#> ID DMY DIM weight_AIC weight_BIC Weight_RMSPE weight_MAE weight_BMA
#> 1 ID2 0.920000 1 1.831143 1.875771 1.467181 1.465492 1.194834
#> 2 ID2 1.533911 2 2.052090 2.075498 2.006070 2.007553 1.630353
#> 3 ID2 1.720438 3 2.171604 2.185195 2.096008 2.096483 1.865502
#> 4 ID2 1.968925 4 2.247112 2.254831 2.179450 2.179661 2.011144
#> 5 ID2 2.201476 5 2.299417 2.303232 2.236370 2.236451 2.108247
#> 6 ID2 2.339388 6 2.337805 2.338881 2.274461 2.274465 2.175942
#> weight_Var weight_CosSquared SMA
#> 1 1.912619 1.474490 1.463731
#> 2 1.837443 2.000676 2.007605
#> 3 1.811571 2.094073 2.095208
#> 4 1.798359 2.178289 2.177863
#> 5 1.790319 2.235544 2.234341
#> 6 1.784906 2.273813 2.272145
The ensemble predictions are obtained using different weighting methods:
Model ranking across individuals can be visualized using RidgeModels and ModelRankRange.
Note that the function ModelRankRange displays the number of individuals which the model was converged in front of each line.
Another visualization option provided by EMOTIONS is the PlotWeightLac function. This function plots the actual daily milk daily production and the predicted values obtained by the ensemble model.The arguments that must be provided to this function are:
PlotWeightLac(
data = out.ensemble, ID = "ID2",
trait = "DMY", metric = "weight_AIC",
dim = "DIM", col = c("red", "blue")
)
The function LacCurveFit allows the customization of the models to be included in the ensemble as well as the parameters of the models.
The EMOTIONS package has a dataset with the list of models, and its respective acronyms, available in the package. This dataset can be accessed using the following command.
data("models_EMOTIONS")
head(models_EMOTIONS)
#> Model Model_acronym
#> 1 Michaelis-Menten MM
#> 2 Michaelis-Menten (Rook) MMR
#> 3 Michaelis-Menten + exponential (Rook) MME
#> 4 Brody (1923) brody23
#> 5 Brody (1924) brody24
#> 6 Schumacher SCH
#> Authors Year
#> 1 Michaelis, L. and M.L. Menten 1913
#> 2 Rook, A.J., J. France, and M.S. Dhanoa 1993
#> 3 Rook, A.J., J. France, and M.S. Dhanoa 1993
#> 4 Brody, S., A.C. Ragsdale, and C.W. Turner 1923
#> 5 Brody, S., C.W. Tuner, and A.C. Ragsdale 1924
#> 6 Schumacher, F.X., Thornley, J.H.M. and J. France 2007
#> Reference
#> 1 https://www.chem.uwec.edu/Chem352_Resources/pages/readings/media/Michaelis_&_Menton_1913.pdf
#> 2 doi:10.1017/S002185960007684X
#> 3 doi:10.1017/S002185960007684X
#> 4 doi:10.1085/jgp.5.6.777
#> 5 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2140670/
#> 6 https://books.google.com.au/books/about/Mathematical_Models_in_ Agriculture.html?id=rlwBCRSHobcC&redir_esc=y
Users can select specific models to include in the ensemble. For
example, the following models will be used to generate the ensemble:
wil
, wilk
, wilycsml
,
DiG
, DiGpw
, legpol3
,
legpol4
, legpolWil
, cubsplin3
,
cubsplin4
, cubsplin5
,
cubsplindef
, wilminkPop
, and
qntReg
.
The ranking of the selected models can be evaluated using a ridge density plot based on AIC scores:
Users can access the list of models that support parameter
customization and their respective parameters using the
model_pars
dataset:
data(model_pars)
head(model_pars)
#> Model_acronym Parameters
#> 1 MM c(a=19.8,b=-1.65)
#> 2 MMR c(a=-10,b=-1.4)
#> 3 MME c(a=-0.06608,b=317.49,c=-328.06,d=-0.027)
#> 4 brody23 c(a=25.6,b=0.0015)
#> 5 brody24 c(a=26.127,b=0.0017,c=0.2575)
#> 6 SCH c(a=1,b=16)
For instance, the starting values of the parameters for models
MM
and wil
can be modified as follows:
edited_list <- list(
MM = c(a = 20, b = -2),
wil = c(a = 35, b = -5, c = -0.01, k = 0.2)
)
out.ensemble.edited <- LacCurveFit(
data = LacData,
ID = "ID",
trait = "DMY",
dim = "DIM",
alpha = 0.1,
models = "All",
param_list = edited_list
)
Model convergence can be checked using:
out.ensemble.edited$converged_models$ID2[["MM"]]
#> NULL
out.ensemble.edited$converged_models$ID2[["wil"]]
#> Nonlinear regression model
#> model: DMY ~ a + b * exp(-k * DIM) + c * DIM
#> data: x
#> a b c k
#> 2.465928 -2.537452 -0.005709 0.482613
#> residual sum-of-squares: 37.37
#>
#> Number of iterations to convergence: 8
#> Achieved convergence tolerance: 9.786e-06
The ResInd
function calculates resilience estimators,
including logarithm of variance, lag-1 autocorrelation, and skewness,
based on the weighted predictions. The required parameters are:
production_df
: A list containing data frames with daily
production records (either actual or predicted) from
LacCurveFit
.dim_filter_range
: A vector defining the lower and upper
limits for filtering lactation records at the beginning and end of
lactation. If no filtering is needed, set the first two values as the
minimum DIM and the last two as the maximum DIM.outlier_sd_threshold
: A threshold specifying the
maximum standard deviations for identifying outlier resilience
indicators.weight
: The column name containing the selected
ensemble prediction (default: weight_AIC
).trait
: The column name containing daily milk yield
records.DIM
: The column name containing days in milk
records.ID_col
: The column name containing unique animal
IDs.out.res <- ResInd(
out.ensemble$production,
dim_filter_range = c(1, 7, 203, 210),
outlier_sd_threshold = 4,
weight = "weight_AIC",
trait = "DMY",
DIM = "DIM",
ID_col = "ID"
)
The output of ResInd
is a list containing:
ri_filtered
: A data frame with daily milk production
after filtering and the estimated resilience indicators.dev_list
: A list of deviations for each animal and
DIM.removed_samples
: A list of animals identified as
outliers and removed from analysis.ri_stats
: A data frame summarizing the resilience
indicators.The first six rows of the filtered dataset can be accessed as follows:
head(out.res$ri_filtered)
#> # A tibble: 6 × 5
#> CodGen log_varianza autocorrelacion_lag1 skewness mean_Prod
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 ID102 -1.05 0.467 -0.000713 3.09
#> 2 ID104 -1.17 0.493 -0.593 3.76
#> 3 ID105 -1.77 0.494 0.214 2.18
#> 4 ID121 -0.553 0.460 1.06 3.22
#> 5 ID125 -2.30 0.452 1.58 1.91
#> 6 ID126 -1.50 0.498 1.15 2.86
The summary statistics for the resilience indicators can be retrieved using:
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.