The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

introduction

library(JSDNE)

Abstract

Age estimation plays a significant role in forensic anthropology and bioarchaeology. However, widely-used traditional methods involving macroscopic observation suffer from subjectivity and statistical bias. The present research aims to minimize both issues by applying computational and mathematical approaches. A laser scanner was used to reconstruct 890 auricular surfaces of adult individuals from three known-age European skeletal collections. Dirichlet Normal Energy (DNE) was applied to assess the curvature of the auricular surface and its relationship with known age-at-death. Ten variables had high correlations, including total DNE per Total polygon faces, Mean value of DNE on apex, proportion of polygon faces with DNE of less than 0.0001 and proportion of polygon faces with DNE of over 0.6. The variables were used to develop age prediction models which are freely available in a novel R package, JSDNE. The package predicts age mathematically, objectively, and user-independently. It includes three functions: principal component quadratic discriminant analysis (PCQDA), principal component regression analysis (PCR), and principal component logistic regression analysis (PCLR), which produce age estimates with 91%, 76%, and 92.9% levels of accuracy, respectively. JSDNE package (https://cran.r-project.org/package=JSDNE) can be downloaded automatically using install.packages(“JSDNE”). The detailed code and the raw data developed the findings of this study are openly available at https://github.com/jisunjang19/cran-JSDNE or doi: 10.5281/zenodo.12708779.

Introduction

Age estimation, together with sex and stature estimation, is crucial in constructing the biological profile of human remains. In bioarchaeology, the biological profile is the foundation for reconstructing past demographic information, leading to the proper understanding of human life history. Forensic anthropology uses the biological profile to identify unknown individuals, helping to investigate cases of missing persons, mass disasters, genocide, and violent crime (Márquez-Grant, 2015; Moraitis et al., 2014; Rissech et al., 2012). With increased globalisation, natural disasters, and international terrorism, attention has been paid to developing more accurate and objective age estimation methods (Zhang et al., 2022). Age estimation methods have been developed based on observing bone changes that occur throughout life (Byers, 2007). These bone changes can be divided into developmental and degenerative changes (Moraitis et al., 2014); developmental changes are based on the maturation rate of bones, such as epiphyseal ossification (Schmeling et al., 2004), and tooth eruption (Kumar and Sridhar, 1990). These are used to estimate subadult age. As for adults who have completed growth, age is estimated by examining the degree of degenerative change (Schwartz, 1995), including the pubic symphysis (Brooks and Suchey, 1990), the sternal ends of the ribs (Işcan et al., 1984), and the auricular surface (Lovejoy et al., 1985). Among these, the auricular surface is considered one of the most significant indicators as it is dense, highly durable and often well-preserved against decomposition (Rivera-Sandoval et al., 2018). Moreover, it is argued that degenerative changes on the auricular surface continue after 60 years of age, making it possible to estimate older adults’ age more accurately than other age indicators (Rissech et al., 2012). Even though age estimation methods have been developed and revised over several decades, there is still no way to precisely interpret the speed and degree of change for different individuals. This is because bone changes are affected by intrinsic and extrinsic factors, such as genes, diet, socioeconomics, and activity throughout the life, with rates of change varying between individuals. Therefore, estimating adult age can be complicated and inaccurate (Miranker, 2016). Age estimation methods traditionally suffer from two common issues: subjectivity and statistical bias. Most age estimation methods are based on visual assessment, which increases subjectivity. When assessing degenerative changes observers rely on their own expertise and experience to interpret the morphological features. Thus, the interpretation can differ depending on the observer, and the estimated result also varies. Even if the same observer assesses the same bone on different occasions, the results can be different. These are called interobserver and intraobserver errors, respectively (Garvin and Passalqua, 2011; Moraitis et al., 2014). Statistical bias occurs due to differences in the age structure of the reference and target samples. Fundamentally, the estimated results for a target sample can be biased by the spread of ages of individuals that made up the reference sample. This is known as age structure mimicry (Mensforth, 1990:90). Regional and diachronic variations can also influence the degree of degeneration displayed by different morphological features, biasing results towards the reference sample and causing inaccurate and unreliable results. Overall, these issues impact the accuracy, reliability, reproducibility and repeatability of results. To address these limitations, more objective and less observer-dependent age estimation methods have been sought (Botha et al., 2019; Curate et al., 2022). Recently, virtual anthropology has gained attention due to its potential to improve and revise traditional approaches through the application of computational imaging technology, such as computed tomography and laser scanning (Warrier et al., 2023). This allows a mathematical approach to extract subtle surface changes and more complex visual information not limited by human vision (Zhang et al., 2022; Warrier et al., 2023). It is also more user-independent, so the extracted results are objective, reproducible, and repeatable (Zhang et al., 2022). Currently, the most widely used mathematical approach in age estimation is curvature analysis of the joint surface. Curvature analysis is used to evaluate the degree to which surfaces are undulated or flat, and can therefore be used to analyse surface texture irregularities (Villa et al., 2015). To date, most research has focused on the pubic symphysis (Biwasaka et al., 2013; Stoyanova et al., 2015; Stoyanova et al., 2017; Slice and Agee-Hewitt, 2015), with only three papers exploring curvature analysis on the auricular surface (Jang et al., 2024; Štepanovský et al., 2023; Villa et al., 2015). Villa et al. (2015) used Gaussian curvature analysis to examine the correlations between five curvature variables and age: mean, highest, lowest, convex, and flat curvatures. The variables had moderate correlations with age, though it was noted that both young and older adults displayed high curvature values; young individuals because of transverse organisation, and older individuals due to increased macroporosity. Middle-aged individuals, who generally showed no transverse organisation or macroporosity (reflecting a flatter auricular surface), had lower curvature values. However, due to the sensitivity limitations of the equipment, detailed surface features such as surface texture and microporosity could not be analysed (Villa et al., 2015). While Villa et al. (2015) were able to show a correlation between curvature values and adult age, Štepanovský et al. (2023) developed this method further to produce age estimation software, CoxAGE3D, which uses an automatic data mining model. CoxAGE3D is fully automated, with age estimations produced after the user uploads a 3D laser scan of an auricular surface to the program. Calculations are based on multi-population auricular surfaces from five known-age Asian and European collections, reducing statistical bias levels. While this method appears to be able to identify a number of age-related coefficients, accuracy rates were not provided and it is yet to be subjected to independent validation. Jang et al. (2024) examined age-related change to the auricular surface using a different type of curvature analysis, Dirichlet Normal Energy (DNE). This requires a surface to be converted into polygon faces before DNE is used to measure how much the surface bends, which it does by quantifying the deviation of surface energy from a stable state, which is 0 or flat. Each polygon face (ρ) has two principal curvatures: the largest curvature (v) and the smallest curvature (υ), and the points at which these meet are called vertices (Figure 1). Each vertex has a normal (red arrow), which is the average value of the adjacent polygonal normal vectors (blue arrows). A polygon is formed by translating the normal vectors to a common vertex. This polygon has edge vectors (nν and nυ), which are derivatives of v and υ (Bunn et al., 2011; Pampush et al., 2016; Winchester, 2016). The Dirichlet Energy Density (DED) is obtained from the sum of square roots of edge vectors. Then, DNE is calculated as the total sum of the values multiplied by DNE and area (Pampush et al., 2016:398):

\[DED(\rho)=(n_v)^2(n_u)^2\] \[DNE=\sum DED(\rho) \times area(\rho)\]

This makes it possible to identify subtle changes in surface curvature more sensitively than other methods of curvature analysis (Bunn et al., 2011; Godfrey et al., 2012). As Jang et al. (2024) tested this method on archaeological skeletons with no biological information, age was estimated using the Lovejoy (1985) and Buckberry-Chamberlain (2002) methods. It was found that four DNE variables correlated with the estimated age phases, those relating to surface undulation, apical activity, macroporosity, and the total DNE per total number of polygon faces. In terms of total DNE per total number of polygon faces, younger adults had higher values than middle-aged adults due to the presence of transverse organisation (surface undulation), which occurred more frequently in the former. Younger adults had the highest DNE values for surface undulation, with these values falling with increasing age. Middle-aged adults had lower total DNE values because, as initially noted by Villa et al. (2015), their auricular surfaces were flatter due to a lack of transverse organisation or macroporosity. Older adults had the highest total DNE values and the highest DNE values for macroporosity and apical activity, which both showed a positive correlation with age phase (Jang et al., 2024). Jang et al. (2024) tested the levels of intraobserver error associated with this method and found very high levels of agreement. However, there was still a degree of user dependency as the user must manually select a region of interest (ROI) before DNE values can be calculated (Jang et al., 2024). Moreover, as this study used archaeological skeletons, the age of the individuals was not documented, potentially obscuring relationships between DNE values and age at death. To address these limitations, the current paper builds on the work of Jang et al. (2024) by reducing the need to select a ROI manually, and examining the relationship between DNE values and age-at-death in known-age individuals. Age-related variables will be identified via three statistical analysis methods: principal component discriminant analysis (PCQDA), principal component regression analysis (PCR), and principal component logistic regression analysis (PCLR). The results have been used to develop an R package, JSDNE, which can be used to estimate age-at-death in unknown individuals.

Datasets in the JSDNE package

JSDNE package includes seven datasets: RawData, PCQDA_Train, PCQDA_Test, PCR_Train, PCR_Test, PCLR_Train, PCLR_Test.

‘RawData’

RawData offers the raw information of DNE variables and biological information of auricular surfaces (age and sex). It consists of Age, Cluster1, Cluster2, Collection, Sex, MeanDNE.Apex, MedianDNE.Apex, IQRDNE.Apex, TotalDNE.TotalPolygonFaces, MedianDNE.Whole, IQRDNE.Whole, MeanDNE.Convex, MeanDNE.Concave, Proportion.DNEunder0.0001 and Proportion.DNEover0.6. Cluster1 and Cluster2 is the clustering information for PCLR and PCQDA, respectively. The number of rows is 890.

‘PCQDA_Train’

PCQDA_Train is the train set of the PCQDA model. It consists of Cluster2, Age, MeanDNE.Apex, TotalDNE.TotalPolygonFaces, Proportion.DNEunder0.0001, and Proportion.DNEover0.6. The number of rows is 704. This dataset is also used to calculate the Accuracy (ACC), Bias, Average Absolute Deviation (AAD) with PCQDA method. The followed code is used to create PCQDA_Train and PCQDA_Test.

PCQDA_data<-subset(RawData,select=c(Cluster2,Age, MeanDNE.Apex, TotalDNE.TotalPolygonFaces, Proportion.DNEunder0.0001, Proportion.DNEover0.6))
sample <- sample(c(TRUE, FALSE), nrow(PCQDA_data), replace=TRUE, prob=c(0.8,0.2))
PCQDA_Train <- PCQDA_data[sample, ]
PCQDA_Test <- PCQDA_data[!sample, ]

‘PCQDA_Test’

PCQDA_Test is the test set of the PCQDA model. It consists of Cluster2, Age, MeanDNE.Apex, TotalDNE.TotalPolygonFaces, Proportion.DNEunder0.0001, and Proportion.DNEover0.6. The number of rows is 186. This dataset is also used to calculate the Accuracy (ACC), Bias, Average Absolute Deviation (AAD) with PCQDA method.

‘PCR_Train’

PCR_Train is the train set of the PCR model. It consists of Age, MeanDNE.Apex, IQRDNE.Apex, TotalDNE.TotalPolygonFaces, MeanDNE.Convex and Proportion.DNEunder0.0001. The number of rows is 702. This dataset is also used to calculate the Accuracy (ACC), Bias, Average Absolute Deviation (AAD) with PCR method. The followed code is used to create PCR_Train and PCR_Test.

PCR_data<-subset(RawData,select=c(Age, MeanDNE.Apex, IQRDNE.Apex, TotalDNE.TotalPolygonFaces, MeanDNE.Convex, Proportion.DNEunder0.0001))
sample <- sample(c(TRUE, FALSE), nrow(PCR_data), replace=TRUE, prob=c(0.8,0.2))
PCR_Train <- PCR_data[sample, ]
PCR_Test <- PCR_data[!sample, ]

‘PCR_Test’

PCR_Test is the test set of the PCR model. It consists of Age, MeanDNE.Apex, IQRDNE.Apex, TotalDNE.TotalPolygonFaces, MeanDNE.Convex and Proportion.DNEunder0.0001. The number of rows is 188. This dataset is also used to calculate the Accuracy (ACC), Bias, Average Absolute Deviation (AAD) with PCR method.

‘PCLR_Train’

PCLR_Train is the train set of the PCR model. It consists of Age, Cluster1, MeanDNE.Apex, TotalDNE.TotalPolygonFaces, MedianDNE.Whole, IQRDNE.Whole and MeanDNE.Convex. The number of rows is 699. This dataset is also used to calculate the Accuracy (ACC), Bias, Average Absolute Deviation (AAD) with PCLR method. The followed code is used to create PCR_Train and PCR_Test.

PCLR_data<-subset(RawData,select=c(Age, Cluster1, MeanDNE.Apex, TotalDNE.TotalPolygonFaces, MedianDNE.Whole, IQRDNE.Whole, MeanDNE.Convex))
sample <- sample(c(TRUE, FALSE), nrow(PCLR_data), replace=TRUE, prob=c(0.8,0.2))
PCLR_Train <- PCLR_data[sample, ]
PCLR_Test <- PCLR_data[!sample, ]

‘PCLR_Test’

PCLR_Test is the test set of the PCR model. It consists of Age, Cluster1, MeanDNE.Apex, TotalDNE.TotalPolygonFaces, MedianDNE.Whole, IQRDNE.Whole and MeanDNE.Convex. The number of rows is 191. This dataset is also used to calculate the Accuracy (ACC), Bias, Average Absolute Deviation (AAD) with PCLR method.

Materials

Data for the current study was collected from three European anatomical skeletal collections; the Luís Lopes Skeletal Collection, the 21st Century Identified Skeletal Collection, and the CAL Milano Cemetery Skeletal Collection. The Luís Lopes Skeletal Collection is curated by the National Museum of Natural History and Science (MUHNAC), Lisbon, Portugal. It contains 1,692 skeletons in total, with biographical information available for 699 individuals, including age-at-death, place of birth and residence, and cause of death. The remains were collected from a modern cemetery in Lisbon between 19th and 20th centuries (Cardoso, 2006). The 21st Century Identified Skeletal Collection is curated by the University of Coimbra, Portugal. This contained 159 adult individuals, ranging in age from 29 to 99 years at death. Most individuals were Portuguese, dying between 1995 and 2008. Biographical information, including sex and age-at-death, is available for the entire collection, with documentation relating to the cause of death available for 12 individuals (Ferreira et al., 2014). Lastly, the CAL Milano Cemetery Skeletal Collection is curated by the Laboratorio di Antropologia e Odontologia Forense (LABANOF) in the Department of Biomedical Sciences for Health of the University of Milan (Italy) and part of the Collezione Antropologica LABANOF (CAL), Italy. It contains 2127 skeletons from Cimitero Maggiore, Cimitero di Lambrate, and Cimitero di Baggio. Biographical information, such as sex, age-at-death, date of birth and death, cause of death, and pathology (if it is specified), were all documented. The individuals died between 1910 and 2001, but most of them died after 1980 (Cattaneo et al., 2018). The current research consists of 890 (402 females, 488 males) adult individuals with a known age-at-death from these three collections. All 890 individuals had fully preserved auricular surfaces that displayed no taphonomic damage or pathological conditions.

Methods

Preparation and importing of reconstructed models Auricular surfaces were scanned using an EXScan SP laser scanner (resolution: 1.3 megapixels). Each surface was scanned from a minimum of five different angles and the resultant images were combined using the software EXScanS_v3.1.0.1 to produce a single 3D model which was saved as a .ply file. Each .ply file was then edited manually in Meshmixer to produce two ROIs; the auricular surface and the apical area. The auricular surface was isolated using the outer margin as a guide (following the criterion of Jang et al., 2024). The apical area was delineated by measuring polygons within a 1cm distance from both the left and right sides of the terminated point of the arcuate line. The 3D models were imported into R studio with the vcgPlyRead function from Rvcg package (Schlager, 2015).

Analysis Principal component analysis (PCA), quadratic discriminant analysis (PCDQA), and regression analysis (PCLR) were used to identify DNE variables that were highly correlated with known age-at-death. These correlations were then used to develop a model that estimates an unknown auricular surface’s age-at-death. In order to make this model available to other researchers, an R package, JSDNE, was created, which requires R version 4.3.2 (R Core Team, 2024). JSDNE can be installed automatically by operating the install function, install.packages(“JSDNE”). Once the package is installed, it can be imported by the library function, library(JSDNE). Four packages are automatically imported to the user’s library: dplyr (Wickham et al., 2019), MASS (Venables and Ripley, 2002), molaR (Pampush et al., 2016), and nnet (Venables and Ripley, 2002). After importing the .ply files with the vcgPlyRead function, the auricular surface .ply file should be replaced with x and the apical area with y. If oppositely applied, incorrect results will be calculated. The JSDNE package consists of three functions to estimate age. The PCQDA function can assign an individual to one of four broad age ranges: phase 1 (20-44yrs), phase 2(31-74yrs), phase 3 (63-93yrs), phase 4 (over 76) The PCR function provides an output of estimated age in years and the standard error (8.8yrs). The PCLR function also assigns an individual to a broad age range, but with a more limited range of under 67 years (phase 1) or over 63 years (phase 2).

Results

Age-related variables Initial analyses identified ten DNE variables that were highly correlated with age-at-death in the known-age sample (Table 3 and Figure 3). Nine of these variables (total DNE per total number of polygon faces, median value of DNE on whole surface, interquartile range (IQR) on whole surface, mean value of DNE on convex and concave surfaces, proportion of polygon faces with DNE of over 0.6, and mean, median and IQR of DNE on apex) had high positive correlations, while the proportion of polygon faces with DNE of less than 0.0001, had a high negative correlation.

Identifying the suitable variables and the cluster The best combination of variables was chosen on the basis of three criteria: (1) high total accuracy, (2) high accuracy in younger auricular surfaces and older auricular surfaces, and (3) low bias. For PCR analysis, an additional criterion was used: low standard error. For the PCQDA analysis, four variables provided the best combination: proportion of polygon faces with DNE of less than 0.0001, proportion of polygon faces with DNE of over 0.6, mean value of DNE on apex, and total DNE/total polygon faces. Grouping estimated ages into four age phases (Table 4 and Figure 4). The age intervals of each phase are 24 years (phase 1), 43 years (phase 2) and 30 years (phase 3), with an open-ended phase 4. For PCR analysis, seven variables provided the best combination: total DNE per total number of polygon faces, mean value of DNE on apex, IQR value of DNE on apex, mean value of DNE on convex surface, proportion of polygon faces with DNE of less than 0.0001. As this is a regression model no age phase was produced. Instead, the output provides an estimated age in years with a standard error of ±10.1 yrs. Finally, the PCLR analysis was developed using five variables: total DNE per total number of polygon faces, median value of DNE on whole surface, IQR of DNE on whole surface, mean value of DNE on convex surface, and mean value of DNE on apex. Estimated age was divided into two phases to determine if the auricular surface is under 67 years or over 63 years (Table 5 and Figure 5). The age phases for each of the three types of analysis were designated in order of the lowest minimum value of the age range. The minimum values of age phases 1,2,3 and 4 in PCQDA were 20,31,63 and 76, respectively, and minimum values of age phases 1 and 2 in PCLR were 20 and 63, respectively. When converting the variables to principal components, the first and second principal components contained 98%, 96%, and 96% information in PCQDA, PCLR, and PCR methods, respectively. Thus, the first and second principal components were used as variables. Then, the statistical models were developed using cross-validation. The training and test sets contained 80% and 20% of the data set randomly.

Accuracy, bias, and the average absolute deviation

The training set, test set, and combined set showed similar results. Accordingly, the results were described based on the combined set. Accuracy was defined as JSDNE age estimates that fell within the age phase that contained the known age-at-death. Bias was calculated as the mean value of the subtracted values between the mean age of the estimated age phase and the known age at death. The average absolute deviation (AAD) was the mean value of the absolute values subtracted between the mean age of the estimated age phase and the known age at death:

\[Bias=\sum(mean \ age \ of \ the \ estimated \ age \ phase - known \ age \ at \ death)/N\] \[AAD=\sum|mean \ age \ of \ the \ estimated \ age \ phase - known \ age \ at \ death|/N\]

Summary of the Results of PCQDA model

The total accuracy and the average absolute deviation (AAD) are 91% and 9.58 years, respectively (Table 6). This demonstrates that the PCQDA function estimates age with a high level of accuracy and precision. Phase 2, which estimated individuals to be over 31 and under 74 years of age, had the lowest accuracy (88.8%) and the highest AAD (11.52 years). The highest accuracy and the lowest AAD were shown in phase 1 (20-44 years of age), with 97.8% and 5.53 years, respectively. While the cluster analysis identified this four-phase grouping system as the best fit for the prediction model, levels of accuracy were also examined using 10-year age intervals (Table 7), in order to compare the relative applicability of each of the functions across age cohorts. This showed that there was a tendency to underestimate the age of older individuals (those in their 60s and older) and overestimate the age of younger and middle aged individuals (in their 50s and younger). Individuals in their 20s were estimated with the lowest level of accuracy (65.9%). The highest AAD (94.9 years) was in their 30s. The highest accuracy levels were seen for individuals estimated to be in their 40s (98.2%).

Summary of the Results of PCR model

PCR Accuracy was defined as JSDNE age estimates that fell within the standard error (8.8yrs) of the known age-at-death. The total accuracy and ADD are 76% and 6.46 years, respectively (Table 8). When collated into 10-year age intervals, the accuracy levels range from poor to moderate, but this is due to the high precision of the age estimate in the majority of intervals. Individuals in their 20s and over 80 were predicted with the lowest levels of accuracy (52.4% and 56.3%), and highest levels of AAD (9.37 years and 9.4 years). The highest accuracy (92.2%) was found for individuals estimated to be in their 50s. As with the PCQDA function, there was a tendency for younger and middle aged adults (those in their 40s and younger) to be overestimated, while the age of individuals over 50 was underestimated.

Summary of the Results of PCLR model

Accuracy was defined as JSDNE age estimates that fell within the age phase that contained the known age-at-death. PCLR produced a high level of accuracy overall (92.9%), and high precision (9.93 years) (Table 9). Phase 1 has higher accuracy (95.2%) than phase 2 (90.2%). However, phase 2 has higher precision than phase 1, indicated by the lower AAD value (11.48 years vs. 8.09 years). When classifying individuals with 10-year age intervals (Table 10), the highest accuracy was shown for individuals in their 30s (100%), and the lowest accuracy for individuals in their 60s (80.6%). In terms of precision, individuals in 20s had the lowest precision (ADD: 20.45 years), and those in their 40s had the highest precision (ADD: 2.68 years). Yet again, individuals under 40 years of age tended to be overestimated, while individuals over 40 were underestimated.

Discussion

Initial analyses involving the DNE values and known age-at-death revealed ten DNE variables that were highly correlated with age (Table 3 and Figure 3). The majority of these variables had positive correlations, demonstrating that the auricular surface becomes more curved with increasing adult age. The only variable to show a negative correlation was the proportion of polygon faces with DNE of less than 0.0001, which demonstrates that flatter areas of the auricular surface (the number of polygons that are mathematically flat as a product of nv and nu), become reduced with increasing age as the surface becomes more irregular. These findings are in agreement with traditional methods of age estimation, which have long been aware that areas of macroporosity and osteophytic growth become more visible on the auricular surfaces of older individuals (Buckberry and Chamberlain, 2002; Lovejoy et al., 1985). Previous research with curvature analysis has suggested that young adults have high curvature values (Jang et al., 2024; Villa et al., 2015), a finding which is not echoed here. It is well known that the auricular surfaces of younger adults display distinctive areas of transverse organization over a large proportion of the surface (Buckberry and Chamberlain, 2002; Lovejoy et al., 1985) and it was previously thought that this feature contributed to the high levels of curvature reported by Jang et al. (2024) and Villa et al. (2015). However, while transverse organization is visible on the reconstructed models used in the current study, it appears to have had no measurable effect on curvature. It is possible that results for younger adults in Jang et al. (2024) and Villa et al. (2015) are highly affected by the presence of subchondral defects (which produce very high DNE values (>0.6)). While subchondral defects are present in all age groups, they have a limited on impact on the variance and median DNE values when the sample is large. However, when the sample is small, the inclusion of several individuals with subchondral defects can increase the variance dramatically and raise median DNE values for the entire group, leading to improper generalisation due to the small sample size. These results demonstrate that variables relating to a number of the morphological, degenerative features used in traditional age estimation methods can be identified through DNE analysis. However, instead of relying on the experience of the observer, DNE can quantify mathematical aspects of surface geometry that are correlated with age, reducing the subjectivity and user-dependence. The JSDNE R package uses DNE variables that are correlated with age-at-death to estimate age for unknown auricular surfaces. Three functions are available; PCQDA, PCR, and PCLR. The total accuracy of the PCQDA function produces a high level of accuracy (91%), likely a result of wide age ranges of each age phase (Millán et al., 2013). The PCLR function also produces the highest level of accuracy (92.9%), but is used to determine whether an individual is under 67 or over 63 years of age. The PCR function produces a narrower age range (SE: 8.8yrs) with the high level of accuracy (76%). These results are similar to the accuracy rates gained by the majority of researchers who carry out validation studies of the Buckberry and Chamberlain (94% - 86%) and Lovejoy (63% - 27%) methods of traditional analysis (Hughes et al., 2024; Milán et al., 2013; Rissech, et al., 2012). However, while the traditional methods rely on subjectivity and the researchers own experience in assigning an individual to an age phase (Štepanovský et al., 2023), the JSDNE results are completely objective and have excellent levels of repeatability. When choosing which function to use, the nature of the research must be considered. PCQDA and PCLR may be more suitable for estimating age in archaeological individuals as these functions can identify broad age ranges that correspond to various stages of the lifecourse with moderate to high levels of accuracy. For forensic cases, which require a more precise estimate of age-at-death, the PCR function may be more appropriate. It is also possible to use all three functions in conjunction with one another. Although able to provide an objective and user-independent estimate of age-at-death, these results still demonstrate a degree of bias, as the majority of the functions tended to overestimate the age of younger individuals and underestimate the age of older individuals. This is a common tendency that has been observed in other studies (Falys et al., 2006; Moraitis et al., 2014; Navitainuck et al., 2022; Rissech et al., 2012; Rivera-Sandoval et al., 2018; Štepanovský et al., 2023), and relates to the age structure of skeletal samples used to develop the methods. Generally, the number of younger and older adults is fewer than those who are middle-aged, meaning that a wider range of variation is captured for the middle-aged auricular surfaces and the resulting age estimates are also biased towards this age group. There is currently no way to alleviate this problem with the known-age skeletal collections that are available worldwide. However, profile likelihood analysis could be a good alternative to reduce the underestimation and overestimation issue with predicted ages, albeit with broader prediction intervals. However, there is no R package for profile likelihood, so this method of analysis would have to be performed in another statistical programme.

Conclusion

The present research successfully identified ten geometric variables of the auricular surface that were associated with known age-at-death and used these to create an objective, mathematical age estimation method that is available as a novel R package, JSDNE (https://cran.r-project.org/package=JSDNE). The detailed code and the raw data developed the findings of this study are openly available at https://github.com/jisunjang19/cran-JSDNE or doi: 10.5281/zenodo.12708779. This estimates age using three functions which offers the high levels of accuracy (91-92.9%). This method does not require a great degree of experience to gather and interpret the results, only familiarity with R studio and access to a laser scanner. Moreover, the results can be applied to research in bioarchaeology and forensic anthropology.

Acknowledgements

We would like to appreciate to Dr. Maria Susana De Jesus Garcia with the Luís Lopes Skeletal Collection, curated by the National Museum of Natural History and Science (MUHNAC), in Lisbon, Portugal, Dr. Maria Teresa Ferreira with the 21st Century Identified Skeletal Collection, curated at the University of Coimbra, in Portugal, Dr. Mirko Mattia and Dr. Cristina Cattaneo with the CAL Milano Cemetery Skeletal Collection, curated at the University of Milan, in Italy, allowing to access the collections.

Code Summary

PCQDA_result()

PCQDA method involves four age phases: age phase 1 (age range:20-44), age phase 2 (age range:31-74), age phase 3 (age range:63-93), and age phase (age range: over 76). The result is the estimated age phase and its age range. This is the example how to operate the PCQDA model. The WholeSurface and Apex files are imported ply file of the scanned auricular surface. Both file can be downloaded though https://github.com/jisunjang19/cran-JSDNE (in data folder). DNE variables (MeanDNE.Apex, TotalDNE.TotalPolygonFaces, Proportion.DNEunder0.0001, Proportion.DNEover0.6) are calcaulted automatically in the PCQDA_result() function. Then, the function is estimated the age phase.The output is the estimated age phase and age range.

library(JSDNE)
PCQDA_output <- PCQDA_result(WholeSurface, Apex)
#> Total Surface DNE = 665.766 
#> Convex DNE= 367.5897 
#> Concave DNE= 298.1763 
#> Total Surface DNE = 209.9028 
#> Convex DNE= 140.7082 
#> Concave DNE= 69.19456

PCR_result()

PCR method estimates the age as the regression analysis. The result is the estimated age and the standard error (8.8 yrs).

This is the example how to operate the PCR model. The WholeSurface and Apex files are imported ply file of the scanned auricular surface. Both file can be downloaded though https://github.com/jisunjang19/cran-JSDNE (in data folder). DNE variables (MeanDNE.Apex, IQRDNE.Apex, TotalDNE.TotalPolygonFaces, MeanDNE.Convex, Proportion.DNEunder0.0001) are calcaulted automatically in the PCR_result() function. Then, the function is estimated the age. The output is the estimated age with standard error (8.8 yrs).

library(JSDNE)
PCR_output <- PCR_result(WholeSurface, Apex)
#> Total Surface DNE = 665.766 
#> Convex DNE= 367.5897 
#> Concave DNE= 298.1763 
#> Total Surface DNE = 209.9028 
#> Convex DNE= 140.7082 
#> Concave DNE= 69.19456

PCLR_result()

PCLR method involves two age phases: age phase 1 (age range: under 67), and age phase 2 (age range: over 63). The result is the estimated age phase and its age range. This is the example how to operate the PCLR model. The WholeSurface and Apex files are imported ply file of the scanned auricular surface. Both file can be downloaded though https://github.com/jisunjang19/cran-JSDNE (in data folder). DNE variables (TotalDNE.TotalPolygonFaces, MeanDNE.Apex, MeanDNE.Convex, MedianDNE.Whole, IQRDNE.Whole) are calculated automatically in the PCR_result() function. Then, the function is estimated the age phase. The output is the estimated age phase and age range.

library(JSDNE)
PCR_output <- PCR_result(WholeSurface, Apex)
#> Total Surface DNE = 665.766 
#> Convex DNE= 367.5897 
#> Concave DNE= 298.1763 
#> Total Surface DNE = 209.9028 
#> Convex DNE= 140.7082 
#> Concave DNE= 69.19456

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.