This document describes methods for quantitative analysis implemented within powdR via a range of reproducible examples that use open source data from the package.

One of the most powerful properties of XRPD data is that the intensities of crystalline (e.g., quartz, calcite and gypsum), disordered (e.g., clay minerals), and amorphous (e.g., volcanic glass and organic matter) signals within a diffractogram can be related to their concentrations within the mixture. This principal facilitates the quantification of phase concentrations from XRPD data.

Of the approaches available for quantitative XRPD analysis, the simple Reference Intensity Ratio (RIR) method has consistently proven accurate. A RIR is a measure of the diffracting power of a phase relative to that of a standard (most often corundum, Al₂O₃), usually measured in a 50:50 mixture by weight. The RIR of a detectable phase within a mixture is required for its quantification.

A given diffractogram can be modeled as the sum of pure diffractograms for all detectable phases, each scaled by different amounts (scaling factors). By combining these scaling factors with RIRs, phase concentrations can be calculated. Hereafter this approach is referred to as full pattern summation. Full pattern summation is particularly suitable for mixtures containing crystalline mineral components in combination with disordered and/or X-ray amorphous phases (e.g. soil), and further details on its implementation in powdR are provided in Butler and Hillier (2021b).

1 Full pattern summation with powdR

1.1 The `powdRlib` object

A key component of the full pattern summation functions within powdR is the library of reference patterns. These are stored within a powdRlib object created from two basic components using the powdRlib() constructor function. The first component, specified via the xrd_table argument of powdRlib(), is a data frame of the count intensities of the reference patterns, with their 2θ axis as the first column. The column for a given reference pattern must be named using a unique identifier (a phase ID). An example of such a format is provided in the minerals_xrd data:

library(powdR)

data(minerals_xrd)

head(minerals_xrd)
#>       tth QUA.1 QUA.2 FEL ORT SAN ALB OLI DOL.1 DOL.2  ILL KAO GOE.1 GOE.2  ORG
#> 1 4.00973    69    91 546 599 638 308 343   268   362 3078 525  3549 10000 3225
#> 2 4.04865    69    92 524 570 609 294 332   256   345 2960 500  3511  9592 3180
#> 3 4.08757    64    86 505 555 582 286 328   250   343 2888 486  3401  9323 3135
#> 4 4.12649    64    83 512 543 558 277 310   247   327 2753 474  3290  9042 3092
#> 5 4.16541    62    83 478 518 536 275 304   241   318 2718 478  3194  9248 3050
#> 6 4.20433    60    81 459 514 517 261 298   228   314 2720 447  3113  8557 3010

The second component required to build a powdRlib object, specified via the phases_table argument of powdRlib(), is a data frame containing 3 columns in the following order.

phase_id: a string of unique IDs corresponding to the names of each reference pattern in the data provided to the xrd_table argument outlined above.
phase_name: the name of the phase group that this reference pattern belongs to (e.g. quartz, plagioclase, illite etc.).
rir: the reference intensity ratios of the reference patterns (relative to a known standard, usually corundum).

An example of the format required for the phases_table argument of powRlib() is provided in the minerals_phases data.

data(minerals_phases)

minerals_phases
#>    phase_id     phase_name  rir
#> 1     QUA.1         Quartz 4.62
#> 2     QUA.2         Quartz 4.34
#> 3       FEL     K-feldspar 0.75
#> 4       ORT     K-feldspar 1.03
#> 5       SAN     K-feldspar 0.93
#> 6       ALB    Plagioclase 1.31
#> 7       OLI    Plagioclase 1.06
#> 8     DOL.1       Dolomite 2.35
#> 9     DOL.2       Dolomite 2.39
#> 10      ILL         Illite 0.22
#> 11      KAO      Kaolinite 0.91
#> 12    GOE.1       Goethite 0.93
#> 13    GOE.2       Goethite 0.37
#> 14      ORG Organic-Matter 0.07

Crucially, when building the powdRlib object, all phase IDs in the first column of the phases_table must match the column names of the xrd_table (excluding the name of the first column which is the 2θ axis), for example.

identical(names(minerals_xrd[-1]),
          minerals_phases$phase_id)
#> [1] TRUE

Once created, powdRlib objects can easily be visualised using the associated plot() method (see ?plot.powdRlib), which accepts the wavelength, refs and interactive arguments that are used to specify the X-ray wavelength, the reference patterns to plot, and the output format, respectively. In all cases where plot() is used in this document, the use of interactive = TRUE in the function call will produce an interactive html graph that can be viewed in RStudio or a web browser.

my_lib <- powdRlib(minerals_xrd, minerals_phases)

plot(my_lib, wavelength = "Cu",
     refs = c("ALB", "DOL.1",
              "QUA.1", "GOE.2"),
     interactive = FALSE)

Figure 1.1: Plotting selected reference patterns from a powdRlib object.

1.1.1 Pre-loaded `powdRlib` objects

There are three powdRlib objects provided as part of the powdR package:

minerals [accessed via data(minerals)], which is a simple and low resolution library designed to facilitate fast computation of basic examples.
rockjock [accessed via data(rockjock)], which is a comprehensive library of 169 reference patterns covering most phases that might be encountered in geological and soil samples. The rockjock library in powdR uses data from the original RockJock program (Eberl 2003) thanks to the permission of Dennis Eberl. In rockjock, each reference pattern from the original RockJock program has been scaled to a maximum intensity of 10000 counts, and the RIRs normalised relative to Corundum. All rockjock data were analysed using Cu Kα radiation.
afsis [accessed via data(afsis)], which contains 21 reference patterns measured on a Bruker D2 Phaser as part of the XRPD data analysis undertaken for the Africa Soil Information Service Sentinel Site programme. These are designed to supplement the rockjock library when analysing soil XRPD data.

To accompany the rockjock reference library, a list of eight synthetic mixtures from the original RockJock program are also included in powdR in the rockjock_mixtures data [accessed via data(rockjock_mixtures)], and the known compositions of these mixtures provided in the rockjock_weights data [accessed via data(rockjock_weights)].

1.1.2 Subsetting a `powdRlib` object

Occasionally it may be useful to subset a reference library to a smaller selection. This can be achieved using subset(), which for powdRlib objects accepts three arguments: x, refs and mode (see ?subset.powdRlib). The x argument specifies the powdRlib object to be subset, refs specifies the IDs and/or names of phases to select, and mode specifies whether these phases are kept (mode = "keep") or removed (mode = "remove").

data(rockjock)

#Have a look at the phase IDs in rockjock
rockjock$phases$phase_id[1:10]
#>  [1] "CORUNDUM"                "BACK_POS"               
#>  [3] "BACK_NEG"                "QUARTZ"                 
#>  [5] "ORDERED_MICROCLINE"      "INTERMEDIATE_MICROCLINE"
#>  [7] "SANIDINE"                "ORTHOCLASE"             
#>  [9] "ANORTHOCLASE"            "ALBITE_CLEAVELANDITE"

#Remove reference patterns from rockjock
rockjock_1 <- subset(rockjock,
                     refs = c("ALUNITE", #phase ID
                              "AMPHIBOLE", #phase ID
                              "ANALCIME", #phase ID
                              "Plagioclase"), #phase name
                     mode = "remove")

#Check number of reference patterns remaining in library
nrow(rockjock_1$phases)
#> [1] 157

#Keep certain reference patterns of rockjock
rockjock_2 <- subset(rockjock,
                     refs = c("ALUNITE", #phase ID
                              "AMPHIBOLE", #phase ID
                              "ANALCIME", #phase ID
                              "Plagioclase"), #phase name
                     mode = "keep")

#Check number of reference patterns remaining
nrow(rockjock_2$phases)
#> [1] 11

1.1.3 Interpolating and merging `powdRlib` objects

Two powdRlib objects from different instruments can be interpolated and then merged using the interpolate and merge methods (see ?interpolate.powdRlib and merge.powdRlib), respectively. For example, the minerals library can be merged with the rockjock library after interpolation using:

#Load the minerals library
data(minerals)

#Check the number of reference patterns
nrow(minerals$phases)
#> [1] 14

#Load the rockjock library
data(rockjock)

#Check the number of reference patterns
nrow(rockjock$phases)
#> [1] 168

#interpolate minerals library onto same 2theta as rockjock
minerals_i <- interpolate(minerals, new_tth = rockjock$tth)

#merge the libraries
merged_lib <- merge(rockjock, minerals_i)

#Check the number of reference patterns in the merged library
nrow(merged_lib$phases)
#> [1] 182

In simpler cases where two libraries are already on the same 2θ axis and were measured using the same instrumental parameters, only the use of merge() would be required.

#Load the afsis library
data(afsis)

identical(rockjock$tth, afsis$tth)
#> [1] TRUE

rockjock_afsis <- merge(rockjock, afsis)

1.2 Full pattern summation with `fps()`

Once you have a powdRlib reference library and diffractogram(s) loaded into R, you have everything needed for quantitative analysis via full pattern summation. Full pattern summation in powdR is provided via the fps() function, whilst an automated version is provided in afps(). Details on these functions are provided in Butler and Hillier (2021a) and Butler and Hillier (2021b).

fps() is specifically applied to powdRlib objects, and accepts a wide range of arguments that are detailed in the package documentation (see ?fps.powdRlib). Here the rockjock and rockjock_mixtures data will be used to demonstrate the main features of fps() and the various ways in which it can be used.

1.2.1 Full pattern summation with an internal standard

Often samples are prepared for XRPD analysis with an internal standard of known concentration. If this is the case, then the std and std_conc arguments of fps() can be used to define the internal standard and its concentration (in weight %), respectively, which is then used in combination with the reference intensity ratios to compute phase concentrations. For example, all samples in the rockjock_mixtures data were prepared with 20 % corundum as the internal standard, thus this can be specified using std = "CORUNDUM" and std_conc = 20 in the call to fps(). In addition, setting the omit_std argument to TRUE makes sure that the internal standard concentration will be omitted from the output and the phase concentrations recomputed accordingly. In such cases the phase specified as the internal standard can also be used in combination with the value specified in the align argument to ensure that the measured diffractogram is appropriately aligned on the 2θ axis. These principles are used in the example below, which passes the following seven arguments to fps():

lib is used to define the powdRlib object containing the reference patterns and their RIRs.
smpl is used to define the data frame or XY object containing the sample diffractogram.
refs is used to define a string of phase IDs (lib$phases$phase_id) and/or phase names (lib$phases$phase_names) of the reference patterns to be used in the fitting process.
std is used to define the phase ID of the reference pattern to be used as the internal standard.
std_conc is used to define the concentration of the internal standard in weight %.
omit_std is used to define whether the internal standard is omitted from the output and phase concentrations recomputed accordingly.
align is used to define the maximum positive or negative shift in 2θ that is permitted during alignment of the sample to the reference pattern that is specified in the std argument.

data(rockjock_mixtures)

fit1 <- fps(lib = rockjock,
            smpl = rockjock_mixtures$Mix5,
            refs = c("ORDERED_MICROCLINE",
                     "Plagioclase",
                     "KAOLINITE_DRY_BRANCH",
                     "MONTMORILLONITE_WYO",
                     "CORUNDUM",
                     "QUARTZ"),
            std = "CORUNDUM",
            std_conc = 20,
            omit_std = TRUE,
            align = 0.3)
#> 
#> -Aligning sample to the internal standard
#> -Interpolating library to same 2theta scale as aligned sample
#> -Optimising...
#> -Removing negative coefficients and reoptimising...
#> -Removing negative coefficients and reoptimising...
#> -Computing phase concentrations
#> -Using internal standard concentration of 20 % to compute phase concentrations
#> -Omitting internal standard from phase concentrations
#> ***Full pattern summation complete***

Once computed, the fps() function produces a powdRfps object, which is a bundle of data in list format that contains the outputs (see ?fps.powdRlib).

summary(fit1)
#>                        Length Class      Mode   
#> tth                    2992   -none-     numeric
#> fitted                 2992   -none-     numeric
#> measured               2992   -none-     numeric
#> residuals              2992   -none-     numeric
#> phases                    4   data.frame list   
#> phases_grouped            2   data.frame list   
#> obj                       3   -none-     numeric
#> weighted_pure_patterns    9   data.frame list   
#> coefficients              9   -none-     numeric
#> inputs                   16   -none-     list

The phase concentrations can be accessed in the phases or phases_grouped data frames of the powdRfps object:

#All phases
fit1$phases
#>               phase_id    phase_name       rir phase_percent
#> 1             CORUNDUM      Corundum 1.0000000            NA
#> 2               QUARTZ        Quartz 3.5404393     24.955000
#> 3   ORDERED_MICROCLINE    K-feldspar 0.9654312     40.098375
#> 4         ANORTHOCLASE   Plagioclase 0.5804293      3.612375
#> 5             ANDESINE   Plagioclase 0.8206422      2.969625
#> 6          LABRADORITE   Plagioclase 0.8113040      3.350375
#> 7            ANORTHITE   Plagioclase 0.5294816      2.485875
#> 8 KAOLINITE_DRY_BRANCH     Kaolinite 0.5812875      5.302750
#> 9  MONTMORILLONITE_WYO Smectite (Di) 0.3202779     12.908250

#Phases grouped and summed by the phase name
fit1$phases_grouped
#>      phase_name phase_percent
#> 1      Corundum            NA
#> 2        Quartz      24.95500
#> 3    K-feldspar      40.09837
#> 4   Plagioclase      12.41837
#> 5     Kaolinite       5.30275
#> 6 Smectite (Di)      12.90825

Further, notice that when the concentration of the internal standard is specified then the phase concentrations do not necessarily sum to 100 %:

sum(fit1$phases$phase_percent, na.rm = TRUE)
#> [1] 95.68263

It’s also possible to “close” the mineral composition so that the weight percentages sum to 100. This can be achieved in two ways:

By defining closed = TRUE in the fps() function call.
By applying the close_quant() function to the powdRfps output.

For example, the phase composition in fit2 created above can be closed using:

fit1c <- close_quant(fit1)

sum(fit1c$phases$phase_percent, na.rm = TRUE)
#> [1] 100

1.2.2 Full pattern summation without an internal standard

In cases where an internal standard is not added to a sample, phase quantification can be achieved by assuming that all detectable phases can be identified and that they sum to 100 weight %. By setting the std_conc argument of fps() to NA, or leaving it out of the function call, it will be assumed that the sample has been prepared without an internal standard and the phase concentrations computed accordingly.

fit2 <- fps(lib = rockjock,
            smpl = rockjock_mixtures$Mix5,
            refs = c("ORDERED_MICROCLINE",
                     "Plagioclase",
                     "KAOLINITE_DRY_BRANCH",
                     "MONTMORILLONITE_WYO",
                     "CORUNDUM",
                     "QUARTZ"),
            std = "CORUNDUM",
            align = 0.3)
#> 
#> -Aligning sample to the internal standard
#> -Interpolating library to same 2theta scale as aligned sample
#> -Optimising...
#> -Removing negative coefficients and reoptimising...
#> -Removing negative coefficients and reoptimising...
#> -Computing phase concentrations
#> -Internal standard concentration unknown. Assuming phases sum to 100 %
#> ***Full pattern summation complete***

In this case the phase specified in the std argument is only used for 2θ alignment, and is always included in the computed phase concentrations.

fit2$phases
#>               phase_id    phase_name       rir phase_percent
#> 1             CORUNDUM      Corundum 1.0000000       20.7155
#> 2               QUARTZ        Quartz 3.5404393       20.6782
#> 3   ORDERED_MICROCLINE    K-feldspar 0.9654312       33.2262
#> 4         ANORTHOCLASE   Plagioclase 0.5804293        2.9933
#> 5             ANDESINE   Plagioclase 0.8206422        2.4607
#> 6          LABRADORITE   Plagioclase 0.8113040        2.7762
#> 7            ANORTHITE   Plagioclase 0.5294816        2.0599
#> 8 KAOLINITE_DRY_BRANCH     Kaolinite 0.5812875        4.3940
#> 9  MONTMORILLONITE_WYO Smectite (Di) 0.3202779       10.6961

Furthermore, the phase concentrations computed using this approach will always sum to 100 %.

sum(fit2$phases$phase_percent)
#> [1] 100.0001

1.2.3 Non-negative least squares

The fitted patterns resulting from full pattern summation are most commonly derived by minimising an objective function. This process is computationally intensive and can therefore prove slow when a large number of scaling coefficients (i.e. a large number of reference patterns) are used. As a fast alternative to this approach, non-negative least squares [NNLS; Mullen and van Stokkum (2012)] is also implemented in fps() and can be defined using the solver argument:

#Create a timestamp
a <- Sys.time()

fit2_n <- fps(lib = rockjock,
              smpl = rockjock_mixtures$Mix5,
              refs = c("ORDERED_MICROCLINE",
                       "Plagioclase",
                       "KAOLINITE_DRY_BRANCH",
                       "MONTMORILLONITE_WYO",
                       "CORUNDUM",
                       "QUARTZ"),
              solver = "NNLS",
              std = "CORUNDUM",
              align = 0.3)
#> 
#> -Aligning sample to the internal standard
#> -Interpolating library to same 2theta scale as aligned sample
#> -Applying non-negative least squares
#> -Computing phase concentrations
#> -Internal standard concentration unknown. Assuming phases sum to 100 %
#> ***Full pattern summation complete***

#Calculate computation time
Sys.time() - a
#> Time difference of 0.2005031 secs

resulting in a computation time of less than half a second. Whilst the use of NNLS is fast, there is a small compromise in accuracy compared to the minimisation of an objective function (see Supplementary Material in Butler and Hillier 2021b).

1.3 Automated full pattern summation

The selection of suitable reference patterns for full pattern summation can often be challenging and time consuming. An attempt to automate this process is provided in the afps() function, which can select appropriate reference patterns from a reference library and subsequently exclude reference patterns based on limit of detection estimates. Such an approach is considered particularly advantageous when quantifying high-throughput XRPD datasets that display considerable mineralogical variation such as the Reynolds Cup (Butler and Hillier 2021a).

All of the principles and arguments outlined above for the fps() function also apply to the use of afps(). However, there are a few additional arguments for afps() that need to be defined:

force is used to specify phase IDs (lib$phases$phase_id) or phase names (lib$phases$phase_name) that must be retained in the output, even if their concentrations are estimated to be below the limit of detection or negative.
lod is used to define the limit of detection (LOD; in weight %) of the phase specified as the internal standard in the std argument. This limit of detection for the defined phase is then used in combination with the RIRs to estimate the LODs of all other phases Butler and Hillier (2021b).
amorphous is used to specify which, if any, phases should be treated as amorphous. This is used because the assumptions used to estimate the LODs of crystalline and disordered phases are not appropriate for amorphous phases.
amorphous_lod is used to define the LOD (in weight %) of the phases specified in the amorphous argument.

Here the rockjock library, containing 169 reference patterns, will be used to quantify one of the samples in the rockjock_mixtures data. Note that when using afps(), omission of the refs argument in the function call will automatically result in all phases from the reference library being used in the fitting process.

#Produce the fit
a_fit1 <- afps(lib = rockjock,
               smpl = rockjock_mixtures$Mix5,
               std = "CORUNDUM",
               align = 0.3,
               lod = 1)

1.4 Additional `fps()` and `afps()` functionality

1.4.1 Shifting of reference patterns

Both fps() and afps() accept a shift argument, which when set to a value greater than zero results in optimisation of a small 2θ shift for each reference pattern in order to improve the quality of the fit. The value supplied to the shift argument defines the maximum (either positive or negative) shift that can be applied to each reference pattern before the shift is reset to zero.

This shifting process is designed to correct for small linear differences in the peak positions of the standards relative to the sample, which may result from a combination of instrumental aberrations, mineralogical variation and/or uncorrected errors in the library patterns. Whilst this shifting routine provides more accurate results, the process can substantially increase computation time.

1.4.2 Regrouping phases in `powdRfps` and `powdRafps` objects

Occasionally it can be useful to apply a different grouping structure to the phases quantified within a powdRfps or powdRafps object. This can be achieved using the regroup function (see ?regroup.powdRfps and ?regroup.powdRafps):

#View the phases of the fit1 output
fit1$phases
#>               phase_id    phase_name       rir phase_percent
#> 1             CORUNDUM      Corundum 1.0000000            NA
#> 2               QUARTZ        Quartz 3.5404393     24.955000
#> 3   ORDERED_MICROCLINE    K-feldspar 0.9654312     40.098375
#> 4         ANORTHOCLASE   Plagioclase 0.5804293      3.612375
#> 5             ANDESINE   Plagioclase 0.8206422      2.969625
#> 6          LABRADORITE   Plagioclase 0.8113040      3.350375
#> 7            ANORTHITE   Plagioclase 0.5294816      2.485875
#> 8 KAOLINITE_DRY_BRANCH     Kaolinite 0.5812875      5.302750
#> 9  MONTMORILLONITE_WYO Smectite (Di) 0.3202779     12.908250

#Load the rockjock regrouping structure
data(rockjock_regroup)

#View the first 6 rows
head(rockjock_regroup)
#>                  phase_id phase_name_grouped phase_name_grouped2
#> 1                CORUNDUM           Corundum            Non-clay
#> 2                BACK_POS         Background          Background
#> 3                BACK_NEG         Background          Background
#> 4                  QUARTZ             Quartz            Non-clay
#> 5      ORDERED_MICROCLINE         K-feldspar            Non-clay
#> 6 INTERMEDIATE_MICROCLINE         K-feldspar            Non-clay

#Regroup the data in a_fit1 using the coarsest description
fit1_rg <- regroup(fit1, rockjock_regroup[c(1,3)])

#Check the regrouped data
fit1_rg$phases_grouped
#>   phase_name phase_percent
#> 1       Clay      18.21100
#> 2   Non-clay      77.47162

2 Plotting `powdRfps` and `powdRafps` objects

Plotting results powdRfps and powdRafps objects, derived from fps() and afps(), respectively, is achieved using plot() (see ?plot.powdRfps and ?plot.powdRafps).

plot(fit1, wavelength = "Cu", interactive = FALSE)

Figure 2.1: Example output from plotting a powdRfps or powdRafps object.

When plotting powdRfps or powdRafps objects the wavelength must be defined because it is required to compute d-spacings that are shown when interactive = TRUE.

In addition to above, plotting for powdRfps and powdRafps objects can be further adjusted by the group, mode and xlim arguments. When the group argument is set to TRUE, the patterns within the fit are grouped and summed according to phase names, which can help simplify the plot:

plot(fit1, wavelength = "Cu",
     group = TRUE,
     interactive = FALSE)

Figure 2.2: Plotting a powdRfps or powdRafps object with the reference patterns grouped.

The mode argument can be one of "fit" (the default), "residuals" or "both", for example:

plot(fit1, wavelength = "Cu",
     mode = "residuals",
     interactive = FALSE)

Figure 2.3: Plotting the residuals of a powdRfps or powdRafps object.

or alternatively both the fit and residuals can be plotted using mode = "both" and the 2θ axis restricted using the xlim argument:

plot(fit1, wavelength = "Cu",
     mode = "both", xlim = c(20,30),
     interactive = FALSE)

Figure 2.4: Plotting both the fit and residuals of a powdRfps or powdRafps object.

3 Quantifying multiple samples

3.1 `lapply()`

The simplest way to quantify multiple samples via either fps() and afps() is by wrapping either of the functions in lapply() and supplying a list of diffractograms. The following example wraps the fps() function in lapply and applies the function to the first three items within the rockjock_mixtures data.

multi_fit <- lapply(rockjock_mixtures[1:2], fps,
                    lib = rockjock,
                    std = "CORUNDUM",
                    refs = c("ORDERED_MICROCLINE",
                             "Plagioclase",
                             "KAOLINITE_DRY_BRANCH",
                             "MONTMORILLONITE_WYO",
                             "ILLITE_1M_RM30",
                             "CORUNDUM",
                             "QUARTZ"),
                    align = 0.3,
                    std_conc = 20,
                    omit_std = TRUE)
#> 
#> -Aligning sample to the internal standard
#> -Interpolating library to same 2theta scale as aligned sample
#> -Optimising...
#> -Removing negative coefficients and reoptimising...
#> -Removing negative coefficients and reoptimising...
#> -Computing phase concentrations
#> -Using internal standard concentration of 20 % to compute phase concentrations
#> -Omitting internal standard from phase concentrations
#> ***Full pattern summation complete***
#> 
#> -Aligning sample to the internal standard
#> -Interpolating library to same 2theta scale as aligned sample
#> -Optimising...
#> -Removing negative coefficients and reoptimising...
#> -Removing negative coefficients and reoptimising...
#> -Computing phase concentrations
#> -Using internal standard concentration of 20 % to compute phase concentrations
#> -Omitting internal standard from phase concentrations
#> ***Full pattern summation complete***

When using lapply in this way, the names of the items within the list or multiXY object supplied to the function are inherited by the output:

identical(names(rockjock_mixtures[1:2]),
          names(multi_fit))
#> [1] TRUE

3.2 Parallel processing

Whilst lapply is a simple way to quantify multiple samples, the computation remains restricted to a single core. Computation time can be reduced many-fold by allowing different cores of your machine to process one sample at a time, which can be achieved using the doParallel and foreach packages, for example:

#Install the foreach and doParallel package
install.packages(c("foreach", "doParallel"))

#load the packages
library(foreach)
library(doParallel)

#Detect number of cores on machine
UseCores <- detectCores()

#Register the cluster using n - 1 cores 
cl <- makeCluster(UseCores-1)

registerDoParallel(cl)

#Use foreach loop and %dopar% to compute in parallel
multi_fit <- foreach(i = 1:2) %dopar%
  (powdR::fps(lib = rockjock,
               smpl = rockjock_mixtures[[i]],
               std = "CORUNDUM",
               refs = c("ORDERED_MICROCLINE",
                        "LABRADORITE",
                        "KAOLINITE_DRY_BRANCH",
                        "MONTMORILLONITE_WYO",
                        "ILLITE_1M_RM30",
                        "CORUNDUM",
                        "QUARTZ"),
               align = 0.3))

#name the items in the aquant_parallel list
names(multi_fit) <- names(rockjock_mixtures)[1:2]

#stop the cluster
stopCluster(cl)

Note how the call to fps uses the notation powdR::fps(), which specifies the accessing of the fps() function from the powdR package.

4 Summarising mineralogy

When multiple samples are quantified it is often useful to report the phase concentrations of all of the samples in a single table. For a given list of powdRfps and/or powdRafps objects, the summarise_mineralogy() function yields such summary tables, for example:

summarise_mineralogy(multi_fit, type = "grouped", order = TRUE)
#>   sample_id Plagioclase Smectite (Di) Kaolinite    Illite K-feldspar Quartz
#> 1      Mix1     25.0575     50.955000  14.65437  7.666375    3.77750     NA
#> 2      Mix2     44.7300      3.303375  24.91338 11.850625    8.57225 5.5575
#>   Corundum
#> 1       NA
#> 2       NA

where type = "grouped" denotes that phases with the same phase_name will be summed together, and order = TRUE specifies that the columns will be ordered from most common to least common (assessed by the sum of each column). Using type = "all" instead would result in tabulation of all phase IDs.

In addition to the quantitative mineral data, three objective parameters that summarise the quality of the fit can be appended to the table via the logical rwp, r and delta arguments.

summarise_mineralogy(multi_fit, type = "grouped", order = TRUE,
                     rwp = TRUE, r = TRUE, delta = TRUE)
#>   sample_id Plagioclase Smectite (Di) Kaolinite    Illite K-feldspar Quartz
#> 1      Mix1     25.0575     50.955000  14.65437  7.666375    3.77750     NA
#> 2      Mix2     44.7300      3.303375  24.91338 11.850625    8.57225 5.5575
#>   Corundum       Rwp         R    Delta
#> 1       NA 0.1187660 0.1149397 39602.26
#> 2       NA 0.1212723 0.1068299 36312.52

For each of these parameters, lower values represent a smaller difference between the measured and fitted patterns, and hence are indicative of a better fit.

5 The powdR Shiny app

All above examples showcase the use of R code to carry out full pattern summation. It is also possible to run much of this functionality of powdR via a Shiny web application. This Shiny app can be loaded in your default web browser by running run_powdR(). The resulting application has six tabs:

Reference Library Builder: Allows you to create and export a powdRlib reference library from two ‘.csv’ files: one for the XRPD measurements, and the other for the ID, name and reference intensity ratio of each pattern.
Reference Library Viewer: Facilitates quick inspection of the phases within a powdRlib reference library.
Reference Library Editor: Allows the user to easily subset a powdRlib reference library .
Full Pattern Summation: A user friendly interface for iterative full pattern summation of single samples using fps() or afps().
Results Viewer/Editor: Allows for results from previously saved powdRfps and powdRafps objects to be viewed and edited via addition or removal of reference patterns.
Help Provides a series of video tutorials (via YouTube) detailing the use of the powdR Shiny application.

References

Butler, Benjamin M, and Stephen Hillier. 2021a. “AUTOMATED FULL-PATTERN SUMMATION OF X-RAY POWDER DIFFRACTION DATA FOR HIGH-THROUGHPUT QUANTIFICATION OF CLAY-BEARING MIXTURES.” Clays and Clay Minerals. https://doi.org/10.1007/s42860-020-00105-6.

———. 2021b. “powdR:An R package for quantitative mineralogy using full pattern summation of X-ray powder diffraction data.” Computers and Geosciences 147: 104662. https://doi.org/10.1016/j.cageo.2020.104662.

Eberl, D. D. 2003. “User’s guide to ROCKJOCK - A program for determining quantitative mineralogy from powder X-ray diffraction data.” Boulder, CA: USGS.

Mullen, Katharine M., and Ivo H. M. van Stokkum. 2012. Nnls: The Lawson-Hanson Algorithm for Non-Negative Least Squares (NNLS). https://CRAN.R-project.org/package=nnls.

Full pattern summation of XRPD data

1 Full pattern summation with powdR

1.1 The powdRlib object

1.1.1 Pre-loaded powdRlib objects

1.1.2 Subsetting a powdRlib object

1.1.3 Interpolating and merging powdRlib objects

1.2 Full pattern summation with fps()