Trajectories of overall adherence to antiretroviral therapy

Introduction

Principal component analysis (PCA) provides an effective statistical approach for exploiting the patterns in CD4 count and viral load data over time. The method defines a new variable related to adherence to antiretroviral therapy (ART) that captures information from longitudinal data using feature extraction properties of PCA which is demonstrated using data from patients who acquired HIV-1 during follow-up in an ART cohort and were subsequently followed prospectively from early infection.

The PCA scores for each patient obtained by this method serve as informative summary statistics for the CD4-count and viral-load trajectories. Similar to baseline CD4 count or viral load, the first PCA score can be interpreted as a single-value summary measure of an individual’s overall treatment response to ART, but unlike most single-value summaries of CD4-count or viral-load trajectories, the first PCA score summarizes the dynamics of these quantities and reveals specific features of the trajectories associated with the effectiveness of the adherence of ART. Moreover, PCA scores are used as powerful prognostic factor than other common summaries when used in predictive analysis.

Doing a PCA under tidy principles requires running the function PCA() from FactoMineR package on the matrix of scaled numeric predictor variables, and then visualizing the result nicely using the factoextra package.

In general, when performing PCA, three things are wanted: 1) Look at the data in PC coordinates; 2) Look at the rotation matrix; and 3) Look at the variance explained by each principal component (PC).

library(viruslearner)
library(FactoMineR)
data(viral_new, package = "viruslearner")
res_pca <- PCA(viral_new, graph = FALSE)
print(res_pca)
#> **Results for the Principal Component Analysis (PCA)**
#> The analysis was performed on 87 individuals, described by 9 variables
#> *The results are available in the following objects:
#> 
#>    name               description                          
#> 1  "$eig"             "eigenvalues"                        
#> 2  "$var"             "results for the variables"          
#> 3  "$var$coord"       "coord. for the variables"           
#> 4  "$var$cor"         "correlations variables - dimensions"
#> 5  "$var$cos2"        "cos2 for the variables"             
#> 6  "$var$contrib"     "contributions of the variables"     
#> 7  "$ind"             "results for the individuals"        
#> 8  "$ind$coord"       "coord. for the individuals"         
#> 9  "$ind$cos2"        "cos2 for the individuals"           
#> 10 "$ind$contrib"     "contributions of the individuals"   
#> 11 "$call"            "summary statistics"                 
#> 12 "$call$centre"     "mean of the variables"              
#> 13 "$call$ecart.type" "standard error of the variables"    
#> 14 "$call$row.w"      "weights for the individuals"        
#> 15 "$call$col.w"      "weights for the variables"

To look at the data in PC coordinates requires combining the PC coordinates with the original dataset. This is done via the res_pca$ind$coord object. The columns containing the fitted coordinates are called Dim.1, Dim.2, etc.

The transformation of variables shown in Figure @ref(fig:plotcoords), in which PCA scores are prognostic factors, reflects the clinical belief that adherence to ART may be affected by the change and other patterns of CD4 counts and viral loads over time. The approach uses PCA with data from the entire cohort to extract the primary structure in individual trajectories and provides a concise summary of each trajectory (PCA score) to reveal clinical features of the CD4 and viral load trajectories that serve as an ART adherence transformed predictor variable.

PC1 vs PC2 dispersion. Specific features of the shape of ART trajectory as prognostic factor for overall treatment.
PC1 vs PC2 dispersion. Specific features of the shape of ART trajectory as prognostic factor for overall treatment.

Rotation matrix

The rotation matrix is stored as the res_pca$var$coord object. The rotation matrix is essential for understanding the relationship between original variables and principal components.

res_pca$var$coord
#>               Dim.1       Dim.2       Dim.3        Dim.4       Dim.5
#> cd_2018  0.27870555 -0.04062273 -0.26163183  0.873137932  0.21538696
#> cd_2019  0.64518533  0.03420379 -0.12376197  0.065834990  0.24795163
#> vl_2019 -0.07573479 -0.07003117  0.78505366  0.031053421  0.60176292
#> cd_2021  0.76117069  0.13314906  0.25218852 -0.036074808 -0.08770293
#> vl_2021 -0.34552296  0.74720041 -0.07602424  0.071507915  0.08454595
#> cd_2022  0.85936771  0.18392938  0.10428379 -0.005222839 -0.15593741
#> vl_2022 -0.17860498  0.82464513  0.11759695  0.051856613 -0.02878516
#> cd_2023  0.86609530  0.11313863  0.11020329 -0.100126342 -0.15036423
#> vl_2023 -0.30954010 -0.10341100  0.54940982  0.422680202 -0.58356162

A negative value indicates an inverse relationship with each PC. Patients with lower baseline CD4 counts contribute more to the negative direction of PC’s. A positive value suggests a positive relationship with each PC. Patients with higher baseline viral loads contribute more to the positive direction of PC’s.

Figure @ref(fig:plotrot) shows the rotation matrix in the context of a plot. The correlation between a variable and a PC is used as the coordinates of the variable on the PC. The correlation circle plot shows the relationships between all variables, with positively correlated variables shown grouped together, negatively correlated variables are positioned on opposite sides of the plot origin (opposed quadrants), and the distance between variables and the origin measures the quality of the variables on the factor map.

Distribution of patients in the PC1 vs PC2 space. Contributions of baseline and longitudinal measures are reflected by the directions and magnitudes of the arrows.
Distribution of patients in the PC1 vs PC2 space. Contributions of baseline and longitudinal measures are reflected by the directions and magnitudes of the arrows.

Patients with lower baseline values are positioned more towards the negative direction of PC1 and patients with higher baseline values are positioned more towards the positive direction of PC1. Patients with slightly decreasing baseline values are positioned more towards the negative direction of PC2, and those with slightly increasing baseline values are positioned more towards the positive direction of PC2.

Patients with similar patterns in CD4 counts and viral loads are likely to cluster together in the PC1 vs PC2 dispersion graph with the direction of the contribution values indicating how each variable influences the position of patients along PC1 and PC2 axes. Lower baseline CD4 counts, higher baseline viral loads, decreasing CD4 counts, and decreasing viral loads tend to be grouped together based on the negative directions of PC1 and PC2. Conversely, slightly decreasing baseline CD4 counts, slightly increasing baseline viral loads, decreasing CD4 counts, and increasing viral loads are grouped together based on the positive directions of PC1 and PC2.

Variance explained by each PC

The variance explained by each PC can be extracted via the function get_eigenvalue.

eig_val <- get_eigenvalue(res_pca)
eig_val
#>       eigenvalue variance.percent cumulative.variance.percent
#> Dim.1  2.8147923        31.275470                    31.27547
#> Dim.2  1.3211254        14.679171                    45.95464
#> Dim.3  1.1081563        12.312848                    58.26749
#> Dim.4  0.9654834        10.727593                    68.99508
#> Dim.5  0.8731286         9.701429                    78.69651
#> Dim.6  0.6741441         7.490490                    86.18700
#> Dim.7  0.5923136         6.581263                    92.76826
#> Dim.8  0.4196698         4.662998                    97.43126
#> Dim.9  0.2311864         2.568738                   100.00000

And is shown in the context of a plot in Figure @ref(fig:plotvarpc).

Variance captured by each PC. Together, PC1 and PC2 provide structure into the factors influencing ART adherence and treatment effectiveness.
Variance captured by each PC. Together, PC1 and PC2 provide structure into the factors influencing ART adherence and treatment effectiveness.

The first component captures 45.2% of the variation in the data and together with the second component approximately 80% of the variability is captured. Eigenvalue greater than 1 indicate those PCs that account for more variance than accounted by one of the original variables in standarized data, so it becomes a commonly used cutoff point for which PCs are retained.

PC’s and other variables in the data

The association of the outcome features (cd_2022 and vl_2022) of the data with the first two principal components is shown in the scatter plots of Figure @ref(fig:outvspc).

Outcome variables vs first two PC’s. A, Association of PC1 with cd_2022. B, Association of PC1 with vl_2022. C, Association of PC2 with cd_2022. D, Association of PC2 with vl_2022.
Outcome variables vs first two PC’s. A, Association of PC1 with cd_2022. B, Association of PC1 with vl_2022. C, Association of PC2 with cd_2022. D, Association of PC2 with vl_2022.

Individual coordinates as scores

The extracted coordinates for the individual observations serve as the transformed variable for scoring the adherence to ART. These values get stored in the res_pca$ind$coord object.

res_pca$ind$coord
#>          Dim.1         Dim.2        Dim.3       Dim.4         Dim.5
#> 1  -0.98870520 -0.5405155102 -0.383502511  0.11758005  0.3573502180
#> 2  -0.68151256 -0.3917371875 -0.536657820  0.63130683  0.2126490172
#> 3   0.57755011  0.0003973339 -0.100869418 -0.09672156 -0.0800287624
#> 4   2.43507136  0.4466142907  0.699071390 -0.52886808 -0.6855489745
#> 5   0.19217912 -0.1142184061 -0.028523841  0.21826272 -0.1810635238
#> 6  -1.51331100 -0.5031996885 -0.448863209 -0.04448105  0.2568046717
#> 7   1.40048059 -0.5320656166 -2.337653919  7.53067393  1.9670461536
#> 8   1.55749377  0.1253492882  0.284136941 -0.07535803 -0.2796376847
#> 9  -1.64727635 -0.5781447220 -0.482786717 -0.27219243  0.1612452440
#> 10 -1.67709265 -0.6461470595 -0.616179382 -0.36691068  0.4267772631
#> 11 -2.68138741 -0.7014226723 -0.533090186 -0.42818750 -0.0446504283
#> 12 -0.34904282 -0.3529301054 -0.553990858 -0.11787367  0.4174126575
#> 13  3.05438607  0.3235781005 -0.323566625  0.40517362  0.1227363909
#> 14 -1.31307839  0.5159140137  0.900200003 -0.16693948  1.1948880288
#> 15  0.41711257 -0.1168868326 -0.043978876 -0.17905349 -0.1864074227
#> 16  0.40319571 -0.1713163002 -0.557921669  0.61280572  0.6199189632
#> 17 -1.44667518  0.0311466462 -0.224201788 -0.37717503  0.1422834188
#> 18  1.32279186  0.0716430706 -0.023473630  0.11602289 -0.0850725955
#> 19 -1.18967950 -0.4688077427 -0.384614984 -0.31641020  0.0140775889
#> 20  2.44633402  0.2508076295  0.039074793  0.18016519  0.0383323379
#> 21  0.62139003  0.0502922717  0.322809044 -0.81226127 -0.7921883905
#> 22  0.38001562 -0.1303155323 -0.250956909  0.62257892 -0.0560720836
#> 23  0.48687394  0.0806233493 -0.027685311 -0.51955124 -0.3001967630
#> 24 -1.79190896 -0.5895877612 -0.429458041 -0.47314270 -0.0829786697
#> 25  0.12407424 -0.3117160968 -0.296564871 -0.04460880  0.2570393396
#> 26  1.55601245 -0.0069240447 -0.653396942  0.36426775  0.5093770890
#> 27  0.22951246 -0.1510644844  0.019957383 -0.28813007 -0.1457492868
#> 28  0.57390223  0.0046820600  0.046540072 -0.43762365 -0.3902294005
#> 29  0.73309854 -0.1101504847 -0.276041977  1.01379683 -0.0271487190
#> 30 -4.62647652 -1.5830336096  5.353325336  3.81577233 -5.0454807764
#> 31  0.53180432 -0.0395805398  0.115362542 -0.72131663 -0.4635237465
#> 32 -0.87666139 -0.4228828798 -0.276286449 -0.21336956  0.0129220628
#> 33  1.77438416  0.1777073962 -0.069467327  0.48157466 -0.0812141341
#> 34  0.74548656 -0.0367837441 -0.059557020  0.01273757 -0.0785535028
#> 35 -0.62274769 -0.3441353458 -0.202906963 -0.12698436 -0.0827287253
#> 36  2.36639454  0.2830740834  0.482457920 -0.16748537 -0.4061329146
#> 37 -0.25158190 -0.3276692868 -0.486333352 -0.01520842  0.7739732192
#> 38 -1.31337910 -0.4043508899 -0.129280826 -0.60200783 -0.1982142235
#> 39 -2.14767808  8.3465508538  0.930016658  0.52344357 -0.5234965121
#> 40  0.85515110  0.0266804074  0.265865181 -0.40060584 -0.4398120854
#> 41  0.55270440 -0.0792691160  0.007468191 -0.14163775 -0.2321123096
#> 42 -1.20220290 -0.4900904558 -0.372030506 -0.42609660  0.0352072741
#> 43 -0.83946495 -0.3412849549 -0.161729926 -0.42599418 -0.1711841934
#> 44  0.39343074 -0.1939855459 -0.346757257 -0.13461730  0.2219048244
#> 45 -1.45420619 -0.5016847594 -0.196978869 -0.57656854 -0.0177663974
#> 46  0.48486528 -0.0204741505 -0.069404591 -0.47503675 -0.0887807342
#> 47 -4.91847286  5.4350935095 -1.126008885  0.51424312  0.9472752765
#> 48  1.13479028  0.1370867673  0.264652052 -0.07473402 -0.8316267457
#> 49  2.27879028  0.2770138821 -0.120278510 -0.16561978 -0.4332859531
#> 50  2.71605796  0.2629746063  0.185099586  0.27460016  0.0956020524
#> 51  0.23240636 -0.0369567364  0.339201950 -0.74403402 -0.5957382520
#> 52 -0.40654486 -0.2129938055  0.007364296 -0.51889936 -0.4270906197
#> 53  1.24428239  0.1493784888 -0.031411269 -0.48587751 -0.2972151901
#> 54 -0.29539979 -0.2117351780 -0.012914211 -0.13535101 -0.3680289293
#> 55 -0.83044701 -0.3450922252 -0.229181356 -0.26270782 -0.0421493294
#> 56 -1.05362652 -0.4926583759 -0.453822550  0.07096787  0.1738621313
#> 57  0.01031647 -0.2307256977 -0.338203562  0.22398656  0.1209286166
#> 58  1.60733633  0.1558112991  0.141661709 -0.45410522 -0.2835030517
#> 59 -2.73214608 -0.8254271651 -0.614658499 -0.46792962  0.3046034945
#> 60  0.74642343 -0.0768789780  0.067742274 -0.19745048 -0.2340123804
#> 61 -1.22977262 -0.4115264528 -0.063764598 -0.62886181 -0.2280905536
#> 62  0.71621951 -0.0294710947 -0.012392441 -0.18788655 -0.2850253030
#> 63 -1.23238510 -0.4517404722 -0.547464672  0.07551622  0.1127870993
#> 64 -1.10159090 -0.5078295697 -0.719780641 -0.14225313  0.5863538021
#> 65  0.58735640 -0.0685325227 -0.044355792 -0.26500383 -0.0040050172
#> 66  1.00314451 -0.0288961090  0.071213248 -0.18368757 -0.0972814613
#> 67  0.79295082  0.0678617926  0.112018151 -0.30775994 -0.1565951400
#> 68 -2.71981554 -0.7944818559 -0.695617684  0.03478114  0.0283567255
#> 69 -0.16434052 -0.3126216383 -0.547949389 -0.13846149  0.2903203341
#> 70  3.93328382  0.6072039993  0.313854170  0.52356243 -0.3724959129
#> 71 -2.06398342 -0.6271244583 -0.266960841 -0.58047286 -0.1158176662
#> 72  0.40636505 -0.1103113427 -0.299305868  0.07230995  0.2284518215
#> 73 -0.30923708  0.2077146091 -0.252024774 -0.06269505  0.2561422396
#> 74 -1.90970358 -0.4929084137 -0.131379749 -0.52761419 -0.4296979594
#> 75  1.98265024  0.3976080356  0.850942882 -0.74286984 -1.1541846965
#> 76 -3.09730941 -0.8451052258 -0.605125093  0.06885556  0.3872850584
#> 77 -1.37492155 -0.5191609467 -0.374699021 -0.25565095  0.0003256777
#> 78  3.09281666  0.4757441197  0.492893156 -0.11716075 -0.5384380202
#> 79  2.20764458  0.3445328342  0.095728043 -0.07602299 -0.3573542786
#> 80 -0.86556815 -0.4551266749  0.853133716 -0.25119432  1.1167943694
#> 81  4.36106362  0.6709446328  0.494252043  0.33963788 -0.2263232092
#> 82 -0.72834646 -0.3606536878 -0.358613386 -0.15867011  0.1403901733
#> 83  0.85408000  1.0833476133  0.248846014 -0.28400861 -0.2581958862
#> 84 -0.80503884 -0.5735560973 -1.026307994  0.48184166  1.1058518237
#> 85 -0.67080711 -0.4005282405 -0.354056712 -0.37497694  0.1570103861
#> 86 -0.14576387 -0.5495637998  6.891136868 -0.16009620  5.6521120979
#> 87  1.14561551  0.1666053059  0.214964455 -0.40199106 -0.5442704268