The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

2. Discretized Non-negative Tri-Matrix Factorization (dNMTF)

Koki Tsuyuzaki

Laboratory for Bioinformatics Research, RIKEN Center for Biosystems Dynamics Research
k.t.the-answer@hotmail.co.jp

2024-05-11

Introduction

In this vignette, we consider approximating a binary or non-negative matrix as a product of three non-negative low-rank matrices (a.k.a., factor matrices).

Test data is available from toyModel.

library("dcTensor")
X <- dcTensor::toyModel("dNMF")

You will see that there are five blocks in the data matrix as follows.

suppressMessages(library("fields"))
image.plot(X, main="Original Data", legend.mar=8)

Binary Matrix Tri-Factorization (BMTF)

Here, we consider the approximation of a binary data matrix \(X\) (\(N \times M\)) as a matrix product of \(U\) (\(N \times J1\)), \(S\) (\(J1 \times J2\)), and \(V\) (\(M \times J2\)):

\[ X \approx U S V' \ \mathrm{s.t.}\ U,V \in \{0,1\}, S \geq 0 \]

Here, we call this Binary Matrix Tri-Factorization (BMTF). BMTF is based on Non-negative Matrix Tri-Factorization (NMTF (Copar e2019; Long 2005; Ding 2006)) and Binary Matrix Factorization (BMF (Zhang 2007)). For the details of NMTF, see also NMTF function of nnTensor package.

Basic Usage

In BMTF, two rank parameters \(J1\) (\(\leq N\)) and \(J2\) (\(\leq M)\)) is needed to be set in advance. Other settings such as the number of iterations (num.iter) or factorization algorithm (algorithm) are also available. For the details of arguments of dNMTF, see ?dNMTF. After the calculation, various objects are returned by dNMTF.

set.seed(123456)
out_BMTF <- dNMTF(X, Bin_U=10, Bin_V=10, rank=c(5,5))
str(out_BMTF, 2)
## List of 8
##  $ U            : num [1:100, 1:5] 0.124 0.124 0.124 0.124 0.124 ...
##  $ S            : num [1:5, 1:5] 9.54e-15 7.71e-02 7.65e-01 4.55 3.01 ...
##  $ V            : num [1:300, 1:5] 9.80e-08 2.12e-13 1.36e-56 1.07e-07 1.34e-11 ...
##  $ rank         : num [1:2] 5 5
##  $ RecError     : Named num [1:101] 1.00e-09 6.99e+01 6.59e+01 6.35e+01 6.19e+01 ...
##   ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...
##  $ TrainRecError: Named num [1:101] 1.00e-09 6.99e+01 6.59e+01 6.35e+01 6.19e+01 ...
##   ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...
##  $ TestRecError : Named num [1:101] 1e-09 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 ...
##   ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...
##  $ RelChange    : Named num [1:101] 1.00e-09 7.22 6.06e-02 3.89e-02 2.46e-02 ...
##   ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...

The reconstruction error (RecError) and relative error (RelChange, the amount of change from the reconstruction error in the previous step) can be used to diagnose whether the calculation is converged or not.

layout(t(1:2))
plot(log10(out_BMTF$RecError[-1]), type="b", main="Reconstruction Error")
plot(log10(out_BMTF$RelChange[-1]), type="b", main="Relative Change")

The product of \(U\), \(S\), and \(V\) shows whether the original data is well-recovered by dNMTF.

recX <- out_BMTF$U %*% out_BMTF$S %*% t(out_BMTF$V)
layout(t(1:2))
image.plot(X, main="Original Data", legend.mar=8)
image.plot(recX, main="Reconstructed Data (BMF)", legend.mar=8)

The histograms of \(U\), \(S\), and \(V\) show that these take values close to 0 and 1.

layout(t(1:3))
hist(out_BMTF$U, breaks=100)
hist(out_BMTF$S, breaks=100)
hist(out_BMTF$V, breaks=100)

Note that these factor matrices do not always take the values of 0 and 1 completely. This is because the binarization in BMTF is based on the regularization to softly set the values as close to {0,1} as possible, and is not a hard binarization.

out_BMTF$U[1:3,1:3]
##           [,1]         [,2]         [,3]
## [1,] 0.1238387 2.015980e-45 8.933262e-20
## [2,] 0.1238387 3.694257e-44 5.997028e-20
## [3,] 0.1238387 3.067534e-44 7.937174e-19
out_BMTF$S
##              [,1]         [,2]         [,3]        [,4]         [,5]
## [1,] 9.535194e-15 7.902857e+00 1.213441e-40 0.006942176 8.308241e-06
## [2,] 7.707013e-02 4.897875e-28 7.171875e+00 0.001055494 1.150494e-01
## [3,] 7.649903e-01 8.425523e-34 2.629017e-01 0.021129672 3.997469e-01
## [4,] 4.546750e+00 1.047616e-15 8.178169e-02 0.173047727 1.173014e-01
## [5,] 3.014648e+00 3.017382e-31 2.199562e-01 0.100951552 1.204483e-01
out_BMTF$V[1:3,1:3]
##              [,1]     [,2]         [,3]
## [1,] 9.798525e-08 1.002167 7.419010e-10
## [2,] 2.121454e-13 1.002167 6.227816e-12
## [3,] 1.362156e-56 1.001178 4.707794e-72

If you want to get the {0,1} values, use the round function as below:

round(out_BMTF$U[1:3,1:3], 0)
##      [,1] [,2] [,3]
## [1,]    0    0    0
## [2,]    0    0    0
## [3,]    0    0    0
round(out_BMTF$S, 0)
##      [,1] [,2] [,3] [,4] [,5]
## [1,]    0    8    0    0    0
## [2,]    0    0    7    0    0
## [3,]    1    0    0    0    0
## [4,]    5    0    0    0    0
## [5,]    3    0    0    0    0
round(out_BMTF$V[1:3,1:3], 0)
##      [,1] [,2] [,3]
## [1,]    0    1    0
## [2,]    0    1    0
## [3,]    0    1    0

Semi-Binary Matrix Tri-Factorization (SBMTF)

Next, we consider the approximation of a non-negative data matrix \(X\) (\(N \times M\)) as the matrix product of binary matrix \(U\) (\(N \times J1\)) and non-negative matrices, \(S\) (\(J1 \times J2\)) and \(V\) (\(M \times J2\)):

\[ X \approx U S V' \ \mathrm{s.t.}\ U \in \{0,1\}, S, V \geq 0 \]

Here, we define this formalization as Semi-Binary Matrix Tri-Factorization (SBMTF). SBMTF can capture discrete patterns from a non-negative matrix.

To demonstrate SBMTF, next we use a non-negative matrix from the nnTensor package.

suppressMessages(library("nnTensor"))
X2 <- nnTensor::toyModel("NMF")

You will see that there are five blocks in the data matrix as follows.

image.plot(X2, main="Original Data", legend.mar=8)

Basic Usage

Switching from BMTF to SBMTF is quite easy; SBMTF is achieved by specifying the binary regularization parameter as a large value like below:

set.seed(123456)
out_SBMTF <- dNMTF(X2, Bin_U=1E+6, rank=c(5,5))
str(out_SBMTF, 2)
## List of 8
##  $ U            : num [1:100, 1:5] 0.977 0.983 0.985 0.987 0.975 ...
##  $ S            : num [1:5, 1:5] 0.000028 0.988919 0.200886 0.101142 0.119332 ...
##  $ V            : num [1:300, 1:5] 1.60e-10 2.01e-10 7.10e-10 4.99e-09 2.69e-11 ...
##  $ rank         : num [1:2] 5 5
##  $ RecError     : Named num [1:101] 1.00e-09 1.02e+04 9.64e+03 9.10e+03 8.67e+03 ...
##   ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...
##  $ TrainRecError: Named num [1:101] 1.00e-09 1.02e+04 9.64e+03 9.10e+03 8.67e+03 ...
##   ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...
##  $ TestRecError : Named num [1:101] 1e-09 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 ...
##   ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...
##  $ RelChange    : Named num [1:101] 1.00e-09 2.35e-01 6.25e-02 5.94e-02 4.95e-02 ...
##   ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...

RecError and RelChange can be used to diagnose whether the calculation is converged or not.

layout(t(1:2))
plot(log10(out_SBMTF$RecError[-1]), type="b", main="Reconstruction Error")
plot(log10(out_SBMTF$RelChange[-1]), type="b", main="Relative Change")

The product of \(U\), \(S\), and \(V\) shows whether the original data is well-recovered by dNMTF.

recX2 <- out_SBMTF$U %*% out_SBMTF$S %*% t(out_SBMTF$V)
layout(t(1:2))
image.plot(X2, main="Original Data", legend.mar=8)
image.plot(recX2, main="Reconstructed Data (SBMF)", legend.mar=8)

The histograms of \(U\), \(S\), and \(V\) show that \(U\) looks binary but \(S\) and \(V\) do not.

layout(t(1:3))
hist(out_SBMTF$U, breaks=100)
hist(out_SBMTF$S, breaks=100)
hist(out_SBMTF$V, breaks=100)

Semi-Ternary Matrix Tri-Factorization (STMTF)

Finally, we expand the binary regularization to ternary regularization to take {0,1,2} values as below:

\[ X \approx U S V' \ \mathrm{s.t.}\ U \in \{0,1,2\}, S, V \geq 0, \] where \(X\) (\(N \times M\)) is a non-negative data matrix, \(U\) (\(N \times J1\)) is a ternary matrix, and \(S\) (\(J1 \times J2\)) and \(V\) (\(M \times J2\)) are non-negative matrices.

Basic Usage

STMTF is achieved by specifying the ternary regularization parameter as a large value like the below:

set.seed(123456)
out_STMTF <- dNMTF(X2, Ter_U=1E+5, rank=c(5,5))
str(out_STMTF, 2)
## List of 8
##  $ U            : num [1:100, 1:5] 0.963 0.972 0.976 0.98 0.959 ...
##  $ S            : num [1:5, 1:5] 0.000277 0.86879 0.376385 0.194511 0.153944 ...
##  $ V            : num [1:300, 1:5] 5.27e-16 6.35e-16 7.18e-16 3.77e-15 1.08e-16 ...
##  $ rank         : num [1:2] 5 5
##  $ RecError     : Named num [1:101] 1.00e-09 1.02e+04 9.71e+03 9.26e+03 8.86e+03 ...
##   ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...
##  $ TrainRecError: Named num [1:101] 1.00e-09 1.02e+04 9.71e+03 9.26e+03 8.86e+03 ...
##   ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...
##  $ TestRecError : Named num [1:101] 1e-09 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 ...
##   ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...
##  $ RelChange    : Named num [1:101] 1.00e-09 2.40e-01 5.06e-02 4.91e-02 4.51e-02 ...
##   ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...

RecError and RelChange can be used to diagnose whether the calculation is converging or not.

layout(t(1:2))
plot(log10(out_STMTF$RecError[-1]), type="b", main="Reconstruction Error")
plot(log10(out_STMTF$RelChange[-1]), type="b", main="Relative Change")

The product of \(U\), \(S\), and \(V\) shows that the original data is well-recovered by dNMTF.

recX <- out_STMTF$U %*%  out_STMTF$S %*% t(out_STMTF$V)
layout(t(1:2))
image.plot(X2, main="Original Data", legend.mar=8)
image.plot(recX, main="Reconstructed Data (STMF)", legend.mar=8)

The histograms of \(U\), \(S\), and \(V\) show that \(U\) looks ternary but \(S\) and \(V\) do not.

layout(t(1:3))
hist(out_STMTF$U, breaks=100)
hist(out_STMTF$S, breaks=100)
hist(out_STMTF$V, breaks=100)

Session Information

## R version 4.3.1 (2023-06-16)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 22.04.3 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: Etc/UTC
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] nnTensor_1.2.0    fields_15.2       viridisLite_0.4.2 spam_2.9-1       
## [5] dcTensor_1.3.0   
## 
## loaded via a namespace (and not attached):
##  [1] gtable_0.3.4       jsonlite_1.8.7     highr_0.10         compiler_4.3.1    
##  [5] maps_3.4.1         Rcpp_1.0.11        plot3D_1.4         tagcloud_0.6      
##  [9] jquerylib_0.1.4    scales_1.2.1       yaml_2.3.7         fastmap_1.1.1     
## [13] ggplot2_3.4.3      R6_2.5.1           tcltk_4.3.1        knitr_1.43        
## [17] MASS_7.3-60        dotCall64_1.0-2    misc3d_0.9-1       tibble_3.2.1      
## [21] munsell_0.5.0      pillar_1.9.0       bslib_0.5.1        RColorBrewer_1.1-3
## [25] rlang_1.1.1        utf8_1.2.3         cachem_1.0.8       xfun_0.40         
## [29] sass_0.4.7         cli_3.6.1          magrittr_2.0.3     digest_0.6.33     
## [33] grid_4.3.1         rTensor_1.4.8      lifecycle_1.0.3    vctrs_0.6.3       
## [37] evaluate_0.21      glue_1.6.2         fansi_1.0.4        colorspace_2.1-0  
## [41] rmarkdown_2.24     pkgconfig_2.0.3    tools_4.3.1        htmltools_0.5.6

References

Copar, A. et al. e2019. “Fast Optimization of Non-Negative Matrix Tri-Factorization: Supporting Information.” PLOE ONE 14(6) (e2019): e0217994.
Ding, C. et al. 2006. “Orthogonal Nonnegative Matrix Tri-Factorizations for Clustering.” SIGKDD’06, 126–35.
Long, B. et al. 2005. “Co-Clustering by Block Value Decomposition.” SIGKDD’05, 635–40.
Zhang, Z. et al. 2007. “Binary Matrix Factorization with Applications.” ICDM 2007, 391–400.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.