The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
Implementation of Frequent-Directions algorithm for efficient matrix sketching [E. Liberty, SIGKDD2013]
# Not yet onCRAN
install.packages("frequentdirections")
# Or the development version from GitHub:
install.packages("devtools")
::install_github("shinichi-takayanagi/frequentdirections") devtools
Here, we use Handwritten
digits USPS dataset as sample data. In the following example, we
assume that you save the above sample data into /tmp
directory.
The dataset has 7291 train and 2007 test images in h5
format. The images are 16*16 grayscale pixels.
library("h5")
<- h5file("/tmp/usps.h5")
file <- file["train/data"][]
x <- file["train/target"][]
y str(x)
#> num [1:7291, 1:256] 0 0 0 0 0 0 0 0 0 0 ...
Example the number 8
image(matrix(x[338,], nrow=16, byrow = FALSE))
Plot the original data on the first and second singular vector plane.
<- scale(x)
x ::plot_svd(x, y) frequentdirections
<- 10^(-8)
eps # 7291 x 256 -> 8 * 256 matrix
<- frequentdirections::sketching(x, 8, eps)
b ::plot_svd(x, y, b) frequentdirections
# 7291 x 256 -> 32 * 256 matrix
<- frequentdirections::sketching(x, 32, eps)
b ::plot_svd(x, y, b) frequentdirections
# 7291 x 256 -> 128 * 256 matrix
<- frequentdirections::sketching(x, 128, eps)
b ::plot_svd(x, y, b) frequentdirections
This result is almost the same with the original data SVD expression.
That’s why we can think that the original data is expressed with only
128
rows.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.