matrix decomposition

2018-12-01 本文已影响0人 balding

1. CLUSTERED MATRIX APPROXIMATION

SIAM [Society for Industrial and Applied Mathematics] 2016

develop a clustered matrix approximation framework
1. Divide the matrix into different groups
particularly well suited for problems with large-scale sparse matrices

Abstract

benefits

preserve the important structure presented in the original matrix;
contain both global-scale and local-scale information
efficient both in computational speed and memory usage
resulting approximations more accurate with less memory usage than truncated SVD approximations, which are optimal with respect to rank.
flexible: modified in various ways [what ways?]

method

derive a probabilistic approach that uses randomness to compute a clustered matrix approximation within the developed framework?.
We further prove deterministic and probabilistic bounds of the resulting approximation error.

Introduction

approximate $A \in R ^{m \times n}$ with low-rank $\tilde{A}$ , $\min _{rank(\tilde{A})=k} ||A-\tilde{A}||_F$ has been studied extensively in the literature
bottleneck: with large amounts of data is often memory consumption and not the rank of the approximation
- In this setting, our method produces a considerably more accurate matrix approximation with memory usage that is the same as or less than that of the truncated SVD approximation, which is optimal with respect to rank.

steps

a clustering or coclustering step so that the rows and columns of a given matrix are partitioned into a number of groups;
reordering the rows and columns according to cluster a�affiliation and extracting sufficiently dense blocks;
computing low-rank approximations of these dense blocks;
combining the blockwise approximations into an approximation of the entire
matrix.

explanation

Computing truncated SVD approximations is relatively expensive, one may wish to trade off� computation time against some slight loss inaccuracy. Probabilistic algorithms for matrix approximation [21] give the means for this kind of trade-off�.

develop and detail the method

Preliminaries

rely on a number of concepts:

efficient graph clustering;
bipartite graph coclustering;
low-rank matrix approximations;
stochastic methods for low rank matrix approximations.

Bipartite graph: is a graph whose vertices can be divided into two disjoint and independent sets $U$ and $V$ such that every edge connects a vertex in $U$ to one in $V$ . $U$ and $V$ are usually called the parts of the graph

assume that we can partition the graph and obtain $c$ disjoint sets

$\cal{V_1}, \dots, V_c$ , with $m_i = |\cal{V_{i}}|$ . Without loss of generality, we can assume that vertices in $\cal{V_1}, \dots, V_c$ are sorted in strictly increasing order. Then matrix has form
$A= \begin{bmatrix} A_{11} & \dots & A_{1c}\\ \vdots & \ddots & \vdots\\ A_{c1} & \dots & A_{cc} \end{bmatrix}$

given $A$ and rank $k$
standard Gausion matrix $\Omega \in R^{n \times (k+p)}$ , $p$ is a small oversampling parameter (typically set to 5 ~ 10)
$Y= A \Omega$
$Q = orth (Y)$ : orthonormal basis
the corresponding low rank approximation is given by $A \approx \tilde{A} = QQ^TA$
$Q^T A = \bar{W} \bar{\Sigma} \bar{V}^T$
$\hat{A} = (Q \bar{W}) \bar{\Sigma} \bar{V}^T = \bar {U}\bar{\Sigma} \bar{V}^T$

Clustered low rank matrix approximations

However , it used existing clusters method... as it is more about computing $U_i,V_i,S_{ij}$

End.

2. Improved Nystrom Low-Rank Approximation and Error Analysis

This paper presents detailed studies on the Nystrom sampling scheme and in particular, an error analysis that directly relates the Nystrom approximation quality with the encoding powers of the landmark points.

Introduction

Kernel methods play a central role in machine learning and have demonstrated huge success in modeling real-world data with highly complex, nonlinear structures.

matrix decomposition

1. CLUSTERED MATRIX APPROXIMATION

Abstract

benefits

method

Introduction

steps

explanation

Preliminaries

Clustered low rank matrix approximations

2. Improved Nystrom Low-Rank Approximation and Error Analysis

Introduction

猜你喜欢

热点阅读