A Random Algorithm for Low-Rank Decomposition of Large-Scale Matrices With Missing Entries

A random submatrix method (RSM) is proposed to calculate the low-rank decomposition U(m×r)V(n×r)(T) (r < m, n) of the matrix Y∈R(m×n) (assuming m > n generally) with known entry percentage 0 < ρ ≤ 1. RSM is very fast as only O(mr(2)ρ(r)) or O(n(3)ρ(3r)) floating-point operations (flops) are...

Description complète

Détails bibliographiques
Publié dans:	IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 24(2015), 11 vom: 24. Nov., Seite 4502-11
Auteur principal:	Liu, Yiguang (Auteur)
Autres auteurs:	Lei, Yinjie, Li, Chunguang, Xu, Wenzheng, Pu, Yifei
Format:	Article en ligne
Langue:	English
Publié:	2015
Accès à la collection:	IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Sujets:	Journal Article Research Support, Non-U.S. Gov't

Description
Résumé:	A random submatrix method (RSM) is proposed to calculate the low-rank decomposition U(m×r)V(n×r)(T) (r < m, n) of the matrix Y∈R(m×n) (assuming m > n generally) with known entry percentage 0 < ρ ≤ 1. RSM is very fast as only O(mr(2)ρ(r)) or O(n(3)ρ(3r)) floating-point operations (flops) are required, compared favorably with O(mnr+r(2)(m+n)) flops required by the state-of-the-art algorithms. Meanwhile, RSM has the advantage of a small memory requirement as only max(n(2),mr+nr) real values need to be saved. With the assumption that known entries are uniformly distributed in Y, submatrices formed by known entries are randomly selected from Y with statistical size k×nρ(k) or mρ(l)×l , where k or l takes r+1 usually. We propose and prove a theorem, under random noises the probability that the subspace associated with a smaller singular value will turn into the space associated to anyone of the r largest singular values is smaller. Based on the theorem, the nρ(k)-k null vectors or the l-r right singular vectors associated with the minor singular values are calculated for each submatrix. The vectors ought to be the null vectors of the submatrix formed by the chosen nρ(k) or l columns of the ground truth of V(T). If enough submatrices are randomly chosen, V and U can be estimated accordingly. The experimental results on random synthetic matrices with sizes such as 13 1072 ×10(24) and on real data sets such as dinosaur indicate that RSM is 4.30 ∼ 197.95 times faster than the state-of-the-art algorithms. It, meanwhile, has considerable high precision achieving or approximating to the best
Description:	Date Completed 16.09.2015 Date Revised 10.09.2015 published: Print-Electronic Citation Status PubMed-not-MEDLINE
ISSN:	1941-0042
DOI:	10.1109/TIP.2015.2458176