Skip to content

Commit 937c4ae

Browse files
committedNov 27, 2017
update readme
1 parent 0cd21bc commit 937c4ae

8 files changed

+33
-14
lines changed
 

‎DESCRIPTION

+1-1
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ Version: 0.2.0
55
Author: Dmitriy Selivanov
66
Maintainer: Dmitriy Selivanov <selivanov.dmitriy@gmail.com>
77
Description: Implements many (sparse) matrix factorizations algorithms.
8-
Focus is applications for recommender systems.
8+
Focus is on applications for recommender systems.
99
Following algorithms are implemented at the moment:
1010
1) Weighted Regularazied Matrix Factorization with Alternationf Least Squares (ALS)
1111
for implicit feedback (inculding approximate Conjugate Gradient solver).

‎README.md

+32-13
Original file line numberDiff line numberDiff line change
@@ -1,23 +1,42 @@
11
# reco
22

3-
`reco` is an R package which implements several algrithms for matrix factorization targeting recommender systems.
3+
`reco` is an R package which implements many algorithms for **sparse matrix factorizations**. Focus is on applications for **recommender systems**.
44

5-
1. Weighted Regularized Matrix Factorization (WRMF) from [Collaborative Filtering for Implicit Feedback Datasets](http://yifanhu.net/PUB/cf.pdf) (by Yifan Hu, Yehuda Koren, Chris Volinsky). One of the most efficient (benchmarks below) solvers.
6-
1. Linear-Flow from [Practical Linear Models for Large-Scale One-Class Collaborative Filtering](http://www.bkveton.com/docs/ijcai2016.pdf). This algorithm is similar to [SLIM](http://glaros.dtc.umn.edu/gkhome/node/774) but looks for factorized low-rank item-item similarity matrix.
7-
1. Regularized Matrix Factorization (MF) - classic approch for "rating" prediction.
5+
## Algorithms
86

9-
Package is **quite fast**:
7+
1. Vanilla **Maximum Margin Matrix Factorization** - classic approch for "rating" prediction. See `WRMF` class and constructor option `feedback = "explicit"`. Original paper which indroduced MMMF could be found [here](http://ttic.uchicago.edu/~nati/Publications/MMMFnips04.pdf).
8+
* <img src="docs/img/MMMF.png" width="400">
9+
1. **Weighted Regularized Matrix Factorization (WRMF)** from [Collaborative Filtering for Implicit Feedback Datasets](http://yifanhu.net/PUB/cf.pdf). See `WRMF` class and constructor option `feedback = "implicit"`.
10+
We provide 2 solvers:
11+
1. Exact based of Cholesky Factorization
12+
1. Approximated based on fixed number of steps of **Conjugate Gradient**.
13+
See details in [Applications of the Conjugate Gradient Method for Implicit Feedback Collaborative Filtering](https://pdfs.semanticscholar.org/bfdf/7af6cf7fd7bb5e6b6db5bbd91be11597eaf0.pdf) and [Faster Implicit Matrix Factorization](www.benfrederickson.com/fast-implicit-matrix-factorization/).
14+
* <img src="docs/img/WRMF.png" width="400">
15+
1. **Linear-Flow** from [Practical Linear Models for Large-Scale One-Class Collaborative Filtering](http://www.bkveton.com/docs/ijcai2016.pdf). Algorithm looks for factorized low-rank item-item similarity matrix (in some sense it is similar to [SLIM](http://glaros.dtc.umn.edu/gkhome/node/774))
16+
* <img src="docs/img/LinearFlow.png" width="300">
17+
1. **Soft-SVD** via fast Alternating Least Squares as described in [Matrix Completion and Low-Rank SVD via Fast Alternating Least Squares](https://arxiv.org/pdf/1410.2596.pdf).
18+
* <img src="docs/img/soft-svd.png" width="600">
19+
1. **Soft-Impute** via fast Alternating Least Squares as described in [Matrix Completion and Low-Rank SVD via Fast Alternating Least Squares](https://arxiv.org/pdf/1410.2596.pdf).
20+
* <img src="docs/img/soft-impute.png" width="400">
21+
* with a solution in SVD form <img src="docs/img/soft-impute-svd-form.png" width="150">
1022

11-
* Built on top of `RcppArmadillo`
12-
* extensively use **BLAS** and parallelized with **OpenMP**
13-
* implements **Conjugate Gradient solver** as dicribed in [Applications of the Conjugate Gradient Method for Implicit
14-
Feedback Collaborative Filtering](https://pdfs.semanticscholar.org/bfdf/7af6cf7fd7bb5e6b6db5bbd91be11597eaf0.pdf) and [Faster Implicit Matrix Factorization](www.benfrederickson.com/fast-implicit-matrix-factorization/)
15-
* Top-k items inference is `O(n*log(k))` and use **BLAS** + **OpenMP**
1623

17-
![benchmark](https://github.com/dselivanov/bench-wals/raw/master/img/wals-bench-cg.png)
24+
## Efficiency
1825

19-
# Tutorials
26+
Package is reasonably fast and scales nicely to datasets with millions of rows and millions of columns:
2027

28+
* built on top of `RcppArmadillo`
29+
* extensively uses **BLAS** and parallelized with **OpenMP**
30+
31+
Here is example of `reco::WRMF` on [lastfm360k](https://www.upf.edu/web/mtg/lastfm360k) dataset in comparison with other good implementations:
32+
33+
<img src="https://github.com/dselivanov/bench-wals/raw/master/img/wals-bench-cg.png" width="600">
34+
35+
# Materials
36+
37+
**Note that syntax could be not up to date since package is under active development**
38+
39+
1. [Slides from DataFest Tbilisi(2017-11-16)](https://www.slideshare.net/DmitriySelivanov/matrix-factorizations-for-recommender-systems)
2140
1. [Introduction to matrix factorization with Weighted-ALS algorithm](http://dsnotes.com/post/2017-05-28-matrix-factorization-for-recommender-systems/) - collaborative filtering for implicit feedback datasets.
2241
1. [Music recommendations using LastFM-360K dataset](http://dsnotes.com/post/2017-06-28-matrix-factorization-for-recommender-systems-part-2/)
2342
* evaluation metrics for ranking
@@ -32,7 +51,7 @@ Feedback Collaborative Filtering](https://pdfs.semanticscholar.org/bfdf/7af6cf7f
3251

3352
We follow [mlapi](https://github.com/dselivanov/mlapi) conventions.
3453

35-
# Notes on multithreading and BLAS
54+
### Notes on multithreading and BLAS
3655

3756
**VERY IMPORTANT** if you use multithreaded BLAS (you generally should) such as OpenBLAS, Intel MKL, Apple Accelerate, I **highly recommend disable its internal multithreading ability**. This leads to **substantial speedups** for this package (can be easily 10x and more). Matrix factorization is already parallelized in package with OpenMP. This can be done by setting corresponding environment variables **before starting `R`**:
3857

‎docs/img/LinearFlow.png

53.6 KB
Loading

‎docs/img/MMMF.png

19.2 KB
Loading

‎docs/img/WRMF.png

27 KB
Loading

‎docs/img/soft-impute-svd-form.png

20 KB
Loading

‎docs/img/soft-impute.png

35.5 KB
Loading

‎docs/img/soft-svd.png

76.2 KB
Loading

0 commit comments

Comments
 (0)
Please sign in to comment.