Sunday, February 01, 2009

Summary of the recommendation system survey paper

I’ve found a good survey paper about recommendation systems as follows:

Gediminas Adomavicius, Alexander Tuzhilin, "Toward the Next Generation of Recommender Systems: A Survey of the State-of-the-Art and Possible Extensions," IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 6, pp. 734-749, June, 2005.

A short summary:

-. A short definition of recommendation system: a problem of extrapolation for predicting unknown values

-. 3 approaches: i) Content-based, ii) Collaborative, and iii) Hybrid

i) Content-based RS(Recommendation System): a user’s feature is computed solely based on the user’s activity history. Can have the following limitations:
a. Feature extraction can be hard in some domain, such as multimedia or image
b. Over specialization: Diversity is required. Randomness, genetic algorithms, or some adjustment (remove too similar, or too different outputs)
c. New user problem: No information to consider

Known algorithms: (Naive) Bayesian classifier, Rocchio, winnow, ANN, …

ii) Collaborative RS: a user’s feature is computed by a group of like-mined people or peers. Limitations:
a. New user problem [83][89] : The same with content-based RS
b. New item problem
c. Sparsity: a few workaround ideas -- use of demographic information, dimension reduction, …

Known algorithms: clustering, Bayesian network, SVD, maximum entropy, …

iii) Hybrid RS: utilize both content-based and collaborative system.