For example,Бобцов

Compound quality model for recommender system evaluation

Annotation

The study examines approaches to quantifying various effects, such as Position bias, Popularity Bias, and others, in recommender systems. A new quality model of the recommendation algorithms is proposed which reduces the selected metrics to one unit of measurement and determines its impact on the system for each effect. The obtained scores allow for a deeper comparative analysis of various algorithms as well as investigation the behavior of the algorithm in different user segments. For each metric, two conditional marginal distribution densities are built within the framework of the model: separately based on relevant and irrelevant recommendations. Based on the comparison of these densities, the set of possible metric values is divided into normal and critical. The model evaluates the impact of each effect on the system based on the frequency of hitting the values of the corresponding metric in its critical area. To demonstrate how the model works, four recommendation algorithms were analyzed on the MovieLens-100K academic dataset. During the testing, Popularity Bias, the lack of novelty in recommendations, and the tendency of algorithms to recommend objects solely based on user demographic data were evaluated. For each effect, an assessment of its impact on the system is constructed, and an example of predicting an upper estimate of the system quality is given if the corresponding effect is eliminated. The study demonstrated that metrics of effects such as Popularity or Position Bias can change the distribution of absolute values depending on the system. One of the ways to compare different recommendation algorithms more reliably is the proposed quality model. The model is suitable for evaluating personal recommendations, regardless of the scope of application and the algorithm that was used to build them.

Keywords

Articles in current issue