Game Recommendation System
an inference engine for recommending video games based on critic reviews
Felix Gonda
felix.e.gonda@gmail.com
Final Project
Data Science (CS 109)Harvard University, Fall 2013
(Screencast) | (IPython Notebook) | (Run Similarity Explorer)
Abstract
Video games today have become an important part of our entertainment culture. According to the latest
game statistics released by ESA (The Entertainmen Software Association), in 2012 alone consumers spend
$20 billion in gaming software and hardware globally and that number is going to dramatically increase
in 2013 with the advent of new generation of hardware from Sony and Microsoft. More over, in the
United States alone, according to ESA, 58% of Americans play video games and the average U.S household
owns atleast one gaming console. These statistics indicates how gaming has become an integral activity of many.
However, one problem remains largely unresolved, and that is how consumers should decide what video
games to purchase. With the price of a new video game ranging from $30 to $65, it is more important
for consumers to make the right decisions when purchasing a game. A lot of data and statistics exists
in disparate locations, but is not readily available to consumers when making purchase decisions.
This project analyzes video game ratings and sales data to build a recommendation system using collaborative
filtering to recommend games based on similarities. The design and implementation includes the following
components:
(1) Evaluation of two classification systems
- A Naive Bayes model that predicted rating categories based on critic reviews comments
- A Random Forest models that predicted rating based on the comments as well as critic ratings.
(2) An inference engine based on a similarity matrix of games computed from video game critic reviews.
(3) An interactive similarity explorer based on the matrix computed from the inference engine.
Data Sources:
-------------
The data for this project was scraped from the following gaming statistics sites:
1. Metacritic (http://www.metacritic.com)
A site that aggregates gaming reviews from critics (publications) and computes an
an average meta score that determines the quality of a game.
2. Video Game Charts (http://vgchartz.com)
A site that aggregates sales data for every video game release.
Results:
--------
The recommendation engine was build in python and ran on Amazon Elastic Map Reduce framework to compute a
similarity matrix of games. The initial results produced a matrix that was 1.6 GB in size. The final
results contained in the notebook is limited to 10 similarities per game.
An interactive similarity visualization is also developed to explore game similarities based on the matrix
computed from the recommendation engine. The screenshots below shows some of the results:
The links to the IPython notebook and similarity explorer provides more details.
For questions and feedback you can contact me at: felix.e.gonda@gmail.com