Game Recommendation System
an inference engine for recommending video games based on critic reviews
Felix Gonda
Final Project
Data Science (CS 109)
Harvard University, Fall 2013
(Screencast) | (IPython Notebook) | (Run Similarity Explorer)

Abstract Video games today have become an important part of our entertainment culture. According to the latest game statistics released by ESA (The Entertainmen Software Association), in 2012 alone consumers spend $20 billion in gaming software and hardware globally and that number is going to dramatically increase in 2013 with the advent of new generation of hardware from Sony and Microsoft. More over, in the United States alone, according to ESA, 58% of Americans play video games and the average U.S household owns atleast one gaming console. These statistics indicates how gaming has become an integral activity of many. However, one problem remains largely unresolved, and that is how consumers should decide what video games to purchase. With the price of a new video game ranging from $30 to $65, it is more important for consumers to make the right decisions when purchasing a game. A lot of data and statistics exists in disparate locations, but is not readily available to consumers when making purchase decisions. This project analyzes video game ratings and sales data to build a recommendation system using collaborative filtering to recommend games based on similarities. The design and implementation includes the following components: (1) Evaluation of two classification systems - A Naive Bayes model that predicted rating categories based on critic reviews comments - A Random Forest models that predicted rating based on the comments as well as critic ratings. (2) An inference engine based on a similarity matrix of games computed from video game critic reviews. (3) An interactive similarity explorer based on the matrix computed from the inference engine. Data Sources: ------------- The data for this project was scraped from the following gaming statistics sites: 1. Metacritic ( A site that aggregates gaming reviews from critics (publications) and computes an an average meta score that determines the quality of a game. 2. Video Game Charts ( A site that aggregates sales data for every video game release. Results: -------- The recommendation engine was build in python and ran on Amazon Elastic Map Reduce framework to compute a similarity matrix of games. The initial results produced a matrix that was 1.6 GB in size. The final results contained in the notebook is limited to 10 similarities per game. An interactive similarity visualization is also developed to explore game similarities based on the matrix computed from the recommendation engine. The screenshots below shows some of the results: The links to the IPython notebook and similarity explorer provides more details. For questions and feedback you can contact me at: