Data sources
Info
You can find all data source here

Caution
All user data has been anonymised.
Some numbers
Music
- 9 075 unique tracks
- 729 995 user - track - rating
Movies
- 58 099 unique movies
- 27 753 445 user - movie - rating
Serie
- 73 504 unique series
- 5 171 349 unique episodes
- 86 389 462 user - series - rating
Games
- 5 154 unique games
- 128 793 user - game - played hours - purchase
Books
- 271 380 unique books
- 1 031 176 user - book - rating
Applications (mobile)
- 9 661 unique app
- 35 930 user - app - review - review popularity - review subjectivity - rating
Content sources
Music
- MusicBrainz Dataset: This data includes information about artists, release groups, releases, recordings, works, and labels, as well as the many relationships between them.
- Taste Profile (subset): The dataset contains real user - play counts from undisclosed partners, all songs already matched to the MSD.
- LastFM (Implicit) (subset): This dataset contains social networking, tagging, and music artist listening information from a set of 2K users from Last.fm online music system.
- Million Song Dataset (subset): The Million Song Dataset is a freely-available collection of audio features and metadata for a million contemporary popular music tracks.
Movies
-
MovieLens: The data sets were collected over various periods of time, depending on the size of the set.
-
TMDB: All movies and series metadata
Serie
- TMDB: All movies and series metadata
Games
- Steam video games Kaggle: This dataset is a list of user behaviors, with columns: user-id, game-title, behavior-name, value. The included behaviors are 'purchase' and 'play'.
- Steam app list: List of (appid, name) steam app.
- Steam open API: Api that provides application details from app_id.
Books
- Book Crossing: The BookCrossing (BX) dataset was collected by Cai-Nicolas in a 4-week crawl (August / September 2004) from the Book-Crossing community