I had a little free time on my hand and decided to quickly complete the coursera-course „Data Science at Scale – Practical predictive analytics“ of the University of Washington by Bill Howe. The last assignment was to participate in a kaggle competition.
For this assignment I chose the „San Francisco Crime Classification“ challenge. The task is to predict the Category of a crime given the time and location. The dataset contains incidents from the SFPD Crime Incident Reporting system from 2003 to 2015 (878049 datapoints for training) with the following variables: