CS-199: Big Data

Smartphone Sensor Data Reveals What You Do  :-)
Smartphone sensors are widely used in various mobile apps, including health monitoring, gaming, localization, activity recognition, etc. For instance, accelerometers are used to recognition a user's activity (e.g., sitting, standing, walking, running, biking). Moreover, when a user walks or runs, the same sensors can be used to count the number of steps the users takes, ultimately translating to his or her calorie counter or activity logger.

In this project, you can design your own activity logger, by classifying your accelerometer data to one of 5 activities, namely, still, walk, run, cycling, in-vehicle. This project gives you data for each of these activities, collected from 3 different users. Your task is to collect your own accelerometer data from your smartphones, and classify them to one of these 5 classes.

If you wish, you can also count the number of steps while a person is walking or running, or the number of rotations while a person is cycling. You could even translate these data into other statistics, such as calories burnt, steps per day, etc. 


This dataset contains text files for each <activity X and user Y>, all at 24 Hz.
Each textfile contains 6 columns: the first three columns correspond to the
X, Y, and Z axes of the acceleromenter readings, while columns 4 to 6
correspond to the X, Y, and Z axes of the smartphone's magnetometer.

Smartphone Sensor Data Reveals Who You Are  :-(
Some people hypothesized that smartphone sensors (e.g., accelerometers) possess unique fingerprints. In other words, if you share your accelerometer data once with an application in the cloud, that cloud can always track you later, using your accelorometer data. The proponents of this hypothesis believe that such fingerprints arise from hardware imperfections during the sensor manufacturing process, causing every sensor chip to respond differently to the same motion stimulus. The differences in accelerometer responses are subtle enough that they do not affect most of the higher level functions (e.g., activity recognition) but close inspection can reveal them to be different from each other.

This project presents to you data that allows you to test this hypothesis. You are asked to cluster accelerometer data from 107 accelerometer chips and test if these clusters are "sufficiently spread out" to deem them a fingerprint. If you prove that they indeed are spread out, and that any new data from one of these sensors can be correctly classified, then you have showed a serious security hole in today's sensor-based apps. If you are successful, it probably means that apps should require to get permissions from users, like they do today before they access your location information.


This dataset contains text files of accelorometer data, collected from different sensor chips and phones. Each text file is from one round of data collection, where a round indicates that the phone was vibrated for 2 seconds using the phone's in-built vibration-motor. Each text file contains 4 columns: the first column corresponds to the time stampt at which the data was collected and columns 2-5 correspond to X, Y, and Z axes of the acceleromenter.

For more information, contact:
Romit Roy Choudhury
Assoc. Professor, ECE and CS, UIUC
Email: croy@illinois.edu