Smarter Smartmirror – Dr. Tobias M. Weis

[heading title=”Projects” subtitle=”Smarter smartmirror”]

This is the fancy project presentation, code-snippets and in-depth technical details can be found on the corresponding blog-page.

Motivation

I want to satisfy my daily needs of information in an unobtrusive way, without holding a device or sacrificing extra time:

Extend the bathroom mirror to both mirror your face and display information!

While brushing teeth you cannot do anything else anyway. The usecases are:

Show general information (time, weather, etc.)
Show user-specific information (calendar)
Control video/music playback

Requirement mapping

In the design process, I followed a process of first gathering all requirements that arise from the usecases and map them to requirements on the hard- and software.

As most information sources can be connected by using JSON-apis over the web and I want an easy way to design a GUI for displaying, I chose to use HTML5/JS/CSS3 and run a browser in kiosk mode.

I want to be able to control the mirror without touching it, so either voice-control, gesture-recognition using a webcam or a LeapMotion controller are options.

To display user-specific information, I use computer-vision and face-recognition.

Principle of a smartmirror

To provide a mirror-image and display information of a display mounted behind, we have to use a two-way-mirror in front of a display. Whereas a regular mirror (left) only reflects light, the two-way-mirror (right) additionally allows light to traverse the mirror from behind:

Hardware-Setup

This is the planned configuration before building it:

And here is the final assembled prototype as it lives in bathroom for the past several months (on this picture, the webcam is missing):

Software and GUI

The backend has been implemented using python and multiprocesses. An always-on-process constantly reads a motion sensor, and if motion is detected it will activate the screen and initiate other processes (Face-, speech-, and gesture-recognition). All data-fetching and displaying is done with Javascript in the browser. Communication between back- and frontend happens using automated keypresses (xdotool).

The GUI is displayed as a HTML5 website with chromium in kiosk mode. Javascript-modules gather and display the information, the control-interface on a low-level are keypresses (in linux, any program can inject keypresses into the system, so this is an easy and naive interface for control).

Face detection and user identification

Developing this part took the longest, because I had to gather a few hundred/thousand images of my girlfriend, myself and others to train a convolutional neural network (CNN).

I first identifed the coarse distances between the user and the webcam that are possible in a regular smartmirror-setting. It is highly unlikely that a user tries to interact with the mirror while standing more than 5m away from it, which introduces a constraint on the expected scalings.

As OpenCV’s cascade-classifier already implements detection on several scales (and works really well to detect faces), I chose to use it as a first stage.

In order to train a CNN, I needed a large dataset of my girlfriend and myself. At first I saved every image-patch that has been detected as a face by OpenCVs Haardcascade classifier to disk and manually put it into the right folders. After I got about 200 images for each class (Tobi, Mari, Other, Negative), the network started to perform above chance, with an accuracy of approximately 70%, so I would let it run, save the classified patches to dedicated folders, and only move the wrong patches in the evening. Currently, each set contains 10.000 images and the network reached a performance of 99.5%.

Even though the camera is mounted in a bathroom without windows, there are huge variations in the appearance of individual image patches.

Below is an everyday sequence of our morning-routine and the detections are drawn as boxes with name written above: