TinyML Person Detection

TinyML vision has made it possible to recognize people in video streams from Arduino and Esp32 cameras running on battery with no internet or cloud connectivity. This would have been impossible to accomplish only a few years ago due to high power requirement of large size deep learning vision models and a lack of hardware (Arduino boards previously had a substandard 16 kb RAM) and software (neural networks support for embedded systems was non-existent).

However, many things have changed as of late, and person detection is now one of TinyML's early deep learning vision application running on embedded devices.

Arducam 5MP+OV5642 Camera

Arducam 5MP+OV5642 Camera

What is TinyML Person Detection?

TinyML Person Detection is an application deep learning vision that identifies if there is a person in the camera frame. The person can be replaced with any other animal or object and sometimes called visual wake-word detection.

Building Person Detection Application with Low Code Platform

Person detection is a classical supervised deep learning application that requires labelled images of 'person' and 'no-person' as dataset with a couple of thousand images with resolution as small as 48x48x3. This dataset is then uploaded to cainvas notebook server to build a deep learning model. The model is then compiled into a static library with deepSea and integrated in an C++ application. This is compiled with lower level C++ compiler. The resulting binary was flashed to Arduino Nano board using Arduino IDE. You can find technical details of hardware and software setup in a AITS Journal article.

Here is a 2 min video that takes you step-by-step into low code development of person detection app.

Building Visual Wake-word Detection Application with No Code Platform

No code person detection uses cainvas playground ViWW feature to build the application. Cainvas lets you use your camera on a laptop or cell phone to capture images with labels. Once you are done capturing images, you hit training. During training, it automatically extracts images and performs image processing steps like resizing, augmentation etc. to create a dataset appropriate for 2 class supervised classification and trains the deep learning model. You can test the model using a follow up step. The last step is to compile and download the application targeted for your IoT device or microcontroller.

Here is a 3 min video that will take you step-by-step to create apple vs banana application.


Smart homes, smart retail, smart cities, smart homes, door bells and intelligent buildings are just a few areas where sensing person/not-person or other objects can play an important role in automating routine tasks. The main contribution of tinyML wake word detection on embedded devices is to allow the application to run on battery in areas with no-connectivity.