Whether countertop surfing, unraveling toilet paper rolls, clawing furniture, knocking over houseplants or other innocent hijinks, monitoring your clever kitty’s mischievous behavior can help prevent destruction around your home. That’s why we set out to develop a smart cat detector using Amazon Kinesis Video Streams that can reliably detect cats and alert owners in real time when their feline friends are someplace they shouldn’t be.
With advanced video analytics powered by edge machine learning, even the slyest cat’s activities can be monitored 24/7. This case study addresses these playful shenanigans through a sophisticated IoT engineering application — a smart cat detector — that harnesses Amazon Kinesis Video Streams’ low-latency video ingestion and machine learning capabilities.
Even better, this project demonstrates the potential of Amazon Kinesis Video Streams in remote monitoring scenarios:
- Baby/elderly care monitoring
- Home security systems
- Smart doorbells
- Traffic monitoring
- Weather monitoring
- Retail store surveillance
- Industrial quality control
Amazon Kinesis Video Streams-Driven Cat Detector
Our smart cat detector is an innovative IoT application that utilizes a trained machine learning model to detect the presence of cats in a given environment and alert owners in near real time. We utilized a model trained using the ImageNet dataset, a benchmark for computer vision research. When harnessing the power of machine learning, using pretrained models out of the box is much more common than wasting precious resources training your own model from scratch.
The cat detector project saw us implement an end-to-end pipeline, ingesting live video from a Raspberry Pi camera into Amazon Kinesis Video Streams. Kinesis Video Streams, or KVS, is a fully managed AWS video streaming service that enables secure video ingestion into the cloud. As a recognized AWS IoT Service Delivery partner, we specialize in harnessing KVS to create real-time video-enabled applications supporting live, recorded and two-way streaming.
Real-time pet detection at the edge — transforming smart pet monitoring through the integration of IoT, machine learning and AWS services.
Offering features such as indexing, video recognition, analysis and machine learning capabilities, Amazon Kinesis Video Streams is utilized to stream live video from devices to the AWS cloud or build applications for real-time video processing or batch-oriented video analytics.
Pet surveillance cameras — with KVS WebRTC, a managed P2P WebRTC streaming infrastructure — can provide real-time monitoring, streaming video data to AWS using KVS for secure storage and analysis. The video can be analyzed in real time to identify any unusual behaviors or incidents, enabling immediate response in case of emergencies.
Designed with built-in machine learning inferencing, our smart cat detector consists of two components — an IoT application running on the Pi and a website hosted in the cloud — to analyze video frames and identify cats, triggering alerts and audio feedback when you catch the culprit red-pawed.
KVS Cat Detector Components
Cat Detector Raspberry Pi IoT Application
Our smart cat detector is an AWS IoT Greengrass application that runs continuously on the Raspberry Pi as a set of background processes. The application includes the following components:
AWS IoT Greengrass Nucleus
The main controller is AWS IoT Greengrass Nucleus, which controls and coordinates other components, handling starting and stopping the other components, configuration, logging, component software updates, etc.
The main controller is AWS IoT Greengrass Nucleus, which controls and coordinates other components, handling starting and stopping the other components, configuration, logging, component software updates, etc.
Video source
The video source streams video from the camera via GStreamer pipeline, sharing it with other components in different formats:
- The first stream sends 640×360 resolution 25 frames per second I420 BT.601 raw video to the WebRTC client component.
- The second stream sends 640×360 resolution one frame per second RGB 1:1:16:4 raw video to the object detector component.
The video source streams video from the camera via GStreamer pipeline, sharing it with other components in different formats:
- The first stream sends 640×360 resolution 25 frames per second I420 BT.601 raw video to the WebRTC client component.
- The second stream sends 640×360 resolution one frame per second RGB 1:1:16:4 raw video to the object detector component.
Amazon Kinesis Video Streams WebRTC
A C application based on an example from the AWS Kinesis WebRTC C SDK, the WebRTC client component accepts WebRTC calls from the website video player via the AWS Kinesis video service. When a call starts, the WebRTC client starts its own GStreamer pipeline using the video source and converts the video into the required H.264 format using hardware acceleration. When a call ends, the component releases its resources so that it is ready to accept the next call.
A C application based on an example from the AWS Kinesis WebRTC C SDK, the WebRTC client component accepts WebRTC calls from the website video player via the AWS Kinesis video service. When a call starts, the WebRTC client starts its own GStreamer pipeline using the video source and converts the video into the required H.264 format using hardware acceleration. When a call ends, the component releases its resources so that it is ready to accept the next call.
Object detector
A Python application that reads RGB video frames from the second video source stream once per second, this component detects cats in the video frames obtained from the video stream. A pretrained Mobilenet model that classifies images against 999 object categories runs on the Tensorflow Lite library to classify the frames. Seven of these categories correspond to cats. If the sum of cat category probabilities in the model output is higher than a threshold value, the system detects a cat. We wrote unit tests using still images to help improve cat detection reliability. Once a cat is detected, this component publishes messages to IoT MQTT topics and Greengrass IPC topics.
A Python application that reads RGB video frames from the second video source stream once per second, this component detects cats in the video frames obtained from the video stream. A pretrained Mobilenet model that classifies images against 999 object categories runs on the Tensorflow Lite library to classify the frames. Seven of these categories correspond to cats. If the sum of cat category probabilities in the model output is higher than a threshold value, the system detects a cat. We wrote unit tests using still images to help improve cat detection reliability. Once a cat is detected, this component publishes messages to IoT MQTT topics and Greengrass IPC topics.
Audio deterrent
When a cat detection message is received, the audio player plays a predefined audio clip through a speaker plugged into the Pi’s headphones jack by subscribing to the relevant Greengrass IPC topic.
When a cat detection message is received, the audio player plays a predefined audio clip through a speaker plugged into the Pi’s headphones jack by subscribing to the relevant Greengrass IPC topic.
While robust edge analytics power real-time detection and response, our smart cat detector also leverages secure AWS cloud services to enable remote monitoring, scalable video ingestion and back-end integration.
Amazon Kinesis Video Streams Cat Detector Cloud Components
Complementing the real-time edge capabilities, our smart cat detector system leverages several AWS cloud services intentionally designed for enhanced functionality and improved communication between the device and the cloud (using TLS).
Website hosted in the cloud
Written in React and hosted via Github Pages, the website uses the AWS Amplify SDK and AWS Kinesis WebRTC SDK libraries, allowing users to view live video from the camera.
Written in React and hosted via Github Pages, the website uses the AWS Amplify SDK and AWS Kinesis WebRTC SDK libraries, allowing users to view live video from the camera.
AWS Greengrass
The initial set-up and configuration of the Pi and subsequent component software updates are performed manually using a combination of the AWS console and running local Greengrass CLI commands. This was helpful when deploying the improved v2 object detection component to all cat detector devices in the field.
The initial set-up and configuration of the Pi and subsequent component software updates are performed manually using a combination of the AWS console and running local Greengrass CLI commands. This was helpful when deploying the improved v2 object detection component to all cat detector devices in the field.
Amazon Kinesis Video Streams signaling channels
This service is used for the WebRTC video calls, but no manual configuration steps are necessary.
This service is used for the WebRTC video calls, but no manual configuration steps are necessary.
AWS Cognito
User authentication is handled by AWS Cognito on the webpage.
User authentication is handled by AWS Cognito on the webpage.
Identity and Access Management (IAM) roles and profiles
The Cognito identity pool has an authenticated access role attached, with a policy to allow access to IoT and Kinesis Video Streams. The Pi gets the necessary access, but only the minimum necessary permissions are granted.
The Cognito identity pool has an authenticated access role attached, with a policy to allow access to IoT and Kinesis Video Streams. The Pi gets the necessary access, but only the minimum necessary permissions are granted.
To bolster security, we assigned permissions to the various system components following the principle of least privilege. With core detection driven by edge machine learning and video management handled via secure cloud infrastructure, our smart cat detector solution is primed for use in smart home environments.
Smart Cat Detector Development Challenges
During development, we experienced some issues with connecting KVS WebRTC to Amazon Kinesis Video Streams:
- Storing video while also allowing real-time WebRTC streaming.
- Cloud-based AI/ML on video while also allowing real-time WebRTC streaming.
By writing a service running on AWS that joins the P2P KVS WebRTC mesh and either decimates and sends frames to AI/ML or stores video on Amazon S3, we worked around these concerns.
Other development challenges included the following:
- Edge computing resource constraints: The first version of the cat detector object detection component used a 25 million-parameter PyTorch ResNet50 machine learning model that taxed the Raspberry Pi, with cat detection latency of about five seconds and CPU usage around 75%. We improved the performance, shifting to a smaller, 4.2 million-parameter Tensorflow Lite Mobilenet ML model specifically designed to run on mobile devices. The cat detection latency dropped to less than one second, and CPU usage decreased to around 25%.
- Video streaming: Simplifying the logic and employing adaptive video encoding techniques and frame subsampling allowed us to reduce the latency when starting WebRTC calls between the website and the device.
- Cat detection: By implementing a lightweight and efficient model architecture, the system can swiftly process video frames on the edge device, minimizing computational load and accelerating near real-time cat detection.
But how does the smart cat detector work to alert you the moment your sneaky feline is causing trouble?
How Our Amazon Kinesis Video Streams-Driven Cat Detector Works
Cardinal Peak’s cat detector solution delivers a seamless experience for owners to monitor their pets and protect their belongings in just a few simple steps:
- Launch the user interface: Sign in using your credentials and click to view your live video feed(s) and access system controls.
- Connect the camera: Plug in the Raspberry Pi camera module and ensure it’s connected to Wi-Fi.
- Let cats be cats: The detectors constantly analyze incoming video. Within five seconds or less of undesirable cat activity, the system triggers automatic responses.
- Receive alerts: When cats are detected on kitchen counters, the system plays a pre-recorded “meow” while the webpage shows a detected cat.
- Reinforce good behaviors: Our cat detector allows for defining areas where cats are allowed and issuing positive reinforcement sounds when they inhabit those spaces, guiding pets over time toward preferable areas and activities using conditioning versus punishment.
With our cat detector powered by Amazon Kinesis Video Streams, pet owners can enjoy peace of mind knowing their valuables are protected and our mischievous pets stay entertained without destructive consequences — even when we’re not home to intervene.
Choose Cardinal Peak for Amazon Kinesis Video Remote Monitoring and Other Innovative Use Cases
When left to their own devices, cats have a knack for keeping themselves entertained — often to the dismay of their owners. In our modern world, feline antics meet ingenious technology designed to keep a vigilant eye, offering peace of mind to pet owners everywhere.
The Amazon Kinesis Video Streams-powered cat detector system highlights the integration of IoT, machine learning and AWS services to address a unique need and real customer problems. Together, these technologies enable continuous cat monitoring with real-time response when furry felines wander into forbidden areas.
With machine learning models rapidly classifying incidents and Amazon IoT services, this smart cat detector project showcases Cardinal Peak’s expertise and talent in delivering inventive solutions harnessing AWS IoT Greengrass, machine learning and edge capabilities. Contact us to explore your custom Amazon Kinesis Video Streams solution with our team of IoT engineering and streaming video experts!