Close this search box.
build a machine learning model


Client: Global Online Survey and Insights Pure Play Company


Azure, Linux, Java SEE Runtime, Spring & Hibernate, Apache Tomcat, JavaScript/Angular.js, Quartz, Apache HTTP Client, JUnit, Jenkins and Apache Maven, Azure IoT Hub, Events Hub, Cosmos DB, Kubernetes, Docker, AWS, Edge Computing, Custom Protocol Gateways.


An advanced machine learning model that runs over a video to identify and tag human behaviors inside a retail store.



The Client was looking to build a custom AI-based solution that would allow for in-store activity supervision without breach of confidentiality.


Vid.Supervisor is a machine-learning model that runs over a video to identify and tag behaviors.

The goal is to decrease the manpower needed to codify a video via software that has a flexible fit for multiple retail business cases. In the complete solution, human input will be minimum, with only the role of reviewer remaining.

As the monotonous tasks are completed by AI, the client team will confirm the automatically assigned tags (improving the algorithm’s accuracy), while client employees can concentrate their energies on work that requires thinking and analysis.

Vid.Supervisor is ready for use in a variety of retail projects, first easing the creation of codebooks and ultimately reducing the time needed to codify a video by over 90%. is working together with the client team to develop a solution leveraging the latest Machine Learning and video analytics trends. 

The solution features the following capabilities:

  1. System setup, user management, project definition
  2. Uploading videos into the System and defining codebook tags: actions/quantities/locations
  3. Streaming videos for manual tagging
  4. Generating codebook spreadsheets
  5. Automatic assignment of tag values 
  6. User confirmation/correction tags
  7. Automated identification of patterns and behavior insight 

Key Features

People Detection

For staff tracking, we use a people detector that positions them at a pre-defined location at any moment in time (e.g. at a workstation). This system tracks every Person in the frame, displaying a bounding box around them (optional) for easy visualization. It also assigns a unique ID to each (a detection ID), for future compatibility.

Facial Recognition

To extend the capabilities of the People Recognition feature, facial recognition is incorporated to link a person’s activity through any span of time (day, week, year …). Assuming we see a person’s face at some point while they’re in the frame, we can retroactively tag their entire time in the frame, meaning video cutting and identification of activities per person is available.

Automatic Video Cutting & Creation

A separate video can be created for each individual staff member/tag/location, containing only the activity of that person/tag/location. 




  • 98% reduction of irrelevant content – reducing 10,000 hours of surveillance to just 2 hours of relevant footage.
  • High accuracy in task recognition increases the relevance of performance analysis.
0 K
Lines of code written
Hours of work
Team members
(1 Dev, 1 Designer, 1 PM)

Other Similar Projects

Looking for a technology partner?

Let’s talk.