AI & ML in Security & Surveillance

By Milind Borkar
MD, Systematica Suyog Security Consultants
(Sr. Consultant & Security Expert)

Surveillance and Security in the traditional sense is now moving forward in leaps and bounds. Gone are the days of CCTV using analog cameras that are being replaced by digital cameras that enable video analytics to be performed on an incoming digital stream. Also, between year 2005 and year 2010 there was a massive push to standardize the interface between the camera and the software that talks to it over an ethernet cable. This standard is ONVIF (Open Network Video Interface). Though many camera manufactures claim ONVIF compliance one must check the following link prepared by onvif.org – https://www.onvif.org/conformant-products/. This development disrupted the stranglehold of camera manufactures with their partners and allowed many other players to enter the market as proprietary protocols were no longer required.

Most of the terabytes of stored video is useless as it does not carry any useful information. Manual searches need to be conducted to find the relevant information one is looking for. This turns out to be a time-consuming process and by the time information is found it might be out of date. This is where Video Analytics can help to some degree by looking for only relevant information thereby saving time and resources. Even though Video Analytics saves a considerable amount of time, it still does not avoid the manual process involved in looking at video instead of data.

Here is where Artificial Intelligence (AI) and Machine Learning come in. AI/ (neural networks) builds a model based on a few initial parameters that are input by the user. Without getting into the details, it quickly builds a neural network and tells you the confidence level of each object found in the video frame. This is a highly mathematical process involving convolution, calculus, probability and statistics. Based on the confidence level of each object found in the frame, one can fine tune the neural network by changing the input parameters. This fine tuning is called Machine Learning by which the neural network gives confidence levels above 95% for each object found. We have done this in our product where object confidence levels went from as low as 60% to as high as 98%. One can now put the neural network in training mode telling it what the target end result the user wants. The machine then self learns by varying the hundreds of input parameters till the end target is met. At this stage the model is what the user was expecting and he now continues to use this highly accurate model to build his or her applications to solve problems specific to their market vertical. So, what AI/ ML has done is that video examination is no longer required but extracted data from the video stream is examined. This is a far more intelligent way of examining video streams and far more efficient allowing the end user to build multiple intelligent applications on top of this. This is the WAVE of the future as multiple PETABYTES of data cannot be examined after the fact. With the number of cameras increasing exponentially all across the globe, the best way to process video is on the fly in real time as it saves time, money and resources across the board. However, for a particular use case some time and money have to be invested to fine tune the neural network model. Once this process and methodology is mastered, one can use it for other use cases. In our case, some of our models took up to 30 minutes to bring up the confidence level above 95%, while in other cases it has taken up to a week.

Factors that affect this training period are following:
(a) Lighting,
(b) Number of objects in the frame, and
(c) Complexity of the shape of the object.

Diagram describing our Object Classification Engine We will now describe a couple of used cases to make this clear:

Used case 1: Implementing Standard Operating Procedure (SOP)

Suppose a SOP is defined for a drug testing methodology in a pharmaceutical laboratory. Requirements are as follows:
● Capture and time stamp when an employee enters and exits the laboratory.
● Measure procedure when the drug testing starts.
● Identify colored flasks and test tubes and their movement from one step to the next.
● Identify microscopes and other medical instruments used in measurement and how they are being used.
● Flag any deviation from SOP and report to administrators.

As one can see, one can use identifying objects in the video stream and determine whether the SOP is being followed. This can be used by the laboratory management team to improve overall efficiency of the laboratory and its’ employee performance without looking at video streams.

A snapshot of our current Object Classification Engine for illustrative purposes

Used Case 2: Measuring queue lengths at bank counters, airport check in lines, hospitals etc.

● Determine queue lengths to determine arrival and service rates.
● Queue lengths will increase if service time is greater than person arrival time.
● Flag these so the service efficiency can be improved.

Summary

The neural network model has over 25 million pre-defined objects in the database. These have been developed using artificial intelligence techniques. In a typical end user case, a very small subset of these 25 million predefined objects is required. New objects are continuously being added to the database. The model also allows itself to be put in training mode based on what the end user really wants Our Object Classification Engine takes advantage of this and provides interfaces so that end user case applications can be developed rapidly and be put to use. We provide extracted data, interface to the ML neural network model as well as we provide application development services for the customer.

Summary

Leave a Reply Cancel reply

Related News