AI for the Security Industry: Real-World Applications
In recent years, Artificial Intelligence (AI) has been the buzzword in the video analytics domain. Trade show stands are rife with AI demos promoting ambitious functionality set to change the face of CCTV in security. Impressive as many of these demonstrations are, there is a definite air of scepticism on the part of the end-user. Is the hype around AI warranted, and can science actually deliver? This feels reminiscent of a decade ago when video analytics promised to revolutionise CCTV monitoring. Today, reliable and effective analytics is the mainstream and is driving tangible business value. That said, there is no denying that the last five years of AI innovation has led to tangible and practical solutions, with the security industry finally starting to reap the benefits. However, AI is now at a precipice – on the cusp of what industry experts call an ‘AI winter’ – so, everyone is wondering what’s next and what is possible. This paper investigates precisely this, focusing on the physical security space. What is AI? One formal definition of Artificial Intelligence (AI) identifies the technology with the “development of computer systems able to perform tasks normally requiring human intelligence such as visual perception, speech recognition, decision-making, and translation between languages.” In reality, the term AI covers a wide range of applications and tends to refer to the current problem being tackled, which of course is constantly evolving. When we think of AI in the security industry, this usually translates to a few key areas: Asset protection & monitoring. Access control. Business intelligence. Decision support. Machine Learning is the process of teaching a system to perform a task, while Deep Learning is just a subset of Machine Learning. There are many other non-deep learning based ML methods which, for the purposes of this paper, will be referred to as traditional ML approaches. Often, when AI is mentioned, what is really being referenced is the Machine Learning (ML) or Deep Learning (DL) algorithm powering that solution. For example, license plate recognition (LPR) is often the application of a DL model to locate and extract a license plate from an image, coupled with ML algorithms cross-referencing information from a database. Therefore, this application should be referred to as a combination of ML and DL – not simply AI. The distinction between traditional ML and DL is an important one, as the recent boom in AI solutions often refers to advances in Deep Learning techniques. In the majority of cases, the use of Deep Learning has led to a significant jump in accuracy over traditional ML techniques. For example, a well-known academic image classification challenge, in which images must be classified into one of a thousand different classes, has seen a notable increase in accuracy – going from 50% of the images being classified correctly in 2011, using traditional ML techniques, to nearly 90% today using modern DL techniques. The figure below illustrates the improvement in the ImageNet challenge over time. Machine Learning vs Deep Learning To understand Deep Learning’s dramatic improvement over traditional Machine Learning techniques, let’s look at how an example asset protection use case could be approached with both methodologies. The goal is to detect if the object in the field of view of a particular camera represents a threat and should generate an alarm (person, vehicle etc), or constitutes mere background noise that can be ignored. To begin, through the use of a movement-based tracker (another ML system) a camera has detected motion and defined a region of interest around the object. Machine Learning (ML) The traditional Machine Learning pipeline generally requires the developer to represent an input (e.g., a region of interest in an image) into a structured feature descriptor of that input: for example, a set of numbers that represents the shape in the image (HOG, SIFT), or possibly another property in the image (colour, texture etc). The model is then trained by feeding labelled examples of the object feature descriptors you want to recognise (person, vehicle) and object feature descriptors of objects you expect to see but want to ignore (trees, shadows, animals etc.). The Machine Learning algorithm learns to group these feature descriptors into these categories so, when a new unlabelled feature representation is fed to the system, it can make an assessment as to which category it might fall into. A system’s accuracy hinges on a developers’ ability to come up with a feature descriptor which the Machine Learning algorithm can easily group into classes to detect vs those to ignore. One of the biggest advantages of using human-designed feature descriptors is the data required to train the ML model is reduced. Creation of labelled datasets to train any Machine Learning algorithm takes significant time and therefore resource. As a consequence, traditional Machine Learning techniques are still very much relevant due to this significant time and cost-saving. Deep Learning (DL) Deep Learning follows a similar process. However, instead of relying on a human-in-the-loop method of developing a robust feature descriptor, the Deep Learning system itself just looks at the labelled input data to learn the best way of grouping the images. By showing the system large numbers of samples (training), the system refines its model to best describe the data it is being shown. The disadvantage is that, for a Deep Learning model to learn that best representation from the data, a notably larger amount of data is necessary. However, although the data requirements are more significant, the Deep Learning approach removes the guesswork of a developer trying to define the optimal representation of an input to enable the system to learn. It also has the advantage that the same approach can be applicable to a range of different problems, whereas traditional ML may require redesigning the feature descriptor based on the application. Deep Learning has demonstrated its advantages over traditional methods. However, the real question is how it can be used to improve business processes or increase precision in detection, while reducing costs for security businesses….