The future is here - when surveillance meets AI

Hangzhou, China

Schema for Dahua facial recognition technology

What will the world be like when surveillance meets AI (Analytics Intelligence)?

According to an IHS report from September 2016, security camera shipments continue to grow at 14.1% CAGR and are predicted to reach 190 million units in 2020, despite revenue growth having slowed to 8.1% CAGR. There are just too many cameras and video footage to be digested by human operators. Most security video footage is erased or over-written without being watched. Video Analytics (VA) Technology was once perceived as a solution to automate the utilisation of abundant video footage resources. By means of identifying and tagging the appearance of certain patterns in a video, the VA or AI system could perform search and run statistics on it.

Such output could further be accumulated and analysed to find trends and correlations. However, the potential has not been translated into business momentum. Complexity in analytic algorithm made it difficult to develop new software to detect a desirable pattern and the tremendous demand in CPU processing power made it difficult to get timely analytics output. Artificial Intelligence may be the key to unlock this potential.

Video Analytics Technology has been evolving over the past 10 years. It is making the headlines more often lately due to the use of Artificial Intelligence. Machine learning greatly simplifies the software development process and the processing power of GPU made it possible to perform near real time video analysis. For example, the 2016 G20 summit, China has deployed a security solution developed by Dahua Technology using AI – deep learning to automatically screen pedestrians in airports and train stations to try to identify crime suspects.

Deep learning refers to artificial neural networks that are composed of many layers. It aims to emulate a human’s ability to analyse and study. It imitates the mechanism of the brain in order to interpret data, such as image, voice and text. Deep learning has been successfully applied in image and voice recognition and is set to be a future development direction. In 2013, deep learning was listed by MIT as one of the top ten breakthrough technologies.

In the security industry, the application of deep learning is important for two reasons. Deep learning, on the one hand improves the accuracy of some algorithms. On the other hand, it realises functions which cannot be done without using deep learning. For instance, facial recognition includes three key parts: face detection, facial features alignment and feature extraction comparison. If deep learning technology was adopted, the performance of each part would be improved dramatically. Using deep learning, the facial expression, gender, age, hair colour, accessories, emotion etc. all can be better recognised. Moreover, GPU can be used to accelerate the computation of deep learning algorithm. Traditional intelligent analysis is unable to cover a large-scale scene with more than 300 people, not to mention group analysis of moving scenes. Now based on deep learning technology and GPU, it can easily deal with 300 targets simultaneously and further estimate the crowd density and identify the movement of the crowd, to provide more useful information to security staff.

Obviously, deep leaning accelerates the development of intelligent surveillance. On 7th March, 2017, Dahua, worked together with Nvidia, a world-leading Artificial Intelligence (AI) computing company, to launch the “Deep Sense” server for smart video structure analysis. Meanwhile, Dahua also cooperated with many renowned universities in and out of China to advance research on deep learning. As a result, Dahua’s face recognition algorithm ranked number one on the public authoritative testing platform LFW, beating Tencent, Google and other top academic and commercial groups around the world.

Dahua Technology made an early start in AI application amongst the players in the global security industry. In 2009, Dahua established a department to research on intelligent algorithms, exploring potential applications in security solutions. The department was later merged with other research groups to form the Institute of Advanced Technology, which focuses on advanced technologies on AI, optics, Codec and ISP, etc. ANPR (automatic number plate recognition) by Dahua has greatly improved traffic and parking management for better environments, promoting sustainable urban development. Deep learning is also being applied to the recognition of vehicles and people. Human objects can be classified according to clothing, hair colour, wearing eye glasses, backpack, gender, age range and even facial expression. Vehicles can be classified by colour, make, model and type on top of vehicle license plate.

The ability to utilise AI to identify and analyse vehicles is going to be very valuable. A witness may remember the colour and make but not the plate of a vehicle. After applying deep learning, there has been an obvious improvement in AI-powered security applications. On the one hand, the rate of number plate recognition has increased significantly. On the other hand, it is now able to identify car features like type, make, model and colour in a more systematic way. Combining various elements in one search, it becomes possible to identify a target vehicle even if the license plate is not captured. (Please refer the short video below-shown)

Traditional intelligent video analysis technology was previously not able to perform recognition of body shape, gender, age, hair colour or hair length, but Dahua’s deep learning technology made it all possible. Deep learning video analytics server handles recognition of up to 80 people within 40ms. Human recognition also suits to be applied in crowded places with continuous flows of people, such as on escalators, crossroads, business centtrs and gates of exhibition centtrs, and its accuracy rate reaches up to 95%. As long as there is enough training done, the recognition rate is only constrained by how big a part of the target is exposed to the camera and its moving speed. Just as if a human operator was watching the video full time.

In recent years, the American TV series called "Person of Interest" has been very popular. This TV series described details of predicting crimes by AI. A software genius called Finch invented a programme for advance recognition of potential violent criminals based on observing a pattern. Sounded like science fiction but it is close to becoming reality with AI deep learning.

A GPU powered “Deep Sense” server can cover 192-channels of HD video. Unlike previous Intelligent Video Analytics (IVA) which can only monitor the key entrances due to cost and capacity limitation, it becomes technically and economically viable to fully monitor the surveillance system of a typical building campus. With a rich set of search criteria, it is much more likely to get a match even without seeing a clear face shot of the target. The system can trace the trail of a target to screen for “behaviour of interest”. This helps police improve their speed in solving crimes and deterring criminals thereby improving security. For example, if the police want to find a suspect who is a middle-aged man with a red umbrella, they can search the key words like “red umbrella”, “male”, “30 to 50 years old” and so on in the system. The AI system can perform a quick search and therefore saves a great deal of manual work.

The development of AI applications will likely face many obstacles and difficulties but the trends are optimistic. The advance in human object and vehicle recognition has made significant impact to security applications. Voice recognition is likely to be the next driver. Acoustic patterns can be combined with human behaviour patterns or vehicle characteristics to narrow down a search faster and reduce false alarms. Voice can also be a form of data entry or interaction. Hand gesture and body gesture or a combination of these could help the “machine” to understand the context of what is happening.

Product Suppliers
Back to top