Real Time Facial Recognition

December 13, 2017

By: Dr. Peyman Askari

Last week we discussed the perils of brute force facial recognition and how with only 6000 surveillance cameras, we could grind the US economy to a halt in just one year. Today we’re going to leverage a simple trick to dramatically increase that number.

With computation, the true enemy is complexity. With computers, when computation is slow, that’s bad, but when computation slows down, that’s really bad, and that’s the problem we encountered with our brute force method. Just as how accelerating an object adds to its inertial mass which makes it harder to accelerate further, here adding a face to the database increases its size which makes it harder to match the next face. In the case of the former we will never be able to achieve light speed because it will require infinite energy. In the case of the latter we will never be able to retain all faces indefinitely because it will require infinite hardware. What we can do, however, is leverage statistics and mathematics to reduce the performance hit considerably.

Using statistics, or basic logic, we can guess that at 30 frames per second (fps), a face is more likely to be a repeat than a new face simply because people do not move at insanely fast speeds. Now we tweak our algorithm to group similar faces together, or in other words, to create the concept of a person. Using this person approach, in a 1 hour video with 1 hour x 60 minutes / hour x 60 seconds / minute x 30 frames per second = 108,000 frames, and with 1 face in every 5 seconds, or 150 frames, we get 108,000 frames x 1 face / 150 frames = 720 frames. This performance reduction brings down our yearly costs on 100 cameras from $1.2B to only $55K and America can now perform facial recognition on 700,000 cameras per year without being forced to suspend social security and universal health care.

That may seem like a lot of cameras, but with nearly 7 billion people inhabiting this planet, that isn’t enough cameras to surveil 1% of the population. To put it in perspective, China alone has 140 million surveillance cameras. Let’s instead ask the question like so: how many cameras would we need to perform facial recognition on every single person on the planet at all times? The earth has roughly 150 million square kilometers of land, and lets take a distribution of 1 camera per 1 square kilometer, partly to average out areas of high camera concentration (ie. London Heathrow) and areas with low camera concentration (ie. the Sahara Desert), but also partly due to the fact that it gives us a nice round number of 150 million surveillance cameras. With this many cameras, how much money would we need to carry out facial recognition for 1 year? About 125 Quadrillion dollars, or roughly 1,600 times the total world economy of $75 trillion dollars. In fact, with $75 trillion dollars, you could run 150 million cameras for only a pitiful two and a half days.

The reason this is happening is because although we addressed one of our challenges by reducing the total number of matches we did not address our main issue, of reducing the incremental addition of complexity over time. To address that, we will turn to math.

Turns out our main bottleneck is that dastardly database. Remember that we add no more cameras as time goes on, we simply add more time, or to be more specific, more faces over time. Reducing the frequency at which we match against the database (one face every five seconds as opposed to every frame) is one way to achieve this, another is to postpone the matching against the large database by creating an intermediary small database stored locally on each camera. All faces are matched against a local database on the camera for a slightly longer than 5 second duration, say for 60 seconds, then when all unique faces have been identified (ie. people), only then are they matched against the large database. For crowded areas, such as subway stations or concerts, the uniqueness factor, which is the percentage of faces belonging to unique individuals, is essentially zero as every face is new. For most other places, such as in an office or commercial setting, however, the uniqueness factor is closer to 75% meaning that a face picked out of a frame has a 75% chance of belonging to a person seen up to 60 seconds ago. Skipping the math, which gets very complicated, we see that using a local database has the effect that running facial recognition on 150 million cameras for one year now costs only $2.2 trillion – of course I use the term only here lightly—and with the world economy, we can run close to a billion cameras for a year before the UN has to assume control and declare global bankruptcy.

Although these numbers are far more accommodating, they are only postponing the inevitable. What we really want is an algorithm that is as sophisticated as the human brain. Such a solution does exist, and gets pretty close to the level of sophistication of the brain. This will be the topic of discussion for next week.

© 2018 Prilyx Research & Development Ltd. | 2220 – 1050 W Pender St Vancouver, BC. Canada, V6E 3S7
+1 604. 428. 5030 | info@prilyx.com