Facebook and Arm ML: Beyond the Perfect Selfie

How Facebook Artificial Intelligence and the Arm Compute Library are working together to enable optimal ML applications

Once upon a time, machine learning (ML) on your smartphone meant predictive text or simple speech and face recognition, often sharing the workload between the device itself and computers in the cloud. But with the advent of new, turbo-charged CPUs and GPUs capable of accelerating ML algorithms and specialised ML processors – such as the Arm ML processor – more and more processing is being done at the edge, on-device. The capabilities of our mobile devices are growing fast, opening the door to novel, more ambitious use cases.

Edge compute brings numerous benefits: by shifting less data to the cloud for processing, cost, power and bandwidth are reduced – not to mention latency – while security and reliability are increased. With faster processors, edge ML can also add new levels of sophistication to existing apps, bringing, for example, a professional touch to one of the most popular smartphone use cases.

Superlative Selfies

Smartphone camera lenses tend to keep everything more or less in focus – which is great for an even, homogeneous image, such as a landscape, but less flattering for the smartphone’s most ubiquitous shot, the selfie.

When professional photographers take a portrait, they add depth by focusing on the face and softly blurring the background. This has the dual benefit of drawing attention to the area of interest – in this case, the face – and creating layers of interest within the frame. One way to achieve this on a smartphone is to use an image segmentation network to determine what is foreground and what is background. This technique works with a single camera – in theory, any smartphone camera – and is what the Facebook AI team implemented for the first version of Instagram Focus.

Selfie Instagram Focus in action
Helping to perfect the selfie: Instagram Focus in action

Instagram Focus

Whether using selfie mode, or the standard, back-facing camera, Focus uses the image segmentation network to automatically hone in on the subject of the image while blurring the background to create a professional-looking shot. As you might imagine, this is a complex technique that requires significant additional processing to run quickly and efficiently, and as a result was deployed selectively to higher-end platforms supporting the necessary optimizations. And, due to a powerful collaboration with Arm and the Compute Library team, this also includes a number of devices with Arm Mali GPUs.

A Key Collaboration

The desire to move other key ML workloads on-device was the catalyst for an exciting collaboration between Facebook and Arm, which began back in 2017 with work to integrate the Compute Library with Caffe2.

PyTorch logo
And now that Facebook has integrated the fast performance found in Caffe2 within the new PyTorch1.0 deep learning platform, it will be even easier to implement a range of ML features across a wide variety of mobile applications. The Compute Library aims to provide best-in-class ML optimizations for Mali GPUs – as well as Arm CPUs – so it’s a natural choice to bring these technologies together, as well as being a great way to test the functionality and performance of the Compute Library through real-world use cases.

The code can be sourced directly from the Compute Library.

ML workloads are typified by large amounts of compute and data transfer. Both are actively hostile to energy-constrained devices such as mobiles, so working out how best to shape the workload across the available cores can be complex, but it is vital. Much of the collaboration between Arm and Facebook has been – and continues to be – how best to architect the processing and data flow for a particular network. Ultimately, the aim is to make it easy for developers to create optimal and portable ML applications. This work continues and will target new processors and techniques as they become available.

Looking Forward

As ML technologies mature, there will be many more examples of ML running locally within applications, reducing reliance on the cloud for key functions and eliminating the latency.

Arm is excited to continue our collaboration with Facebook and the PyTorch community to integrate our hardware and software ML technologies. As hardware and software evolve – alongside the constant improvements in AI and ML technology – the capabilities of these features will become more sophisticated and more ambitious. And as the technology is made available to the developer community, it will be fascinating to see what use cases emerge.

The latest versions of the software that were part of this collaboration are available from the links below, so why not download them and see what benefits machine learning can bring to your Arm platforms?

Download the Compute Library