Almost 10 years after its launch, about 200,000 people still use a Xiaomi Mi 2

Google explains how deep 3D photos in Google Photos work



Google Photos added a great novelty last December: automatic 3D effect photos. Google calls them 'cinematic photographs' and can be generated automatically from the application, by clicking on the recent highlights section.



From the Google blog have wanted to explain how they manage to give movement to the photos, making them have such a striking 3D effect. As always, they use their neural networks and computational expertise.








The technology behind Google's 'cinematic photos'





According to Google, with cinematic photos it wants to try to revive the user "the feeling of immersion of the moment in which the photo was taken", simulating both the movement made by the camera and the 3D parallax. How do you convert a 2D image into a 3D one?

Google uses its neural networks trained with photographs taken with the Pixel 4 to estimate depth of field with a single RGB image



Google explains that, as they do with portrait mode or augmented reality, cinematic photographs require a depth map to be able to give information about the 3D structure. To achieve this effect in any mobile that does not have a double camera, they have trained a convolutional neural network to predict a depth map from a single RGB image.









Google explains how portrait mode works with the Pixel 4's dual camera





With only one point of view (the plane of the photo), it is able to estimate the depth of the photograph with monocular keys such as the relative sizes of the objects, perspective of the photograph, blur and more. For this information to be more complete, use data collected with the Pixel 4's camera, to combine them with other photos taken with professional cameras by the Google team.



Basically, the technique is similar to that of the Pixel portrait mode: the image is analyzed, segmented and once the background is isolated, movement is simulated by moving the background. This is much more complex, since various corrections and analysis are required in the photograph since a few misinterpreted pixels could ruin the final result.



More information | Google