[ad_1]
After various leaks and rumors, Google finally unveiled the Pixel 5 and Pixel 4a 5G earlier this year in September. Unsurprisingly, the devices came with a number of new Google camera features that set them apart from other Android phones on the market. These include Cinematic Pan for shake-free panning in videos, active and locked stabilization modes, night view support in portrait mode, and a portrait light function to automatically adjust the lighting of portrait shots. A few weeks after launch, Google rolled out most of these features for older Pixel devices via an update to Google Photos. And now, the company has shared some details about the technology behind the Portrait Light feature.
According to a recent company blog post, the portrait light feature was inspired by off-camera lights used by portrait photographers. Enhance portraits by modeling a repositionable light source that can be added to the scene. When added automatically, the artificial light source automatically adjusts direction and intensity to complement the photo’s existing lighting using machine learning.
As Google explains, the feature makes use of new machine learning models that were trained on a diverse data set of photos captured on the Light Stage computational lighting system. These models allow for two algorithmic capabilities:
- Automatic Directional Light Placement – Based on the machine learning algorithm, the feature automatically places an artificial light source that is consistent with how a professional photographer would have placed a light source outside the camera in the real world.
- Post-Capture Synthetic Relighting – Depending on the direction and intensity of light in a portrait shot, the machine learning algorithm adds synthetic light that appears realistic and natural.
For automatic location of directional light, Google trained a machine learning model to estimate a high dynamic range omni-directional lighting profile for a scene based on an input portrait. This new lighting estimation model can find the direction, relative intensity and color of all light sources in the scene coming from all directions, considering the face as a light probe. Also estimate the subject’s head position using a MediaPipe Face Mesh. Based on the aforementioned data, the algorithm determines the direction of the synthetic light.
Once the direction and intensity of the synthetic lighting are established, the following machine learning model adds the synthetic light source to the original photo. The second model was trained using millions of pairs of portraits, with and without additional lights. This dataset was generated by photographing 70 different people using the Light Stage computational lighting system, which is a spherical lighting rig that includes 64 cameras with different points of view and 331 individually programmable LED light sources.
Each of the seventy subjects was captured while illuminated one light at a time (OLAT) by each of the 331 LEDs. This generated its field of reflectance, that is, its appearance illuminated by the discrete sections of the spherical environment. The reflectance field encoded the unique color and light-reflective properties of the subject’s skin, hair, and clothing and determined how bright or opaque each material appeared in photos.
These OLAT images were then linearly aggregated to generate realistic images of the subject as it would appear in any image-based lighting environment, with complex light transport phenomena such as subsurface scattering correctly represented.
Then, instead of training the machine learning algorithm to predict the output of relay images directly, Google trained the model to generate a low-resolution ratio image that could be applied to the original input image to produce the desired output. This method is computationally efficient and encourages only low-frequency lighting changes without affecting the high-frequency image details that are transferred directly from the input image to maintain quality.
Additionally, Google trained a machine learning model to emulate the optical behavior of light sources that are reflected off relatively matte surfaces. To do so, the company trained the model to estimate the surface normals given the input photo and then applied Lambert’s law to calculate a “light visibility map” for the desired lighting direction. This light visibility map is then provided as input to the quotient image predictor to ensure that the model is trained using physics-based insights.
While this may all seem like a lengthy process that would take a bit of time to process the Pixel 5’s mid-range hardware, Google claims that the Portrait Light feature was optimized to run at interactive frame rates on mobile devices, with a total size of the less than 10MB model.