Amazon Goes 3D (plus more from Disney & Facebook)

Amazon’s 3D body scanner, Disney’s expressive AI voices & Facebook’s music fetcher

April 8, 2021

Neer Sharma

Guest Contributor to The Daily Upside

April 8, 2021

Amazon’s 3D body scanner, Disney’s expressive AI voices & Facebook’s music fetcher

1. Amazon – predictive personalised 3D body models

Amazon is working on generating 3D body models of people based on 2D images of them.

Currently, 3D modelling of human bodies require large and expensive scanners, making them impractical to do in your own home.

Amazon is looking to create an app where users will be instructed to take photos of a user from different directions. Based on these images, Amazon will determine a 3D model of a user, based on estimated body measurements. For example, Amazon will estimate a user’s weight, body fat, body dimensions (e.g. arm length), and skin texture / colour.

With this model, users could then begin to interact with the model. For example, a user may wonder what they could look like if they lost some body fat and increased their muscle. Amazon’s interface will provide users with a sliding interface to adjust these parameters.

What’s Amazon’s game plan?

Well, in this issue of Patent Drop, we saw that Amazon filed a patent application for users to be able to order customised clothing. In that filing, there was a small mention of Amazon looking to capture a user’s measurements via submitted photos or through a camera.

This latest filing seems to be how Amazon will look to enable users to maybe order customised clothing that is right for their body measurements.

But before that, Amazon having 3D body models of users could allow users to virtually try out on items of clothing to make sure that they’re ordering the right size. In theory, this could help minimise returns, and in turn save on costs. This could also become a separate technology layer that Amazon sells to other e-commerce platforms.

And maybe more wildly, could there be interesting media applications from creating a database of 3D human body scans? For example, imagine if you could ‘inject’ yourself into an Amazon Prime movie as one of the main actors.

Will keep an eye out for any future developments around this.

2. Disney – emotionally expressive synthesized voice

Disney is working on making speech generation sound more emotionally expressive.

The difficulty of generating emotionally expressive voices is that one word can be enunciated in numerous ways, each of which would reveal a different emotional state of the speaker. Most speech generation models are trained to generate the most likely averaged speech, which tends to be neutral in tone.

Without getting lost in the details of how Disney is essentially looking to train neural networks to generate speech that takes on the form of different emotional contexts, such as happiness, sadness, anger, fear, excitement, affection, dislike and more. As shown in the above image, there’ll be audio templates where speech will be generated to fill in the gaps, in a way that aligns of the emotional context for the whole sentence.

Why is this interesting?

In a previous issue of Patent Drop, we saw that Disney is working on AI role playing experiences where an AI chat bot will play as an intelligent character with kids. Moreover, in #014, we saw that Disney is looking at deploying robot actors that will interact with people in a life-like fashion.

Putting this altogether, Disney is looking to bring their characters to life as physical, embodied, intelligent, emotionally expressive creatures that people can interact with. I wouldn’t be surprised if Disney become one of the major breakout brands for consumerising robotics.

If you’re interested in this space, check out what Sonantic is doing with expressive AI voices for the game voiceover industry.

3. Facebook – fetching music

If you’ve been following the last few editions of Patent Drop, there’s been this recurring music theme in the weekly Facebook patent filings.

In this latest filing, Facebook describes a system to easily transfer music from one platform to another. For example, let’s say you’re in Apple Music and you take a screenshot of the song that you’re playing. Facebook’s filing would enable a user in another app to upload this photo, and then the music to be fetched in this different app.

Why?

In theory, it could create a layer of interoperability between music apps. If your friend uses Spotify and tells you to listen to this song, they could send you a screenshot which you could upload into Apple Music and pull up the right song.

With respect to Facebook and Instagram specifically, this music fetching could help users easily bring in songs that they want to use in their Stories or Reels videos, without needing to search the song using text. Moreover, consumers of a Stories or Reels video with music could take a screenshot of the song playing and easily find it in whatever music app they use.

From the last few weeks of filing applications, it feels like Facebook is exploring how music can become a more integral part of the ecosystem, beyond just being a piece of content to add to Stories & Reels.