“Imagine J.A.R.V.I.S. from Iron Man – that’s really what we’re building.”
New startup Corto is harnessing the power of artificial intelligence (AI) and data analysis to reimagine the creative process for the entertainment and marketing industries.
Every month, Corto collects 5 million unique conversations on media and entertainment in North America, China, South East Asia and Latin America. This knowledge bank forms the basis of their work and constitutes the largest repository of audience data in any of these regions. Yves Bergquist, cofounder and CEO of Corto, plans to leverage this wealth of data for the media and entertainment industries to drive performance and deliver deeper connections with audiences by unearthing the cognitive relationship between content and audience.
Here, Bergquist speaks with us about the holy grail of creative novelty, the highly-personalized future of entertainment and how he’s using AI and data to enhance storytelling and marketing.
What is Corto?
Corto is an early stage AI startup that is leveraging about 10 years of research and development that my team and I have been doing in the field as well as R&D we have been doing for the past two years at the Entertainment Technology Center at the USC, applying deep AI solutions to media and entertainment, creative and storytelling products.
You’re a data scientist. How does your background and expertise lend itself to creativity?
My passions for creativity and math have collided because thinking about the creative process in my head, it seems to be very algorithmic, sequential and procedural. It seemed that it was complex but not infinitely complex, and this is a pretty good problem to try and break down using quantitative methods.
The world seems random, but stories are extremely logical and in fact, there’s an opportunity to make them a little less logical and create more innovation in the storytelling process.
In a previous interview you said you and your team set out to “revolutionize Hollywood.” What did you mean by that?
I regret saying that. It was a little arrogant. But what I would say is the entire media and entertainment industry is not currently leveraging the entire algorithms and systems that have been coming out of AI in the past few years. All the insight tools in the industry right now are about 10 years old and we want to leverage all the exciting R&D that has been going on in the AI and machine learning community to solve problems in the industry. For example, how do you map the virality of a piece of content? How do you map how certain communities influence other communities? How do you segment the hardcore fans of something to the groups of people who aren’t fans but probably will become fans with a bit of nudging?
How granular can your analysis get to help brands or films understand their audience?
In the entertainment space as long as there’s passion in conversation we can get very granular. For example, we have sentiment analysis that can tell you if people feel neurotic, happy or type A about your product. Also, since we have a huge database of scripts, films, TV shows and even political speeches, we can take that narrative structure of your brand and tell you what kind of film or TV show or what kind of character in a film/TV show your brand feels like to an audience. We can do a lot of work around audience affinity and can get down in the weeds with how people are talking about you.
We also have the ability to tell in every single zip code in America what people like and dislike when it comes to brands, TV shows, restaurants. We’re able to infer and see in the data what brands are good candidates to sponsor that show because we see a direct affinity with brands, audience and type of show. We’re also about to suggest topics to cover in the show to resonate with the largest possible audience for them and how to evolve the show in a way that will grow that show’s audience.
On the topic of being granular, you’re able to parse any text for 60 different dimensions of emotional tonality…
Yes. We’re using a natural language processor application that is probably the most sophisticated when looking at what is the emotional tonality in the text based on what words are present in the text but also the sequence of words in the text. What signals does that give us for people who are writing and saying these words. This works really well for scripts and social media conversations because it picks up on the nuances. We’ve indexed now a few thousand scripts, we’ve parsed them, and we can give you the emotional tonality of every single character and every single line in this script.
What I like to do is compare what we do with the microscope and how the microscope has really revolutionized biology in the sense that you can now have the tools to look deeply into your audiences and why do people like this story or this character or product or brand. And look very scientifically and mathematically into that.
How does this work in the world of visuals?
In the visual sense it’s even more ambitious and comprehensive. We’re working on an application called Vid2Vec, which takes every single visual attribute, in every frame, every second, every scene: color, composition, objects, etc. We even classify tonalities of audio together with the emotional tonalities that’s being explored at this particular moment in this moment. It gives us a very semantic representation of everything that’s going on in that frame, so you can create some very granular content recommendation models.
When it comes to storytelling, there seems to be an appetite from audiences for something more complex. How can Corto help develop unique stories for the entertainment industry?
This is the more important and I think probably the coolest thing we’re doing. We have a mathematical and cognitive representation of interestingness—of everything our brain finds interesting. That applies to everything: from media content and people to relationships and food — literally everything our brain perceives. In everything that our brain finds interesting there is a specific ratio of attributes about that thing that are known and that are traditionally associated with that thing and attributes of novelty about that thing.
For example, if you’re watching a film (this happens a lot of the time) and you feel like you’ve seen this scene, character or story a million times before, this means that these attributes of the film are in the “known” category which we call the kinetical category—it’s because they’re canons of the genre. On the other hand, if I put you in a completely new environment, if I make you listen to music for example that is completely random and your brain can’t completely represent what it means it because it’s never experienced it before—all the attributes about that track are in the “novelty” domain. If too much of that “novelty” happens it’s overwhelming. The sweet spot is in the middle. We think that somewhere in this methodology is the holy grail of creative novelty and surprise in media.
Are there any overarching themes you’ve noticed from the global data you have gathered?
What I’ve noticed is that audiences are not as different as we think across cultures. There are some variants, of course, but it’s not as much as people think. There is a common human cognitive way of approaching stories and characters that is the same everywhere.
We also found a lot of surprising things [from our data]. In a film there are generally two problems within the story that needs to be solved. Usually, the main character has a problem that needs to be solved and there’s a problem at the community level that also needs to be solved. We found that audiences prefer that the problem be solved at the community level than the personal level. In other words, if both problems are solved at the end of the film then great, the film does really well. If neither problem gets solved the film will do poorly. If the main character gets what they want but the community level problem isn’t solved, then that film will do poorly. But if the problem gets solved at community level and the main character doesn’t get what they want, then the film tends to do very well. These are the things we’re noticing but there are a million of them out there.
We’re looking to raise a little bit of money. We have a prototype that we’ve been able to show that’s very exciting. The way you interact with all this data is through natural language. Imagine J.A.R.V.I.S. from Iron Man – that’s really what we’re building. We’re looking to launch a very early version of our prototype at the end of this year, which is already in testing phase.