Voice messages will be great once they come with accurate transcriptions
The people who love voice messages love voice messages. Vox.com’s Kaitlyn Tiffany and The Verge’s Ashley Carman aren’t those people. On this week’s Why’d You Push That Button, they discuss voice messages and why people send them. They also try to figure out why people like them in the first place.
Ashley talks to her best friend, Casey, about her habit of sending voice messages, and Kaitlyn interviews The Verge’s very own AI reporter James Vincent and his mom, Bridget, about their family texting dynamics. It’s heartwarming. Then, Ashley and Kaitlyn take all that they’ve learned to Djamel Agaoua, the CEO of messaging app Viber, to learn more about why people use voice messages and how they’ve become more popular around the world. Agaoua posits a few theories on why they’ve bloomed in popularity and previews how voice messages will evolve in the future.
Listen to the show below and follow along with Agaoua’s transcript. Of course, feel free to subscribe anywhere you typically get your podcasts. You know our usual places: Apple Podcasts, Pocket Casts, Spotify, Google Podcasts, and our RSS feed. Subscribe your friends, too! Steal their phones and just sign them up for the podcast; they’ll love it.
This interview has been lightly edited for clarity.
Ashley: So we are back with Djamel Agaoua. He is the CEO of Viber, a messaging app. Maybe just to start, could you tell us a little bit about Viber?
Djamel Agaoua: Yeah, Viber was founded eight years ago by a couple of guys in Israel. It grew very fast to a couple of hundred million users in three or four years; got acquired by Rakuten, a Japanese giant of e-commerce, for about a billion dollars in February 2014. So now we’re still part of the group Rakuten, and we have now about a billion users registered and on the platform. Our big countries, big regions are all the Eastern European countries, Middle East, Southeast Asia, those are our main countries, and we have a more challenging position in Europe, US, with a lot, of course, immigrants from the countries where we are very strong. We’re one of the very global messaging platform in the world.
Ashley: Are most of your users on Android or iOS?
About 75 percent Android, 25 percent of iOS.
Ashley: Okay. Interesting, and can you say what percentage are in the US?
Five percent maybe, 5 to 7 percent.
Ashley: So today we are talking about voice messages, and right off the bat we’re curious, why do people use voice messages?
There are whole bunch of reasons, but the first reason is probably because of the speed. Most people just talk faster than they type on a keyboard, and we all know that speed is a very important topic. The other reason that we can see in some specific countries is probably also the literacy skills, or maybe just the fear to get caught with some spelling mistakes. People prefer to talk than to send a message with a lot of spelling mistakes, and also because it’s convenient for some specific situation or people, because it frees up the hands, so you can still send a message while you’re doing something else, like painting or typing on another, bigger keyboard. So there are a bunch of reasons, but we saw voice messages growing very fast the last two years.
Kaitlyn Tiffany: Do you have rough numbers on how often people are using voice messages on Viber, as opposed to texting?
Texting is still the vast majority of the messages that we send out on our platform. That’s obvious, but voice messages have been growing very fast, especially the last two years. This year, for example, we saw voice messages growing more than 50 percent. And it’s a bit surprising to us because it was pretty stable three, four years ago, and it started to grow this year dramatically. And there are probably a couple of reasons that explain it.
Ashley: What are those reasons?
First of all, there is clearly a boom of the home assistant that is starting to… first of all, the technology for voice recognition is much better now than it used to be three years ago. The boom of the home assistant also educates some people to talk to a device, talk to a machine, to get something done. To launch a search, for example. So it’s also something that we see, and we also saw that, for example, the youngsters are progressing faster on this kind of message than another population. So that’s probably the reason we see a lot of youngsters that are used to the home assistant at home and that well prefer to use voice messages than anything else.
Ashley: When you say youngster, what is a youngster? Are we youngsters?
I’m not sure because I don’t see you very well, but I’m sure you are much, much, much younger than me. I’m a dinosaur.
So I mean, I have three young kids, and they’re four to nine, so they’re not on Viber, of course, or any other platform. When my three kids want to play some music at home, they don’t move from the couch. They just talk to their favorite assistant and call his or her, I don’t want to offend anyone, a little name to launch some music. They definitely get it like that, and we cannot imagine that our kids in five or 10 years from now will type on this small keyboard to send a message to someone because they will have spent their entire life talking to a machine to get the door closed, or to launch music, or to switch off the lights.
So we clearly see this kind of behavior. If we look at teenagers now, and I would say, to give you some numbers, from 15 to 25, this category of users is more excited by voice messages than other users.
Kaitlyn: Something that I think, based on no substantiated evidence and just my own recent experience watching When Harry Met Sally and being jealous of them for having voicemails, I kind of wonder if part of the reason young people are into voice memos is just this whole nostalgia, like analog trend, like, “Oh my God wouldn’t it be so cool and romantic to have little voice notes?” I don’t know if you have any insight into that. That’s just something that I suspect.
Yeah, what I think is that the major problem of text messages is the lack of emotions sometimes, and that’s the reason why, for example, for years, emojis or stickers or GIFs have been so, so popular on the Viber platform, for example. We have a huge amount of users that are sending stickers and GIFs on our platform, and voice remains the best way to send emotions to your friends or family. So the combined fact that the technology is now much better, and then when you talk to a machine the machine understands much better than in the past, and the fact that it’s still the best way to send your emotion away, it’s probably one of the reasons. I mean clearly. So maybe it’s not romantic enough, to use the same words that you used, but yes I think it’s better to say, “I love you,” than to just write the three words.
Ashley: I’m going to assume you don’t listen to the contents of people’s voice messages, but do you have any sense of situations in which a voice message would be used over a text? Are they actually more emotional messages being sent over voice versus the logistical?
I have no clue how to answer this question for a simple reason: at Viber, all of our messages, all our calls, are end-to-end encrypted, which means that we don’t have access to the content. So it’s a very big difference compared to all the blue apps that you can imagine. We don’t listen to the calls. We don’t look at the messages. We don’t listen to the voice messages. We just can’t do it. We don’t have the keys to read into that. So I have no idea about that.
What we saw is a higher demand for longer messages. So that’s a reason why, for example, we recently extended the maximum length of a voice message, and it was a surprise for me, the dinosaur, because I was thinking that a short message is always better than a long one. But we extended the maximum length to 15 minutes, and we saw immediately a huge usage of this function. So it seems that a lot of users like to tell stories or declare their flame in a very long and romantic way. But it’s all in pieces because we don’t have access to the content of those messages.
Ashley: So people are actually taking advantage of 15 minutes worth of a voice message?
I know it’s a surprise. It was a surprise to me, to be very honest, when my VP of product came to me and told me, “I want to do that.” I don’t see the point. He had a hard time to convince me, and he showed me some number, showed me some focus group that we did with some users, and we saw that yes, to make it more convenient, we even created a button, which is a lock button, because as you know, when you leave a voice message you have to push a button and keep it pushed while you’re recording. So we decided to lock this button to allow the visitor a button because 15 minutes can be really long to press on a screen. And yes, we have an increasing usage of long voice messages.
Kaitlyn: Oh my god, what are they doing, like reenacting entire episodes of sitcoms to each other or something? That’s one example of how the tech is changing, but how else are you evolving the voice product, and what are some of the technical challenges?
It’s a long journey, and we are on this journey and it’s a step-by-step approach. So as I said, the first step was to increase the length of the message. Second is make it more convenient to send. The next one, very soon, will be to combine it better with the call services. Our users are big callers. When we look at the numbers of our competitors, we have more calls than they have, so we want to just make it seem much simpler for an unanswered call to leave a voice message if the call is not answered. The next generation, probably next year, will be to offer transcription because if the quickest way to send a message is to talk, the quickest way to receive a message is clearly to read. So to offer the ability to send a voice message and to transcribe it into texts will be extremely useful for our users.
We probably will roll that out country by country because the technology of transcription is not the same quality in every country. So progressively, we do that. My ultimate dream, especially for someone like me, I’m French, and I have to fight with my broken English sometimes to make me understood, will be to combine our existing translation feature that’s already live on Viber for all our platform, and you can translate a message that you receive in French into English, or whatever, to combine transcription with translation, and then we will get to the point where somebody will be able to talk his native language to send a message, and his message will be received in another language, indexed in the natural language of the other person. Maybe next time I do this podcast I’ll just speak in French, and you will read the script in English and won’t have to fight with my broken English.
Ashley: Your English is great. We kind of talked about this, but why are specific parts of the world more interested in voice messages?
We see some differences. That’s true. I’m not sure I can explain them completely today, but one of the things that we think is true is the literacy level. I mean the average level of literacy of the population in general. It’s true that in some countries we are, for example, very strong in some countries, like Myanmar. And we see a lot of voice messages in this country, probably related to the fact that people are uncomfortable to write, in some cases, especially in the countries where Viber is used not only for personal and light conversation but also for very official types of conversation. Because we are very strong over there, so we can have discussions on Viber about your job or about things that you don’t want to be seen as weak in some spelling situation for example.
Some other reasons also are that some devices themselves are very difficult to handle for some languages because of the alphabet of those languages, and then sometimes the device that the users have in those countries are not a iOS, but can be cheaper Android phones, and those phones may have some weakness in terms of the way they present the keyboard in the local language, so it also explains some difficulties. So this is one of the reasons we see. The rest is really about early adopters or not, so the other trend we see is that when a user starts to use voice messages, it doesn’t stop. It does increase. It’s like you have a cliff, and once you get there, once you have taken this habit to talk, you really like it and you go on more and more and more.
It’s true it’s not really not very popular in the US, but in a very, very specific situation, the wife of my very good friend is a super addict to the voice messages, and she’s doing voice recognition. She’s not doing voice messages. She’s using a voice to write the text, and she loves it, and she’s the only person who I receive texts that are 20 lines. She just talks to the phone like this, “boom,” and then sends a text after that.
So it’s coming, and I think it’s just something that was not in our habit, but will be pushed more and more and more by the fact that the technology’s out there and voice recognition is pretty good in some languages, not all, and it’s faster. So that’s the reason why we strongly believe it’s going to grow because it’s just faster to send a message this way.
Ashley: Do you send voice messages ever?
Not a lot to be very honest. But some, yeah. To be very frank, like I said, the first time my VP of product came to me, I was not convinced, and then I started to desire to try to use it. And it’s actually very convenient and especially with the lock button, which allows you to do a lot of stuff. So I have started, yes, but I was not convinced at all the beginning, but you know, I’m just a dinosaur.