When last we checked in on the state of realistic sounding AI generated human voices, The technology was getting better but still had a way to go before it was going to fool most people. But sitting here today, having just listened to this fake Joe Rogan, I feel very confident saying that my god, we’ve reached holy shit I would totally fall for that territory.
I’m sure somewhere there’s a diehard Joe Rogan fan screaming at me through his monitor saying come on, man! But no. You come on, man! It might not be 100 percent perfect, but as far as the general population is going to be concerned, goddammit, that’s Joe Rogan talking.
Here’s the voice of actual Joe by way of a random episode of his podcast for comparison.
Seriously, don’t even try to tell me that the two aren’t scarily close, at least not until you try this real vs. fake quiz, creatively named Faux Rogan. Somehow I managed to pass it with 5 out of 8 correct, but it was far from easy. I’m not sure that if I took it again after some time had passed that I would get a similar result, honestly. They really do sound damn near identical.
So where did this come from?
It’s the work of a company called Dessa and the RealTalk deep learning model its engineers have developed. That link will take you to a fairly technical explanation of how they did all of this, if you’re interested.
Importantly, the company seems to be aware of what they’ve done here and the implications of it, so much so that they have no plans to release any of their actual research or datasets to the public. That’s nice, even though it’s surely only a matter of time before some other company does.