The (New) Social Dilemma
Back in 2020, I watched The Social Dilemma on Netflix and it shook me. Not because anything in it was a surprise, but because it was able to effectively articulate a lot of the concerns I’ve had growing up in the first generation with access to social media from a young age.
There was one quote in particular from the show that stood out to me, and continues to haunt me on an ongoing basis:
“With technology, you don’t have to overwhelm people’s strengths. You just have to overwhelm their weaknesses. This is overpowering human nature. And this is checkmate on humanity.” - Tristan Harris (Former design ethicist at Google)
In the show, the quote is referring to the addictive properties intentionally built into social media platforms, but I think it applies much more broadly, and especially rings true when it comes to generative AI.
The last couple of years have seen incredible advances in the quality, application, and adoption of generative AI. While this has unlocked amazing new capabilities for humanity, it comes with a much darker new reality as well.
I believe that (continuing) rapid improvements in generative AI have made it so we can no longer rely on our own senses to help us discern what is real from what is fake online.
Specifically, going back to Tristan’s quote, I believe that AI has now surpassed human weakness, and is quickly approaching human strength.
The result of this is that we have entered a new era of digital communication. An era where we can no longer trust a familiar voice. An era where we can no longer believe what we are seeing or hearing in real-time.
A level of doubt and uncertainty has been introduced into all digital communication and media. Things we once took for granted as real can now be questioned. Anyone can claim something is AI generated, and people have to legitimately consider that this might be the case.
This is a dramatic shift with far reaching implications across society.
So how do we solve this?
I believe it starts with helping people trust what they hear again.
Generative audio is progressing the fastest, and humans are not used to ever having to question a familiar voice.
It used to be that when we heard someone’s voice over the phone or on a call, we could confidently conclude that it was that person. No longer.
Companies like ElevenLabs, Typecast, Speechify, HeyGen, Respeecher, and WellSaid have all built incredible platforms that enable the cheap, quick, recreation of voices at a level of quality never before imagined. These platforms have all gotten so good that an everyday person cannot reliably identify AI generated speech from real speech when they are listening for it, let alone when they aren’t.
Outside of this, we have never had to think about protecting our own voices, or even consider where they might appear. From social media, to YouTube, to voicemail inbox messages… our voices are extremely accessible. In fact, a 2023 McAfee report found that 53% of adults share their voice online at least once a week, with 49% doing so over 10 times per week. When only 3 seconds of audio is needed to create a voice clone, anyone’s likeness is up for grabs.
The end result of this is that generative audio presents the highest risk of misuse in the near term. We are already seeing increasingly frequent headlines of social engineering and phishing attacks that used deepfake voices to target both individuals and businesses.
While social engineering and voice phishing (vishing) have always existed, attackers have now been handed powerful tools to launch these attacks increasingly effectively, cheaply, and at scale.
Let’s revisit the first half of the quote I referenced before:
“With technology, you don’t have to overwhelm people’s strengths. You just have to overwhelm their weaknesses.” - Tristan Harris (Former design ethicist at Google)
For businesses, and especially remote or distributed businesses, this dramatically impacts the need for call security across voice and video communication channels. Two channels that historically haven’t required as much security, as humans could be reliably counted on to identify potential imposters.
The reality is that this has changed, and the majority of businesses outside of banking and financial services have not taken meaningful actions to secure their voice calls, and certainly not their video calling platforms. Even companies in banking and financial services are quickly identifying that these communication channels are often woefully under protected compared to the rest of their organizations.
This is where we aim to help at DeepTrust.
We are building a call security platform to help protect employees across voice and video communication channels. DeepTrust doesn’t just identify deepfakes, it takes a layered approach to allow security teams to confidently protect their employees from all forms of social engineering, voice phishing, and fraud.
So what does this look like in practice?
It means that all VoIP platforms are protected with real-time audio source verification, deepfake detection, and context analysis that can identify high risk conversations and provide just-in-time security training directly in the call.
If an employee joins a call using an unauthorized device, it’s flagged. If a deepfake is used on a call, it’s flagged. If someone attempts to manipulate an employee into taking a compromising action or giving up sensitive information on a call, both the employee and security team are alerted in real-time and provided with recommended actions.
Our goal at DeepTrust is to ensure that businesses can confidently protect their employees from all types of voice and video based social engineering, and ensure that attackers can’t use generative AI to exploit the human element of security.
I’ll close by noting that this is just the beginning at DeepTrust.
Call security is critical, and is our primary focus at this time, but generative AI has introduced a problem much larger than just call security.
Long term, at DeepTrust we are setting out to protect human authenticity by building the trust layer for the internet and allowing full provenance over digital content and communication.
If you’re interested in learning more, or partnering with us to turn this vision into a reality, we would love to connect!