Page cover

Ai Vishing

Leveraging voice cloning tech in social engineering

We are not prepared

Having used AI voice cloning during social engineering engagements and CTF labs this year it became clear to me quickly that most support desks, and organizations overall, are not prepared to deal with this threat. To demonstrate the ease of these attacks let's use the following attack setup. 1) Obtain impersonation victims phone number ( this can be done a variety of ways, from just looking on linked-in to phishing)

2) Obtain a voice sample of the victim ( This is surprisingly easy in practice and can also be done in a variety of ways. Some examples are: cutting it from a podcast or investor call and simply dialing the person/ripping it from their voicemail)

3) use Eleven labs to clone the voice ( Breaks TOS, don't do this unless given permission to do so)

4) create a voicemail that requests a password reset or callback to a spoofed number. (Eleven labs sound effect generator can be used along with Audacity to make the call extremely realistic, adding in background noise, distortion and distractions which the voicemail can replicate, like a bus or airplane going by)

5) use Spooftel or Spoofcard to spoof the impersonated victims number, (I'm sure there are *more operationally secure* ways to do this, but for testing it works well) You can also use the AI voice to create a voicemail for a callback number created thru something like Google Voice to make the attack seem even more real. ( Number Spoofing is not legal, do not do this unless given permission to do so)

6) Send the AI voicemail and get a foothold in the network thru password reset or other payload means. (look at me, I am the CEO now!)

In this way you are calling from the CEO’s number, Using the CEOs voice, requesting a callback to a number that has a voicemail using the CEO’s voice. And this is all using easily available tools.

It's almost at the point where you can rapidly translate text to speech thru the AI cloned voice in a realistic fashion, at that point you wouldn't need to use asynchronous communications either, vastly increasing the effectiveness of the technique. Furthermore, I'm sure some readers have seen the AI video scams as well, that is rapidly also approaching the point where a stressed and overloaded help desk employee would very likely not be able to tell it was an AI video Teams message. One other interesting and more sophisticated way this AI vishing attack could be conducted is by leveraging a bot that is synced to the voice and uses an LLM to communicate, primed with the goal of obtaining the payload (password reset, callback etc). In this way the victim is put even more at ease by the conversational nature of the call and the back and forth and realistic pushback using social engineering strategy employed by the AI call bot. This could further be enhanced by training the LLM on a victim's writings ( for example, emails, social media or books) allowing the bot to even impersonate the speaking patterns. This would be a very sophisticated attack requiring large amounts of resources, but it demonstrates the larger possibilities of AI social engineering. What does this all mean? What is the action item to this information from an organization's perspective? Checklists and verifications procedures must be followed EVERY time. Does your organization have written down procedures, are they always followed?

Also All forms of voice security should be deemed insecure and obsolete, the ease of cloning voices and the availability of samples makes the risk of bypass very great. Day by day it becomes harder to differentiate the artificially generated from the real, this is not a trend that will change anytime soon. Social engineering is already a common real work attack vector that results in breaches, I predict it will only increase in effectiveness due to AI tooling.

(Note: don't break the law)

All hail the AI overlords

Last updated