JavaScript is disabled in your web browser or browser is too old to support JavaScript. Today almost all web pages contain JavaScript, a scripting programming language that runs on visitor's web browser. It makes web pages functional for specific purposes and if disabled for some reason, the content or the functionality of the web page can be limited or unavailable.

Thursday, April 24, 2025

AI experts say leaked audio is most likely real

by

10 days ago
20250414

To­ba­go Cor­re­spon­dent

Two AI an­a­lysts say a leaked au­dio clip, al­leged­ly of a con­ver­sa­tion be­tween THA Chief Sec­re­tary Far­ley Au­gus­tine and Op­po­si­tion Leader Kam­la Per­sad-Bisses­sar, is most like­ly re­al.

Af­ter the clip was post­ed on so­cial me­dia last week­end, Au­gus­tine de­nounced it as “fool­ish­ness” and an at­tempt to link his To­ba­go Peo­ple’s Par­ty with the Unit­ed Na­tion­al Con­gress, while Per­sad-Bisses­sar de­scribed the au­dio as “fake news.”

How­ev­er, ac­cord­ing to one ex­pert, the tech­nol­o­gy tells a dif­fer­ent sto­ry.

“Ac­cord­ing to its syn­the­sis mod­el . . . it’s a two per cent pos­si­bil­i­ty (it’s AI-gen­er­at­ed)” said Steven Williams, an AI ap­pli­ca­tions con­sul­tant and ex­pert, with more than 30 years of ex­pe­ri­ence in tech­nol­o­gy and a ro­bust back­ground in cy­ber­se­cu­ri­ty, da­ta pri­va­cy, and dig­i­tal me­dia based in Bar­ba­dos.

Williams analysed the 577-sec­ond au­dio file us­ing three sep­a­rate tools, in­clud­ing 11Labs, which he de­scribed as “the lead­ing speech syn­the­sis mod­el.”

“It is very un­like­ly that this au­dio is gen­er­at­ed ei­ther by them or the au­dio was ma­nip­u­lat­ed in any way,” he ex­plained. “I found no tool that gave me any­thing above two to five per cent that it is AI-gen­er­at­ed.”

His re­port al­so high­light­ed the nat­ur­al sound and rhythm in the record­ing — fea­tures that are dif­fi­cult for AI to repli­cate.

“The au­dio was clean and con­sis­tent, in terms of things such as nat­ur­al sound or am­bi­ence, am­bi­ent noise. Noise in the nat­ur­al world is ran­dom. A bird chirp­ing, a car pass­ing by . . . AI could put cer­tain el­e­ments in, but it would have a cer­tain con­sis­ten­cy to it,” Williams said.

He al­so not­ed that “there was no abrupt clip­ping” which is a red flag that could sug­gest au­dio splic­ing or ma­nip­u­la­tion.

When eval­u­at­ing voice pat­terns, Williams said he looks for emo­tion, vari­a­tion, and rhythm, all of which ap­peared in the clip.

“Voice dy­nam­ics vary in loud­ness and in­flex­ion, so we find that AI hard­ly has the abil­i­ty to do this and that and be ex­pres­sive with sound and go up and down,” he ex­plained.

“There was no ex­is­tence of a pre­tence or an ar­ti­fi­cial in­stance of that. The tone was nat­ur­al, right, and the hu­man emo­tions con­veyed right was that of a hu­man be­ing.”

He added: “Paus­es and speech time ap­peared nat­ur­al and re­spon­sive, not over­ly pol­ished or ro­bot­ic.

Williams said AI sys­tems strug­gle with Caribbean di­alects, es­pe­cial­ly Trinida­di­an and To­bag­on­ian tones.

“Most AI tools are not trained on the Caribbean tonal­i­ty and ac­cent­ing. The like­li­hood that you would have some per­son train AI to ob­fus­cate a sin­gle con­ver­sa­tion is rel­a­tive­ly low,” he said.

“You have to train in a com­plete­ly new lan­guage.”

In the leaked record­ing, a male voice is heard dis­cussing funds al­leged­ly sent for two elec­tions, a pro­posed po­lit­i­cal al­liance, and what ap­peared to be a plan to un­der­mine con­trac­tors in To­ba­go. While no speak­er has been of­fi­cial­ly con­firmed, the con­tents of the al­leged record­ing have trig­gered de­bate and con­fu­sion in To­ba­go.

Williams was care­ful to note that his re­sults were not a foren­sic rul­ing.

“Even AI, there’s no one or ze­ro in terms of ei­ther yes or no. We talk about per­cent­ages,” he said.

How­ev­er, based on the tests he con­duct­ed, he con­clud­ed that the nat­ur­al tim­ing, tone and speech pat­tern in the clip point away from AI in­volve­ment.”

Travis Sookoo, a cy­ber­se­cu­ri­ty ad­vi­sor and ma­chine learn­ing re­searcher had a more cau­tious but in­sight­ful per­spec­tive on how to de­tect AI-gen­er­at­ed voic­es, fo­cus­ing on pat­terns the pub­lic can learn to recog­nise.

“There’s a sort of per­fec­tion that comes with it, even in speech,” he said.

“Even with di­alect, you know, you will hear some sort of per­fec­tion.”

“Re­al speech has flaws. If the voice sounds too smooth, too per­fect, that’s when you should be sus­pi­cious.”

Sookoo said with most AI tools, the rhythm and de­liv­ery might be too pol­ished.

“If we look at how AI writes and how it says things back, it is usu­al­ly in a per­fec­tion­ist base,” he said.

Sookoo added that at­ten­tion should be paid to tone, move­ment, di­alect, and small de­tails.

“There may be some­thing that will catch your ear if you are lis­ten­ing to it, the in­to­na­tion, how they use the ac­cent, the di­alect,” he said.

Both an­a­lysts called for more at­ten­tion to be paid to dig­i­tal lit­er­a­cy.

So far, no one has con­firmed or de­nied who is speak­ing in the clip but with no signs of voice merg­ing, no au­dio ma­nip­u­la­tion de­tect­ed, and clear in­di­ca­tors of nat­ur­al hu­man speech, the tools seem to point in one di­rec­tion.

“This is ei­ther a re­al con­ver­sa­tion, or it’s the most ad­vanced fake I’ve ever seen,” Williams said.


Related articles

Sponsored

Weather

PORT OF SPAIN WEATHER

Sponsored