Author Topic: DaveClone  (Read 13183 times)

0 Members and 1 Guest are viewing this topic.

Offline EEVblogTopic starter

  • Administrator
  • *****
  • Posts: 37764
  • Country: au
    • EEVblog
DaveClone
« on: May 07, 2023, 11:45:18 am »
I'm experimenting with various AI voice cloning tools.
Just tried ElevenLabs and it's, well, not me at all.
No surprise given the limit of 10MB of source learning material and the instant nature of it.
FAIL



https://www.eevblog.com/forum/chatgptai/daveclone/?action=dlattach;attach=1777136
« Last Edit: May 08, 2023, 12:45:40 am by EEVblog »
 

Online MK14

  • Super Contributor
  • ***
  • Posts: 4544
  • Country: gb
Re: DaveClone
« Reply #1 on: May 07, 2023, 12:42:58 pm »
My understanding (from memory and could be wrong), is that a person has difficulties, properly hearing (comparing) their own/real voice from recordings.  Because, unlike when other people speak, you hear your voice more directly and more via direct vibrations and things.

I.e. You are NOT (contrary to a persons opinion), necessarily the best independent judge of how real it sounds.

To me, (although it does sound a fair bit off), it is a (somewhat vaguely) reasonable representation of your voice (via the supplied MP3 file).  But by no means, anywhere near perfect.

Also, I suspect opinions (perceptions) on how good it is (or not), will vary a fair bit, from person to person.

N.B. I agree, it is not especially brilliant (the MP3), and fairly easy to tell it is not you.

EDIT: Here is an article, which attempts to explain the differences, between hearing your own voice, while speaking and a recording.

https://www.wonderopolis.org/wonder/why-does-my-voice-sound-different-on-a-recording
« Last Edit: May 07, 2023, 01:11:58 pm by MK14 »
 

Offline RoGeorge

  • Super Contributor
  • ***
  • Posts: 6227
  • Country: ro
Re: DaveClone
« Reply #2 on: May 07, 2023, 01:17:25 pm »
Very few will notice the differences if you overlap that sample mp3 to one of your videos.
Not a perfect imitation, but very close.  :-+
 
The following users thanked this post: MK14

Offline schmitt trigger

  • Super Contributor
  • ***
  • Posts: 2225
  • Country: mx
Re: DaveClone
« Reply #3 on: May 07, 2023, 01:24:24 pm »
The genie is out of the bottle.
It is a matter of time where voice clones become indistinguishable, not only in the pitch, tone and inflection, but also in the unique combination of mannerisms that everyone has.
 
The following users thanked this post: MK14

Offline rstofer

  • Super Contributor
  • ***
  • Posts: 9890
  • Country: us
Re: DaveClone
« Reply #4 on: May 07, 2023, 02:24:52 pm »
I'm experimenting with various AI voice cloning tools.
Just tried ElevenLabs and it's, well, not me at all.

From a distance (say California), it's close enough to call it 'Dave'.

Part of the reason that I accept it as 'Dave' is that I don't hear audio from videos on a frequent basis.  The subtle differences are probably lost on me.

It's  'Dave' for all practical purposes.

 
The following users thanked this post: langwadt, MK14

Online PA0PBZ

  • Super Contributor
  • ***
  • Posts: 5134
  • Country: nl
Re: DaveClone
« Reply #5 on: May 07, 2023, 03:22:05 pm »
It's missing a bit of variation in tone and speed but yeah, it's Dave.
Keyboard error: Press F1 to continue.
 
The following users thanked this post: MK14, PartialDischarge

Online ebastler

  • Super Contributor
  • ***
  • Posts: 6524
  • Country: de
Re: DaveClone
« Reply #6 on: May 07, 2023, 04:09:59 pm »
I would also say it's close enough. The difference from the "real deal" might stem from the text style as much as from the voice itself.

(The sample text is probably from a textbook or such, very matter-of-fact and not "no script, no fear, all opinion". ;) Could you let the AI voice read a short piece of transcript from a typical EEVblog or EEVblab video?)
 
The following users thanked this post: MK14

Online MK14

  • Super Contributor
  • ***
  • Posts: 4544
  • Country: gb
Re: DaveClone
« Reply #7 on: May 07, 2023, 04:17:17 pm »
I would also say it's close enough. The difference from the "real deal" might stem from the text style as much as from the voice itself.

(The sample text is probably from a textbook or such, very matter-of-fact and not "no script, no fear, all opinion". ;) Could you let the AI voice read a short piece of transcript from a typical EEVblog or EEVblab video?)

That's a very good point, and idea.  With interesting results, if carried out.
 

Online Kim Christensen

  • Super Contributor
  • ***
  • Posts: 1328
  • Country: ca
Re: DaveClone
« Reply #8 on: May 07, 2023, 07:05:33 pm »
It does sound a bit like Dave... needs some more phrases like "Flappin' in the breeze", "brilliant", "there you go", etc for more realism.  :D


 

Offline gamalot

  • Super Contributor
  • ***
  • Posts: 1306
  • Country: au
  • Correct my English
    • Youtube
Re: DaveClone
« Reply #9 on: May 07, 2023, 07:52:01 pm »
For me, a non-native English speaker, the voice is very close to Dave's video, but the accent and tone of the speech are a little different, it sounds too formal or serious.

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 14506
  • Country: fr
Re: DaveClone
« Reply #10 on: May 07, 2023, 08:30:00 pm »
It's a lot more lifeless than the real deal, but the voice tone sure sounds very similar.
 

Offline Zero999

  • Super Contributor
  • ***
  • Posts: 19544
  • Country: gb
  • 0999
Re: DaveClone
« Reply #11 on: May 07, 2023, 08:52:47 pm »
I say it doesn't sound like Dave at all. The accent is much weaker. It sounds more like standard British English, than Dave's Australian accent.
 
The following users thanked this post: EEVblog, thm_w

Offline golden_labels

  • Super Contributor
  • ***
  • Posts: 1217
  • Country: pl
Re: DaveClone
« Reply #12 on: May 07, 2023, 09:27:52 pm »
I can’t tell the difference in the voice itself. But the way of speaking is wrong. Hearing this, I imagine Dave being coerced into giving a lecture to an empty room, while he himself is falling asleep. Give the real Dave a white, empty sheet of paper to describe and he will sound more enthusiastic.

I am also used to hearing Dave at 1.5×. This sounds sooo sloooooow at the default speed. ;)
« Last Edit: May 07, 2023, 09:34:31 pm by golden_labels »
People imagine AI as T1000. What we got so far is glorified T9.
 
The following users thanked this post: MK14

Offline EEVblogTopic starter

  • Administrator
  • *****
  • Posts: 37764
  • Country: au
    • EEVblog
Re: DaveClone
« Reply #13 on: May 08, 2023, 12:47:10 am »
My understanding (from memory and could be wrong), is that a person has difficulties, properly hearing (comparing) their own/real voice from recordings.  Because, unlike when other people speak, you hear your voice more directly and more via direct vibrations and things.
I.e. You are NOT (contrary to a persons opinion), necessarily the best independent judge of how real it sounds.

I have literally thousands of hours of training hearing my own voice through speakers exactly as you and everyone else has. Remember, when I edit my videos I'm listening through speakers, I'm not hearing my voice internally.
 
The following users thanked this post: MK14

Offline EEVblogTopic starter

  • Administrator
  • *****
  • Posts: 37764
  • Country: au
    • EEVblog
Re: DaveClone
« Reply #14 on: May 08, 2023, 12:47:52 am »
I say it doesn't sound like Dave at all. The accent is much weaker. It sounds more like standard British English, than Dave's Australian accent.

That's exactly how I hear it, it's clearly british.
 
The following users thanked this post: MK14

Online Kim Christensen

  • Super Contributor
  • ***
  • Posts: 1328
  • Country: ca
Re: DaveClone
« Reply #15 on: May 08, 2023, 12:59:20 am »
That's exactly how I hear it, it's clearly british.

To us poorly educated North Americans; Brit, Aussie, NZ, ya'll sound the same!  :-DD
 

Offline EEVblogTopic starter

  • Administrator
  • *****
  • Posts: 37764
  • Country: au
    • EEVblog
Re: DaveClone
« Reply #16 on: May 08, 2023, 01:03:41 am »
Ok, try this:
 

Online Kim Christensen

  • Super Contributor
  • ***
  • Posts: 1328
  • Country: ca
Re: DaveClone
« Reply #17 on: May 08, 2023, 02:10:26 am »
There is definitely a difference, but it's still an impressive imitation. Your real voice is more animated and dynamic and the accent is a bit different, but I'm sure some people could be fooled by the AI voice.
 

Offline EEVblogTopic starter

  • Administrator
  • *****
  • Posts: 37764
  • Country: au
    • EEVblog
Re: DaveClone
« Reply #18 on: May 08, 2023, 02:51:29 am »
There is definitely a difference, but it's still an impressive imitation. Your real voice is more animated and dynamic and the accent is a bit different, but I'm sure some people could be fooled by the AI voice.

Maybe some, but it's so not an aussie accent it's not even close in that regard.
 

Offline EEVblogTopic starter

  • Administrator
  • *****
  • Posts: 37764
  • Country: au
    • EEVblog
Re: DaveClone
« Reply #19 on: May 08, 2023, 02:54:36 am »
I wonder if there's any public figure or private person whose sense of identity can survive the coming wave of too-cheap-to-meter targeted advertising and entertainment.

No person with anything more than a few minutes of public audio of their voice out there is safe.
Wait until I try the higher end ones that build word lists and take days/weeks to process and build a voice profile, I'm betting it'll be chalk and cheese against this simplistic one.
This one literally look zero processing time. I uploaded the source files and that voice you heard was generated a minute later.
 

Offline golden_labels

  • Super Contributor
  • ***
  • Posts: 1217
  • Country: pl
Re: DaveClone
« Reply #20 on: May 08, 2023, 02:56:23 am »
Dave: in this video having to take a breath makes it sound like a human.

karpouzi9: for the particular situation you mentioned there is a working, well tested and widely deployed solution. Created over 40 years ago. It’s called a public-key signature.

In general case, I see it from a different perspective. Finally the majority can no longer turn a blind eye to problems, which were present for a long time, but were either understood by or affecting minorities or people deemed not worth caring about.
People imagine AI as T1000. What we got so far is glorified T9.
 

Offline Circlotron

  • Super Contributor
  • ***
  • Posts: 3184
  • Country: au
Re: DaveClone
« Reply #21 on: May 08, 2023, 03:47:31 am »
No person with anything more than a few minutes of public audio of their voice out there is safe.
Pity Bob Hawke isn't still Australian Prime Minister then. Comedians and others had a great time with his voice back in the day.

https://youtu.be/xz9BmAXW_Gs
 

Offline David Aurora

  • Frequent Contributor
  • **
  • Posts: 422
  • Country: au
Re: DaveClone
« Reply #22 on: May 08, 2023, 03:58:10 am »
Yeah, nah. I'll give it points for timbre- for a split second I was impressed on a word or two (FFT analysis of your voice vs the fake voice could be pretty interesting), but the accent difference was hilarious.

Still, this time next year it'll probably be perfect
 

Offline EEVblogTopic starter

  • Administrator
  • *****
  • Posts: 37764
  • Country: au
    • EEVblog
Re: DaveClone
« Reply #23 on: May 08, 2023, 04:07:29 am »
Pity Bob Hawke isn't still Australian Prime Minister then. Comedians and others had a great time with his voice back in the day.

Totally iconic!
 

Offline golden_labels

  • Super Contributor
  • ***
  • Posts: 1217
  • Country: pl
Re: DaveClone
« Reply #24 on: May 08, 2023, 05:07:45 am »
It’s worth remembering, that person’s voice should not be considered secret from security perspective. I am mentioning this, because the general population believes information not offered to wider public audience is secret. Just like other similar instances, one’s voice is both both processed by many parties and easily obtainable in active attacks (a phone call, for example).

Regarding fakes, a recommended movie: “The Congress” (2013).
« Last Edit: May 08, 2023, 05:14:19 am by golden_labels »
People imagine AI as T1000. What we got so far is glorified T9.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf