Predicting wrong speaker many times #27

hudsantos · 2017-04-19T08:08:25Z

Hi Yuxin Wu,

I've cloned your nice solution for some testing, but after training, it is not predicting as expected.
I am not using the gui. Only using ./speaker-recognition.py
In fact, it got trained well and is responding to all prediction attempts, but pointing to a person in which is not the one who is speaking.
Follows an explanation on how I am handling it:

I've created a training/ directory with the following content:

training/
├── fegens
│   ├── Fegens2.wav
│   └── Fegens.wav
├── hudson
│   ├── hudson2.wav
│   ├── hudson3.wav
│   └── Hudson.wav
├── jan
│   └── Jan.wav
├── paulo
│   └── Paulo.wav
└── pedreau
├── Pedreau2.wav
├── pedreau3.wav
└── Pedreau.wav
5 directories, 10 files

..and ran enroll task like you've documented:

$ ./speaker-recognition.py -t enroll -i "./training/*" -m model.out

Then I've tried predictions this way:

$ ./speaker-recognition.py -t predict -i "already_trained_file.wav" -m model.out

If, and only if, predicting input file a) is exactly the same or b) isn't the same file but is a re-recording of it with exactly the same content, and same duration or near, then obviously it's predicting very well! Even if I predict those same audios simulating with a smartphone, recording them again, and predicting, like if some of those peole were talking to microfone.. then software can say who is talking!! It's amazing!! Very nice!

But if I play only five or seven seconds of a random voice, even with a voice software should know, it is predicting with a error rate too high. Seems random.
Here is the duration of my voice files:

Duration 5.120 seconds
Duration 3.560 seconds
Duration 40.320 seconds
Duration 28.860 seconds
Duration 20.480 seconds
Duration 13.200 seconds
Duration 14.160 seconds
Duration 19.320 seconds
Duration 15.880 seconds
Duration 23.360 seconds

Note: I am working at home, with good silence conditions. No significant SNR affecting the audios.

And I've already tried to multiply those same voice samples, with different filenames but under each person directory (just for testing, cause I think for machine learning is means not so much difference), remaining 5 directories, and 70 files..and training it again to a new model_2, predicting using this new model_2 and even so error rate too high, like randomizing.

I would like to let you know it is a very nice software! Very nice documentation PDFs. Congratulations!
And if you can, it would be nice if you can give us some help on what do you think is going on... Am I using right methods to get it trained? Is the directory structure fine? Does quantities I've used are fine? Do you have any other suggestions?

Thank you very much! Greetings from Brazil.

The text was updated successfully, but these errors were encountered:

ppwwyyxx · 2017-04-20T08:43:56Z

Please put data of the same person into same directory.
And you'll need more data in your case.
Usually at least one minute of constant talking per person is needed, but the more the better.

hudsantos · 2017-04-20T13:58:06Z

Ok Thanks! About "put data of the same person into same directory" i've already done this.
But regarding "more data" I'll do that. I'll enroll with more constant talking.

Thanks!

hudsantos · 2017-04-24T07:11:25Z

predict/Fegens2.wav -> fegens___________________[OK!]
predict/Fegens.wav -> fegens____________________[OK!]
predict/fozy.wav -> tairon______________________[OK!]
predict/hudson2.wav -> pedreau__________________[ERROR!]
predict/hudson3.wav -> pedreau__________________[ERROR!]
predict/Hudson.wav -> pedreau___________________[ERROR!]
predict/Jan.wav -> jan__________________________[OK!]
predict/mpl_Fegens2.wav -> tairon_______________[ERROR!]
predict/mpl_Fegens.wav -> tairon________________[ERROR!]
predict/mpl_hudson2.wav -> tairon_______________[ERROR!]
predict/mpl_hudson3.wav -> tairon_______________[ERROR!]
predict/mpl_Hudson.wav -> tairon________________[ERROR!]
predict/mpl_Jan.wav -> tairon___________________[ERROR!]
predict/mpl_Paulo.wav -> tairon_________________[ERROR!]
predict/mpl_Pedreau2.wav -> tairon______________[ERROR!]
predict/mpl_pedreau3.wav -> tairon______________[ERROR!]
predict/mpl_Pedreau.wav -> tairon_______________[ERROR!]
predict/Paulo.wav -> paulo______________________[OK!]
predict/Pedreau2.wav -> pedreau_________________[OK!]
predict/pedreau3.wav -> pedreau_________________[OK!]
predict/Pedreau.wav -> pedreau__________________[OK!]
predict/sim_eu_mesmo.wav -> hudson______________[OK!]

hudsantos · 2017-04-24T07:17:21Z

Information on OK and ERROR I've writed based on results prediction said itself. I am very excited with this project! I have no doubt I'm gonna find out what I am doing wrong! ...and post here. All ideas are welcome.

richardm47 · 2017-09-14T11:29:30Z

Hi @hudsantos , has the prediction improved for you ?

hudsantos · 2017-09-14T15:46:35Z

noup... prediction has not improved in my case..
I am still following this project aside with this thread to see if someone can discover why it is not working.. because I couldn't.

skulai · 2017-10-29T07:45:55Z

@hudsantos I am trying to run the project, I followed the instructions in the docker file. All the dependencies are installed. Then I execute
$ docker run speaker-recognition
/usr/lib/python2.7/dist-packages/matplotlib/font_manager.py:273: UserWarning: Matplotlib is building the font cache using fc-list. This may take a moment.
warnings.warn('Matplotlib is building the font cache using fc-list. This may take a moment.')
usage: speaker-recognition.py [-h] -t TASK -i INPUT -m MODEL
speaker-recognition.py: error: argument -t/--task is required

Could you please help me out. I cannot get this project to start.

soswow · 2019-06-30T02:16:06Z

@hudsantos some years passed. I wonder if you figured anything out?

hudsantos closed this as completed Apr 21, 2017

hudsantos reopened this Apr 24, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Predicting wrong speaker many times #27

Predicting wrong speaker many times #27

hudsantos commented Apr 19, 2017

ppwwyyxx commented Apr 20, 2017

hudsantos commented Apr 20, 2017

hudsantos commented Apr 24, 2017

hudsantos commented Apr 24, 2017

richardm47 commented Sep 14, 2017

hudsantos commented Sep 14, 2017

skulai commented Oct 29, 2017

soswow commented Jun 30, 2019

Predicting wrong speaker many times #27

Predicting wrong speaker many times #27

Comments

hudsantos commented Apr 19, 2017

ppwwyyxx commented Apr 20, 2017

hudsantos commented Apr 20, 2017

hudsantos commented Apr 24, 2017

hudsantos commented Apr 24, 2017

richardm47 commented Sep 14, 2017

hudsantos commented Sep 14, 2017

skulai commented Oct 29, 2017

soswow commented Jun 30, 2019