Speech.jl? #3

jfsantos · 2014-10-27T13:23:09Z

I would like to start drafting a new package for speech signal processing, focused mainly on speech feature extraction (MFCCs, LPCs, fundamental frequency, etc). @davidavdav has a lot of work on MFCCs at MFCC.jl but, as last time we talked, it needed some updates. @davidavdav, would you mind chiming in with your comments and suggestions? Thanks!

davidavdav · 2014-10-27T14:03:00Z

MFCCs is now in METADATA, and apart from default parameter settings that should mimic HTK defaults (we could use some testing there) it has some other parameter sets (:rasta should mimic the default parameters of the rastamat package).

It further has various forms of feature normalization (mean/variance: znorm() and short time Gaussianization: warp()), derivatives (delta()) and shifted-delta-cepstra (sdc's, used in language recognition).

We could use some additional code to compute PLP (perceptual linear prediction) coefficients, RASTA processing. Other might be interested in LPC estimation, pitch extraction, etc---for recognition this is not too useful, but for (re)synthesis it may be.

I have higher level code in https://github.com/davidavdav/Feacalc.jl.git which can read .wav files and has some trivial energy-based speech activity detection, and save/load routines in HDF5 (for compatibility with non-julia software).

jfsantos · 2014-10-27T23:25:08Z

That's great! So since MFCC is in METADATA, maybe we should aim for a higher degree of specialization in speech-related packages rather than having a single mega-package.

davidavdav · 2018-01-19T14:25:01Z

I suppose it would be better if MFCC would be moved into JuliaDSP, for tighter integration with DSP, and longer duration maintenance. How would you guys think about this and how would we proceed to do that?

ssfrr · 2018-01-29T16:10:08Z

I think moving it into JuliaDSP seems pretty reasonable, MFCCs are pretty important (even outside of speech, e.g. music processing), so it's nice to have that stuff in an org rather than a personal repo.

ssfrr · 2018-01-29T16:16:39Z

If other folks are onboard I think the process is roughly:

transfer ownership of the repo to the org through the repo settings
update the repo URL in METADATA (github will redirect the original URL appropriately, but it's nice to have the right one there)

incidentally JuliaAudio could also be a reasonable org for this to live in, though that family of packages is somewhat more opinionated w.r.t. samplerate-aware buffer and stream types so the package might need some minor refactoring to interoperate nicely with them.

davidavdav · 2018-01-29T23:17:17Z

Could you add me to JuliaDSP then? I tried to transfer ownership, but that didn't work as I was not allowed.

ssfrr · 2018-01-30T01:24:57Z

I'm actually not a JuliaDSP member either, so I can't add you

davidavdav · 2018-01-31T09:12:59Z

All right, MFCC.jl is now part of JuliaDSP.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speech.jl? #3

Speech.jl? #3

jfsantos commented Oct 27, 2014

davidavdav commented Oct 27, 2014

jfsantos commented Oct 27, 2014

davidavdav commented Jan 19, 2018

ssfrr commented Jan 29, 2018

ssfrr commented Jan 29, 2018

davidavdav commented Jan 29, 2018

ssfrr commented Jan 30, 2018

davidavdav commented Jan 31, 2018

Speech.jl? #3

Speech.jl? #3

Comments

jfsantos commented Oct 27, 2014

davidavdav commented Oct 27, 2014

jfsantos commented Oct 27, 2014

davidavdav commented Jan 19, 2018

ssfrr commented Jan 29, 2018

ssfrr commented Jan 29, 2018

davidavdav commented Jan 29, 2018

ssfrr commented Jan 30, 2018

davidavdav commented Jan 31, 2018