Skip to content
This repository has been archived by the owner on Feb 6, 2024. It is now read-only.

Speech.jl? #3

Open
jfsantos opened this issue Oct 27, 2014 · 8 comments
Open

Speech.jl? #3

jfsantos opened this issue Oct 27, 2014 · 8 comments

Comments

@jfsantos
Copy link
Member

I would like to start drafting a new package for speech signal processing, focused mainly on speech feature extraction (MFCCs, LPCs, fundamental frequency, etc). @davidavdav has a lot of work on MFCCs at MFCC.jl but, as last time we talked, it needed some updates. @davidavdav, would you mind chiming in with your comments and suggestions? Thanks!

@davidavdav
Copy link

MFCCs is now in METADATA, and apart from default parameter settings that should mimic HTK defaults (we could use some testing there) it has some other parameter sets (:rasta should mimic the default parameters of the rastamat package).

It further has various forms of feature normalization (mean/variance: znorm() and short time Gaussianization: warp()), derivatives (delta()) and shifted-delta-cepstra (sdc's, used in language recognition).

We could use some additional code to compute PLP (perceptual linear prediction) coefficients, RASTA processing. Other might be interested in LPC estimation, pitch extraction, etc---for recognition this is not too useful, but for (re)synthesis it may be.

I have higher level code in https://github.com/davidavdav/Feacalc.jl.git which can read .wav files and has some trivial energy-based speech activity detection, and save/load routines in HDF5 (for compatibility with non-julia software).

@jfsantos
Copy link
Member Author

That's great! So since MFCC is in METADATA, maybe we should aim for a higher degree of specialization in speech-related packages rather than having a single mega-package.

@davidavdav
Copy link

I suppose it would be better if MFCC would be moved into JuliaDSP, for tighter integration with DSP, and longer duration maintenance. How would you guys think about this and how would we proceed to do that?

@ssfrr
Copy link

ssfrr commented Jan 29, 2018

I think moving it into JuliaDSP seems pretty reasonable, MFCCs are pretty important (even outside of speech, e.g. music processing), so it's nice to have that stuff in an org rather than a personal repo.

@ssfrr
Copy link

ssfrr commented Jan 29, 2018

If other folks are onboard I think the process is roughly:

  1. transfer ownership of the repo to the org through the repo settings
  2. update the repo URL in METADATA (github will redirect the original URL appropriately, but it's nice to have the right one there)

incidentally JuliaAudio could also be a reasonable org for this to live in, though that family of packages is somewhat more opinionated w.r.t. samplerate-aware buffer and stream types so the package might need some minor refactoring to interoperate nicely with them.

@davidavdav
Copy link

Could you add me to JuliaDSP then? I tried to transfer ownership, but that didn't work as I was not allowed.

@ssfrr
Copy link

ssfrr commented Jan 30, 2018

I'm actually not a JuliaDSP member either, so I can't add you

@davidavdav
Copy link

All right, MFCC.jl is now part of JuliaDSP.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants