Add ECAPA2 to VoxCeleb#3039
Conversation
be08a77 to
b43ab2a
Compare
TParcollet
left a comment
There was a problem hiding this comment.
Hey! Thank you very much for this recipe! I won't be able to try it because we do not have the voxceleb data. Before finding someone to try it, could you please address the comments?
| num_workers: !ref <num_workers> | ||
|
|
||
| # Functions | ||
| use_tacotron2_mel_spec: True |
There was a problem hiding this comment.
Why using this? Is there a reason for not using standard Mels?
There was a problem hiding this comment.
No, I just followed the same used in this recipe https://github.com/speechbrain/speechbrain/blob/develop/recipes/VoxCeleb/SpeakerRec/hparams/train_ecapa_tdnn_mel_spec.yaml
But I can change it to standard mels
There was a problem hiding this comment.
Oh, this is new, interesting, then ok. Could you try with the standard ones just to see the end result maybe?
| @@ -0,0 +1,97 @@ | |||
| # ################################ | |||
| # Model: Speaker Verification Baseline for ECAPA2 | |||
| # Acknowledgment: The source code is derived from the Kiwano toolkit. | |||
There was a problem hiding this comment.
Add author name for tracking.
| |-----------------|------------|------| -----| | ||
| | Xvector + PLDA | VoxCeleb 1,2 | 3.23% | https://www.dropbox.com/sh/ab1ma1lnmskedo8/AADsmgOLPdEjSF6wV3KyhNG1a?dl=0 | | ||
| | ECAPA-TDNN | VoxCeleb 1,2 | 0.80% | https://www.dropbox.com/sh/ab1ma1lnmskedo8/AADsmgOLPdEjSF6wV3KyhNG1a?dl=0 | | ||
| | ECAPA2 | VoxCeleb 1,2 | 0.60% | https://drive.google.com/drive/folders/1cpU5qpCVM30Ip8I85EPM33lsUYPa6S7q?usp=sharing | |
There was a problem hiding this comment.
Please @Adel-Moumen can we upload this to dropbox?
| with torch.no_grad(): | ||
| feats = params["compute_features"](wavs) | ||
| if ( | ||
| "use_tacotron2_mel_spec" in params |
There was a problem hiding this comment.
Yes, this is a bit confusing. See above question, i'd prefer if we could use standard Mel. Is there a real difference?
|
|
||
|
|
||
| class SubCenterClassifier(nn.Module): | ||
| """Sub-Center ArcFace Classifier. |
There was a problem hiding this comment.
Docstring isn't explicit enough. I don't know what this is.
|
|
||
|
|
||
| class ECAPA2Res2NetConv1d(nn.Module): | ||
| """Res2Net convolutional block for 1D features.""" |
|
|
||
|
|
||
| class ECAPA2TDNNBlock(nn.Module): | ||
| """TDNN block for ECAPA2.""" |
|
|
||
|
|
||
| class ECAPA2DenseBlock(nn.Module): | ||
| """Dense convolutional block for ECAPA2.""" |
|
|
||
|
|
||
| class ECAPA2AttentiveStatPoolingBlock(nn.Module): | ||
| """Attentive Statistics Pooling for ECAPA2.""" |
|
|
||
|
|
||
| class JeffreysLoss(nn.Module): | ||
| """Computes the Jeffreys Loss, a combination of Cross Entropy, Label Smoothing, |
There was a problem hiding this comment.
Can we get a unit test for this new loss please?
What does this PR do?
This PR implements the ECAPA2 model architecture and its corresponding training recipe for VoxCeleb.
Key Additions:
speechbrain/lobes/models/ECAPA2.py: Implementation of the ECAPA2 architecture andSubCenterClassifier.speechbrain/nnet/losses.py: AddedJeffreysLossfor embedding regularization.recipes/VoxCeleb/SpeakerRec/):train_ecapa2.yamlandverification_ecapa2.yaml.train_speaker_embeddings.pyandspeaker_verification_cosine.pyto support the new model and pipeline requirements.Testing & Validation:
tests/recipes/VoxCeleb.csv.pytest teststo ensure existing functionality remains intact.pre-commit run -ato verify strict code formatting and linting.Performance:
Trained on VoxCeleb 1 + VoxCeleb 2:
Trained on VoxCeleb 2 only (tested without s-norm):
Fixes N/A
Breaking changes: None. Backward compatibility is maintained for existing VoxCeleb scripts.
Before submitting
PR review
Reviewer checklist