ValueError on w2v-bert-v2[train] #1

allandclive · 2024-04-03T10:03:48Z

On google colab;
ValueError: Label values must be <= vocab_size: 29

allandclive · 2024-04-03T10:06:46Z

`---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
in <cell line: 1>()
----> 1 TRAINER.train() # resume_from_checkpoint=True # only if resume

9 frames
/usr/local/lib/python3.10/dist-packages/transformers/models/wav2vec2_bert/modeling_wav2vec2_bert.py in forward(self, input_features, attention_mask, output_attentions, output_hidden_states, return_dict, labels)
1245 if labels is not None:
1246 if labels.max() >= self.config.vocab_size:
-> 1247 raise ValueError(f"Label values must be <= vocab_size: {self.config.vocab_size}")
1248
1249 # retrieve loss input_lengths from attention_mask

ValueError: Label values must be <= vocab_size: 29`

phineas-pta · 2024-04-03T12:39:06Z

seem like u tried to fine tune for uganda language

in that case i think u should follow the official guide: https://huggingface.co/blog/fine-tune-w2v2-bert

my use case (vietnamese language) is a bit more mainstream so my script is simplified a lot comparing to official guide

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ValueError on w2v-bert-v2[train] #1

ValueError on w2v-bert-v2[train] #1

allandclive commented Apr 3, 2024

allandclive commented Apr 3, 2024

phineas-pta commented Apr 3, 2024

ValueError on w2v-bert-v2[train] #1

ValueError on w2v-bert-v2[train] #1

Comments

allandclive commented Apr 3, 2024

allandclive commented Apr 3, 2024

phineas-pta commented Apr 3, 2024