Neural Machine Translation
Models
Model |
Lang |
|---|---|
EN |
|
EN |
|
RU |
|
RU |
Datasets
-
Lang: EN
Rows: 2630
Preprocess:
Select
testsplit.Rename column
entosource.Rename column
frtotarget.Delete duplicates in dataset.
Reset indexes.
-
Lang: EN
Rows: 700
Preprocess:
Select
testsplit.Rename column
entosource.Rename column
detotarget.Delete duplicates in dataset.
Add prefix Translate from English to German: for each
sourcerow.Reset indexes.
shreevigneshs/iwslt-2023-en-ru-train-val-split-0.2
Lang: RU
Rows: 600
Preprocess:
Select
if_testsplit.Drop columns
ru_annotated,styles.Rename column
rutosource.Rename column
entotarget.Reset indexes.
nuvocare/Ted2020_en_es_fr_de_it_ca_pl_ru_nl
Lang: RU
Rows: 7210
Preprocess:
Select
testsplit.Drop columns
de,en,fr,it,nl,pl.Rename column
rutosource.Rename column
estotarget.Delete empty rows in dataset.
Delete duplicates in dataset.
Reset indexes.
Supervised Fine-Tuning (SFT) Parameters
Note
Set the parameter
target_modules=["q_proj", "k_proj"] for the
Helsinki-NLP/opus-mt-en-fr,
Helsinki-NLP/opus-mt-ru-en,
Helsinki-NLP/opus-mt-ru-es
models.
Note
Set the parameters target_modules=["q", "k", "v"], rank=24, alpha=36 for the
t5-small model as SFT parameter.
Note
Set the parameter fine_tuning_steps=100 for the
Helsinki-NLP/opus-mt-ru-es
model as SFT parameter.
Note
Set the parameters fine_tuning_steps=60, rank=16, alpha=24 for the
Helsinki-NLP/opus-mt-en-fr
model as SFT parameters.
Metrics
BLEU