Neural Machine Translation
Models
Model |
Lang |
|---|---|
EN |
|
EN |
|
RU |
|
RU |
Datasets
-
Lang: EN
Rows: 2630
Preprocess:
Rename column
entosource.Rename column
frtotarget.Delete duplicates in dataset.
Reset indexes.
-
Lang: EN
Rows: 700
Preprocess:
Rename column
entosource.Rename column
detotarget.Delete duplicates in dataset.
Add prefix Translate from English to German: for each
sourcerow.Reset indexes.
shreevigneshs/iwslt-2023-en-ru-train-val-split-0.2
Lang: RU
Rows: 600
Preprocess:
Drop columns
ru_annotated,styles.Rename column
rutosource.Rename column
entotarget.Reset indexes.
nuvocare/Ted2020_en_es_fr_de_it_ca_pl_ru_nl
Lang: RU
Rows: 7210
Preprocess:
Drop columns
de,en,fr,it,nl,pl.Rename column
rutosource.Rename column
estotarget.Delete empty rows in dataset.
Delete duplicates in dataset.
Reset indexes.
Supervised Fine-Tuning (SFT) Parameters
Note
Set the parameter
target_modules=["k_proj", "v_proj", "q_proj", "out_proj"] for the
Helsinki-NLP/opus-mt-en-fr,
Helsinki-NLP/opus-mt-ru-en,
Helsinki-NLP/opus-mt-ru-es
models.
Note
Set the parameter learning_rate=1e-4 for the
Helsinki-NLP/opus-mt-ru-es
model as SFT parameter.
Metrics
BLEU