Summarization
Models
Model |
Lang |
|---|---|
mrm8488/bert-mini2bert-mini-finetuned-cnn_daily_mail-summarization |
EN |
EN |
|
mrm8488/bert-small2bert-small-finetuned-cnn_daily_mail-summarization |
EN |
RU |
|
RU |
|
RU |
Datasets
-
Lang: EN
Rows: 973
Preprocess:
Rename column
reporttosource.Rename column
summarytotarget.Reset indexes.
-
Lang: EN
Rows: 11490
Preprocess:
Select
1.0.0subset.Drop columns
id.Rename column
articletosource.Rename column
highlightstotarget.Delete duplicates in dataset.
Remove substring
(CNN)for eachsourcerow.Reset indexes.
tomasg25/scientific_lay_summarisation
Lang: EN
Rows: 1376
Preprocess:
Select
plossubset.Drop columns
section_headings,keywords,title,year.Rename column
articletosource.Rename column
summarytotarget.Reset indexes.
-
Lang: EN
Rows: 6658
Preprocess:
Rename column
articletosource.Rename column
abstracttotarget.Reset indexes.
-
Lang: RU
Rows: 6793
Preprocess:
Drop columns
title,date,url.Rename column
texttosource.Rename column
summarytotarget.Reset indexes.
-
Lang: RU
Rows: 30454
Preprocess:
Select
trainsplit.Drop columns
title,date,url.Rename column
article_contenttosource.Rename column
summarytotarget.Reset indexes.
-
Lang: RU
Rows: 7609
Preprocess:
Rename column
infotosource.Rename column
summarytotarget.Reset indexes.
-
Lang: RU
Rows: 95
Preprocess:
Select
trainsplit.Rename column
Reviewstosource.Rename column
Summarytotarget.Reset indexes.
Metrics
BLEU
ROUGE
Note
Use the rougeL metric and set seed=77 parameter
when loading the rouge metric.