Generation

Models

Model	Lang	Task
timpal0l/mdeberta-v3-base-squad2	EN/RU	CLOSED QA
VMware/electra-small-mrqa	EN	CLOSED QA
EleutherAI/pythia-160m-deduped	EN	OPEN QA
JackFram/llama-68m	EN	OPEN QA
EleutherAI/gpt-neo-125m	EN	OPEN QA

Datasets CLOSED QA

starmpcc/Asclepius-Synthetic-Clinical-Notes
1. Lang: EN
2. Rows: 20038
3. Preprocess:
  1. Choose task Question Answering.
  2. Choose columns note, question and answer.
  3. Rename column note to context.
  4. Rename column answer to target.
  5. Reset indexes.
lionelchg/dolly_closed_qa
1. Lang: EN
2. Rows: 1773
3. Preprocess:
  1. Choose columns instruction, context and response.
  2. Rename column instruction to question.
  3. Rename column response to target.
  4. Reset indexes.
HuggingFaceH4/no_robots
1. Lang: EN
2. Rows: 260
3. Preprocess:
  1. Select train_sft split.
  2. Choose category Closed QA.
  3. Choose columns prompt, messages.
  4. Rename column prompt to question.
  5. Reset indexes.
  6. Process column messages with raw text into two columns context and answer.
sberquad
1. Lang: RU
2. Rows: 5040
3. Preprocess:
  1. Select validation split.
  2. Choose columns question, context, answers.
  3. Rename column answers to target.
  4. Process column target with raw text to leave just an answer in this column.
RussianNLP/wikiomnia
1. Lang: RU
2. Rows: 173000
3. Preprocess:
  1. Select train split and `wikiomnia_ruGPT3_filtered subset.
  2. Drop NaN.
  3. Drop duplicates
  4. Reset indexes.
  5. Choose columns question, summary, answer.
  6. Rename columns summary to context and answer to target.

Datasets OPEN QA

truthful_qa
1. Lang: EN
2. Rows: 817
3. Preprocess:
  1. Drop columns type, category, correct_answers, incorrect_answers, source.
  2. Rename column best_answer to target.
jtatman/databricks-dolly-8k-qa-open-close
1. Lang: EN
2. Rows: 7706
3. Preprocess:
  1. Filter dataset rows by category == open_qa.
  2. Drop columns context, category, __index_level_0__.
  3. Rename column instruction to question.
  4. Rename column response to target.
tatsu-lab/alpaca
1. Lang: EN
2. Rows: 52002
3. Preprocess:
  1. Drop columns input, text.
  2. Rename column instruction to question.
  3. Rename column output to target.
lionelchg/dolly_open_qa
1. Lang: EN
2. Rows: 188
3. Preprocess:
  1. Drop columns context, category, text.
  2. Rename column instruction to question.
  3. Rename column response to target.

Inferring batch

Process of implementing method stubs.labs.lab_7_llm.main.LLMPipeline._infer_batch() for closed question-answering task has its specifics:

You need to transpose the sample_batch before you pass it to the tokenizer, so that it is a sequence of tuples where each tuple has two strings: a question and a context.

The prediction of the model will consist of two tensors that contain start and end scores respectively.

Only the ids between start and end location corresponding to the answer have to be decoded and passed on.

To get the ids, iterate through input_ids field of the tokenized batch.

Metrics

Open QA
- BLEU
- ROUGE
Closed QA
- squad

Note

To calculate the squad metric, you need to convert the data into a special structure. This structure you can find in this repository in the metrics directory.