lab_7 package

Submodules

Neural machine translation module.

class lab_7_llm.main.LLMPipeline(model_name: str, dataset: TaskDataset, max_length: int, batch_size: int, device: str)

Bases: AbstractLLMPipeline

A class that initializes a model, analyzes its properties and infers it.

__init__(model_name: str, dataset: TaskDataset, max_length: int, batch_size: int, device: str) None

Initialize an instance of LLMPipeline.

Parameters:
  • model_name (str) – The name of the pre-trained model

  • dataset (TaskDataset) – The dataset used

  • max_length (int) – The maximum length of generated sequence

  • batch_size (int) – The size of the batch inside DataLoader

  • device (str) – The device for inference

_abc_impl = <_abc._abc_data object>
_infer_batch(sample_batch: Sequence[tuple[str, ...]]) list[str]

Infer model on a single batch.

Parameters:

sample_batch (Sequence[tuple[str, ...]]) – Batch to infer the model

Returns:

Model predictions as strings

Return type:

list[str]

analyze_model() dict

Analyze model computing properties.

Returns:

Properties of a model

Return type:

dict

infer_dataset(**kwargs: Any) Any

Infer model on a whole dataset.

Returns:

Data with predictions.

Return type:

pandas.DataFrame

infer_sample(**kwargs: Any) Any

Infer model on a single sample.

Parameters:

sample (tuple[str, ...]) – The given sample for inference with model

Returns:

A prediction

Return type:

str | None

class lab_7_llm.main.RawDataImporter(hf_name: str | None)

Bases: AbstractRawDataImporter

A class that imports the HuggingFace dataset.

_abc_impl = <_abc._abc_data object>
obtain(**kwargs: Any) Any

Download a dataset.

class lab_7_llm.main.RawDataPreprocessor(raw_data: DataFrame)

Bases: AbstractRawDataPreprocessor

A class that analyzes and preprocesses a dataset.

_abc_impl = <_abc._abc_data object>
analyze() dict

Analyze a dataset.

Returns:

Dataset key properties

Return type:

dict

transform(**kwargs: Any) Any

Apply preprocessing transformations to the raw dataset.

class lab_7_llm.main.TaskDataset(data: DataFrame)

Bases: Dataset

A class that converts pd.DataFrame to Dataset and works with it.

__getitem__(index: int) tuple[str, ...]

Retrieve an item from the dataset by index.

Parameters:

index (int) – Index of sample in dataset

Returns:

The item to be received

Return type:

tuple[str, …]

__init__(data: DataFrame) None

Initialize an instance of TaskDataset.

Parameters:

data (pandas.DataFrame) – Original data

__len__() int

Return the number of items in the dataset.

Returns:

The number of items in the dataset

Return type:

int

property data: DataFrame

Property with access to preprocessed DataFrame.

Returns:

Preprocessed DataFrame

Return type:

pandas.DataFrame

class lab_7_llm.main.TaskEvaluator(data_path: Path, metrics: Iterable[Metrics])

Bases: AbstractTaskEvaluator

A class that compares prediction quality using the specified metric.

__init__(data_path: Path, metrics: Iterable[Metrics]) None

Initialize an instance of Evaluator.

Parameters:
  • data_path (pathlib.Path) – Path to predictions

  • metrics (Iterable[Metrics]) – List of metrics to check

_abc_impl = <_abc._abc_data object>
run(**kwargs: Any) Any

Evaluate the predictions against the references using the specified metric.

Returns:

A dictionary containing information about the calculated metric

Return type:

dict | None

Web service for model inference.

class lab_7_llm.service.Query(question: str)

Model, defining incoming model request.

question: str
async lab_7_llm.service.infer(query: Query) dict

Main endpoint for model call.

Parameters:

query (Query) – Query from client

Returns:

Content with predictions.

Return type:

dict

lab_7_llm.service.init_application() tuple[FastAPI, LLMPipeline]

Initialize core application.

Run: uvicorn reference_service.server:app –reload

Returns:

Instance of server and pipeline

Return type:

tuple[fastapi.FastAPI, LLMPipeline]

async lab_7_llm.service.root(request: Request) TemplateResponse

Root endpoint with server-side rendering.

Parameters:

request (Request) – Request

Returns:

Template with index.html

Return type:

TemplateResponse