lab_7 package

Submodules

Neural machine translation module.

class lab_7_llm.main.LLMPipeline(model_name: str, dataset: TaskDataset, max_length: int, batch_size: int, device: str)

Bases: AbstractLLMPipeline

A class that initializes a model, analyzes its properties and infers it.

__init__(model_name: str, dataset: TaskDataset, max_length: int, batch_size: int, device: str) → None

Initialize an instance of LLMPipeline.

Parameters:

model_name (str) – The name of the pre-trained model
dataset (TaskDataset) – The dataset used
max_length (int) – The maximum length of generated sequence
batch_size (int) – The size of the batch inside DataLoader
device (str) – The device for inference

_abc_impl = <_abc._abc_data object>

_infer_batch(sample_batch: Sequence[tuple[str, ...]]) → list[str]

Infer model on a single batch.

Parameters:: sample_batch (Sequence[tuple[str, ...]]) – Batch to infer the model
Returns:: Model predictions as strings
Return type:: list[str]

analyze_model() → dict

Analyze model computing properties.

Returns:: Properties of a model
Return type:: dict

infer_dataset(**kwargs: Any) → Any

Infer model on a whole dataset.

Returns:: Data with predictions.
Return type:: pandas.DataFrame

infer_sample(**kwargs: Any) → Any

Infer model on a single sample.

Parameters:: sample (tuple[str, ...]) – The given sample for inference with model
Returns:: A prediction
Return type:: str | None

class lab_7_llm.main.RawDataImporter(hf_name: str | None)

Bases: AbstractRawDataImporter

A class that imports the HuggingFace dataset.

_abc_impl = <_abc._abc_data object>

obtain(**kwargs: Any) → Any: Download a dataset.

class lab_7_llm.main.RawDataPreprocessor(raw_data: DataFrame)

Bases: AbstractRawDataPreprocessor

A class that analyzes and preprocesses a dataset.

_abc_impl = <_abc._abc_data object>

analyze() → dict

Analyze a dataset.

Returns:: Dataset key properties
Return type:: dict

transform(**kwargs: Any) → Any: Apply preprocessing transformations to the raw dataset.

class lab_7_llm.main.TaskDataset(data: DataFrame)

Bases: Dataset

A class that converts pd.DataFrame to Dataset and works with it.

__getitem__(index: int) → tuple[str, ...]

Retrieve an item from the dataset by index.

Parameters:: index (int) – Index of sample in dataset
Returns:: The item to be received
Return type:: tuple[str, …]

__init__(data: DataFrame) → None

Initialize an instance of TaskDataset.

Parameters:: data (pandas.DataFrame) – Original data

__len__() → int

Return the number of items in the dataset.

Returns:: The number of items in the dataset
Return type:: int

property data: DataFrame

Property with access to preprocessed DataFrame.

Returns:: Preprocessed DataFrame
Return type:: pandas.DataFrame

class lab_7_llm.main.TaskEvaluator(data_path: Path, metrics: Iterable[Metrics])

Bases: AbstractTaskEvaluator

A class that compares prediction quality using the specified metric.

__init__(data_path: Path, metrics: Iterable[Metrics]) → None

Initialize an instance of Evaluator.

Parameters:

data_path (pathlib.Path) – Path to predictions
metrics (Iterable[Metrics]) – List of metrics to check

_abc_impl = <_abc._abc_data object>

run(**kwargs: Any) → Any

Evaluate the predictions against the references using the specified metric.

Returns:: A dictionary containing information about the calculated metric
Return type:: dict | None

Web service for model inference.

class lab_7_llm.service.Query(question: str)

Model, defining incoming model request.

question: str

async lab_7_llm.service.infer(query: Query) → dict

Main endpoint for model call.

Parameters:: query (Query) – Query from client
Returns:: Content with predictions.
Return type:: dict

lab_7_llm.service.init_application() → tuple[FastAPI, LLMPipeline]

Initialize core application.

Run: uvicorn reference_service.server:app –reload

Returns:: Instance of server and pipeline
Return type:: tuple[fastapi.FastAPI, LLMPipeline]

async lab_7_llm.service.root(request: Request) → TemplateResponse

Root endpoint with server-side rendering.

Parameters:: request (Request) – Request
Returns:: Template with index.html
Return type:: TemplateResponse