core_utils package
Submodules
Module with description of abstract LLM pipeline.
- class core_utils.llm.llm_pipeline.AbstractLLMPipeline(model_name: str, dataset: Dataset, max_length: int, batch_size: int, device: str = 'cpu')
Bases:
ABC
Abstract LLM Pipeline.
- __init__(model_name: str, dataset: Dataset, max_length: int, batch_size: int, device: str = 'cpu') None
Initialize an instance of AbstractLLMPipeline.
- Parameters:
model_name (str) – The name of the pre-trained model.
dataset (torch.utils.data.dataset.Dataset) – The dataset used.
max_length (int) – The maximum length of generated sequence.
batch_size (int) – The size of the batch inside DataLoader.
device (str) – The device for inference.
- _abc_impl = <_abc._abc_data object>
- _model: HFModelLike | None
Model
- abstract analyze_model() dict
Analyze model computing properties.
- Returns:
Properties of a model
- Return type:
- class core_utils.llm.llm_pipeline.HFModelLike(*args, **kwargs)
Bases:
Protocol
Protocol definition of HF models.
- __init__(*args, **kwargs)
- _abc_impl = <_abc._abc_data object>
- _is_protocol = True
Metrics enum.
- class core_utils.llm.metrics.Metrics(value)
Bases:
Enum
Metrics enum.
- ACCURACY = 'accuracy'
- BLEU = 'bleu'
- F1 = 'f1'
- PRECISION = 'precision'
- RECALL = 'recall'
- ROUGE = 'rouge'
- SQUAD = 'squad'
Module with description of abstract data importer.
- class core_utils.llm.raw_data_importer.AbstractRawDataImporter(hf_name: str | None)
Bases:
ABC
Abstract Raw Data Importer.
- __init__(hf_name: str | None) None
Initialize an instance of AbstractRawDataImporter.
- Parameters:
hf_name (str | None) – Name of the HuggingFace dataset
- _abc_impl = <_abc._abc_data object>
- property raw_data: DataFrame | None
Property for original dataset in a table format.
- Returns:
A dataset in a table format
- Return type:
pandas.DataFrame | None
Module with description of abstract raw data preprocessor.
- class core_utils.llm.raw_data_preprocessor.AbstractRawDataPreprocessor(raw_data: DataFrame)
Bases:
ABC
Abstract Raw Data Preprocessor.
- __init__(raw_data: DataFrame) None
Initialize an instance of AbstractRawDataPreprocessor.
- Parameters:
raw_data (pandas.DataFrame) – Original dataset in a table format
- _abc_impl = <_abc._abc_data object>
- property data: DataFrame | None
Property for preprocessed dataset.
- Returns:
Preprocessed dataset in a table format
- Return type:
pandas.DataFrame | None
- class core_utils.llm.raw_data_preprocessor.ColumnNames(value)
Bases:
Enum
Column names for preprocessed DataFrame.
- CONTEXT = 'context'
- HYPOTHESIS = 'hypothesis'
- PREDICTION = 'predictions'
- PREMISE = 'premise'
- QUESTION = 'question'
- SOURCE = 'source'
- TARGET = 'target'
Module with description of abstract task evaluator.
- class core_utils.llm.task_evaluator.AbstractTaskEvaluator(metrics: Iterable[Metrics])
Bases:
ABC
Abstract Task Evaluator.
- __init__(metrics: Iterable[Metrics]) None
Initialize an instance of AbstractTaskEvaluator.
- Parameters:
metrics (Iterable[Metrics]) – List of metrics to check
- _abc_impl = <_abc._abc_data object>