Readability¶

class hinteval.cores.evaluation_metrics.readability.TraditionalIndexes(method: Literal['flesch_kincaid_reading_ease', 'gunning_fog_index', 'smog_index', 'coleman_liau_index', 'automated_readability_index'] = 'flesch_kincaid_reading_ease', spacy_pipeline: Literal['en_core_web_sm', 'en_core_web_lg', 'en_core_web_md', 'en_core_web_trf'] = 'en_core_web_sm', checkpoint: bool = False, checkpoint_step: int = 1, enable_tqdm=False)¶

Class for evaluating readability of Question or Hint using traditional readability indexes [15].

spacy_pipeline¶

The spaCy pipeline to use for tokenization.

Type:: str

checkpoint¶

Whether checkpointing is enabled.

Type:: bool

checkpoint_step¶

Step interval for checkpointing.

Type:: int

enable_tqdm¶

Whether the tqdm progress bar is enabled.

Type:: bool

References

See also

MachineLearningBased: Class for evaluating readability of Question or Hint using machine learning such as XGBoost and Random-Forest models.
NeuralNetworkBased: Class for evaluating readability of Question or Hint using contextual embeddings such as BERT and RoBERTa models.
LlmBased: Class for evaluating readability of Question or Hint using large language models.

evaluate(sentences: List[Question | Hint], **kwargs) → List[float]¶

Evaluates the readability of the given Question or Hint using the specified method [17].

Parameters:

sentences (List[Union[Question, Hint]]) – List of sentences to evaluate.
**kwargs – Additional keyword arguments.

Returns:

List of readability scores for each sentence.

Return type:

List[float]

Notes

This function stores the scores as Metric objects within the metrics attribute of the Question or Hint, with names based on the method, such as “readability-flesch_kincaid_reading_ease-sm”.

Examples

>>> from hinteval.cores import Question, Hint
>>> from hinteval.evaluation.readability import TraditionalIndexes
>>>
>>> traditional_indexes = TraditionalIndexes(method='flesch_kincaid_reading_ease')
>>> sentence_1 = Question('What is the capital of Austria?')
>>> sentence_2 = Hint('This city, once home to Mozart and Beethoven, is the capital of Austria.')
>>> sentences = [sentence_1, sentence_2]
>>> results = traditional_indexes.evaluate(sentences)
>>> print(results)
# [87.945, 69.994]
>>> metrics = [f'{metric_key}: {metric_value.value}' for sent in sentences for metric_key, metric_value in
...        sent.metrics.items()]
>>> print(metrics)
# ['readability-flesch_kincaid_reading_ease-sm: 87.945', 'readability-flesch_kincaid_reading_ease-sm: 69.994']

References

See also

MachineLearningBased: Class for evaluating readability of Question or Hint using machine learning such as XGBoost and Random-Forest models.
NeuralNetworkBased: Class for evaluating readability of Question or Hint using contextual embeddings such as BERT and RoBERTa models.
LlmBased: Class for evaluating readability of Question or Hint using large language models.

release_memory()¶

Releases the memory used by the class instance.

This method deletes the instance of the class and triggers garbage collection to free up memory.

Examples

>>> from hinteval.evaluation.familiarity import Wikipedia
>>>
>>> wikipedia = Wikipedia(spacy_pipeline='en_core_web_sm')
>>> wikipedia.release_memory()

class hinteval.cores.evaluation_metrics.readability.MachineLearningBased(method: Literal['xgboost', 'random_forest'] = 'xgboost', spacy_pipeline: Literal['en_core_web_sm', 'en_core_web_lg', 'en_core_web_md', 'en_core_web_trf'] = 'en_core_web_sm', checkpoint: bool = False, checkpoint_step: int = 1, force_download=False, enable_tqdm=False)¶

Class for evaluating readability of Question or Hint using machine learning methods such as XGBoost and Random-Forest models [18].

checkpoint¶

Whether checkpointing is enabled.

Type:: bool

checkpoint_step¶

Step interval for checkpointing.

Type:: int

enable_tqdm¶

Whether the tqdm progress bar is enabled.

Type:: bool

References

See also

TraditionalIndexes: Class for evaluating readability of Question or Hint using traditional readability indexes.
NeuralNetworkBased: Class for evaluating readability of Question or Hint using contextual embeddings such as BERT and RoBERTa models.
LlmBased: Class for evaluating readability of Question or Hint using large language models.

evaluate(sentences: List[Question | Hint], **kwargs) → List[float]¶

Evaluates the readability of the given Question or Hint using the specified machine learning method [20].

Parameters:

sentences (List[Union[Question, Hint]]) – List of sentences to evaluate.
**kwargs – Additional keyword arguments.

Returns:

List of readability scores for each sentence.

Return type:

List[float]

Notes

This function stores the scores as Metric objects within the metrics attribute of the Question or Hint, with names based on the method, such as “readability-ml-xgboost-sm”.

Examples

>>> from hinteval.cores import Question, Hint
>>> from hinteval.evaluation.readability import MachineLearningBased
>>>
>>> machine_learning = MachineLearningBased(method='xgboost')
>>> sentence_1 = Question('What is the capital of Austria?')
>>> sentence_2 = Hint('This city, once home to Mozart and Beethoven, is the capital of Austria.')
>>> sentences = [sentence_1, sentence_2]
>>> results = machine_learning.evaluate(sentences)
>>> print(results)
# [0, 0]
>>> classes = [sent.metrics['readability-ml-xgboost-sm'].metadata['description'] for sent in sentences]
>>> print(classes)
# ['beginner', 'beginner']
>>> metrics = [f'{metric_key}: {metric_value.value}' for sent in sentences for metric_key, metric_value in
...        sent.metrics.items()]
>>> print(metrics)
# ['readability-ml-xgboost-sm: 0', 'readability-ml-xgboost-sm: 0']

References

See also

TraditionalIndexes: Class for evaluating readability of Question or Hint using traditional readability indexes.
NeuralNetworkBased: Class for evaluating readability of Question or Hint using contextual embeddings such as BERT and RoBERTa models.
LlmBased: Class for evaluating readability of Question or Hint using large language models.

release_memory()¶

Releases the memory used by the class instance.

This method deletes the instance of the class and triggers garbage collection to free up memory.

Examples

>>> from hinteval.evaluation.familiarity import Wikipedia
>>>
>>> wikipedia = Wikipedia(spacy_pipeline='en_core_web_sm')
>>> wikipedia.release_memory()

class hinteval.cores.evaluation_metrics.readability.NeuralNetworkBased(model_name: Literal['bert-base', 'roberta-large'] = 'bert-base', batch_size: int = 256, checkpoint: bool = False, checkpoint_step: int = 1, force_download=False, enable_tqdm=False)¶

Class for evaluating readability of Question or Hint using neural network models such as BERT and RoBERTa [21].

checkpoint¶

Whether checkpointing is enabled.

Type:: bool

checkpoint_step¶

Step interval for checkpointing.

Type:: int

enable_tqdm¶

Whether the tqdm progress bar is enabled.

Type:: bool

References

See also

TraditionalIndexes: Class for evaluating readability of Question or Hint using traditional readability indexes.
MachineLearningBased: Class for evaluating readability of Question or Hint using machine learning such as XGBoost and Random-Forest models.
LlmBased: Class for evaluating readability of Question or Hint using large language models.

evaluate(sentences: List[Question | Hint], **kwargs) → List[float]¶

Evaluates the readability of the given Question or Hint using the specified neural network model [23].

Parameters:

sentences (List[Union[Question, Hint]]) – List of sentences to evaluate.
**kwargs – Additional keyword arguments.

Returns:

List of readability scores for each sentence.

Return type:

List[float]

Notes

This function stores the scores as Metric objects within the metrics attribute of the Question or Hint, with names based on the model, such as “readability-nn-bert-base”.

Examples

>>> from hinteval.cores import Question, Hint
>>> from hinteval.evaluation.readability import NeuralNetworkBased
>>>
>>> neural_network = NeuralNetworkBased(model_name='bert-base')
>>> sentence_1 = Question('What is the capital of Austria?')
>>> sentence_2 = Hint('This city, once home to Mozart and Beethoven, is the capital of Austria.')
>>> sentences = [sentence_1, sentence_2]
>>> results = neural_network.evaluate(sentences)
>>> print(results)
# [0, 0]
>>> classes = [sent.metrics['readability-nn-bert-base'].metadata['description'] for sent in sentences]
>>> print(classes)
# ['beginner', 'beginner']
>>> metrics = [f'{metric_key}: {metric_value.value}' for sent in sentences for metric_key, metric_value in
...        sent.metrics.items()]
>>> print(metrics)
# ['readability-nn-bert-base: 0', 'readability-nn-bert-base: 0']

References

See also

TraditionalIndexes: Class for evaluating readability of Question or Hint using traditional readability indexes.
MachineLearningBased: Class for evaluating readability of Question or Hint using machine learning such as XGBoost and Random-Forest models.
LlmBased: Class for evaluating readability of Question or Hint using large language models.

release_memory()¶

Releases the memory used by the class instance.

This method deletes the instance of the class and triggers garbage collection to free up memory.

Examples

>>> from hinteval.evaluation.familiarity import Wikipedia
>>>
>>> wikipedia = Wikipedia(spacy_pipeline='en_core_web_sm')
>>> wikipedia.release_memory()

class hinteval.cores.evaluation_metrics.readability.LlmBased(model_name: str, api_key: str = None, base_url: str = 'https://api.together.xyz/v1', temperature: float = 0.7, top_p: float = 1.0, max_tokens: int = 512, batch_size: int = 10, checkpoint: bool = False, checkpoint_step: int = 1, enable_tqdm=False)¶

Class for evaluating readability of Question or Hint using large language models [24].

checkpoint¶

Whether checkpointing is enabled.

Type:: bool

checkpoint_step¶

Step interval for checkpointing.

Type:: int

enable_tqdm¶

Whether the tqdm progress bar is enabled.

Type:: bool

References

See also

TraditionalIndexes: Class for evaluating readability of Question or Hint using traditional readability indexes.
MachineLearningBased: Class for evaluating readability of Question or Hint using machine learning such as XGBoost and Random-Forest models.
NeuralNetworkBased: Class for evaluating readability of Question or Hint using contextual embeddings such as BERT and RoBERTa models.

evaluate(sentences: List[Question | Hint], **kwargs) → List[float]¶

Evaluates the readability of the question and hints of the given instances using large language models [26].

Parameters:

sentences (List[Union[Question, Hint]]) – List of sentences to evaluate.
**kwargs – Additional keyword arguments.

Returns:

List of readability scores for each sentence.

Return type:

List[float]

Notes

This function stores the scores as Metric objects within the metrics attribute of the Question or Hint, with names based on the model, such as “readability-llm-meta-llama_Meta-Llama-3.1-70B-Instruct-Turbo”.

Examples

>>> from hinteval.cores import Question, Hint
>>> from hinteval.evaluation.readability import LlmBased
>>>
>>> llm = LlmBased(model_name='meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo',
...          api_key='your_api_key', base_url='base_url', batch_size=2,
...          enable_tqdm=True)
>>> sentence_1 = Question('What is the capital of Austria?')
>>> sentence_2 = Hint('This city, once home to Mozart and Beethoven, is the capital of Austria.')
>>> sentences = [sentence_1, sentence_2]
>>> results = llm.evaluate(sentences)
>>> print(results)
# [0, 0]
>>> classes = [sent.metrics['readability-llm-meta-llama_Meta-Llama-3.1-70B-Instruct-Turbo'].metadata['description'] for sent in sentences]
>>> print(classes)
# ['beginner', 'beginner']
>>> metrics = [f'{metric_key}: {metric_value.value}' for sent in sentences for metric_key, metric_value in
...           sent.metrics.items()]
>>> print(metrics)
# ['readability-llm-meta-llama_Meta-Llama-3.1-70B-Instruct-Turbo: 0', 'readability-llm-meta-llama_Meta-Llama-3.1-70B-Instruct-Turbo: 0']

References

See also

TraditionalIndexes: Class for evaluating readability of Question or Hint using traditional readability indexes.
MachineLearningBased: Class for evaluating readability of Question or Hint using machine learning such as XGBoost and Random-Forest models.
NeuralNetworkBased: Class for evaluating readability of Question or Hint using contextual embeddings such as BERT and RoBERTa models.

release_memory()¶

Releases the memory used by the class instance.

This method deletes the instance of the class and triggers garbage collection to free up memory.

Examples

>>> from hinteval.evaluation.familiarity import Wikipedia
>>>
>>> wikipedia = Wikipedia(spacy_pipeline='en_core_web_sm')
>>> wikipedia.release_memory()