Familiarity¶
The Familiarity metric measures how common or well-known the information in the hints, questions, or answers is. It assesses whether the content is likely to be understood by the general audience, making it easier for users to grasp the provided hints without needing specialized knowledge.
The HintEval framework provides two different methods for computing the Familiarity metric.
Note
The evaluate function takes a list of Hint, Question, or Answer objects as its input, where each object contains the text that needs to be evaluated for familiarity.
Word-Frequency¶
The Word-Frequency method evaluates the familiarity of the text by analyzing how frequently the words appear in large corpora. The C4 corpus is used as a reference dataset for word frequencies, providing insight into how commonly words are used in everyday language.
Word-Frequency is available in two main variants:
With Stop-Words: Considers all words in the text, including common stop-words like “the,” “is,” and “and.”
Without Stop-Words: Excludes stop-words to focus on the more meaningful terms that are likely to indicate familiarity.
Example¶
from hinteval.cores import Question, Hint
from hinteval.evaluation.familiarity import WordFrequency
word_frequency = WordFrequency(method='include_stop_words')
sentence_1 = Question('What is the capital of Austria?')
sentence_2 = Hint('This city, once home to Mozart and Beethoven, is the capital of Austria.')
sentences = [sentence_1, sentence_2]
results = word_frequency.evaluate(sentences)
print(results)
# [1.0, 1.0]
metrics = [f'{metric_key}: {metric_value.value}' for sent in sentences for metric_key, metric_value in
sent.metrics.items()]
print(metrics)
# ['familiarity-freq-include_stop_words-sm: 1.0', 'familiarity-freq-include_stop_words-sm: 1.0']
Wikipedia¶
The Wikipedia method evaluates familiarity by analyzing the popularity of the entities mentioned in the text. It does this by looking up the corresponding Wikipedia pages for each entity and using the number of views of the page as a measure of familiarity. This method helps determine how well-known the people, places, or concepts in the text are to the general public. For more information, refer to the 📝original paper.
Example¶
from hinteval.cores import Question, Hint
from hinteval.evaluation.familiarity import Wikipedia
wikipedia = Wikipedia(spacy_pipeline='en_core_web_trf')
sentence_1 = Question('What is the capital of Austria?')
sentence_2 = Hint('This city, once home to Mozart and Beethoven, is the capital of Austria.')
sentences = [sentence_1, sentence_2]
results = wikipedia.evaluate(sentences)
print(results)
# [1.0, 1.0]
metrics = [f'{metric_key}: {metric_value.value}' for sent in sentences for metric_key, metric_value in
sent.metrics.items()]
print(metrics)
# ['familiarity-wikipedia-trf: 1.0', 'familiarity-wikipedia-trf: 1.0']
entities = [f'{entity.entity}: {entity.metadata["wiki_views_per_month"]}' for sent in sentences for entity in
sent.entities]
print(entities)
# ['austria: 248144', 'mozart: 233219', 'beethoven: 224128', 'austria: 248144']
Comparison¶
For each method, we provide details on:
Method |
Preferred Device |
Cost-Effectiveness |
Accuracy |
Execution Speed |
|---|---|---|---|---|
Word-Frequency |
CPU |
Very High |
Low |
Very Fast |
Wikipedia |
CPU |
High |
High |
Slow |
Preferred Device: Indicates whether the method works best on CPU or GPU.
Cost-Effectiveness: Evaluates how computationally expensive the method is, considering the resources needed.
Accuracy: Reflects how accurate the method is in assessing familiarity.
Execution Speed: How quickly the method executes (e.g., Fast, Moderate, Slow).