Evaluates the familiarity of the given Question, Hint, or Answer using word frequency analysis on Common Crawl.
Parameters:
sentences (List[Union[Question, Hint, Answer]]) – List of sentences to evaluate.
**kwargs – Additional keyword arguments.
Returns:
List of familiarity scores for each sentence.
Return type:
List[float]
Notes
This function stores the scores as Metric objects within the metrics attribute of the Question, Hint, or Answer, with names based on the method, such as “familiarity-freq-include_stop_words-sm”.
Examples
>>> fromhinteval.coresimportQuestion,Hint>>> fromhinteval.evaluation.familiarityimportWordFrequency>>>>>> word_frequency=WordFrequency(method='include_stop_words')>>> sentence_1=Question('What is the capital of Austria?')>>> sentence_2=Hint('This city, once home to Mozart and Beethoven, is the capital of Austria.')>>> sentences=[sentence_1,sentence_2]>>> results=word_frequency.evaluate(sentences)>>> print(results)# [1.0, 1.0]>>> metrics=[f'{metric_key}: {metric_value.value}'forsentinsentencesformetric_key,metric_valuein... sent.metrics.items()]>>> print(metrics)# ['familiarity-freq-include_stop_words-sm: 1.0', 'familiarity-freq-include_stop_words-sm: 1.0']
Evaluates the familiarity of the given Question, Hint, or Answer using the number of views of corresponding Wikipedia pages [35].
Parameters:
sentences (List[Union[Question, Hint, Answer]]) – List of sentences to evaluate.
**kwargs – Additional keyword arguments.
Returns:
List of familiarity scores for each sentence.
Return type:
List[float]
Notes
This function stores the scores as Metric objects within the metrics attribute of the Question, Hint, or Answer, with names based on the method, such as “familiarity-wikipedia-sm”.
This function also stores number of views for each entity as Entity objects within the entities attribute.
Examples
>>> fromhinteval.coresimportQuestion,Hint>>> fromhinteval.evaluation.familiarityimportWikipedia>>>>>> wikipedia=Wikipedia(spacy_pipeline='en_core_web_trf')>>> sentence_1=Question('What is the capital of Austria?')>>> sentence_2=Hint('This city, once home to Mozart and Beethoven, is the capital of Austria.')>>> sentences=[sentence_1,sentence_2]>>> results=wikipedia.evaluate(sentences)>>> print(results)# [1.0, 1.0]>>> metrics=[f'{metric_key}: {metric_value.value}'forsentinsentencesformetric_key,metric_valuein... sent.metrics.items()]>>> print(metrics)# ['familiarity-wikipedia-trf: 1.0', 'familiarity-wikipedia-trf: 1.0']>>> entities=[f'{entity.entity}: {entity.metadata["wiki_views_per_month"]}'forsentinsentencesforentityin... sent.entities]>>> print(entities)# ['austria: 248144', 'mozart: 233219', 'beethoven: 224128', 'austria: 248144']