Skip to content
This repository was archived by the owner on Oct 4, 2022. It is now read-only.
This repository was archived by the owner on Oct 4, 2022. It is now read-only.

Refactor TopicDistribution assessment to include morphology #1755

@nataliashitova

Description

@nataliashitova

We need to refactor the topicDistribution assessment to the following schema:

For every sentence

  • if keywordLength < 4,
    • GOOD if all content words from the keyphrase are in the sentence,
    • OK if some but not all are,
    • BAD if none;
  • if keyword >= 4,
    • GOOD if all content words from the keyphrase are in the sentence, or at least 3 content words from the keyphrase are in the sentence and the rest are in the neighbour sentences,
    • OK if only some content words from the keyphrase are found in the sentence but not all,
    • BAD if none.

Step function

Start with the first third of the text (based on the total number of sentences) and calculate an average score over all sentences in this set. Move down by one sentence, calculate an average score again. Continue until the end of the text is reached.

Final outcome

Compute average over all steps.

  • GOOD if >=6,
  • OK if between 3 and 6,
  • BAD otherwise.

Additionally provide markings for sets of sentences that have a low average.

Metadata

Metadata

Labels

morpho-synoIssue that is related to providing morphological analysis for keywords and synonyms.

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions