Masterthesis (applied NLP): Automatic Detection of Intrinsic and Extrinsic Errors in One Sentence Summaries

Moderator: Aktive Fachschaft

Forumsregeln
Auch ohne Registrierung können Beiträge in diesem Unterforum geschrieben werden.
tsteuer
Neuling
Neuling
Beiträge: 4
Registriert: 19. Jun 2020 13:43

Masterthesis (applied NLP): Automatic Detection of Intrinsic and Extrinsic Errors in One Sentence Summaries

Beitrag von tsteuer »

Hi,

The knowledge media group at KOM is looking for a motivated master student interested in the following master thesis. (see PDF for contact)

Motivation

One-sentence summaries shorten a paragraph of text to its gist. Given a paragraph about Bohr’s atom model, a one-sentence summary might be:

Bohr’s model describes the atom as is a system consisting of a small, dense nucleus surrounded by orbiting electrons.

Research in psychology suggests that such summaries, encourage readers to reflect on the text, increasing their comprehension. Yet, generating such summaries is hard and prone to errors.

Hence, the advertised thesis aims to investigate the factual correctness and faithfulness of abstractive summaries. Given a generated one-sentence summary, your task is to predict the probability that the summary is correct under the information given in the text or under information from the text and additional background knowledge.

Task
  • In-depth review of the related work of abstractive summarization errors and error detection
  • Design of different error detection algorithms / models for intrinsic and extrinsic errors
  • Implementation of the algorithms
  • Automatic evaluation of the algorithm (XSUM Hallucination Annotations)
  • Written thesis

Requirements
  • Interest in educational technologies
  • Experience in statistical NLP models (e.g. NLI, QA models)
  • Basic Linux experience
Initial Literature

Maynez, J., Narayan, S., Bohnet, B., & McDonald, R. (2020). On Faithfulness and Factuality in Abstractive Summarization. arXiv preprint arXiv:2005.00661.

Zurück zu „Abschlussarbeiten“