Here you find links to different datasets for various LLiS projects.



Dataset contents: Human gaze data during self-paced reading of real-world English text (5247 tokens) containing interruptions, pre- and post-test scores

Number of participants: 50

Contact: Francesca Zermiani,

The data is only to be used for non-commercial scientific purposes. If you use this dataset in a scientific publication, please cite the following paper:

Francesca Zermiani, Prajit Dhar, Ekta Sood, Fabian Koegel, Andreas Bulling, and Maria Wirzberger. 2024. InteRead: An Eye Tracking Dataset of Interrupted Reading. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 9154–9169, Torino, Italy. ELRA and ICCL.

Dataset license agreement

This dataset - with all the files it contains - is released under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International license (CC BY-NC-SA 4.0). By using this dataset, you agree to the license terms. The major license terms include:
  • Attribution: You must give appropriate credit to the original creators of the dataset.
  • Non-Commercial: You may not use the dataset for commercial purposes.
  • Share Alike: If you remix, transform, or build upon the dataset, you must distribute your contributions under the same license as the original.

The full dataset can be downloaded here


This image shows Maria Wirzberger

Maria Wirzberger

Jun.-Prof. Dr. rer. nat.

Professor for Teaching and Learning with Intelligent Systems | Spokesperson of the Stuttgart Research Focus IRIS | Co-Director of the AI Software Academy

To the top of the page