‘Achievement’ in Historic Textbooks – DiaCollo for GEI Digital

A GEI seed funds project exploring experimental corpus creation and analysis of historic collections.

This project titled ‘The concept of “achievement” in historic German-language textbooks – a quantitative and qualitative analysis using digital tools’ is being supported by the seed funding programme ‘GEI innovation 2020’. The concept of ‘achievement’ has acquired a key role in modern ideas of society and the individual, as well as within education. The project examines methods of creating and analysing digital corpora, in order to enable the exploration of the formation and dissemination of concepts of ‘achievement’ in German-language textbooks published before 1920 using methods from the digital humanities.

The project will investigate the corpus management environment dstar/D* and the tools it provides. These tools use DiaCollo to enable simple and complex searches, frequency analysis and the extraction of typical collocations related to a selected search term (such as ‘achievement’), which allows interactive selection and visualisation within a freely definable time period and can therefore point to changes in language use and semantics over time.

In order to take advantage of the full functionality of these analytical tools it is recommended that the texts to be analysed are of high quality and are extensively pre-processed. The digital full-texts available as files in GEI Digital are however captured automatically and, due to historical fonts, the inclusion of foreign words and tables etc., are not error-free. The project team will therefore examine and document the ‘effectiveness’ of diachronic collocation analysis by tools such as DiaCollo when applied to data from the digital textbook library. The working hypothesis is that a basal and purely mechanical pre-processing method would be sufficient to perform analyses using DiaCollo, which would at least support the generation of research questions in the field of historical research. Such a minimal solution, even with a high error rate, would be preferable to dispensing with the analysis technique altogether.

  • Aims

    The project investigates whether tools for digital analysis contain functions that are suitable to support (historic) educational media research and considers whether they could help to expand the research questions asked. It consists of three goals:

    1. Corpus creation: Data from the GEI is compiled using tools from the Research Centre Language at the Berlin-Brandenburg Academy of Sciences and Humanities (BBAW). This enables computer-aided analysis and visualisation of the digitised full texts, so far as the data quality allows.
    2. Developing a research question in the field of cultural studies and the history of concepts: The project team will examine the frequency, distribution and context of the term ‘achievement’, as well as other relevant linguistic phenomena, within the digital textbook corpus and reference corpora. When did the term ‘achievement’ first appear in historical textbooks and in the contexts of which discourses has it been used?
    3. Usability: Corresponding teaching materials will be created in order to facilitate the subsequent use of the corpus and analytical tools by potential new users without expertise in computational linguistics. To provide a brief overview of the structure and features of the data collection, visualisations of the metadata and filter functions will be reused, which were first developed in 2017 by the GEI and the Urban Complexity Lab at the Fachhochschule Potsdam.

  • Methodology

    Towards the end of 2020 the metadata and automatically generated full texts available at that time were converted to TEI and then, in cooperation with the Research Centre Language at the BBAW, processed, annotated and indexed using the NLP tools developed for historic texts. This project corpus for GEI Digital 2020 was then made available for analysis using a dstar corpus management environment set up for this purpose and the DiaCollo tool.

    The data was also made available through the Research Centre Language at the BBAW, where it was used by the linguistics and German studies communities. It forms part of the historic corpora of the DWDS and can be used comparatively or together with other collections of historic source materials covering the years 1465–1969 with the aid of the DWDS web interface of the DWDS and also with a dstar instance.


  • Results

    The tools for analysing the ‘GEI Digital 2020’corpus are available for use at diacollo.gei.de, as are the visualisations of the metadata, a tutorial, related publications and advice on further resources.


  • Publications (selection)
    • Nieländer, Maret (2022): „DiaCollo für GEI-Digital. Computerlinguistische Werkzeuge für die Analyse von mehr als 5000 historischen deutschsprachigen Schulbüchern“, DOI: 10.35468/5952-03, in: Oberdorf, Andreas (ed.): Digital Turn und Historische Bildungsforschung. Bestandsaufnahme – Forschungsperspektiven, Bad Heilbrunn: Julius Klinkhardt 2022, S. 33–48. DOI: 10.35468/5952
    • Nieländer, Maret; Scheel, Christian; Jurish, Bryan (2022): „DiaCollo für GEI-Digital – Ein experimentelles Projekt zur weiteren Erschließung digitalisierter historischer Schulbuchbestände“. Eine Posterpräsentation auf der 8. Tagung des Verbands „Digital Humanities im deutschsprachigen Raum“ – DHd 2022 Kulturen des digitalen Gedächtnisses. 07.–11.02.2022. Poster: DOI 10.5281/zenodo.6322544, Abstract: DOI 10.5281/zenodo.6328118
    • Nieländer, Maret; Jurish, Bryan (2021): „D* für Anfänger:innen: Ein Tutorial. Einfache und komplexe Suchanfragen, Frequenzanalysen und diachrone Kollokationsanalysen in der D*-Korpusmanagement-Umgebung“. urn:nbn:de:0220-2021-0088.
    • Nieländer, Maret: „Die Vermessung des Schulbuchs Computerlinguistische Zugänge zum Begriff der ‚Leistung‘ in der historischen Schulbuchsammlung GEI Digital“, Presentation at the Georg Eckert Institute’s annual conference ‘What are we doing in schools? Achievement and performance in education and educational media’ (#Leistung) 02.–03.09.2021, Braunschweig. PDF, Video (YouTube)
    •  

Project team

Transfer

sroll-to-top