Text Technologies for Historical Disciplines, Scholarly Editions (TEI-XML)

Basics

Conférence Universitaire de Suisse Occidental (CUSO)

05.05.2023

UniL, salle 315.1 (Amphipôle Lausanne)

Repository: https://github.com/DominicWeber/CUSO_UNIL

Creative Commons License



Peter Dängeli, University of Bern (DH, Data Science Lab)

peter.daengeli@unibe.ch

Schedule


Digital Scholarly Editions (DSE)


Editing sources

        +---------------+ +---------------+ +---------------+ +---------------+
         \               \ \               \ \               \ \               \
          \               \ \               \ \   Editorial   \ \               \
           \  Selection    \ \  Preparation  \ \    core       \ \  Publication  \
           /               / /               / /  business     / /               /
          /               / /               / /               / /               /
         /               / /               / /               / /               /
        +---------------+ +---------------+ +---------------+ +---------------+

Any of these steps may be influenced by the use of digital approaches.

What about the process as a whole? Does it become more cyclical or pluggable?


How digital do we want it?

To what extent are digital editions re-mediated print editions?

            the print mind  <---------------------------->  the digital mind

How close do we want to stay to the typographical setting and to perusing conventions of printed editions?

Do we create editions only for human eyes?

Does a digital edition even need a (graphical) interface?

“The print mind is (unsurprisingly) tenacious.” Toma Tasovac, recent talk at the Austrian Centre for DH, https://youtu.be/2tpFr8VL6PM (04.05.2023)


The Text Encoding Initiative (TEI)

History of the TEI

1987 Begin of a standardisation effort (by ACH / ACL / ALLC)
│  13.11.1987: TEI PCP1 "CLOSING STATEMENT OF THE VASSAR PLANNING CONFERENCE"
│
└── 1990 ❰ P1 ❱  
  │   07.1990: first public proposal "Guidelines for the Encoding and Interchange 
  │            of Machine-Readable Texts" (Ed. M. Sperberg-McQueen, L. Burnard)
  │
  └── 1992 ❰ P2 ❱  Development (four Working Committees and various WGs)       
    │
    └── 1994 and 1999 ❰ P3 ❱ "Guidelines for Electronic Text Encoding and Interchange"
      │                      439 elements, 1292 pages; considered achievement of the goals of 1987
      │
      └── 1999 / 2000 TEI Consortium (TEIC)
        │
        └── 2002 ❰ P4 ❱ 441 elements; SGML, XML (neu); retaining backwards compatibility
          │ initial goal: to rectify mistakes; outcome: identification of new potentials
          │
          └── 2007 ❰ P5 ❱ reorganisation and revision; no (full) compatibility to P4
                          ongoing development, laufende Fortentwicklung, zuletzt 04.04.2023: TEI P5 4.6.0
  TEI proposals
    └── 1990 ❰ P1 ❱           DOI: 10.5281/zenodo.3459203
      └── 1992 ❰ P2 ❱         DOI: 10.5281/zenodo.3459221     
        └── 1994 ❰ P3 ❱       DOI: 10.5281/zenodo.3549598
          └── 2002 ❰ P4 ❱     DOI: 10.5281/zenodo.3549616
            └── 2007 ❰ P5 ❱   DOI: 10.5281/zenodo.5347789

Editing TEI files


Presentation, some approaches

Approaches: (La)TeX

LaTeX


Approaches: In-browser transformation


Approaches: Pipeline-based


Approaches: ODD-based

The model underlying the TEI Guidelines is itself specified using mechanisms defined in it. Specifically, it is based on the One document does it all approach devised by the TEI core developers.

At the core of ODD lies the TEI Processing Model, a declarative way to formalise input-output relations. For instance, the TEI Guidelines are generated using this mechanism (as PDF files, HTML pages and schema files for validation purposes.

In recent years, the ODD/Processing Model approach was enhanced to facilitate the transformation of TEI XML documents to HTML renderings.

Among the driving forces behind this development are the developers of the TEI Publisher, the publishing tool with the tagline “The Instant Publishing Toolbox”.

Background information:


TEI Publisher


TEI Publisher, Demonstration

TEIPublisher samples

Currently, 21 samples are available at https://teipublisher.com/exist/apps/tei-publisher/index.html?query=&collection=test, at different levels of sophistication.

This database-like view also illustrates a central feature of the tool, specifically the faceted search on the left. Many editions use a view like this to access edited documents.

Hernán CORTÉS to Ioannes DANTISCUS, Madrid [1529]-09-11


Synoptical view of a letter in Spanish and Polish with annotated entities. Interactive highlighting of corresponding sentences.

Two modes are available (through a checkbox in the navigation bar): a normalised view and a plain reading view.

https://teipublisher.com/exist/apps/tei-publisher/test/cortes_to_dantiscus.xml?view=page&p_norm=on&p_highlight=off&odd=dantiscus&root=2.5.2.2.6.3.7.3.5&selectors=%5Bobject+Object%5D%2C%5Bobject+Object%5D


Letter #6 from Robert Graves to William Graves (at Oundle School) November 15, 1957


Transcription of a letter (including two postscripts) with interactive map view as well as content information on referenced persons, locations, etc.

https://teipublisher.com/exist/apps/tei-publisher/test/graves6.xml?view=div&odd=graves


Das italienische Madrigal


Short snippet showcasing the inclusion of a MEI encoding (Music Encoding Initiative) in a TEI encoding.

The excerpted sheet music is available in audio form.

https://teipublisher.com/exist/apps/tei-publisher/test/pb-mei-app.xml


Cossacks by Marko Vovchok / Maria Vilinskа


Bare text presentation in Ukrainian variation of Cyrillic script: https://teipublisher.com/exist/apps/tei-publisher/test/UKR18583_VovchokKozachka.xml


teipublisher.info samples

The following examples are chosen to demonstrate non-western scripts and RTL languages. With thanks to their lead developer Gil Shalit, who has a lot of experience with TEI Publisher and is happy to answer questions or offer consulting (via https://www.dh-dev.com/#Contact, https://twitter.com/DH_Development, or https://twitter.com/GilShalit).

Dybbuk


Der-Dybbuk starts from three versions of a Yiddish play from the beginning of the 20th century and will assist in the synthesis of a new version which will be performed on stage.

Workflow:

Examples:

TraveLab


Trilingual edition of the Journey of Benjamin of Tudela, a 12th century Jewish traveller.

Workflow:

Examples:


TEI Publisher, Demonstration: How does it work?

Three elements are at play:

From these the TEI Publisher generates the various views including functionalities such as linking, tooltips, highlighting, etc.


Source: https://github.com/eeditiones/workshop/blob/master/e-editiones-workshop-20200608.pdf

Source: https://github.com/eeditiones/workshop/blob/master/e-editiones-workshop-20200608.pdf

Source: https://github.com/eeditiones/workshop/blob/master/e-editiones-workshop-20200608.pdf

TEI Publisher, Demonstration: Let’s try

Relevant chapter in the TEI Publisher Documentation: https://teipublisher.com/exist/apps/tei-publisher/doc/documentation.xml?odd=docbook&view=div#odd-customization


TEI Publisher, Pros and Cons

1 Generally, at least two independent and interoperable implementations are desirable for sound specification work. To quote a high-ranking example: “In general, an Internet Standard is a specification that […] has multiple, independent, and interoperable implementations with substantial operational experience” – The Internet Standards Process – Revision 3, https://datatracker.ietf.org/doc/html/rfc2026).


Part II

Hands on: TEI Publisher


Part III 

Synthesis, discussion