History – TEI: Text Encoding Initiative

The TEI was established in 1987 to develop, maintain, and promulgate hardware- and software-independent methods for encoding humanities data in electronic form. For over three decades the TEI has been extraordinarily successful at achieving its objective and it is now widely used by scholarly projects and libraries around the world.

Although a comprehensive history of the TEI has not yet been written, all known documentary resources about the TEI are stored in the Archive. If you (or others you know) have electronic copies of any original TEI documents not available here, please get in touch.

The archive of the TEI-L discussion list is a rich resource for historical information, as is the archive of the now defunct TEI-TECH mailing list, which can be downloaded in its entirety.

Origins of the TEI

When the Text Encoding Initiative (TEI) was originally established, scholarly projects and libraries attempting to take advantage of digital technology seemed to be faced with an overwhelming obstacle to creating sustainable and shareable archives and tools: the proliferating systems for representing textual material. These systems seemed almost always to be incompatible, often poorly designed, and multiplying at nearly the same rapid rate as the electronic text projects themselves. This situation was inhibiting the development of the full potential of computers to support humanistic inquiry by erecting barriers to access, creating new problems for preservation, making the sharing of data (and theories) difficult, and making the development of common tools impractical.

Part of the problem was simply a lack of opportunity for sustained communication and coordination, but there were more systemic forces at work as well. Longevity and re-usability were clearly not high on the priority lists of software vendors and electronic publishers, and proprietary formats were often part of a business strategy that might benefit a particular company, but at the expense of the broader scholarly and cultural community. At the end of the eighties there was a real concern that the entrepreneurial forces which (then as now) drive information technology forward would impede such integration by the proliferation of mutually incompatible technical standards.

In November 1987 a meeting at Vassar College was convened to address these problems. Sponsored by the Association for Computers in the Humanities and funded by the National Endowment for the Humanities, it brought together a diverse group of scholars from many different disciplines and representing leading professional societies, libraries, archives, and projects in a number of countries in Europe, North America, and Asia. At this meeting the intellectual foundation for Text Encoding Initiative was articulated. The organization of the actual work of developing the TEI Guidelines was then undertaken by the three TEI sponsoring organizations: The Association for Computers in the Humanities, the Association for Literary and Linguistic Computing, and the Association for Computational Linguistics. A Steering Committee was organized from representatives of the sponsoring organizations, and an Advisory Board of delegates from various professional societies was formed. To lead the actual work two editors were chosen and four working committees appointed. By the end of 1989 well over 50 scholars were already directly involved and the size of the effort was growing rapidly.

The initial phase resulted in the release of the first draft (known as “P1”) of the Guidelines in June 1990. A second phase, involving an additional 15 working groups making revisions and extensions, immediately began and released its results throughout 1990–1993. Then, after another round of revisions, extensions, and supplements, the first official version of the Guidelines (P3) was released in May 1994. Early on in this process a number of leading humanities textbase projects adopted the Guidelines — while they were still very much a moving target of rapidly changing drafts — as their encoding scheme, identifying problems and needs and contributing proposed solutions. In addition, workshops and seminars were conducted to introduce the wider community to the Guidelines and ensure a steady source of experience to support continuing development. As more scholars became acquainted with the Guidelines, comments, corrections, and requests for extensions arrived from around the world. In the end there were nearly 200 scholars from many disciplines, professions, and countries in the core group that was developing the TEI Guidelines.

The TEI Consortium

In January of 1999, the University of Virginia and the University of Bergen (Norway) presented a proposal to the TEI Executive Committee for the creation of an international membership organization, to be known as the TEI Consortium, which would maintain, continue developing, and promote the TEI. This proposal was accepted by the TEI Executive Committee, and shortly thereafter, Virginia and Bergen added two other host institutions with longstanding ties to the TEI: Brown University and Oxford University.

This group then formulated an Agreement to Establish a Consortium for the Maintenance of the Text Encoding Initiative which was the basis on which a transition group comprising representatives from the three original sponsoring organizations of the TEI, as custodians of rights in the TEI, and from the incoming Host Organizations set about the job of drafting and incorporating the TEI Consortium during 2000.

Incorporation was completed during December of 2000, and the first Board members took office during January of 2001.

The goal of establishing the TEI Consortium was to maintain a permanent home for the TEI as a democratically constituted, academically and economically independent, self-sustaining, non-profit organization. In addition, the TEI Consortium was intended to foster a broad-based user community with sustained involvement in the future development and widespread use of the TEI Guidelines. In both of these goals the creation of the Consortium has proven a positive step. Inasmuch as the original goal of the TEI was to promote collaborative research on electronic texts, by making the encoding system no longer an obstacle to such work, the Consortium’s efforts are similarly directed towards making the TEI encoding system as effective a tool for creating, archiving, and sharing textual data as possible. For its members, the TEI Consortium provides valuable services to assist them in the creation and use of digital resources, and to help them stay abreast of rapidly changing technologies and practices.

Following the establishment of the TEI Consortium, a critical priority was the release of an XML version of the TEI Guidelines, updating P3 to enable users to work with the emerging XML toolset. The P4 version of the Guidelines was published in June 2002. It was essentially an XML version of P3, making no substantive changes to the constraints expressed in the schemas apart from those necessitated by the shift to XML, and changing only corrigible errors identified in the prose of the P3 Guidelines. However, given that P3 had by this time been in steady use since 1994, it was clear that a substantial revision of its content was necessary, and work began immediately on the P5 version of the Guidelines. This was planned as a thorough overhaul, involving a public call for features and new development in a set of crucial areas including character encoding, graphics, manuscript description, standoff markup, and the language in which the TEI Guidelines themselves are written. The P5 version of the Guidelines was released in November 2007.

Impact of the TEI

The impact of the TEI on digital scholarship has been enormous. Today, the TEI is internationally recognized as a critically important tool, both for the long-term preservation of electronic data, and as a means of supporting effective usage of such data in many subject areas. It is the encoding scheme of choice for the production of critical and scholarly editions of literary texts, for scholarly reference works and large linguistic corpora, and for the management and production of detailed metadata associated with electronic text and cultural heritage collections of many types. See the projects page for a list of currently active TEI projects, ranging from small research applications to major encoding ventures.

The TEI’s recommendations have been endorsed by many organizations, including the US National Endowment for the Humanities, the UK’s Arts and Humanities Research Board, the Modern Language Association, the European Union’s Expert Advisory Group for Language Engineering Standards, and many other agencies around the world that fund or promote digital library and electronic text projects. Recognizing its importance in the emerging digital library community, the Digital Library Federation produced guidelines for best practice in applying the TEI metadata recommendations for interoperability with other standards.

The success of the TEI has also gone a long way to ensuring that our cultural heritage will be brought forward into the emerging new networked world, and made broadly available to the students, scholars, and the wider public.