Search the Guidelines

This page provides the entirety of the guidelines, allowing you to perform simple string searches. Use your browser's "Find (on this page)" function (Ctrl + F) to locate the term or string of characters you're looking for.
 
 
 

1. Introduction

1.1 Using The Walt Whitman Archive Encoding Guidelines
1.2 Some Basic Vocabulary
1.3 The Purpose of Encoding
1.4 Getting Started

1.1 Using The Walt Whitman Archive Encoding Guidelines

These guidelines define current transcription and encoding practices of the Whitman editors and staff as we make Whitman's writings available on our website, The Walt Whitman Archive. While for the most part our practices have stabilized, discussion is ongoing, and our practices continue to evolve. The guidelines were last updated October 2005.

Two major sections make up the core of the guidelines: "Global" and "Local." Global describes those aspects of encoding that are consistent from document to document. Local addresses aspects that vary from document to document. We have designed the guidelines to be read in order, so we recommend that you first read the section on the global encoding before moving on to the local. In addition, there are four supplementary sections: this introduction, a reference section, a downloadable template with the basic tagging filled in, and a single-file version of the guidelines that allows basic searching.

If you have questions or comments please email Kenneth Price or Brett Barney.


1.2 Some Basic Vocabulary

The primary audience for these guidelines is Walt Whitman Archive staff, and staff members come to the project with varying degrees of familiarity with humanities computing. The following explanations of selected basic terms are intended to help those with limited experience better understand the guidelines:

  • TEI (Text Encoding Initiative): As described on the TEI Consortium hompage, "The TEI is an international and interdisciplinary standard that helps libraries, museums, publishers, and individual scholars represent all kinds of literary and linguistic texts for online research and teaching, using an encoding scheme that is maximally expressive and minimally obsolescent." In other words, the TEI is a standard for making transcriptions of complicated texts (inlcuding handwritten manuscripts) readable by computers.
  • Markup and Encoding: Generally, these terms refer to the "tags" that we include in our transcriptions to mark textual features in a way that allows them to be processed by a computer.
  • Tag: The string of characters surrounded by "<" and ">". For example: <add>. Tags typically come in pairs, an "opening" one to mark the beginning and a "closing" one to mark the end of a section of the transcription. The example above is an opening tag. A closing tag includes a slash after the "<" to distinguish it: </add>. A pair of tags describes all of the transcription that they enclose, so if you wanted to note that the word "crusty" was added to a text, you would tag it like this: <add>crusty </add>.
  • Element: This is the core part of a tag—the first string of characters after the "<" in the open tag. For example, in <add type="insertion">, "add" is the element.
  • Attribute: This is a secondary part of a tag that creates a category for further describing the element. It appears after the element name in the opening tag. An attribute must be followed by a value (see next). An element may have more than one attribute, each separated from the element name and from other attribute/value combinations by a single space.
  • Value: This is a word or short phrase that classifies the element in terms of a particular attribute. It is contained within quotation marks and preceded by an equal sign. For example, in the following add tag, "type" and "place" are attributes and "unmarked" and "supralinear" are values: <add type="unmarked" place="supralinear">.
  • DTD (Document Type Definition): This is a file that functions as our rulebook. Validating (or "parsing") our markup against the DTD is one tool we use to see whether our files are properly encoded. Though our DTD implements the TEI standard, the Whitman Archive DTD is unique, customized for our specific project needs.
  • Nesting: This refers to the practice of enclosing pairs of tags within other pairs of tags, a basic principle of properly structuring XML documents. For example, the tag pair <TEI.2> and </TEI.2> contain almost all of the other markup within a document (see the Annotated Template for a visualization of this). It is important to nest tags properly, so that a particular element does not overlap other elements. The following is an example of improperly nested markup: <l><add></l></add>. Compare this properly nested example: <l><add></add></l>. At any given point in a document several tags may be "open," but the tag most recently opened must always close before earlier tags close.
  • Stylesheet: An XSLT (Extensible Stylsheet Language for Transformation) stylesheet is a file that transforms the encoded manuscript into HTML (HyperText Markup Language) for use on the web. In other words, stylesheets are what we use to make the various XML-encoded documents display in consistent and attractive ways on our site.

1.3 The Purpose of Encoding

The Walt Whitman Archive is developing in many areas simultaneously. Currently, work is underway on the image gallery, on the interviews, on the presentation of the various editions of Leaves of Grass, on the poems published in periodicals, and on Whitman's disciples—that group of close friends and associates who sometimes collaborated with him, who promoted his work, and who shaped his legacy. However, since 2000 the main effort of the Whitman Archive has been directed toward editing and providing access to all of Whitman's poetry manuscripts, highly revealing documents that have never before been systematically collected, transcribed, and published. Scattered in over sixty institutions worldwide and often difficult to decipher, Whitman's manuscripts remain little known despite their importance for students and scholars.

The XML encoding of manuscripts has presented us with numerous challenges—in part the TEI was initially designed with a focus more on books than on manuscripts. Our tagging of manuscripts has in many cases required us to develop extensions. The modifications to TEI conform to the guidelines provided in Chapter 29, "Modifying and Customizing the TEI DTD" in the Text Encoding Initiative Guidelines for Electronic Text Encoding and Interchange (Oxford: TEI Consortium, 2002).

XML encoding is not mechanical but interpretive. Sophisticated users of the Whitman Archive may wish to understand our tagging from the inside, as it were, thereby better grasping the query potential of our Archive. We wish to be overt about what the Whitman Archive has chosen, thus far, to encode regarding Whitman poetry manuscripts. The thus far in that sentence points to a key aspect of encoding: it is a process that can go through multiple passes and layerings. Generally speaking, our approach in tagging is to encode at a non-controversial structural level. Thus we tag titles, lines, and the like. This tagging will enable searches on discrete parts of documents (e.g., sections, lines, clusters, titles) within individual poems, across printed poems, across all poems, etc. One could also do, say, thematic tagging, but we have avoided that because we have felt that it would lead to endless internal debates on the project and might lead to a too-coercive editorial presence or at least to an end product that was too bound and too limited by the perspectives and the historical moment of the current creators of this site.

There are other features that it would not be controversial to tag, that would be useful information to have accessible, but that we have nonetheless chosen not to tag. For example, we have not recorded paper types, nor have we noted ink and pencil colors. We can certainly imagine scholars who could make brilliant use of this information if it were systematically recorded across all of Whitman's available documents. Still, a project such as the Whitman Archive constantly faces practical questions about what to prioritize. The magnitude of the entire undertaking is so vast that we know that we can at best hope to achieve a first pass through the material. Whitman himself sometimes thought that he left his writings for "poets to come" who would justify him and make clear his significance. Something analogous is at work in our hope that we can produce through the Whitman Archive not a monumental product but instead a monumental process that can be continued, corrected, and otherwise improved by future scholars. Other scholars with special interests in particular aspects of textuality could take our initial tagging and add additional layers that would enable various types of analyses.

Assuming continuing cooperation from libraries, we hope to make it possible for any interested person to look at images of all the known manuscripts; to search, in complicated ways, the text of those manuscripts and the rest of Whitman's work; to find out quickly where the physical documents are held; and to begin to make sense of a vast collection of important documents.

Like other electronic editing projects, we find we are limited in models to follow. Since we are doing new things, we cannot simply adopt tried-and-true methods but must to some extent invent our own way. The work of individual encoders is invaluable to the project, as it translates into practice the theoretical conclusions that we've come to since beginning work on the Whitman Archive in 1995. Encoding manuscripts and other documents is some of the most complex and valuable work we have underway.


1.4 Getting Started

Before you begin to encode your first manuscript, you'll need to get an assignment from one of the project editors. This process is managed by the Manuscript Tracking Database (password required). Here, you will get the image(s) of the manuscript and the unique id that you will use while encoding. After you locate an assigned document in the database, you must mark it "Checked Out."

Often, it will be useful for you to consult the notes accompanying the manuscript images, notes that were made by the institution or individual as they created the digital images. These notes describe the relationship of the individual images to other images (for example, recto/verso relationships) and, where applicable, provide the folder name and other important bibliographical information. To look at these notes (which, due to the complicated history of the project, vary in form and completeness), please go to the appropriate directory within the Image Warehouse (password required). In most cases, notes are included with the raw images.

Individual encoders use various software to create encoded documents, though NoteTab has been popular and is recommended. It is also important to work at a machine with imaging software that allows you to "zoom in" on complicated portions of the manuscript or do other manipulations that enable easier and more accurate transcription.

Once you have finished transcribing and encoding and have validated the file against the DTD, you must return to the Manuscript Tracking Database to "Check In" your document. This procedure uploads the file so that others may double-check it and publish it on the site. After you have uploaded the file, please type "Yes" in the "Final Transcription" box on the database upload screen.


2. Global

Encoding Common to Every Document


2.1 Header
2.2 Unique Identifiers
2.3 Basic Document Structure
2.4 Titles and Naming
2.5 References to External Files (Page Breaks and Entity Declarations)

[Note: Above the Header of each document are the XML Declaration, Document Type Declaration, External Entity Declarations, and the open tag of the "root element," TEI, which contains all other elements. Please go to the Annotated Template or to section 2.5 to read more about how to insert these.]

2.1 The Header

Every XML document we create has a "header," which carries essential information about who is responsible for creating and publishing the document, the source of the text we are marking up, and kind of electronic title page. The header is analogous to a book's first few pages, which inform you of the author, publisher, copyright date, terms of publication, etc.

Since much of the information in the header is the same for all of the XML documents we create, we recommend that you use the template to simplify your encoding of it.

Below, you will find descriptions of the main parts of the header, and you can click here to consult an annotated version of the template.

The <teiHeader> has three principal components:

  • <fileDesc> contains a full bibliographic description of an electronic file
  • <profileDesc> provides a detailed description of non-bibliographic aspects of a text, specifically the situation in which it was produced, the participants, and their setting
  • <revisionDesc> summarizes the revision history for a file

These elements are arranged within the <teiHeader> in this order, so the overall structure of <teiHeader> is this:

<teiHeader>
<fileDesc></fileDesc>
<profileDesc></profileDesc>
<revisionDesc></revisionDesc>
</teiHeader>

File description <fileDesc>

This should contain the following components:

  • Title statement <titleStmt> includes 1) the title given to the electronic work (which here always includes the subtitle provided by us: "a machine readable transcription"); 2) the author; 3) the editors; 4) information about others responsible for aspects of the electronic text; and 5) the name of the sponsors and funders. An example in which the original document bears a title given by Whitman:

<titleStmt>
<title level="m" type="main">Song of Myself</title>
<title level="m" type="sub"&gta machine readable transcription</title>
<author>Walt Whitman</author>
<editor>Ed Folsom</editor>
<editor>Kenneth M. Price</editor>
<respStmt>
<resp>Transcription and encoding</resp>
<name>The Walt Whitman Archive Staff</name>
</respStmt>
<sponsor>The Institute for Advanced Technology in the Humanities</sponsor>
<sponsor>University of Iowa</sponsor>
<sponsor>University of Nebraska-Lincoln</sponsor>
<funder>The National Endowment for the Humanities</funder>
<funder>The United States Department of Education</funder>
</titleStmt>

An example for a manuscript that lacks an authorial title (to read the guidelines for assigning titles, click here.):

<titleStmt>
<title level="m" type="main" rend="bracketed">I see who you are</title>
<title level="m" type="sub">a machine readable transcription</title>
. . .
</titleStmt>
etc.

Note that the title element includes a rend attribute that indicates it has been supplied by us and should therefore be displayed with brackets.

<editionStmt>
<edition>
<date>2005</date>
</edition>
</editionStmt>

<publicationStmt>
<idno>uva.00023</idno>
<distributor>The Walt Whitman Archive</distributor>
<address>
<addrLine>The Institute for Advanced Technology in the Humanities</addrLine>
<addrLine>Alderman Library</addrLine>
<addrLine>University of Virginia</addrLine>
<addrLine>P.O. Box 400115</addrLine>
<addrLine>Charlottesville, VA 22904-4115</addrLine>
<addrLine>whitman@jefferson.village.virginia.edu</addrLine>
</address>
<availability>
Copyright &#169; 2005 by Ed Folsom and Kenneth M. Price, all rights reserved. Items in the Archive may be shared in accordance with the Fair Use provisions of U.S. copyright law. Redistribution or republication on other terms, in any medium, requires express written consent from the editors and advance notification of the publisher, The Institute for Advanced Technology in the Humanities. Permission to reproduce the graphic images in this archive has been granted by the owners of the originals for this publication only.
</availability>
</publicationStmt>

<sourceDesc>
<bibl>
<author>Walt Whitman</author>
<title>Calamus Leaves</title>
<orgName>Yale Collection of American Literature, Beinecke Rare Book and Manuscript Library</orgName>
<note type="project">Transcribed from our own digital image of original manuscript.</note>
</bibl>
</sourceDesc>

Note on <orgName>: The institution that holds the manuscript should be cited as listed in the Preferred Citation table in the References section of the Encoding Guidelines.

Notes on description of source: This information is about the copy text, and the <title> here (as opposed to the one in titleStmt) should be given exactly as it appears in the records of the institutional repository, no matter how imprecise or wrong-headed their conventions may seem. Many times, the most specific title for the material will be that given to the folder used to store it, since few archives assign a title to each individual item; often, therefore, the <title> given in the <sourceDesc> will be a folder label.

At present, we almost always work from our own digital images, but we have also worked from Joel Myerson's facsmile reproductions of Whitman manuscripts (published in Joel Myerson, The Walt Whitman Archive: A Facsimile of the Poet's Manuscripts, New York: Garland, 1993.); from the Primary Source Media Whitman CD (Major Author's on CD-ROM: Walt Whitman, Eds. Ed Folsom and Kenneth M. Price, Woodbridge, CT : Primary Source Media, 1997); or from the original manuscripts themselves. Whatever the case, specific information about the image(s) and/or text(s) you rely on should be given in a <note>. If you consult more than one thing, list each, separated by semicolons. (Please note that when citing Myerson, the volume #, part #, and page # change from manuscript to manuscript.)

By the way, we say "our own digital image" rather than, say, the Whitman Archive's digital image so as to draw a clear distinction with Myerson's volumes, also called—somewhat confusingly—the Whitman Archive.

Profile description <profileDesc>

In the <profileDesc> is a list of all hands other than Whitman's that the markup declares as being in any way responsible, typically as the value of a "resp" (or "responsibility") attribute in a note, unclear, or gap element.

For example, if you are transcribing a Whitman manuscript that has a note by Fredson Bowers written physically on it, the header must have a <profileDesc> that reads:

<profileDesc>
<handList>
<hand scribe="Fredson Bowers" id="fb"/>
</handList>
</profileDesc>

(For more on this topic and how to encode non-Whitman writing on manuscripts, see section 3.10, "Writing in Others' Hands".)

You also need to include a <handList> in the <profileDesc> if your markup includes any <unclear> or <gap> elements, which require a "resp" attribute. For example, if Andy Jewell is encoding a manuscript with an unclear word and inserts this markup:

<unclear reason="cut away" cert="60%" resp="awj">herbage</unclear>
the document's <teiHeader> will need to include this <profileDesc>:
<profileDesc>
<handList>
<hand scribe="Andrew Jewell" id="awj"/>
</handList>
</profileDesc>

Revision description <revisionDesc>:

The revisionDesc element is used to summarize the changes that have been made to the file. It contains date, respStmt, name, and item elements to specify the date, responsible individuals, and changes. IMPORTANT: TEI allows only one <item> per <change>. If changes are performed at the same time, insert additional changes within the same <item> and use semicolons. If multiple changes are performed at different times, add another <change> at the top, so that changes are listed in reverse chronological order (most recent change first). To describe the tasks in our routine workflow, choose from the following terms for the content of <item>:

If the task is something other than these, any descriptive phrase can be used. Example:

<revisionDesc>
<change>
<date>2002-10-30</date>
<respStmt>
<name>Brett Barney</name>
</respStmt>
<item>Converted to camel case</item>
</change>
<change>
<date>2002-09-14</date>
<respStmt>
<name>Kenneth M. Price</name>
</respStmt>
<item>Edited</item>
</change>
<change>
<date>2002-09-07</date>
<respStmt>
<name>Andrew Jewell</name>
</respStmt>
<item>Checked; revised</item>
</change>
<change>
<date>2000-08-22</date>
<respStmt>
<name>Matt Miller</name>
</respStmt>
<item>Transcribed; encoded</item>
</change>
</revisionDesc>

2.2 Unique Identifiers


Description

Unique identifiers are one-of-a-kind names assigned to each electronic text we create. That is, every poem, collection of poems and work (for an explanation of "work" vs. "document" click here) must have a unique ID.

Creating and assigning IDs

For manuscripts, IDs are made up of a 3-character repository code plus a 5-digit number (assigned in ascending order), with the two fields separated by a dot.
Examples:
loc.00158 (a manuscript at the Library of Congress)
uva.00001 (a manuscript at University of Virginia)

Printed texts are all assigned the 3-letter prefix "ppp."

ID database

We use a database to track the unique identifiers and our workflow as we transcribe, encode, and upload manuscripts. This database can be accessed here.

Placement of IDs

The unique identifier appears in two places in the TEI header:

Transcription file names

To name the file when you save it, simply add the file extension ".xml" to the ID. Example:
uva.00023.xml

Image file names

Each page image of a document is also given an ID, created by adding a three-digit suffix to the document ID. For example:
loc.00158.002 (Page 2 of a manuscript)

These page image IDs are inserted as the value of the corresp attribute of the appropriate page break elements (<pb/>), and an entity declaration for each one must be inserted between the square brackets in the document type declaration. Example:

. . .
<!DOCTYPE TEI.2 PUBLIC "-//UVA::IATH//DTD whitman.dtd (Whitman Archive)//EN" "whitman.dtd" [

<!ENTITY uva.00023.001 SYSTEM "uva.00023.001.jpg" NDATA jpeg>
<!ENTITY uva.00023.002 SYSTEM "uva.00023.002.jpg" NDATA jpeg>

]>
. . .
<pb corresp="uva.00023.001" />
. . .
<pb corresp="uva.00023.002" />

2.3 Basic Document Structure

Within the <text> of each encoded document is a structured description of the content of the item being encoded. This page describes the basic elements of this structural tagging.


Basic Elements for Marking Structure

The following elements are used to describe the structure of Whitman's poetic works:

A sample structure might look like this:

<!-- markup is simplified -->
<div1 type="poem notes">
<lg1 type="poem">
<head type="main-authorial" rend="underline"></head>
<l></l>
<l>
<seg></seg>
<seg></seg>
</l>
</lg1>
<lg1 type="poem">
<head type="main-derived"></head>
<lg2 type="linegroup">
<l>
<seg></seg>
<seg></seg>
<seg></seg>
</l>
<l></l>
</lg2>
<lg2 type="linegroup">
<l></l>
<l></l>
</lg2>
</lg1>
<p></p>
</div1>

Manuscript Genres

To figure out how to tag a particular manuscript, first look closely at its structure, and decide which of the following three categories it falls into. A guide on how to deal with each type follows.

VERSE ONLY

PROSE ONLY

Prose should be divided into <p>s. No <div> is required in a prose-only document unless the prose is divided into separate intellectual units. For example, a manuscript requires <div1 type="section"> if it begins with two paragraphs about democracy, then has a clear break (e.g., a sub-heading, a horizontal line, or white space) followed by three paragraphs about the sound of the fishmonger yelling on the street. In such a case, the discreet groups of paragraphs should be marked with <div1>s. Except on title pages, line breaks <lb/> are not encoded. Also note that <lg>s are only used to markup poetry, never prose.

<!-- markup is simplified -->
<text type="manuscript">
<body>
<div1 type="section">
<p></p>
<p></p>
</div1>
<div1 type="section">
. . .
</div1>
</body>
</text>

MIXED GENRE

Many manuscripts contain single intellectual units which are a mixture of poetry and prose. (For an example, see the manuscript "Ashes of Roses," here.) "Mixed genre," for our purposes, does NOT just mean a manuscript leaf with poetry and prose on it (for example, a poetic draft on the recto and prose on the verso). Rather, "mixed genre" signifies writing that is thematically unified, apparently part of a single draft, but made up of a mix of prose and verse, as when Whitman composes an early draft that combines trial poetic lines with prose notes or lists. For a mixed-genre manuscript, use a <div1> with "poem notes" as the value of the "type" attribute, like this:

<!-- markup is simplified -->
<text type="manuscript">
<body>
<div1 type="poem notes">
etc.

TITLE PAGE

Some manuscripts have only titles, with no content to follow those titles, or are pages with several trial titles that Whitman never used (for an example, click here). For these unusual manuscripts, we have a different <div1> type, "title notes."

<!-- markup is simplified -->
<text type="manuscript">
<body>
<div1 type="title notes">
etc.

To read about the unique markup used in Title Page manuscripts, go here

Headings

A <head> tag, used to mark titles, will be used for both indexing and display. Please go here to read more about this titling procedure. Note that <head> can be on any structure; <div#> and <lg#> will be most common.

2.4 Titles and Naming


[If you are interested in reading about the markup used in Title Page manuscripts, go here]

Each poetry manuscript transcription will have three different kinds of titles. These titles may be identical; they may be different.


Naming poetry manuscripts


We have developed a simple set of rules for giving names to Whitman's manuscript poems. Note that this naming is IN ADDITION TO the assignment of a unique identifier. The rules are listed here in the order of priority:

3. Local


Encoding that Varies from Document to Document


3.1 Spacing
3.2 Deletions
3.3 Additions
3.4 Additions and Deletions in Combination
3.5 Illegible or Missing Text
3.6 Hyphenation and Non-Standard Spelling
3.7 Unusual Characters and Marks
3.8 Graphically Distinctive Text
3.9 Signatures and Dates
3.10 Other Writing in Whitman's Hand
3.11 Writing in Others' Hands
3.12 Cutting and Pasting
3.13 Page Breaks
3.14 Encoding Corrected Proofs
3.15 Encoding Prose
3.16 Encoding Lists
3.17 Manuscripts That Are Neither Poetry Nor Prose
3.18 Enigmas
3.19 Work Relationships and Date Information

3.1 Spacing

Recent work with stylesheets has taught us that paying attention to and regularizing the encoding of white space is important as we prepare manuscripts for display on the site. The most important guideline is simply to be conscious of spacing as you transcribe and encode, but here are a few more specific rules to follow:

  • Be sure to put a space between words. Remember that, after processing, the markup will be invisible, so your transcription needs to include the spaces that separate words even when the words are separated in the XML document by one or more tags.
  • Avoid spaces before closing <add> or <del> tags. Since Whitman's revisions typically did not involve the addition or deletion of white space after the last word of a phrase, make sure you insert the space outside the closing <add> or <del>. A properly spaced transcription looks like this:
    <add type="unmarked" place="supralinear">Song</add> of Myself.
  • Within <app> structures, insert the spaces between the closing <add> or <del> tag and the closing <rdg> tag. All characters must be contained within the "reading" or <rdg>, so spaces outside of <rdg> will be ignored. A properly spaced <app> structure should look like this:
    Song of <app>
    <rdg varSeq="1">
    <del type="overstrike">You</del> </rdg>
    <rdg varSeq="2">
    <add type="unmarked" place="supralinear">Myself</add> </rdg>
    </app>
  • Spaces before closing <l> or <seg> tags are unnecessary and should be eliminated.

  • Spaces before and after the em dash (&#8212;) should be eliminated.

  • Do not insert unnecessary spaces. Often, encoders have inserted spaces that are not part of the transcription (for example, to make the tagging more human-readable). You can use as many returns as you wish to make the markup easier to read, but please do not use the space bar.
  • [To learn how to encode Whitman's use of intentional space within lines, go here]


3.2 Deletions


What to Mark

Use <del> to mark a letter, word or passage that has been deleted by any method. Use common sense when marking deletions; if an entire line has been crossed out, for example, but the horizontal line does not physically intersect with a comma that follows the passage, you should still assume that the comma is intended to be included in the deletion. In cases of doubt, please consult Ed Folsom or Kenneth Price for his reading of the passage.

Attributes

type is the only required attribute for the del element. Possible values are:

  • overstrike: A line or lines are drawn through rejected letters, words, or passages. This is by far the most common method of deletion in Whitman manuscripts.
  • erasure: Whitman has erased part of the text.
  • hashmark: A vertical or diagonal line or lines marks through a large chunk of text (often the whole manuscript page).
  • pasteover: Whitman has deleted text by pasting another piece of paper on top of it.
  • overwrite: Letters or words are marked for deletion by being written over with other letters or words.
NOTE: Each of these types of deletion can occur in combination with additions; "overwrite" does by definition, and "pasteover" almost always does. For information about marking combinations of additions and deletions, see section 3.4, "Additions and Deletions in Combination."

Whitman's overstrikes are usally emphatic and easily recognizable, but occasionally one sees a mark which may be either an overstrike OR a stray pen mark. In these cases, first check with Kenneth Price or Ed Folsom, and then use the optional cert attribute to indicate your degree of certainty that the passage has been deleted. For example, on the Duke manuscript "I see who you are," a few lines from the bottom, the word "editor" appears to be struckthrough. This might be tagged as follows:

<del type="overstrike" cert="80%">editor</del>

Deletion of Longer Passages

Occasionally you will encounter a long passage that has been deleted. For these passages, use <del> unless doing so would create a nesting problem. For example, consider again the manuscript in the example above. Here, the long vertical strike is marked simply by enclosing the entire poem with a del element thus:

. . .
<body>
<pb/>
<del type="hashmark">
<lg1 type="poem">
. . .
</lg1>
</del>
</body>
. . .

Imagine, however, a different scenario, one in which a deleted passage consists of the last two words of one line segment <seg> and all of the next line segment. Because using <seg><del></seg><seg></seg></del> would violate the nesting rule, using paired <del> </del> tags to mark this kind of deletion is unacceptable. Instead, you must use the elements delSpan and anchor, the first to mark the beginning of the deleted passage and the second to mark the end. As with <del>, the type attribute is required for <delSpan>. The to attribute is also required, since it provides a "pointer" to the anchor. The value of the to attribute must be the same as the value of the required id attribute on the corresponding anchor element. To create these, use "d" (for a deletion) plus the next available number ("d1" for the first <delSpan>, "d2" for the second, and so on). Please note: if the manuscript calls for multiple <delSpan> elements, you will need to use distinct identifiers for each anchor. In other words, you may only use "d1" or "d2" once in a document; subsequent identifiers will have to be "d3," "d4," etc.

Consider as an example the Duke manuscript "To be at all". If we ignore other complexities for the moment, the multi-segment deletion in the line that begins "One no more than" should be marked as follows:

. . .
<seg>and out of me <delSpan type="overstrike" to="d1"/>of me more bliss</seg>
<seg>than I thought the spheres</seg>
<seg>could carry.</seg>
</l>
<anchor id="d1"/>

3.3 Additions


What to Mark

Use <add> to mark any part of the text whose placement, ink, etc. clearly indicate that it was added to the manuscript after the surrounding text was written.

Attributes

Two attributes are required on the <add> element: type and place.

For type, the possible values are:

  • insertion: marked by a caret (which often looks like an "x").
  • unmarked: added without caret or other mark.
  • overwrite: written over earlier text.
  • pasteon: written on a piece of paper that is glued to paper with earlier text.

For place, the possible values are:

  • supralinear: above the line.

  • inline: in space available on the same line as earlier text.

  • infralinear: below the line.

  • over: over the earlier letter, word, or phrase.

  • margintop: in top margin.
  • marginbot: in bottom margin.

  • marginleft: in left margin.

  • marginright: in right margin.

  • interlinear: between lines.

Addition of Longer Passages

The rules for marking long additions are similar to those for marking long deletions. Use <add> unless doing so would create a nesting problem, and use the elements addSpan and anchor to mark the beginning and end of a deleted passage that doesn't nest within other elements. <addSpan> has three requried attributes: to, type, and place. Available values for type and place are the same as those listed above for <add>. The value of the to attribute must be the same as the value of the required id attribute on the corresponding anchor element. To create these, use "a" (for "addition") plus the next available number ("a1" for the first <addSpan>, "a2" for the second, and so on).

Transpositions Noted by Arrows or Asterisks

Some manuscripts have brackets, arrows, and/or a series of asterisks to indicate Whitman's desire to move a line or lines to a different place in the poem. To encode this phenomenon, we use the <transpose> element. In this example, Whitman has bracketed one line and indicated with an asterisk in the margin that the line should be moved down. The encoding for this section follows, with the tagging most pertinent to the transposition in bold.

<note type="authorial" place="marginleft">&#45;down</note>
<transpose rend="bracketed" anchored="yes" target="t1">
<l><seg>Of the native scorn of grossness</seg>
<seg>and gain there, (O it lurks</seg>
<seg>in me night and day—What</seg>
<seg>is gain, after all, to savage-</seg>
<seg>ness and freedom?)</seg></l>
</transpose>
<l><seg>Of immense spiritual <app><rdg varseq="1"><del type="overstrike">things</del></rdg>
<rdg varseq="2"><add type="unmarked" place="supralinear"> results</add></rdg></app>, future years,</seg>
<seg>inland, spread there each side of</seg>
<seg>the Anahuacs,</seg></l>
<l><seg>Of these Leaves established there, and</seg>
<seg>well understood there.&#8212;</seg></l>
<anchor id="t1"/>
<milestone unit="undeclared" rend="horbar"/>
<note type="authorial" place="marginleft">take down&#45;</note>
Explanation:
  1. We use <note> to transcribe any marginal characters that indicate the transposition.
  2. When Whitman brackets the part to be moved, we add rend="bracketed" to the <transpose> tag.
  3. The anchored attribute is used to note whether or not the manuscript clearly indicates where the line(s) are to be moved. (We know of at least one example where this is not known.) This attribute has one of two possible values: yes or no.
  4. If the value of the anchored attribute is yes, a target attribute is required; it points to an <anchor> that is placed at the target—the point in the manuscript to which the part is to be moved. If the value of anchored is no, the target attribute is not required.


3.4 Additions and Deletions in Combination


Substitutions

Very frequently, Whitman's additions are not merely appended to earlier text but are substituted for earlier text. It is our policy to link the deleted and added portions by marking each as a reading <rdg> within an <app> element. ("App" is short for "apparatus entry." For information about the use of this element in other contexts, see chapter 19 of the TEI Guidelines.) Each <app> will contain at least two <rdg>s, and may contain up to five. The <app> element requires no attributes; the rdg element requires the varSeq ("variant sequence") attribute, the value of which is a single-digit number that indicates the relative order in which the present reading is presumed to have been written. For this example line segment, the markup would look like so:

<l><seg>Old Asia's
<app>
<rdg varSeq="1"><del type="overstrike">self</del></rdg>
<rdg varSeq="2"><add place="supralinear" type="unmarked"> there </add></rdg>
</app>
with venerable</seg>
. . .

For obvious reasons, usually the first reading will contain a deletion and the second reading will contain an addition. This is not always true, however. You may come across instances where multiple variants are left undeleted. In this case, the first <rdg> will contain no other elements, just the transcribed word(s).

Or you may find that a second reading was added but subsequently rejected in favor of the first. In this case, the second <rdg> will contain both an <add> and a <del>. (See "Nesting <add> and <del>" below for an explanation of how to arrange these.)

If the state of the manuscript makes it difficult to determine with certainty the order of the readings, a resp attribute is available for the <rdg> element; the value of this attribute can be used to identify (by initials) the person responsible for asserting an order of readings.

Overwriting

You will sometimes encounter a substitution in which a letter or word has been overwritten with another letter or word. In such cases, the value of the place attribute on the add element is "over"; the value of the type attribute for both the <add> and <del> elements is "overwrite." For example, consider this manuscript excerpt, in which Whitman has changed the "e" from upper- to lower-case. The markup for this word is:

<app>
  <rdg varSeq="1">
    <del type="overwrite">e</del>
  </rdg>
  <rdg varSeq="2">
    <add type="overwrite" place="over">E</add>
  </rdg>
</app>ach

Pasting

Fairly frequently, Whitman made substitutions by pasting one page or scrap over another. Treat such cases as you would other substitutions, by using the app, rdg, del, and add elements. For the type attribute on <del>, use the value "pasteover." The type attribute of the <add> should be given the value "pasteon," and the place attribute should be given the value "over." For an example, look at this manuscript leaf.

. . .
<app><rdg varSeq="1"><del type="pasteover">
<l><seg>And that night O you happy </seg>
<seg>waters, I heard you beating</seg>
<seg>the shores – But my heart</seg>
<seg>beat happier than you – for</seg>
<seg>he I love is returned and </seg>
<seg>sleeping by my side,</seg></l>
<l><seg>And that night in the stillness</seg>
<seg>his face was inclined toward</seg>
<seg> me while the moon's clear</seg>
<seg>beams shone,</seg></l>
<l><seg>And his arm lay lightly over my</seg>
<seg>breast – And that night I</seg>
<seg> was happy.</seg></l>
</del></rdg>
<rdg varSeq="2"><add type="pasteon" place="over">
<l><seg>And that night, while all</seg>
<seg>was still, I heard the</seg>
<seg>waters roll slowly continually</seg>
<seg>up the shores</seg></l>
<l><seg>I heard the hissing rustle of</seg>
<seg>the liquid and sands, as directed</seg>
<seg>to me, whispering, to congratulate</seg>
<seg>me, – For the friend I love lay</seg>
<seg>sleeping by my side,</seg></l>
<l><seg>In the stillness his face was in-</seg>
<seg>clined towards me, while the</seg>
<seg>moon's clear beams shone,</seg></l>
<l><seg>And his arm lay lightly over my</seg>
<seg>breast – And that night I was happy.</seg></l>
</add></rdg></app>
. . .

Nesting <add> and <del>

Of course, not all combinations of <add> and <del> are substitutions. Consider this manuscript excerpt. To indicate that the addition was deleted, the add element should be nested within the del element:

from
<del type="overstrike">
  <add type="unmarked" place="supralinear">this</add>
</del>
base

There are a number of other ways in which Whitman combined additions and deletions—probably too many to cover each one separately here. You should be able to handle almost all situations you encounter by applying these principles and rules:

  1. Nesting operates on a radial principal, working from the center out. For additions and deletions, this means that when the boundaries of a deletion and an addition are the same, the <add> should be nested within the <del>. This indicates 1) that the material was added; and 2) that the addition itself was deleted. If, however, only part of an addition has been deleted, the <del> will, of course, be nested inside the <add>.
  2. <add>s and <del>s, in various combinations, can be nested within one another, with no theoretical limit to the "depth" of that nesting. So it's entirely possible to have, for example, a <del> within an <add> within an <add> within a <del>.
  3. <app>s, however, should never be nested inside other <app>s, even though you will occasionally encounter situations which seem to call for such markup. Because <app>s within <app>s would create difficulties for computer processing, our project policy is to mark only the "highest level" substitution as such and to mark interior substitutions only with <add> and <del>, as appropriate.
  4. All <add>s and <del>s must nest properly within any other elements that are present. In particular, you should be careful not to straddle line or line segment boundaries.

To understand how to approach a complicated series of additions and deletions, take a look at this manuscript line segment. Note that it shows two substitutions: the word "for" replaces "to," and the phrase "time's hourly ceaseless" replaces "the varied." The fact that "hourly" and "ceaseless" have then been deleted adds a further complication. This is how the segment should be encoded:

<seg>No more
  <app>
    <rdg varSeq="1">
      <del type="overstrike">to</del>
    </rdg>
    <rdg varSeq="2">
      <add type="unmarked" place="supralinear"> for</add>
    </rdg>
  </app>
him
  <app>
    <rdg varSeq="1">
      <del type="overstrike">the varied,</del>
    </rdg>
    <rdg varSeq="2">
      <add type="insertion" place="supralinear">
time's
        <del type="overstrike">hourly ceaseless</del>
      </add>
    </rdg>
  </app>
mightiest,</seg>

3.5 Illegible or Missing Text

Often while encoding, we find words or marks that we cannot decipher, or we postulate readings that we do not feel completely confident about. Since we have decided to encode all of Whitman's text, no matter how indecipherable, we have tags to help us record illegible or missing text.

  • <gap>: for completely unreadable text
  • <supplied>: for text that is currently unreadable, but that has been supplied by another source
  • <unclear>: for text illegible enough to render your transcription questionable

<gap>: This element is used when text is absolutely unreadable, when, for example, it has been torn or cut away, obscured by deletion, or is simple illegible. Each <gap> needs a reason attribute, and you have the choice of three values, "cut away," "deletion, illegible," or "illegible." Note: gap is an empty element (i.e, does not require a close tag).

  1. <gap reason="cut away"/>: When a page has been torn or cut, leaving only tantalizing stubs of the letters you want to transcribe, as in this example, use this tag at the point in the transcription where the words would appear.
  2. <gap reason="deletion, illegible"/>: When deleted words are illegible (typically because of Whitman's overstrike) insert a <gap> tag in place of The unreadable words. For an example of this sort of circumstance, click here.
  3. <gap reason="illegible"/>: When characters on the page are not deleted, but are simply impossible to make sense of, as in this example, where the characters preceding the question marks cannot be resolved, use the "illegible" value for the reason attribute.

<supplied>: Sometimes a secondary source can supply a reliable transcription of text that is at present illegible, as for example when a transcription was done by an institution or editor before damage occurred. The supplied element is used in such situations. Enclose that part of the text that has been supplied in the supplied element, and use the reason attribute values listed for <gap> above to state the cause of the loss of text. Also insert a source attribute with a value that notes your source for the supplied text. For example, this excerpt is from our transcription of "Ashes of Roses":

<p>
Are we to have a National Hy
<supplied reason="cut away" source="Library of Congress transcription">
mn by
  <orig reg="Centennial">Cen-<orig>
</supplied>
tennial time?
</p>
<unclear>: When you believe you have an accurate reading of a difficult-to-read passage, but you are not completely confident, mark the questionable reading with the unclear element. Use the reason attribute to state the cause of the uncertainty in transcription, selecting from the values described above under <gap>. Use the cert (certainty) attribute to indicate the degree of confidence in the transcription. Its value will be a numeric percentage (e.g., "95%"). Also include a resp (responsibility) attribute to indicate your responsibility for the postulated reading, and as its value use your initials.

For example, if Andy Jewell is encoding a manuscript with an unclear deleted word that he thinks might be "herbage," he inserts this markup:

<unclear reason="deletion, illegible" cert="70%" resp="awj">herbage</unclear>

**Remember, when the value of a "resp" attribute indicates a hand other than Whitman's, a note must be included in the <profileDesc> within the Header. Go here to read more about how to do this.


3.6 Hyphenation and Non-Standard Spelling

Segments with end-hyphenation

Because we wish both to record the lineation of the copy text and to enable searches for words that are broken by end-line hyphenation, we use the <orig> tag with the reg attribute to record the original and regularized readings. Tag such instances in the following way:

<l>
<seg> . . . and the dying <orig reg="emerging">emerg-</orig></seg>
<seg>ing from gates,</seg>
</l>

Non-Standard Spelling

The sic element is used to represent a mistake by the author. The required attribute corr provides a correction. These corrections will enable searches to use standardized spelling and not require the searcher to know, for example, that Whitman misspelled "Buildings" as "Buldings" in this manuscript. This word should be marked up in this way:
<sic corr="Buildings">Buldings</sic>

Sometimes what you might think of as a spelling error would more accurately be termed an alternate spelling. For words that are spelled in idiosyncratic—though not exactly incorrect—ways, use the orig tag and its required reg attribute. This element and attribute pair works in the same way as sic and corr; that is, Whitman's spelling is transcribed and the standardized spelling is recorded as the value of the reg attribute. As an example, look at Whitman's spelling of "Shakespeare" in this image. Since this spelling of Shakespeare's name is one he himself used (and he never, as far as we know, used "Shakespeare"), it should be encoded as follows:

<orig reg="Shakespeare">Shakspere</orig>


3.7 Unusual Characters and Marks

XML supports only the ASCII character set, which roughly corresponds with the set of characters on a standard keyboard. Not all of the characters you might encounter in a Whitman manuscript are part of the ASCII character set, so to represent one of these unsupported characters you will need to use the appropriate Unicode number—a string of numerals that begins with an ampersand and pound sign (&#) and ends with a semicolon (;).

The table below lists the Unicode numbers we are using on the project. It is important to use the numbers for the listed characters, even when it might be possible to key them in (as with the ampersand, for example) or to use a close approximation (e.g., two hyphens to represent an em-dash). For characters not listed, Unicode numbers are NOT necessary.

For the characters in the left-hand column to display correctly, you must have a Unicode font installed on your computer. If you see boxes for some or all of the entries there, you can try downloading and using Bitstream Cyberbase.

CharacterFunction in WhitmanUnicode Number
=Proofreader's mark for hyphen. WW sometimes uses "=" for compound words ("down=balls") and words split between two lines ("some=thing").
PLEASE NOTE that &#8209; is used only when Whitman uses "="; if he uses the standard hyphen ("-"), just key it in.
&#8209;
Longer dash e.g., "Not these—O none of these more"
PLEASE NOTE that there should be no spaces before or after the dash, regardless of how the spacing appears on the page.
&#8212;
&Indicates "and"&#38;
*An asterisk&#42;
©Copyright symbol&#169;
Checkmark&#10003;
½Used often in Bowers's system of page numbering&#189;
¾Used to indicate the fraction, occasionally on manuscripts&#190;
Indicates beginning of new paragraph or a new line of poetry&#182;
ñSpanish-language character, n with tilde&#241;
óAn "o" with an acute accent mark (to capitalize, change to &#211;)&#243;
éAn "e" with an acute accent mark (to capitalize, change to &#201;)&#233;
èAn "e" with a grave accent mark (to capitalize, change to &#200;)&#232;
A right-pointing finger&#9758;
A left-pointing finger&#9756;
An up-pointing finger&#9757;
A down-pointing finger&#9759;

3.8 Graphically Distinctive Text

Underlined words: Underlined words require the "rend" attribute with the "underline" value. The "rend" attribute is global (can be used on any element), but typically you will use it with a <head> element, the <signed> element, or, if the underlined words are in the middle of a line, the <hi> element. However, if an entire line or linegroup is underlined, the element can be used on <l> or <lg>. For example, in this manuscript the underlined words in the first line would be encoded like this: <hi rend="underline">the necessity of</hi>

Dotted-underlined words: Occasionally, you may encounter a manuscript with a word that has been deleted and underlined with a series of dots. This is a printer's mark for "I don't want to delete this word after all; please leave it in," or, "stet." To handle these instances, which are rare, we surround the dotted-underlined word and the <del> with the <restore> element and use the "rend" attribute. For example, in this manuscript the dotted-underlined words are encoded like this:

<app>
<rdg varSeq="1">
<del type="overstrike">baleful</del>
</rdg>
<rdg varSeq="2">
<add place="supralinear" type="unmarked">
<restore rend="dotted"><del type="overstrike">mortal</del></restore>
</add>
</rdg>
</app> coals,

Line indentation: Though in most cases Whitman begins lines at the left edge of the writing space, he sometimes uses line indentation in distinctive ways, as in this copy of "O Captain! My Captain!" To encode this indentation, we add the "rend" attribute to the <l> element. The value of "rend" is "indented" plus a number that indicates the relative length of the indentation. For the shortest indentation, we use "indented1"; for the longest, "indented4."

Reference Chart for Use of the "Rend" Attribute

Value of 'rend' attributeFunction in Whitman
underlineindicates underscored text
circledused when text, typically within a note in the margin, is surrounded by a circular line in order to separate it from other text
bracketedused in <title> within the <titleStmt> to distinguish derived titles
italicused only in transcriptions of printed material or in project notes to mark titles of books.
dottedused with <restore>
indented1
indented2
indented3
indented4
added to <l> when Whitman uses staggered indentation at the beginnings of lines; the numbers indicate relative amount of indentation (1=shortest, 4=longest)
horbar-full
horbar-short-right
horbar-short-left
horbar-short-center
used in <milestone> to indicate various positions and lengths of horizontal separators

3.9 Signatures and Dates

  • Bylines

    Bylines that immediately follow the title should be encoded using the byline element. Insert it after the head element and before the first <l>. The byline shown here is encoded as follows:

    . . .
    <head type="main-authorial" rend="underline">Up, lurid stars!</head>
    <byline><hi rend="underline">By Walt Whitman</hi></byline>
    <l>Up, lurid stars! martial constellation!</l>
    . . .

  • Signatures at the Bottom

    A signature that comes after the last line should be marked with the signed element. In order to properly unite the signature with the poem being signed, <signed> is included within <closer>. The <closer> ought to close before <lg1> closes.

    The following example is based on this manuscript. . . .
    <seg><add type="insertion" place="supralinear">scented</add> roses blooming.</seg>
    </l>
    <closer>
    <signed rend="underline">Walt Whitman</signed>
    </closer>
    </lg1>
    </body>
    </text>
    </TEI.2>
  • Dated Manuscripts

    A date which Whitman has written on a manuscript in order to note the composition date or occasion date (as when he writes on Washington's birthday or on the death of General Sheridan) is encoded within the <dateline> tag. <dateline> can occur wherever Whitman has written the date, typically either after the <head> or within the <closer> after the signature.

    Each time you use a <dateline> tag, you also need to use the <date> tag with the value attribute. The <date> element should contain only the date. The value of value is the normalized date, put into this form: YYYY-MM-DD. If, as in the example below, you need to encode a date range, the value attribute will have both dates, separated by a slash.

  • The following example is based on this manuscript.

    . . .
    <lg1 type="poem">
    <head type="main-authorial" rend="underline">The Sobbing of the Bells&#8212;</head>
    <dateline>(Midnight <date value="1881-09-19/1881-09-20">Sept: 19-20 1881</date>)&#8212;</dateline>
    <milestone unit="undeclared" rend="horbar"/>
    <l><seg>The sobbing of the bells, the sudden death&#8209;news</seg>
    <seg>everywhere</seg></l>
    . . .

When is a date a <dateline> and when is it a <note>?: We use the <dateline> element to note a date that is to be published with the poem and is part of the poem's meaning. As mentioned above, Whitman most often inserted these sorts of dates under the head or under his signature. On other manuscripts, Whitman has written a date that is not a <dateline> but instead is to be treated as a <note> on the page. Most of the time, these <note>s will be distinguished by their placement on the manuscript page (in a margin, a corner, or otherwise beyond the layout of the poem proper) and by their ambiguous relationship to the lines of poetry.


3.10 Other Writing in Whitman's Hand


Notes

You will sometimes come across writing on the manuscript page that is not part of the text of the poetry manuscript proper, but instead a note of some sort about it. For example, this note follows a poem, at the bottom of the page. This sort of material is encoded using the <note> element. Note also takes two required attributes: type and place. To distinguish the writing as Whitman's, use the value "authorial" for the type attribute. For the place attribute choose from the following values: "margintop," "marginbot," "marginleft," "marginright," "inline," "supralinear," or "interlinear."

The example should be marked up as follows:

<note type="authorial" place="marginbot">
  <app>
    <rdg varSeq="1">
      <del type="overstrike">sent to</del>
    </rdg;>
    <rdg varSeq="2">
      <add type="unmarked" place="supralinear">pub in</add>
    </rdg>
  </app>
Herald early in Feb. '88
</note>

Sometimes, Whitman will visually separate his notes from the rest of the text by drawing a boundary line, as in this example. When this happens, you need to add the rend="circled" attribute and value to <note>. (The value circled is used even though Whitman's boundary line often does not make a proper geometric "circle"). The encoding for the example would read:

<note type="authorial" place="margintop" rend="circled">follow copy strictly</note>

Note that <p> is only used within <note> when there are multiple paragraphs within the note.

Reverse-Side Notes

Occasionally, Whitman will write a prose note about the poem on the reverse-side of the manuscript leaf, such as a note to the printer or a comment on the poem's placement in a larger work. These notes, though they are on the reverse side, are encoded basically the same way as the notes described above are encoded, with a few minor adjustments:

  • The <note> tag is inserted after the <pb> tag that identifies the verso (i.e., the one with "id='leaf01v'")
  • The value of the place attribute is "inline"

Unrelated Reverse-Side Writing

If reverse-side writing is in Whitman's hand, we encode it, regardless of its content. You will need to create a new <head> and, therefore, a second <title> in the <titleStmt> to deal with the separate intellectual unit on the reverse-side. It should be encoded to the same level as any other Whitman manuscript.

Miscellaneous Writing

The note element is also used to mark other writing on the page that, while not strictly a note, is not part of the text. Examples include: page numbers, addition or subtraction problems, and question marks.

Whitman's use of proofreading/typesetting marks is a special case. To encode a manuscript that has such marks, you should first decide whether the marks are being used to indicate a) changes to the base text, or b) emphasis of a typographic feature.

It is our current policy to encode only marks of the first type. Examples include the caret ( ^ ) to indicate an addition, and the paragraph mark (¶) to indicate a new paragraph (in prose) or line (in verse). For instructions on encoding the caret, see 3.3 Additions. The paragraph mark should be encoded as a named character entity—&para;.

At present we have chosen not to encode marks of the second type, though we may in a future stage of the project return and add representions of them to our markup. Examples of marks that you should not encode include triple-underlining to emphasize capitalization and horizontal curved brackets used to indicate lack of spacing between parts of hyphenated words.

Horizontal Lines

Whitman fairly often draws a line to signal the beginning or the end of a unit of text. These lines range from full page width to small bars at the left, right, or in the center. You can see an example of a small center line at the bottom of this manuscript and an example of a small line at the left at the top of this one. We take these lines to indicate some kind of division, though we make no claims about the sort of unit(s) they define. They are encoded using the empty milestone element, with "undeclared" as the value of the unit attribute and the following as possible values of the rend attribute:

  • horbar-full
  • horbar-short-right
  • horbar-short-left
  • horbar-short-center

The first example above should be encoded as follows:

<!-- markup here is simplified -->
<seg>We never separate again.&#8212;</seg></l>
<milestone unit="undeclared" rend="horbar-short-center"/>
</lg1>

Brackets

Whitman sometimes used brackets to group lines or other bits of text, as in this example. These should be indicated by using the span element. Like <addSpan> and <delSpan>, <span> is an empty element (i.e., consists of a single open tag), but it works in a slightly different way. In addition to the to attribute, from and value are also required. Assign the span an id—use the formula s+number—and give the "from" the same id. This, in essence, means "start from here.") The value of the value attribute will always be "bracketed." And, as with addSpan and delSpan, you must always mark the end of the bracketed section with an empty anchor element bearing an id attribute whose value corresponds with the value of the <span>'s to attribute. This is the markup for the example above:

<span value="bracketed" id="s1" from="s1" to="s2"/>
<l>I rate myself high&#8212;I receive no small sums;</l>
<l>I must have my full price&#8212;whoever enjoys me.</l>
<anchor id="s2"/>

Page Numbers

Whitman's use of page numbers—combined with the history of manuscript dispersal—means we are left with both ambiguous and reliable page numbering on Whitman manuscripts. By "ambiguous," we mean manuscripts with a number, like "43," written in the top corner but no corresponding "42" or "44"; by "reliable," we mean multiple-leaf manuscripts with an ordered numbering of each leaf (ordered numbering does not mean an uninterrupted sequence that begins with "1"; instead, it means any discernable numbering system that reliably determines the leaf order).

We handle these two types of page numbers in different ways. For ambiguous numbers, we use <note>. For reliable page numbering, we add an attribute to the Page Break, or <pb> element. Specifically, we add an "n" attribute with a value that corresponds to the number written on the page. So, if a three-leaf manuscript is numbered "2," "3," "4," then the <pb>s would have n="2", n="3", and n="4".

Section Numbers

Sometimes you will encounter a manuscript with distinctly numbered sections, as in this example. These sections are different than linegroups, as they are typically numbered or otherwise clearly marked, and they often contain multiple linegroups. To handle section numbers, add a <head> tag immediately after the <lg> tag to note the "head" of that section. Here's how you would encode the example manuscript :

<!-- this markup is simplified -->
...
<lg1 type="poem">
<head ... >
<lg2 type="section">
<head type="main-authorial">1</head>
<lg3 type="linegroup">
<l>Come, said the Muse,</l>
....
</lg3>
<lg3 type="linegroup">
...
</lg3>
<lg3 type="linegroup">
</lg3>
</lg2>
<lg2 type="section">
<head type="main-authorial">2</head>
...
</lg2>
...
</lg1>

Intentional Inline Spaces

Whitman will occasionally leave a blank space within a line of poetry, apparently making room for the perfect word that he has yet to discover. To encode these spaces, insert a <space> element with two attributes, dim (dimension), the value of which will almost always be "horizontal"; and extent, the value of which is expressed as a number of letters, determined by the size of the letters surrounding the space on the manuscript. The encoding for this manuscript would look like this:

action, <space dim="horizontal" extent="9 letters"/>in husky

In other cases, Whitman will leave a blank line that indicates the intentional blank space (apparently in addition to his major poetic innovations, Whitman also developed the Mad Lib). To encode this phenomenon, use the same strategy as above, but add a "rend" attribute with the value "underline." Therefore, this example would be tagged like this:

. . . .
<seg>As in Visions of <space dim="horizontal" extent="7 letters" rend="underline"/> at</seg>
<seg>night&#8212;</seg>


3.11 Writing in Others' Hands


Writing on the Front of the Leaf

Use the note element to mark letters, words, etc. that someone other than Whitman has written on the manuscript page you are encoding. As explained in section 3.9 above, this element requires both type and place attributes. For the value of type use "editorial." The possible values of the place attribute are the same as for other notes: "margintop," "marginbot," "marginleft," "marginright," or "interlinear." In addition, non-authorial notes should be given a resp attribute, with a two- or three-letter code to identify the responsible hand. Hands to which codes have been assigned are listed in the following table. Should you encounter a hand not accounted for, please email Brett Barney and Andy Jewell.

Nameprimary location of msscode
Bowers, FredsonUVafb
Traubel, HoraceLibrary of Congress, Feinberg Collectionht
unknownn/aunk

This note, written by Traubel in the bottom right corner of a poem manuscript, is encoded as follows:

<note type="editorial" resp="ht" place="marginright">
For Francis Howard Williams May 1896 Traubel
</note>
Resp Values Any time the resp attribute is used (whether on <note> or on <unclear>, etc.) in the body of a document, you must also include in the header some information that explains each of the values you've used. To do this, add a profileDesc (profile description) element directly before the <revisionDesc> (revision description). Inside the <profileDesc>, create a pair of handList tags and one hand element for each different value given to a resp attribute in the file. hand has two required attributes: scribe and id. The value of scribe is the full name of the person being identified as responsible for the written note. The value of id corresponds with the value assigned to the resp attribute in the body. The following excerpt is taken from the file that contains the note discussed above:

<profileDesc>
<handList>
<hand scribe="Horace Traubel" id="ht"/>
</handList>
</profileDesc>
<revisionDesc>

Writing on the Back of the Leaf

Most cases of reverse-side writing in others' hands arise from Whitman's re-use of paper. Common examples include envelopes, fan letters, and government stationery. For such situations, we use the note element with the value "project" on the type attribute to describe (not transcribe) the non-authorial writing. Unlike "authorial" and "editorial" <note>s—which should be transcribed as close as possible to their position in the manuscript—these notes must be placed within the <teiHeader>. To do this,

  1. insert a <notesStmt> immediately before the opening source description tag (<sourceDesc>);
  2. inside the <notesStmt> create a <note> with no place attribute, the value "project" on the type attribute, and the additional attribute target;
  3. find the page break that indicates the beginning of the reverse side;
  4. copy the value of the page break's id attribute as the target attribute on the project note;
  5. write a short description of the reverse-side writing

Note: You need not use <p>s to enclose these project notes, unless you want to write more than one paragraph of description. Here's how the envelope example above should be encoded:

<teiHeader>
<fileDesc>
. . .
<notesStmt>
<note type="project" target="leaf01v">Verso of manuscript leaf is addressed to Walt Whitman, Camden, New Jersey, postmarked September 25, 1890.</note>
</notesStmt>
<sourceDesc>
<bibl>
<author>Walt Whitman</author>
. . .
</teiHeader>
. . .
<pb corresp="loc.00046.001" id="leaf01v" type="verso"/>
. . .

Other Cases


Materials which accompany manuscripts (notes, transcriptions)

For now, we have decided not to transcribe or encode any of the items not written by Whitman—for example, notes about the text or transcriptions—that are sometimes stored with manuscripts. Eventually, these may be encoded as separate documents and linked to the relevant manuscript, but this work will happen at a later time.

Pasted clippings

See section 3.12 below for a discussion of how to encode clippings that Whitman has incorporated into his own manuscripts.


3.11 Cutting and Pasting

We distinguish among three sorts of pasting in Whitman's manuscripts. Please look over the description of each and decide which describes the instance you're encoding.
  1. One page pastes over another, deleting old material and adding new. (See 3.4, "Additions and Deletions in Combination.")
  2. Paper has been pasted together to provide more writing space. (See below.)
  3. Whitman pastes a clipping onto his manuscript. (See below.)

Pasting that Extends the Writing Area

To represent the seam of the two pages that have been joined, use the (empty) milestone element. Include the unit attribute with the value "glued." This example shows a manuscript that calls for such markup.

Clippings Pasted to the Manuscript

Sometimes Whitman pastes others' material to his manuscripts. For example, in "Ashes of Roses," Whitman has pasted a newspaper clipping in the lower left-hand corner of the first leaf. In these cases, use the <add> element and its available "type" attribute (if a manuscript's peculiarity requires it, as when pasted-on material begins in the middle of an <l> and crosses <l> boundaries, you may also use <addSpan>). In this particular case, the author of the newspaper clipping is "unknown."

<add place="marginbot" hand="unk" type="pasteon">
<p>Are we to have a National Hy<supplied source="Library of Congress transcription">
mn by <orig reg="Centennial">Cen-<orig></supplied> tennial time?</p>
</add>

Please remember that hands other than Whitman's must be declared in the <profileDesc>.


3.13 Page Breaks

Page Breaks (<pb>) are inserted in the encoding whenever you begin the transcription of a new page (including the first one). You use <pb> tags in every document, even if they are only one page long. <pb> is an empty tag, which means that you never need to "close" <pb>, but just insert a "/" at the end of the tag.

Each <pb> tag has three required attributes, "corresp," "id," and "type". The "corresp" attribute indicates the file that contains the page image, so you'll need to assign, as its value, the unique id with a three-digit suffix that indicates the page number (.001, .002, etc)—the image files will be given this name before they are mounted on our site. The "id" attribute identifies the page by "leaf" (or piece of paper) number and side—"r" for "recto" (front) or "v" for "verso" (back). The "type" attribute classifies the page as either "recto" or "verso." The "id" value must always end in either "r" or "v"—even if there is only one image. When there is only one image, the "id" value will almost always be "leaf01r."

Here is an example of what the <pb> tag looks like:

<pb corresp="loc.00008.001" id="leaf01r" type="recto"/>

The first <pb> tag goes after the <body> tag and before the first <div> or <lg>. If there are multiple pages, i.e., more than one corresponding image, simply insert a <pb> at each place in the encoding that corresponds to the beginning of a new page. Often, these will occur at the close of one linegroup (</lg1>) and before the opening of another (<lg1>). Or, commonly, you will need to include a <pb> to indicate untranscribed verso material; this should be done after the <lg> or <div> closes but before the <body> tag closes.

How to Handle Unusual Document Order: In some instances, Whitman has written a single poem on the rectos of several leaves that also have poetic lines on the verso that are not part of the same poem. In this case, you must encode in a way that preserves the intellectual unity of the poem on the rectos. To do that, you will have to break the typical order of <pb> "id" values. That is, instead of "leaf01r," then "leafo1v," "leaf02r", "leaf02v", etc., encode the pages in an order that preserves the integrity of each poem. For example, if you have a manuscript with a poem written across the rectos of three leaves and other poetic lines written on the versos of leaves 1 and 3, the <pb> will have id attributes ordered like this: leaf01r, leaf02r, leaf03r, leaf01v, leaf03v. It is done this way to ensure that the material on the rectos of leaves 1-3 are all contained within the same <lg1 type="poem">

Remember: For every <pb> you insert, you need to insert an Entity Declaration in the header.


3.14 Encoding Corrected Proofs

Many manuscripts in various collections combine printed text and handwritten correction, as in this example. We have developed a an encoding procedure for these manuscripts that make distinctions between the two types of texts (printed and handwritten).

  1. All "mixed media" mss of the kind described are assigned "prepub-proof" or "postpub-proof" as the value of the type attribute in the <text> element. Pre-publication proofs are the most common: typically, they are detatched sheets of paper with a typescript rendering of a poem. Post-publication proofs are what we are calling Whitman's hand-revised copies of published works. So, for example, if a copy of the 1876 edition of Leaves served, after being annotated, as the printer's copy for the 1881-1882 edition, we would call that a post-pub proof. Note that we are using the word "proof," in a way that is broader than is usual in the publishing world, to describe the proof-like functioning of a document.
  2. The base text for these manuscripts is assumed to be printed, so we explicitly declare the medium of only the handwritten <add>s, <del>s, and <note>s.
  3. We use the hand attribute on <add>, <del>, and <note> to record handwritten bits.
  4. The handwritten "hand" is declared in the <teiHeader>'s <profileDesc> in the same way that we're already doing it for non-Whitman writing on mss. (E.g., when we have <note type="editorial" resp="ht" place="marginright"> for one of Traubel's ms. annotations, the <profileDesc> has <hand scribe="Horace Traubel" id="ht">). But instead of putting the elaborated description in the scribe attribute, we put it in an "ink" attribute.
  5. Consonant with our practice of letting pass distinctions between colors of ink or between ink and pencil, all handwritten bits will share the same value for the hand attribute.
  6. Whitman often changes the proof inline and also adds a marginal note, as when he adds a comma inline and puts "<," in the margin. In these cases, do not double-encode his corrections. Marking the addition inline is sufficient.
  7. When Whitman notes the insertion of space with a "#", use <add> with the <space> element, noting whether or not it is vertical or horizontal space (in proofs it will most often be vertical), and noting the approximate size of the space, using "lines" as the measuring unit (relative to surrounding line-heights). An example would be:
    <space dim="vertical" extent="1 line"/>
  8. When Whitman uses a curled line to correct inverted letters or words, use the <transpose> element to note his re-ordering. For example, if you see in a corrected proof, you would encode it this way:
    b<anchor id="t1"/>a<transpose hand="h1" anchored="yes" target="t1">e</transpose>rd

Whitman will often use specialized proofreading marks to note common changes, as when he uses a triple-underline to note his desire to capitalize an uncaptalized word. In these cases, you encode using an <app> structure and "edit" as the value of the type attribute on <add> and <del>. For example, if you saw , you would encode it like this:

<app><rdg varSeq="1"><del type="edit">l</del></rdg><rdg varSeq="2"><add type="edit">L</add></rdg></app>


We use a controlled vocabulary for both the value of the hand attribute on <add>, <del>, and <note>, and for the ink attribute on <hand>: ink="handwritten"; hand="h1". An example follows.

<profileDesc>
<handList>
<hand id="h1" ink="handwritten"/>
</handList>
</profileDesc>
. . .
<text type="prepub-proof">
<body>
. . .
<add type="insertion" place="supralinear" hand="h1"> . . .</add>
. . .
</body>
. . .

3.15 Encoding Prose

Even though we are focusing on Whitman's poetry, the manuscripts will sometimes contain prose that you will need to represent. You might encounter prose in Whitman's poetry manuscripts in a few different ways:

  1. Prose notes about the verse on the same leaf:
    Sometimes Whitman will have lines of prose on the same leaf that he has used for poetic composition. This is ususally deemed a mixed genre manuscript and requires the "poem notes" value for the <div1> type attribute. Occasionally, though, the prose will be a note about the poem. In that instance, consult the section in the guidelines about authorial notes.
  2. Prose with imagery or language that was later incorporated into a poem, or prose notes about an idea for a poem:
    For prose writings that relate to poetic work, you will simply encode the manuscript as a prose only document.
  3. Prose unrelated to the poetic lines on the leaf:
    Prose unrelated to the verse on the leaf, which almost always is on the verso of the manuscript, is discussed in the Unrelated Reverse-Side Writing section of the guidelines.

Transcribed prose is always encoded in the same way: the text is surrounded by <p> tags and all line breaks and line segments are ignored.


3.16 Encoding Lists

Whitman sometimes made lists of words or phrases that led, in one way or another, to his poems; for example, this list has been traced to Whitman's poem "When Lilacs Last in the Dooryard Bloom'd."

To encode these lists, we use the <list> element, which contains the <item> element. The structure is pretty straightfoward: at the beginning of the list, open the <list> element; then, if appropriate, insert a <head> tag; finally, encode each item on the list with an <item> tag, which is nested within <list>. The following sample encoding is based upon the manuscript example above:

. . .
<list>
<item>sorrow (saxon)</item>
<item>grieve</item>
<item>sad</item>
<item>mourn (sax)</item>
. . .
</list>


3.17 Manuscripts That Are Neither Poetry Nor Prose




Encoding Lines that are Neither Verse nor Prose

Occasionally, you will run across a manuscript with lines that appear to have all the layout qualities of verse (hanging indentation, initial capitalization, etc.) but otherwise seem like prose notes about an idea for a poem (for an example, click here). Rather than imprecisely calling such manuscripts either "prose" or "verse," we have developed a method that acknowledges the indeterminacy of the genre. Specifically, we use the "anonymous block," or <ab> element in place of either <l> or <p> and mark line segments. Also, the <div1> type is the same we use for "mixed genre" manuscripts, "poem notes." Here's an example of how the encoding would look (click here to look at the manuscript):

<-- markup is simplified -->
<text type="manuscript">
<body>
<div1 type="poem notes">
<head type="main-authorial">incidents for (Soldier in the Ranks)</head>
<milestone unit="undeclared" rend="horbar"/>
<ab>
<seg>describe a group of men coming off the</seg>
<seg>field after a heavy battle, the grime,</seg>
[. . .]
</ab>

**IMPORTANT: We don't want to use this tagging indiscriminately; it should only be used when the editors cannot decide if material is prose or poetry. Therefore, if you come across a manuscript that you think fits the description above, please consult with either Ken or Ed before you use the <ab> markup.

Encoding Title Pages

If you encounter a manuscript that has only titles on it, like this one, please use this variation on the <ab> tagging described above (note the different <div1> type and attributes for the <ab> elements):

<-- markup is simplified -->
<text type="manuscript">
<body>
<div1 type="title notes">
<ab type="title" rend="underline">American to Old-World Bards</ab>
<ab type="subtitle" rend="underline">A reminiscence from reading Walter Scott</ab>
</div1>

**NOTE: Only use "rend='underline'" if the title is indeed underlined as it is in this example. Otherwise, omit that attribute.


3.18 Enigmas


What not to Encode

Although we are attempting to accurately encode all of the important aspects of Whitman's texts, absolute comprehensiveness is not our goal. Specifically, we have decided, for now at least, to ignore

  • ink blots, smudges, and stray pen marks;
  • pin holes;
  • embossing;
  • variations in ink, pencil, or paper;
  • distinctions between single and multiple overstrikes

Things not Covered by the Guidelines

The encoding practices articulated in these guidelines have evolved over the last several years, and that evolution has mainly been driven by the needs of encoders, editors, programmers, and consultants as we have worked to create and deliver electronic transcriptions of individual manuscripts. When we first began, every manuscript presented a host of new challenges that had to be addresssed before the encoding could be completed. Naturally, the pace of change has slowed as the number of encoded manuscripts has grown. Even so, new puzzles continue to present themselves occasionally, and you should not be unduly disturbed if you find yourself in an encoding dilemma for which the guidelines seem to give no guidance. In such cases, you should write or call Brett Barney or Andy Jewell as soon as possible and explain as clearly as you can the nature of the difficulty. If the problem is indeed new, you may be asked to draft a summary of the issues to be sent to the members of the listserv.

Dealing with Uncertainty

Encoding manuscript materials is difficult for a number of reasons, and you will no doubt sometimes feel confused or indecisive. If the difficulty is one of reading or interpreting, consult with Ed Folsom or Kenneth Price. If, instead, the problem has to do with markup, consult with Brett Barney or Andy Jewell.

If the problem is relatively minor, in that it doesn't prevent your continuing to work on other parts or aspects of the manuscript, you might decide to handle the problem in the best way you know how and leave yourself and the editors a detailed comment—in the file itself—about the problem and how you've handled it. Then you and they can return to it later. Do this by writing the comment and then surrounding it with these characters:

<!--                  -->
This sequence of characters signals the computer's SGML processor to ignore everything that comes between the first and last marks.

Alternatively, you can write your comment and wrap it in a <what> element. This is an element we have borrowed from the Brown Women Writers Project, and it can be used in essentially the same way as the "commenting out" convention explained above.



3.19 Work Relationships and Date Information

*this encoding is typically inserted by upper-level staff people and editors*

As part of the header, we encode the relationship of the individual manuscript to a "work" (or "works"). As opposed to a document, which is a particular instatiation of a poem or book, etc., a "work" is the abstract idea of a poem or book, etc. We name the work according to the last instance published in Whitman's lifetime. For example, the work "Song of Myself" refers not to any particular manuscript or printed version of that poem, but to all of the versions collectively. Individual documents of that work include: the poem printed in the "deathbed edition," titled "Song of Myself"; the first, untitled version of the poem in the 1855 edition of Leaves of Grass; manuscript drafts of lines included in the poem; and notebooks that contain ideas and trial phrases that contributed to the composition of the poem.

We encode this work relationship at the beginning of the transcription file, immediately before the <teiHeader> element. The following example is taken from the transcription of this manuscript):

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE TEI.2 PUBLIC "-//UVA::IATH//DTD whitman.dtd (Whitman Archive)//EN" "whitman.dtd" [
<!ENTITY loc.00213.001 SYSTEM "loc.00213.001.jpg" NDATA jpeg>
<!ENTITY xxx.00358 SYSTEM "xxx.00358.xml" NDATA xml>]>
<TEI.2 id="loc.00213" type="doc">

<relations>
  <work entity="xxx.00358" cert="high">
    <p>This manuscript is a draft of "Life and Death," which was published first in the New York
      <hi rend="italic">Herald</hi>, <date value="1888-05-23">May 23, 1888</date>.
    </p>
  </work>
</relations>

<teiHeader>
. . .

To encode the work relationships, we must first look up the work ID. A table of work IDs can be found here in the reference section of the Encoding Guidlines. The ID, which is a string of characters beginning "xxx" and ending with a five-digit number, corresponds to a work file which will contain prose descriptions of the compositional history of the work as well as connect the transcription files with other elements of the Whitman Archive. The ID is inserted two places: in the entity declaration and as the value of the "entity" attribute in the "work" element.

In addition to this ID, project editors also assign either "high" or "low" as the value of "cert" to describe their confidence in connecting the individual manuscript to the work. For a manuscript with lines that are identical or very close to a published poem, the certainty will be "high"; notes that describe an idea in a way that bears a general resemblence to a published poem will get a "low" certainty.

The final part of the <relations> section is a brief prose description of the publication history of the work. This editorial note will be displayed along with the transcription on the site. If there is something distinctive and noteworthy about the manuscript, the editor may also insert a project note within the <noteStmt>.

A new <work> element, with a prose description, is used for every work related to the document. All dates within the prose description are tagged with a <date> element with a "value" attribute that records the date in the form YYYY-MM-DD, (or, if appropriate, just YYYY). Any titles that are normally italicized need to be marked with <hi rend="italic">

Dating the Manuscript

We have recently begun inserting information in the transcription files that will allow us both to sort manuscripts by composition date and to provide users with a note about that date. Here is an example, taken from the same manuscript transcription as the sample above:

. . .
<notesStmt>
  <note type="project" target="dat1">This manuscript was probably composed in the spring of
    <date value="1888">1888</date> shortly before it was published.
  </note>
</notesStmt>
<sourceDesc>
  <bibl>
    <author>Walt Whitman</author>
    <title>Life and Death</title>
    <date certainty="high" value="1888" id="dat1">1888.</date>
    <orgName>The Charles E. Feinberg Collection of the Papers of Walt Whitman, 1839-1919,
      Library of Congress, Washington, D.C.
    </orgName>
    <note type="project">Transcribed from our own digital image of original manuscript.</note>
  </bibl>
. . .

There are two major steps to inserting the dating information. The first is the insertion of the <date> element within <bibl>. Within this element we insert either the year of composition ("1888"), the year plus qualifying language ("About 1888"), a date range ("1867-1888"), or whatever brief description of the date is appropriate. (We have decided to only list the year in this space, even when month and day information is available.) The <date> element takes three required attributes: "certainty," "value," and "id." The value of "certainty" can be low (when we only have a conjectural date to offer), high (when we are fairly certain about the composition date), or absolute (when Whitman dated the manuscript or there is some other very conclusive evidence of its composition date). The value of "value" is a regularized date, written as "YYYY" or, in the case of a range, "YYYY/YYYY" (the first year represents the earliest year in the range; the second year represents the latest year). The value of "id" is always "dat1."

The second step is to insert a prose note within the <noteStmt> that describes the reasoning behind our dating of the manuscript. This <note> takes two attributes: "type='project'" and "target='dat1'" (required). The "target" attribute allows us to associate the dating information with the <date> in the <sourceDesc>. The prose note will display below the transcription, giving users fuller information about when Whitman wrote the manuscript.

The Template

The template is a downloadable file to help you complete the consistent components of the encoded poetry manuscript, primarily the header. Though no template can exist that will provide all the necessary elements, we hope this file will help with important and complex parts of the encoding.

Using the Template



After you download the template file and open it in NoteTab or other software, you will see a document with tags, many of which require you to insert content. To help you visually identify the areas that require information, and to help explain the information that is needed, a convention called "commenting out" is used. To "comment" something out means to make the computer ignore it; in other words, it is a way to insert words that are to be read only by humans. A "commented out" section is contained within this structure: <!--   -->. Whenever that string of characters surrounds text, what it contains is not machine-readable. For example, if you are looking at an SGML file and you see "<!-- Insert your name here -->", you know that the words are meant for others on the project, not the computer.

Though the template is a helpful tool, avoid getting too dependent on it. The varying nature of Whitman's manuscripts means that no template will accomodate all the possible distinctions. You must consult the rest of the guidelines to make sure you encode all the necessary individual attributes of the specific manuscript you are working on.

To help you make sense of the template, we have created an Annotated Template as part of the Encoding Guidelines. This should help you both to understand why you are doing what you are doing and how to "fill in" the correct content in the template.

Staying Current



Though we don't anticipate needing to update the template too often, as discussions about best practices continue, we will undoubtedly alter our procedures in some fashion. To make sure that you are following the most up-to-date procedures, please ensure you have the most up-to-date template.

The date at the bottom of the page will let you know when the template was updated last, and we will send a email to the Whitman Archive listserv to alert everyone to the existence of the update.

Right-Click Here to
Download the Template (template.xml)


Click Here to
go to the Annotated Template


The template was last updated July 8, 2004.

References


Whitman Project Tag Library
      Tables showing all of the elements available, with summaries of where they may occur and their available attributes and values
 
"unScripting Whitman"
      Stacey Provan's guide to Whitman's handwriting
 
Library Codes
      Table showing the locations of Whitman repositories and the three-letter prefixes to be used for unique ID's and filenames
 
Preferred Citations
      Table showing the citation to be used for each repository (in <sourceDesc> and when creating EAD documents)
 
Unusual Characters and Marks
      Table showing the Unicode numbers for some characters that you can't find on your keyboard
 
Work I.D.s
      Alphabetical listing of Works that have been identified so far, with their corresponding ID's (for use in <relations>)
 
Links
      Links to the TEI Guidelines (online) and the TEI Consortium's homepage