About the Archive

Methodology and Standards

Editorial Policy Statement and Procedures

The Walt Whitman Archive is guided by pragmatic principles in its attempt to achieve two goals 1) to edit the vast textual corpus Whitman produced and 2) to provide access to a wide range of related materials that shed light on his writings, including photos, reviews, translations, criticism of his writings and accounts of his life, finding guides to his manuscripts, and bibliographies of criticism. We try to achieve the best possible results within the constraints we face (technical, legal, financial)—hence the pragmatism mentioned above.

Whitman's writings survive in many forms including manuscript scraps, notebooks, periodical printings, and books. He worked primarily in print and manuscripts forms, and we are, for better and worse, changing media when we represent his vast oeuvre using digital storage and transmission. We recognize that the Archive cannot serve all purposes nor can all editorial goals be pursued. We explain our practices to assist scholars in making informed judgments about their own use of the Archive. Text encoding, though crucial to our editorial practice, is excluded from this overview. See ("Encoding Guidelines") for a detailed description of our text encoding practices.

The Whitman Archive's approach to editing is to establish documentary texts rather than to reconstruct what are assumed to be authorially-intended texts. We are concerned with historical and social aspects of the texts—how the texts made their way into the world and the multiple agents that brought them into being. The Whitman Archive confines any speculation about Whitman's intentions to editorial notes.

I. Purposes

While our work to date has emphasized Whitman's poetry manuscripts and the books he saw through the press, the Archive has expanded to include other primary materials including his correspondence, notebooks, and periodical publications. The Archive also publishes interviews, translations, reviews, criticism, bibliographies, and biographies. Throughout the Archive we strive to provide convenient access to accurate material.

For Whitman's poetry manuscripts, periodical printings of his poetry, and original printed volumes, we publish authoritative electronic versions (facsimile page images and transcriptions) with an emphasis on fidelity to original documents. For other materials, we do not provide facsimile page images of original documents but instead offer searchable electronic transcriptions.

II. Process

A. Poetry Manuscripts

We provide both high-quality facsimile images and transcriptions that emphasize the semantic content of Whitman's poetry manuscripts. Every step of the process—from deciding which pieces of paper together form the manuscript to transcribing handwritten marks into typographic letter forms—involves judgment. In general, intellectual continuity weighs more heavily in these judgments than does physical similarity or proximity, though we do not wholly discard such physical evidence as paper and the ordering of materials in repositories. We display text without correction or regularization (e.g., end-of-line hyphens and obvious misspellings remain); to aid searches, regularized forms are encoded but suppressed in the default display.

Our practice for transcribing prose commentary written on a Whitman poetry manuscript differs according to the perceived relationship of the prose and poetry. If the prose is by Whitman or intercedes within the text, it is transcribed. Otherwise, it is noted but not transcribed (for example, Whitman drafted many poems on the backs of envelopes addressed to him by autograph seekers; in these cases, we note this information without fully transcribing the address and whatever other non-authorial writing might be present). When we render our transcriptions on the web, transcribed non-authorial prose appears verbatim in the notes. Our description of non-authorial prose that we have not transcribed also appears in the notes.

We first obtain a high-resolution digital image (TIFF scanned at 600 DPI or photographed at 3008 x 2000 pixels) and enter it into a tracking database. This high quality image is the basis for the transcription. To prepare it for web presentation, the TIFF image is cropped and then used to derive three different JPEG images of different sizes to accommodate various users' needs.

Transcription and encoding constitute the second step, which is completed by two Whitman Archive staff members. A staff member transcribes the text and validates the markup against the Archive's encoding standard. As the transcription is prepared, it is added to the tracking database, and subsequent editorial interaction with the manuscript image and transcription takes place through the tracking database files. A more senior staff member or editor then reviews, corrects, and verifies the encoding and transcription by proofreading against the image. He or she flags for review textual cruxes, uncertain encodings, etc.

Then, in consultation with a senior staff member, one of the general editors of the Archive performs a complete review of each poetry manuscript. This review includes proofreading the transcription and, when possible, resolving cruxes and uncertainties. The general editor works with staff members to date each manuscript and determine its relationship to other manuscripts and published writings. This information is included as metadata, and the updated file is then approved for publication.

Next, a senior editor publishes the poetry manuscript on the test server, reviews the manuscript's digital publication form, and, if necessary, modifies stylesheets (XSLT) for optimal presentation. The manuscript is then made available on the public site.

Following initial publication, one of the general editors again reviews and proofreads the public version. If revisions are deemed necessary, the public version is altered and updated.

B. Poetry Editions

The electronic versions of the print editions of Whitman's poetry have been prepared in collaboration with several institutions. Always our aim has been to reproduce the original printed volumes accurately.

As part of Major Authors on CD-ROM: Walt Whitman, edited by Ed Folsom and Kenneth M. Price, Primary Source Media (PSM) transcribed all six American editions of Leaves of Grass, plus a seventh text, the so-called deathbed edition. Their transcriptions were entered into a proprietary Borland database. At the request of the Whitman Archive, the Electronic Text Center at the University of Virginia stripped out PSM's proprietary encoding and replaced it with Text Encoding Initiative (TEI)-conformant Standard Generalized Markup Language (SGML) encoding. Later, the staff of the Whitman Archive converted the SGML files into eXtensible Markup Language (XML). These converted files were proofread against the high-quality scans of original print editions of Leaves of Grass that we received from special collections departments at the University of Virginia and the University of Iowa. We then published the corrected transcriptions, along with the page images.

C. Periodicals

Our process for editing Whitman's poems published in periodicals is similar to that for poetry manuscripts. The tracking database is used to manage images and coordinate transcription. The practice for publication of periodical poems varies from manuscripts in the following particulars: initial transcriptions may be based on a print original, a high-resolution digital image, or a microfilm facsimile. A staff member transcribes, encodes, proofreads, and supplies metadata for the text. A senior editor proofreads, reviews textual cruxes, and writes headnotes to individual periodicals. One of the general editors proofreads and critiques all of the work after it has been completed. The remaining publishing process matches that used for the poetry manuscripts.

For periodical printings, Archive staff have not attempted to replicate with our transcriptions the display of the newspapers or periodicals. Thus, centered titles and right aligned bylines, notes, and other prose information, have all been left-aligned for presentation on the Archive. To the extent possible, we have preserved the formatting of the poems, including indentation of poetic lines; line breaks in poetry are always encoded and represented. Users interested in the way typeface, ornamentation, and other aspects of layout may have affected the meaning of Whitman's periodical poems should consult the page images we supply.

Brief references to a poem within the same issue of a periodical, such as an editorial note that draws attention to a poem by Whitman, are not transcribed.

D. Translations

The translations section of the Archive features full-length translations of Whitman as well as many versions of Whitman's poem "Poets to Come." For both subsections of translations, we designate a lead editor, who establishes the goals and scope of the undertaking in consultation with the general editors.

For translations, we provide page images whenever possible along with transcriptions. In the transcriptions we do not attempt to capture the so-called bibliographic codes—the appearance of margins, fonts, and ornaments in the original printed documents. Most other features of the printed page are preserved: capitalization, hyphenation, punctuation (for French translations, this sometimes means preserving a space between a word and punctuation), and page breaks. Note that in the case of translations of "Poets to Come," page breaks are recorded in the XML/TEI file, but they are not displayed as such in the HTML display. Our electronic transcriptions preserve typographical errors present in the original; to aid in searching, corrected forms are also included in the encoding. When transliteration of non-Roman characters is necessary, as in describing the Russian editions available on the Archive, we follow the Library of Congress ALA-LC Romanization Tables.

E. Correspondence

The Whitman Archive's correspondence project, which presents Whitman's outgoing and incoming letters and letters of the Whitman family, brings together previously edited print material and freshly edited material that has never appeared in print. Archive staff have transcribed letters from digital scans of the original manuscripts and microfilm reproductions or from previously edited print volumes of correspondence. The source text for every transcription is identified for users.

Those letters for which the Archive has digital images have been freshly transcribed and edited, often for the first time. For now, we follow the practices of other editors of correspondence by remaining as unobtrusive as possible and presenting an inclusive text representing as nearly as possible a clean, reading version of the letter. We have not recorded deletions, noted authors' insertions, nor attempted to duplicate the appearance of the original holographs. We have also omitted metacommentary in the form of cues such as "(over)" that were relevant to the reader of the original letter as a physical object but are more distracting than helpful in an electronic environment. We have standardized the placement of salutations, signatures, and postscripts. In addition, we are in the process of transcribing and encoding letterhead for these documents. These decisions have been made on a pragmatic basis and to create consistency among the materials presented. As we secure more digital images of original letters, and as we have time, we will update our XML files and encode all deletions and insertions. In the future, Archive users will have an opportunity to choose between two different ways of viewing the correspondence, either as clean, reading versions or as diplomatic transcriptions.

In instances where the Archive presents transcriptions derived from earlier print volumes, both the source text and repository of the original manuscript are identified. Letter transcriptions derived from earlier print volumes reproduce the text of the letter as found in the source text. Although the Archive has drawn on print volumes in letter transcriptions, we do not claim nor intend to offer digital editions of these print volumes. Instead, we present the letters in a digital format, and the individual letter constitutes the root unit of every file.

Editorial content, including footnotes, is a combination of new information composed by Archive staff and existing material from print editions. We do not reproduce the print volume's editorial notes in cases where we have not been granted permission from the copyright holder. Material reproduced from print editions does not necessarily recreate the text or print environment of the original. Archive staff have attempted to avoid redundancies, such as identification of repositories in footnotes, and have therefore deleted some notes. In other instances, where new information has become available since the publication of the print editions, footnotes have been added or revised. Editorial material that appears prior to the text of a letter in a print volume has been moved to the first footnote following the letter transcription. In these instances, the language of the original editorial material has been silently revised (e.g., we have changed "the following letter" to "the above letter") and footnotes in leading material have been transcribed as parenthetical citations. Given these changes, users should not assume that footnote numbers, as they appear on the Whitman Archive, correspond with the numbers as they appear in the print volumes.

F. Scribal Documents

All documents are transcribed and encoded by staff of the Whitman Archive and are then checked by two or more members of the project staff. Prior to being published on the Archive all letters receive one or more rounds of additional review by senior Archive staff and project directors. Final rounds of checking assess the transcription and encoding as well as the HTML display.

Unlike documents treated for other parts of the Whitman Archive, which are more richly encoded, almost all of the text of the scribal documents has been encoded within anonymous block (<ab>) tags. This decision was made to expedite the encoding process, in order to make this vast trove of material available to users. At a later stage, we may enrich the encoding to mark place names or other named entities or to identify various structural features of the texts. With the exception of wholly-deleted documents, we have not recorded deletions, nor have we marked insertions for special treatment.

The documents are presented as transcriptions and facsimile page images. The process and specifications for obtaining page images and for presenting them on the Archive follow those for the poetry manuscripts. Although some documents are on loose leaves, most of the documents have been photographed or scanned from large, bound letter books that often include several letters on a single page. Currently, we make available only cropped images of individual documents. In the future, we may also provide full page images for greater context.

In the HTML display of the documents, we have not attempted to duplicate the appearance of the original holographs. For example, all text is left-justified, regardless of how it appears on the manuscript page. In addition, we display a short horizontal line to separate the text of the body of the document and the text of marginal annotations by Whitman and others. We have omitted metacommentary in the form of cues such as "(over)" and catchwords that were relevant to the reader of the original document as a physical object but are more distracting than helpful in an electronic environment. We display text without correction or regularization, but errors and idiosyncratic, antiquated, or other variant spellings are marked in the encoding.

F. Reviews

When editing reviews we record what appeared in the original source document. Any deviations from that original source—the insertion of an obviously omitted word or the alteration of spelling, for example—are marked by brackets. In addition, transcriptions of reviews include all authorial or editorial footnotes that appeared in the original document. Because we have not represented page breaks in our transcription and encoding of the reviews, authorial and editorial notes in the original appear at the end of the transcription, rather than at page breaks. In addition, footnotes in the original are often indicated by Arabic numerals. To distinguish these notes from editorial notes added by the Archive, we have changed the superscript numerals to asterisks (first footnote), daggers (second footnote), and double-daggers (third footnote). Similarly, in instances where the original review uses the same symbol to mark two or more footnotes appearing on different pages, we have replaced succeeding symbols of the same type with daggers (second footnote) and double-daggers (third footnote).

For the reviews, Archive staff have not attempted to replicate with our transcriptions the display or typographic features—typeface, ornamentation, and other aspects of layout—of the newspapers or periodicals. To the extent possible, we have preserved the formatting of the poems in the reviews, including indentation of poetic lines; line breaks in poetry are always encoded and represented.

G. Horace Traubel's With Walt Whitman in Camden

This section prioritizes access to the printed versions of Horace Traubel's record of his conversations with Whitman, titled With Walt Whitman in Camden and issued in nine volumes from 1906–1996 by various publishing houses. This collection is a major source for Whitman scholars, but because it is not strictly Whitman-authored, the Archive presents an accurate and complete transcription of the printed volumes rather than an extensively tagged version. Deeper encoding and integration with other resources of the Archive may be pursued in the future.

For this section of the Archive, we provide a digital facsimile of all pages containing an image (for example, a picture of one of Whitman's associates) but not of pages containing only text. Our policy is to record the printed page accurately. Capitalization and punctuation are preserved; quotations have been tagged with speakers' names to facilitate searching. The transcription and encoding processes are followed by silent proofreading of the transcription against the original source document by editorial assistants and another round of silent proofing of the file as displayed by the stylesheet, performed by the editor. Typographical errors deemed obvious are encoded as alternate forms for searching, but the original printed form is displayed.

H. Selected Criticism

This section prioritizes access to current scholarly work on Whitman and makes available selected current articles, monographs, and essay collections. In all cases, the Archive has received permission from the rights holders to publish an electronic edition of the work. This section also makes available some out-of-copyright commentary as time allows.

For criticism, the Archive privileges the content of the work and does not attempt to present design aspects of the original (font, spacing, ornaments, etc.). In addition, Archive staff have regularized the encoding and display of tables of contents (e.g. "Chapter 1. Historical Background," rather than "1 Historical Background"). Page breaks are not encoded, and page images are not provided. Obvious textual errors in the original are corrected in the electronic edition, and Archive staff track these changes and publish them in "Changes to Criticism Texts in the Electronic Editions." A link to this document appears at the end of every critical text that contains Archive corrections.

When possible, Archive staff members work from an electronic text (.txt) file of the original article or book, which has been provided by the author or publisher, or created from a PDF. Staff members also consult PDF page images of the print work, or the actual book or journal. A staff member adds the TEI markup to the text files. In the case where text files are not available, an Archive staff member first transcribes the critical text from the original and then encodes the document, again using page images or hard copy as a point of reference. Another Archive staff member then copyedits the text against the original, and a test version of the document is made available on the site. A senior member of the staff looks for textual and display inconsistencies and modifies the encoding or stylesheet (XSLT) as necessary. A general editor of the Archive then reviews the test version and either suggests changes or recommends publication on the Archive. The latter two steps are repeated until the general editor gives approval for publication.


Published Works | In Whitman's Hand | Life & Letters | Commentary | Resources | Pictures & Sound

Support the Archive | About the Archive

Distributed under a Creative Commons License. Ed Folsom & Kenneth M. Price, editors.