History of Information Retrieval

Jump to:

Tables of contents | Alphabetization | Hierarchies of information | Indexes in history

The papyrus scroll used by the ancient Greeks and Romans was not the most efficient way of storing information in a written form and of retrieving it. Yet, as Greek and Roman scholars began to write large works that were compilations of data of various sorts, they found it useful to devise various means of organizing the material to make locating certain passages easier for the reader. Here are a few examples of what they did.

Tables of contents

Pliny the Elder (died 79 A.D.) wrote a massive work called The Natural History in 37 Books. It was a kind of encyclopedia that comprised information on a wide range of subjects. In order to make it a bit more user friendly, the entire first book of the work is nothing more than a gigantic table of contents in which he lists, book by book, the various subjects discussed. He even appended to each list of items for each book his list of Greek and Roman authors used in compiling the information for that book. He indicates in the very end of his preface to the entire work that this practice was first employed in Latin literature by Valerius Soranus, who lived during the last part of the second century B.C. and the first part of the first century B.C. Pliny’s statement that Soranus was the first in Latin literature to do this indicates that it must have already been practiced by Greek writers.


One method of information organization which we take for granted nowadays, namely alphabetization, was probably first devised by Greek scholars of the third century B.C. at the library of Alexandria in Egypt in order to help them organize the growing numbers of Greek literary works. If I recall correctly, the subject of alphabetization and its use in classical antiquity was treated years ago in a little monograph by Lloyd Daly.

Hierarchies of information

There are a few other ancient works which employed arranging material under headings in order to make the writing more user friendly and easier to consult.

Valerius Maximus wrote a collection of memorable deeds and sayings ca.30 A.D. The work is divided into nine books, and each book is subdivided into chapters, and each chapter has its own heading, and all entries within that chapter contain anecdotes taken from ancient literature and history which illustrate that theme.

Marcus Julius Frontinus, a Roman senator of the late first century A.D. and early second century A.D., wrote a book of military strategems in four books. Each book concerns itself with a specific area of warfare. Each book is then subdivided into chapters that each address one specific aspect of the book’s major theme. Each chapter has a heading to clue the reader, and the chapter itself consists of brief extracts taken from historical works that illustrate the practical application of the topic.

Finally, Aulus Gellius wrote a work entitled The Attic Nights ca. 160 A.D. in 20 books. The work is a crazy quilt assortment of items on Greek and Roman history, philosophy, grammar, rhetoric, and antiquarian material in general. Since the work was composed with no real order but as the various topics occurred to the author, each chapter of every book concerns an isolated subject, and this subject is clearly spelled out in a title heading that stands at the beginning of the chapter. A reader could therefore skim through a book and locate the subject by glancing over the titles of the chapters.

A brief but good discussion of the problems of ancient scholarship posed by the use of the papyrus scroll can be found on pp. 101-116 of Varro the Scholar, by Jens Erik Skydsgaard, published in 1968 in the series Analecta Romana Instituti Danici.

Indexes in history

(from Hans Wellisch’s Indexing from A to Z, H.W. Wilson Co., 1991)

Book indexes. Members of the societies of indexers may well take pride in the fact that this sense of index is indeed the oldest among the figurative or applied senses of the word, and that this specific usage (like the word itself) goes back to ancient Rome. There, when used in relation to literary works, the term index was used for the little slip attached to papyrus scrolls on which the title of the work (and sometimes also the name of the author) was written so that each scroll on the shelves could be easily identified without having to pull them out for inspection. “… ut [librarioli] sumant membranulam, ex qua indices fiant, quos vos Graeci … sillybus appelatis” (so that [the copyists] may take some bits of parchment to make title slips from them, which you Greeks call sillybus) (Cicero, Atticus, 4.41.1). From this developed the usage of index for the title of books: “Sunt duo libelli diverso titulo, alteri ‘gladius’, alteri ‘pugio’ index erat” (There are two books with different titles, one called “The sword”, the other having the title “The dagger”) (Suetonius, Caligula, 49.3) Those two books, by the way, were what we would call today “hit lists” of people whom Caligula wished to have assassinated shortly before that same fate befell him. At about the same time, in the first century A.D., the meaning of the word was extended from “title” to a table of contents or a list of chapters (sometimes with a brief abstract of their contents) and hence to a bibliographical list or catalog…

However, indexes in the modern sense, giving exact locations of names and subjects in a book, were not compiled in antiquity, and only very few seem to have been made before the age of printing. There are several reasons for this. First, as long as books were written in the form of scrolls, there were neither page nor leaf numbers not line counts (as we have them now for classical texts). Also, even had there been such numerical indicators, it would have been impractical to append an index giving exact references, because in order for a reader to consult the index, the scroll would have to be unrolled to the very end and then to be rolled back to the relevant page. (Whoever has had to read a book available only on microfilm, the modern successor of the papyrus scroll, will have experienced how difficult and inconvenient it is to go from the index to the text.) Second, even though popular works were written in many copies (sometimes up to several hundreds),no two of them would be exactly the same, so that an index could at best have been made to chapters or paragraphs, but not to exact pages. Yet such a division of texts was rarely done (the one we have now for classical texts is mostly the work of medieval and Renaissance scholars). Only the invention of printing around 1450 made it possible to produce identical copies of books in large numbers, so that soon afterwards the first indexes began to be compiled, especially those to books of reference, such as herbals. (pages 164-166)

Index entries were not always alphabetized by considering every letter in a word from beginning to end, as people are wont to do today. Most early indexes were arranged only by the first letter of the first word, the rest being left in no particular order at all. Gradually, alphabetization advanced to an arrangement by the first syllable, that is, the first two or three letters, the rest of an entry still being left unordered. Only very few indexes compiled in the 16th and early 17th centuries had fully alphabetized entries, but by the 18th century full alphabetization became the rule… (p. 136)

(For more information on the subject of indexes, please see Professor Wellisch’s Indexing from A to Z, which contains an account of an indexer being punished by having his ears lopped off, a history of narrative indexing, an essay on the zen of indexing, and much more. Please, if you quote from this page, CREDIT THE AUTHOR. Thanks.)

Indexes go way back beyond the 17th century. The Gerardes Herbal from the 1590s had several fascinating indexes according to Hilary Calvert. Barbara Cohen writes that the alphabetical listing in the earliest ones only went as far as the first letter of the entry… no one thought at first to index each entry in either letter-by-letter or word-by-word order. Maja-Lisa writes that Peter Heylyn’s 1652 Cosmographie in Four Bookes includes a series of tables at the end. They are alphabetical indexes and he prefaces them with “Short Tables may not seeme proportionalble to so long a Work, expecially in an Age wherein there are so many that pretend to learning, who study more the Index then they do the Book.”

