All of human knowledge in ten categories
March 26th, 2014 at 4:25 pm (Library School)
How can you organize all of human knowledge? Or at least the parts that people put into books, movies, CDs, ebooks, and other media?
I find the subject classification systems used by library catalogers fascinating from this perspective. What a daunting challenge, to come up with an ontology that is both sufficiently comprehensive yet not overwhelming, and simultaneously something that everyone else will agree with. The Dewey Decimal Classification (DDC) system was invented by Melvil Dewey in 1873 and it is *still* in use by libraries (albeit with updates and modifications). It is still used despite general recognition that it is exceedingly Eurocentric and exhibits other biases — but it now has the weight of history behind it, and changing your subject classification scheme is a huge endeavor, and no one else has come up with something better.
Or have they? In 1897, Herbert Putnam came up with a different ontology (LCC, Library of Congress Classification). Both systems are now maintained by the Library of Congress. Public and school libraries mainly use DDC, while academic and government libraries use LCC. Why?
From What’s so great about the Dewey Decimal System?:
The organization of the LC was primarily focused on the needs of Congress, and secondarily towards other government departments, agencies, scholars, etc. So more space is allowed for history (classes C to L) than for science/technology (Q to V). More important, the focus on the needs of Congress means the LCC pays less attention to non-Western literature, and has no classifications for fiction or poetry.
[…]
DDC uses fewer categories and sub-classifications and is consistent across disciplines, while LCC is more highly subdivided with no consistency between disciplines. It’s understandable, therefore, that DDC has proven more useful to libraries catering to a wide range of needs such as public libraries and schools, while LCC is more widely used in libraries focused more on technical areas like colleges, universities, and government.
Turns out that they’re both Eurocentric (or even America-centric) and infused with biases about the relative importance of different topics. For example, let’s look at the top-level division of the DDC. As a decimal system, it has ten categories available at each level. If you were to divide all of human knowledge into ten categories, what would you choose?
Here’s what Dewey did:
000 Computer science, information & general works
100 Philosophy & psychology
200 Religion
300 Social sciences
400 Language
500 Science
600 Technology
700 Arts & recreation
800 Literature
900 History & geography
Or actually, that’s what his system has evolved to now. Obviously Dewey had no concept of “computer science.” In fact, 000 feels more like a “Misc” category. What is CS? The Library of Congress must have thought it didn’t quite fit under 500 (Science) or 600 (Technology). You can browse more here: Dewey Decimal classes.
I’m wondering what a content-based analysis (e.g., clustering) of a large collection of books would create. How would such a hierarchy differ from Dewey’s or Putnam’s? Google, tell us!