All of human knowledge in ten categories

How can you organize all of human knowledge? Or at least the parts that people put into books, movies, CDs, ebooks, and other media?

I find the subject classification systems used by library catalogers fascinating from this perspective. What a daunting challenge, to come up with an ontology that is both sufficiently comprehensive yet not overwhelming, and simultaneously something that everyone else will agree with. The Dewey Decimal Classification (DDC) system was invented by Melvil Dewey in 1873 and it is *still* in use by libraries (albeit with updates and modifications). It is still used despite general recognition that it is exceedingly Eurocentric and exhibits other biases — but it now has the weight of history behind it, and changing your subject classification scheme is a huge endeavor, and no one else has come up with something better.

Or have they? In 1897, Herbert Putnam came up with a different ontology (LCC, Library of Congress Classification). Both systems are now maintained by the Library of Congress. Public and school libraries mainly use DDC, while academic and government libraries use LCC. Why?

From What’s so great about the Dewey Decimal System?:

The organization of the LC was primarily focused on the needs of Congress, and secondarily towards other government departments, agencies, scholars, etc. So more space is allowed for history (classes C to L) than for science/technology (Q to V). More important, the focus on the needs of Congress means the LCC pays less attention to non-Western literature, and has no classifications for fiction or poetry.
[…]
DDC uses fewer categories and sub-classifications and is consistent across disciplines, while LCC is more highly subdivided with no consistency between disciplines. It’s understandable, therefore, that DDC has proven more useful to libraries catering to a wide range of needs such as public libraries and schools, while LCC is more widely used in libraries focused more on technical areas like colleges, universities, and government.

Turns out that they’re both Eurocentric (or even America-centric) and infused with biases about the relative importance of different topics. For example, let’s look at the top-level division of the DDC. As a decimal system, it has ten categories available at each level. If you were to divide all of human knowledge into ten categories, what would you choose?

Here’s what Dewey did:

000 Computer science, information & general works
100 Philosophy & psychology
200 Religion
300 Social sciences
400 Language
500 Science
600 Technology
700 Arts & recreation
800 Literature
900 History & geography

Or actually, that’s what his system has evolved to now. Obviously Dewey had no concept of “computer science.” In fact, 000 feels more like a “Misc” category. What is CS? The Library of Congress must have thought it didn’t quite fit under 500 (Science) or 600 (Technology). You can browse more here: Dewey Decimal classes.

I’m wondering what a content-based analysis (e.g., clustering) of a large collection of books would create. How would such a hierarchy differ from Dewey’s or Putnam’s? Google, tell us!

The bikini bridge and other social objects

In The Participatory Museum, Nina Simon discusses the “social object,” which seems to be a term coined in 2005 to mean “conversation piece.” (I prefer “conversation piece”, because to me “social object” sounds like an object that is social rather than something that has a social function.) These are items that spark conversation, like dogs or babies or a bizarre hat. They provide easy entry points to human interaction that may be less threatening than directly initiating a conversation.

They may also be curious or controversial sculptures, websites, or memes — things (not necessarily physical) that get crowds of people talking. A recent example I encountered is the bikini bridge meme.

In this case, the meme was deliberately fabricated by 4chan, but once they got the ball rolling, the word quickly spread throughout the internet. Arguably, the social object here was the hashtag: #BikiniBridge2014.

Simon lists four ways that objects can be social: make a personal connection (e.g., an Erector set invites someone to relate a story about *their* first set), impose physically (e.g., a car crash nearby), provoke a response (e.g., graffiti on a wall), or create interactions (e.g., a football). The bikini bridge is definitely provocative (responses range from people who think they’re sexy to people who think the idea is yet another way to objectify women), and for many, also personal (e.g., those who posted a selfie to share their own bikini bridge with the world).

At JPL, we make use of social objects to connect with people outside the lab. Speakers often bring a life-size replica of one of the Mars Science Laboratory’s wheels to let people experience for themselves how big they are and examine the design up close.

I can think of several social objects that inspired me to interact with others just in the past week:

  • a purple origami necklace in the shape of a rocket
  • a USB flash drive shaped like a storm trooper
  • a curiously shaped iPhone case that turned out to be created by a 3D printer

… and the entire poster session at the Lunar and Planetary Science Conference (LPSC), filled with more than 600 posters, was a smorgasbord of social objects, deliberately created to invite interaction!

Perhaps research posters could borrow ideas from Simon’s suggestions about how to make museum/display objects more social:

  1. Ask visitors questions: The goal is for the visitor to personally engage with the exhibit (poster). Perhaps questions like “when did you first see a solar eclipse?” I’ve yet to see an interactive poster that allowed you to post or write in contributions as a visitor, but it might be fun to experiment with!
  2. Provide live interpretation: This is already a built-in feature of poster sessions. When the presenter is present, that is.
  3. Make it provocative: Everyone loves a controversy!
  4. Offer visitors ways to share: Create your own hashtag? Microblogging was rampant at LPSC. More pedestrian: hand out business cards or printouts of the poster.

What’s your favorite social object?

Cataloging on the edge

The first major assignment for my Cataloging class was to round up 20 books and create catalog entries for them. Any books, so long as no more than three were “literature.” After getting stuck for a while on trying to decide what exactly “literature” was, I settled on my books (mostly non-fiction, which apparently was the goal), and dove in.

This was hard.

This was hard because there are no good resources out there (that I know of, or that my class knows of) for exactly how to “catalog a book.” This astonished me, since a system that allows many many people to contribute data is exquisitely vulnerable to any inconsistencies in how those records are created. Surely there are standard rules for what information to include, where to find it, and how to express it?

Kind of.

The currently cataloging ruleset, RDA (Resource Description and Access), sets forth guidelines about what kind of content should go into a bibliographic record, but not how to format it. RDA seeks to implement FRBR (Functional Requirements for Bibliographic Records), which is a statement of cataloging philosophy and what user needs are out there. FRBR also contains an entity-relationship diagram that traces out how works, creators, and subjects are (or should be?) connected. FRBR is silent on how to actually create a record, though.

Further, no real system out there actually implements FRBR yet, and even RDA only spells out a partial path to it (parts of RDA are not yet defined, like what kind of relationships between subjects should be captured).

In the meantime, real systems use something called MARC (MAchine-Readable Cataloging) to encode bibliographic records. So that’s what we used to catalog our 20 books. MARC provides some guidelines about formatting (e.g., when to end a field with a period and what field separators to use) but is silent on other aspects like capitalization and bigger questions like where to get the required information from. For example, how do you go about extracting the publication date from a book? How should you express the author’s name?

Here’s where the assignment gets pedagogically interesting, for two reasons.

First, we were operating at the “pleasantly frustrating” level. James Paul Gee listed this as an effectively learning principle in his guide to “Good Video Games and Good Learning.” He suggested that a good learning challenge stays within, but at the edge of, the student’s “regime of competence.” We weren’t just executing a set of well understood rules; instead, there is a lot of ambiguity and nuance, and each question pushed us to dig deeper.

Second, we were working with books in the wild. I gather that most cataloging professors assign their students the same set of books to practice cataloging on. The real answer is known, any questions or gotchas have already been anticipated, and the result is a controlled, sandbox experience.

My professor instead flung the doors wide open and let us each pick our own 20 books, without any sense of what would turn out to be easy or hard to catalog. The result was a chaotic, challenging, and ultimately far more educational experience.

This approach only worked because we had a discussion forum and a professor who monitored it assiduously. Students plastered the forum with questions. “What if the book is a translation?” “What if the pages aren’t numbered?” “What if there are multiple publishers?” Our professor responded quickly to every question, and over time I realized that I was quite possibly learning more from the forum than from my own small set of 20 books. With 88 students, we had something like 1700 books being catalogued (some are duplicates), and the array of issues that came up was dazzling. It was great to have the practice of actually creating my own records (and hunting down resources to allow me to deal with my books’ issues), but it was also fantastic to get to eavesdrop on my classmates’ questions and learn vicariously through them.

In that assignment, we only had to create fields for each book’s title, publisher, publication date, etc. The next assignment had us add the authorized form of the author’s name, and we are just about to revisit our records again to add appropriate subject headings. Each iteration makes our records richer and increases our understanding of the cataloging process. And I have to applaud Prof. Mary Bolin for structuring the process in such an interesting and valuable way.

Plate tectonics on Europa

This just in: Europa, the moon of many mysteries, has been declared to have the icy equivalent of plate tectonics.

It’s been known for decades that Europa’s icy surface moves around, because we see bands that cut across pre-existing features. These are dubbed “spreading” or “dilational” bands because if you roll back time by removing them, the earlier features line up in their (presumed) prior orientation like puzzle pieces. For example:


(from Prockter et al., 2002, JGR, 107(E5), 10.1029/2000JE001458)

But if there are areas where the ice is spreading apart, then one of two things must be true:

  1. There are other areas that consume ice so the total surface area is constant.
  2. Europa is expanding.

On the Earth, new crust is created at the mid-ocean ridges. It moves outward and then is consumed at subduction zones (like the coast of California). But to date, no one has seen anything like that on Europa.

At this year’s Lunar and Planetary Science conference, Simon Kattenhorn and Louise Prockter presented the first evidence for a subduction zone on Europa. It’s curiously curved, and they’ve only found one so far, but it could provide the missing part of the story of how Europa’s surface changes over time. There are some remaining details to work out (ice can’t subduct exactly the way crust does on Earth), but it’s certainly intriguing!

You can read Kattenhorn and Prockter’s two-page abstract here: Subduction on Europa: The case for plate tectonics in the ice shell. For a very nice, accessible discussion of the context and importance of this work, I recommend Emily Lakdawalla’s discussion (and images).

Participatory exhibits

We’re now reading The Participatory Museum (by Nina Simon) for my class on Maker Spaces. This book (freely available! and you’re encouraged to read participatively, too!) advocates for innovation in creating truly valuable participatory experiences for museum visitors. That means going beyond a comment card or a build-your-own-X to first ask questions like:

  1. What would the visitor personally gain by participating? (personal)
  2. What would the visitor gain by having other visitors participate? (social)
  3. What would the museum gain by having visitors participate? Does it align with the museum’s goals or is it just entertainment? (institutional)

In retrospect these seem obvious. But how often are they employed?

Developing a participatory experience, Simon argues, “doesn’t require flashy theaters or blockbuster exhibits. It requires institutions that have genuine respect for and interest in the experiences, stories, and abilities of visitors.”

I was struck by this comment. I think it often happens that interactive exhibits are viewed as eye candy or entertainment, something to draw people in but perhaps not as serious or contentful as a static, traditional display. This view equates interaction with condescension, e.g., “You’re not smart enough or serious enough to focus on the real stuff, so we’ll entertain you instead.” This statement turns that around by instead equating interaction with respect, e.g., “We want to get your input because it will enrich what is already here.”

I also appreciated Simon’s discussion of how not everyone wants to participate by being a content creator (creating a video or a poster or an essay or…). She identifies five ways people can participate: creating, critiquing, collecting, joining, and spectating. She is also quick to assert that there is no moral hierarchy in these different modes of participation. Some people are driven to create, while others prefer to spectate. Your desired role likely changes depending on the topic and venue. And that’s okay.

She notes that content creators are in the minority, for a variety of reasons that you can probably guess off the top of your head. But she makes a strong case for the importance of other kinds of contributions. You may personally have benefited from movie, restaurant, or book ratings by previous consumers; from connecting with old friends on Facebook even if they do not post daily status updates; or from having an audience for your blog, whether or not the audience leaves comments.

There is some pushback against the transformation of the traditional static, hushed environment of the museum into something more participatory. Dobrzynski argues that museums will lose their current identity and that participatory experience will change “who goes to museums and for what”. Simon addressed this point as well: participatory experiences may only appeal to some, but traditional museum experiences are similarly focused on only a sub-population. The most successful museums will integrate elements of both.

So how do you create a meaningful participatory experience? Simon suggests that constraints are your friend: they lower the barrier to entry. Nothing is more daunting than being asked to write a story on a blank sheet of paper. But anyone can do Mad Libs (and it’s fun!). The motivational effect of a constrained art form has been celebrated from the haiku to the sonnet, and it applies here too. Finally, Simon emphasizes the importance of giving participants feedback. How will their contributions be used? Displayed? Shared? Can you send an email when their work goes “live”?

I’ve mentioned the Idea Box before as an example of participatory creation, but it bears another mention here.

I’m finding this book to be engaging and exciting. It makes me want to go out there start participating… maybe by creating my own participatory event. Like a soldering workshop for Kids Building Things. :)

Older entries »