## Cracking the ISBN code

February 11th, 2011 at 10:53 pm (Books)

There are all sorts of ways to look up a book in the library: by author, by title, by subject, by call number… But today, a patron approached me with only the book’s ISBN and asked me to locate it for her. Fortunately, the catalog system does permit searches on ISBNs (and even better, the computer had a numeric keypad to make entering it easy), and I quickly found the book, got the call number, walked the patron through the shelves, and whipped out the book, as if by magic. :) As I was leaving, she asked, “What is this thing, this ISBN?” and I realized that I didn’t actually know anything about ISBNs as a concept, not even what the acronym stood for.

An ISBN is an International Standard Book Number. The original (non-international) Standard Book Numbers, created in 1966 for booksellers, used 9 digits. Its derivative, the ISBN, has grown from 10 to 13 digits, as ISO, the international body that governs such things, noticed that more and more books were being created.

Even more interesting, the 13-digit ISBN has structure. It consists of:

- a GS1 prefix (978 means book publishing)
- a group identifier, which seems to indicate its country of origin
- a publisher code (one of 628,000)
- an item number
- a checksum or check digit

Take for example the mouth-watering book Geographies of Mars. The ISBN found on its back cover is 978-0-226-47078-8. We know therefore that this is a book (978), from an English-speaking country (0), by publisher 226 (University of Chicago Press), and it is item 47078. Because the publisher code takes 3 digits, by convention the item gets 5 digits.

The checksum, ah, now that’s fun. Checksums are commonly used to detect whether or not something was lost in transmission. A checksum value is computed over a block of data before it is sent, and then the receiver can compute the same value on what was actually received and see if it matches. If not, it requests that the data be re-sent. (Note that the error could have occurred in the data or in the checksum itself, which is also transmitted; either way, it’s safest to retransmit the data.)

Here, the checksum (digit) is computed over the first 12 digits in the number. This allows bar code scanners to confirm that they’ve correctly scanned the number, or to rescan it if not. (Or a manual entry system to alert the user if they type in an invalid ISBN.) The algorithm for the ISBN checksum reads like an exercise for an introductory CS course (oh wait, someone already thought of that, and someone else, and someone else…):

Each digit, from left to right, is alternately multiplied by 1 or 3, then those products are summed modulo 10 to give a value ranging from 0 to 9. Subtracted from 10, that leaves a result from 1 to 10. A zero (0) replaces a ten (10), so, in all cases, a single check digit results. [from wikipedia]

So for my example book, we have 9+7*3+8+0*3+2+2*3+6+4*3+7+0*3+7+8*3 = 102. Then 102 mod 10 is 2, which we subtract from 10 to get 8. And sure enough, the final digit in the ISBN is… 8! Checksum complete!

And now I’ll be able to answer that question about ISBNs promptly, should it ever come up again. Next up: memorizing the Dewey Decimal System.

## Daniel said,

February 17, 2011 at 2:06 pm

(Learned something new!)Wonderful. I always wondered what the ISBN codes meant. I’ll try it out on some books of mine.

## Jon S said,

February 21, 2011 at 10:26 pm

(Learned something new!)Holy cow!

I have to deal with ISBNs all the time when I edit Wikipedia (where you have to meticulously site sources…). But it never occurred to me that there was order to the ISBN number!