The Digital Commonplace Book

Commonplace Book from the 17th Century
Commonplace Book from the 17th Century

UPDATE: The first version of my digital commonplace book is this blog on

I just happened, in the last few days, on the concept of the commonplace book. It has been a revelation, and a project shall commence henceforth, inspired by this wondrous idea.

But I am ahead of myself.

A commonplace book is a notebook in which to keep interesting intellectual and information snippets, often quotes, excerpts from books, poetry, etc. It is a place for centralizing all the disparate information that one comes across in the course of life, for future reference. It is a little like a diary, but not really time ordered. Some people kept indexes by title or author or subject.

Now this made a lot of sense in a world where information was trapped in paper (or people’s heads!). It’s a response to general information overload, combined with our poor ability to recall facts, and the general paucity of information technology. You can’t carry a copy of every book you’ve ever read with you, and even if you could, it’s unsearchable. You need a summary, notes, boiled down to a manageable size. The early moderns probably found it quite an improvement on oral rote learning when they could afford it (paper was very expensive).

Now, our intellectual world is completely different. Paper costs next to nothing, and you can take all the notes you like. Moreover, we can store what is functionally unlimited amounts of text in machines.

Also, we have most of the world’s knowledge increasingly at our fingertips. Why hoard knowledge when you can just go pull it out of the ether at a whim?

And yet, I’ve been bothered for years by the feeling that something is missing in this information rich environment. I have books, and other printed paper, which sits on shelves and which I can reference if I think to look. I can write notes in notebooks (something I used to do extensively), but they soon become too cumbersome to find anything in. I forget what I’ve written, so will never go look, so I may as well have never written it.

On the other hand, Google will hand up general search results to me, but these are unknown quantities, like a crowd of strangers. That’s great when I’m researching something that I don’t know, and haven’t researched before. But, if I’m trying to recall and/or work into an area that I have covered before, one I know passingly well, I often don’t want a crowd of strangers, I want to recall work that I’ve seen before.

It’s that moment, when someone remarks on subject X, or I am writing about subject Y, and I think, hey, I’ve seen something relevant to this, it’s –hazy muddled recollection–, hang on it was an article, when did I see it? Some time of searching later, maybe I find it, usually I don’t. Failure is more likely the more time has passed since I originally saw/read the item.

There is so much to know. I have adopted the practice in recent years of consuming as much information as possible. I subscribe to many blogs and newsfeeds. I have many social network connections. I follow links in texts and read background material, I trace out the network of informational connections. When I have a conversation and it becomes clear that myself and my fellow conversers are ignorant on something important, I look it up online.

What I decided to do was to take in more information than I could handle usefully, and damn the torpedoes.

And it turns out there is a limit to what you can manage. I’m beyond it. I notice symptoms like my desire to post links to interesting articles out to others as soon as possible after I read something, because I know the information will fade from my mind, in fact I’ll forget the existence of the article. I often notice that people remind me of things I’ve said or believed in the past, and I’ve completely lost all recollection of that. It’s like an intellectual groundhog day. Maybe pumping too much too fast into ones mind can cause it to split at the seams, and then one can only try to pump faster than the system leaks?

There’s a reason people call information flow “the firehose”.

Really, the amount of information that smashes me in the face will just increase, while, if anything, my ability to handle it as an au naturelle human being will continue to decrease. So, I either continue with groundhog day (which becomes groundhog 20 hours, groundhog 16 hours, groundhog half day, etc). Or, I limit my intake of knowledge, which is really to doom myself to fail in the task of understanding my universe. Or, I use technology and augment.

Let’s augment.

The Problem

What my real problem here is, is that there are some separate realms where knowledge lives, and I’m having an integration issue.

The realms are these:

Internal knowledge: Stuff stored in the grey matter. Limited (I think maybe fixed) size, lossy. Quick to access. Poor resolution. Has links into External Knowledge That I’ve Seen Before.

External Knowledge That I’ve Seen Before: Books, articles, papers that I’ve read, conversations I’ve had, diagrams I’ve seen. Much of this I’ve at least partially forgotten. Even if I have hardly remembered it, it has shaped my thinking. My internal knowledge contains references into this set of knowledge, and I am inconsistent/incoherent where they no longer point to anything.

External knowledge that I’ve never seen before: Future offerings of the great Goog. It’s an ongoing quest to explore more of this undiscovered country. No links between this and the Internal Knowledge.

(nb: there is some cool latin terminology for knowledge stored inside the head vs outside, but I can’t remember where I saw it, tried googling it and failed. I read it in some book by Carl Sagan. This is *exactly* the problem I’m talking about)

The thing I’m interested in is External Knowledge That I’ve Seen Before. The crucial thing that separates it from the other external knowledge is the links between it and the Internal Knowledge. I’m not getting an upgrade of the Internal Knowledge hardware any time soon, so I need to maximise its effectiveness. I think having it filled with half baked bits and pieces linking out into nothingness is a complete waste; it’s like a piece of software that keeps throwing access violations.

This crystalised concept is what I was looking for when I wrote the post Infinite Bookshelf. Progress!

So how does this all relate back to the commonplace book?

Well, the commonplace book was also about this. You wrote this book to create solid links between Internal Knowledge and External Knowledge That I’ve Seen Before! So it isn’t just a book, it’s outsourcing part of your cognition. That’s exactly what I need to do.

However, I have different issues and opportunities. The issues are that the scale of the external knowledge spheres is much larger. Also, there’s a flow rate which changes what I can deal with; just like you can’t use a household water filter on a town’s water supply, you can’t use a paper book to outsource your Internal-External knowledge links. On the other hand, the opportunities are, that there’s much more knowledge, and so much more to gain. Also, computers are far more sophisticated than books, the networked mass of them even more so, so I can do something critical with a much larger collection of knowledge links; I can make it searchable. Crucially, for it to scale, it shouldn’t be designed to be browsed, like a book, but to be searched.

I need to invent something, the 21st century, digital commonplace book. It is a system that can manage internal-external knowledge links. It’s outsourced cognition.

The digital commonplace book

Firstly, how would I use it? Well, this breaks into two pieces, Storage and Retrieval.


– When I’m reading an online article, I should be able to put it into the commonplace book. This might be just via URL (but that may then need to draw in the article at that URL, for indexing purposes). Or, it might just be snippets from something I’m reading online, which could be cut and pasted in, along with URLs and any other useful metadata (Author? …)

– When I’m reading a paper book, I might find snippets that need to go in. I should be able to scan & OCR that text (or type it in in the extreme case, but bleh!). Then add some metadata (crucially, details of the book! ISBN would go a long way…)

– If I’m reading something large online, I should optionally be able to put the whole thing into the commonplace book, where it gets broken up and indexed.

– Storage needs to be quick and painless. The more work required, the less likely I am to do it. eg: with a book snippet, a handheld device with a hires camera and wireless net connection would be ideal; take a photo of the text, send that to the commonplace book, which OCRs it and stores the image, the OCRd text, plus any hand entered metadata. eg2: with web based information, a browser plugin could be grabbing and entering anything I read, with little or no intervention from me.

– Integrating storage with posting on social networks would be ideal. Sometimes things going into the book can also be social network posts. Sometimes social network posts should also go into the book.


– Basic retrieval should be by search. Straightforward google style search. It should only search my commonplace book, not other things (nothing from External knowledge that I’ve never seen before).

– Search retrieval ideally is at a paragraph level. A list of relevant paragraphs is retrieved, which then link back to the whole article in question. Searching directly on entire articles/pieces of information would limit the size of things that could go into the commonplace book; if I add an entire book, and a search just gives me a reference to the book, I still have too much work to do in browsing the book looking for the actual references.

– Categorization would be useful. Particularly, restricting searches to chosen categories might narrow things down when the book gets large. Text-based tags would be good enough. Note though that the use of categories/tags imposes a burden of assigning them on the Storage step.

– A browser plugin that continuously put text from the screen into the search, and showed strong matches in some kind of sidebar, should be a pretty strong external associative memory function. Nice to have.

– Date based criteria would be useful

Implementing a first Digital Commonplace Book

I realize I’ve already had a bash at this. This blog, in fact, was a first, less clear attempt at the concept. However, it isn’t a commonplace book, because blog posts (in this blog) are too much about a serious piece of writing, and not about just throwing interesting tidbits into the pot.

Also, my social network posting (currently at record levels I think, apologies to my network), have been an attempt to remember (and perhaps to load up my social network with my ideas, so later on they might be thrown back at me). I sometimes post a link on facebook when I intend to read it later in the day. But in the end the stream of facebook posts sort of go into a mostly non-useful swill bucket. Facebook is such a crappy walled garden, gah.

A blog is close to what I’m looking for.

I noticed a wonderful thing the other day. When I posted on my blog, then immediately searched on google, there was the blog post! It’s google’s new realtime web functionality.

Combined with the fact that you can restrict a google search to a single blog, this turns a blog into a wonderful free full text database with absolutely brilliant text-search functionality (it’s google search!). Provided, that is, that you are ok with the contents of said database being freely accessible. Which I am.

That’s the heart of a digital commonplace book. If I were to use a blog, I’d need to abuse the concept slightly; given a text to go into the “blog”, I’d want to break it into paragraphs, add each paragraph as a post of its own, all with links back to the original (which also needs to be added I guess). Some fun and games with categories might be useful so you can tell paragraph chunks from original texts.

The storage functions would require custom apps. Google AppEngine would help here, as would some firefox plugin development. I might also want a server somewhere (Amazon? Or just under my desk?) taking scanned text, OCRing it, and submitting the results to the blog.

Basic retrieval can be based on a simple search box on the blog as can just be added in as a widget. And, to do more sophisticated things, custom code can talk to google to get search results, then massage them into the shape desired.

A blog with a solid API seems to be the thing. Blogger would be a good bet, and as all the rest of the tools come from google, it might integrate better too.

I’ll need to check if there are limits on the size of a blogger blog, or posting limits.

Well, I’m out of steam for this post. Great idea though, I think, and I can implement it incrementally.

The Digital Commonplace Book

4 thoughts on “The Digital Commonplace Book

  1. Kat says:

    I struggle with the same challenges, being a relentless infovore. I keep a personal and a public wiki, but both suffer from lack of maintenance. I love the flexibility of organization it gives, but I need to figure out how to make the process a little more automatic so I’ll actually *use* it…

    1. emlyn says:

      I have a lot of digital detritus too 🙂

      I’m finding that the more I use computers et al for my information consumption, the more I maintain things. eg: I keep a reading list, more or less complete for a while now, in a google document (there’s a link on this blog to it). It’s staying up to date because I read as far as possible on a little netbook, so the act of reading and of updating the record uses the same device.

      Something I didn’t mention in the post was that one thing I’m assuming is ever more convenient access to digital information, particularly smartphones and/or tablets. I think if I’m reading on a touchscreen tablet, say, and can press a couple of buttons to inject that reading material into my commonplace book, it’ll actually happen. It needs to be integrated smoothly into the reading experience, which is a software development challenge for me 🙂

  2. Scott says:

    I just found this post and two days ago discovered the concept of the commonplace book. When I did I searched for more information and discovered that the English physician John Locke invented an indexing system for commonplace books that was widely publicized and adopted.

    There were several methods of writing them with various advantages and drawbacks. One of the problems that most other systems had was wasted paper in a book or inadequate allocation of space beforehand. His system all but solved these problems.

    But his system also helped to create associations among the various pieces of information in the book by grouping similar ideas together. Computers don’t suffer the constraints of space paper books do, but we can still profit from the grouping of similar ideas together.

    In Locke’s system, the index for the book was on the last two facing pages of the book. Each entry in the book was given a heading, usually in Latin as was common at the time, and then he used a simple device to categorize the entry based on its constituent characters. Each category consisted of a two-character code. The first character was derived from the first character of the entry heading. The second character was assigned by finding the first vowel after the first character in the heading. So, if an entry were titled “Class Warfare,” the classification code would be ca.

    When an unused class code was being employed for the first time, he would turn to an unused set of blank facing pages and write the code at the upper left and then write his entry. Afterward, he would turn to his index and write a page number next to the ca classification code to indicate that entries matching that code could be found on that page number.

    In this way, he was able to quickly find any entry for any topic. I have adapted this idea to the computer with the use of plain text files and a bash shell script. In my implementation, the commonplace book is stored in a Dropbox folder. Each classification code is stored as a subfolder. A single text file with a .txt extension is placed within this subfolder when creating an entry matching that code for the first time.

    The files themselves are a blend of Microsoft Notepad .LOG files with markdown syntax. Instead of Notepad, I use Vim and programmed it with behavior that mimics Notepad when it loads a .LOG file; it automatically appends the time and date to each entry, which is given a markdown-formatted header.

    When I am done creating my entry, my bash program re-indexes my book, creating a top-level text file called index.txt containing the names of all used categories together with their accumulated entry headers.

    On my phone or tablet I can quickly open the index file to see what entries are in my book and almost as quickly open the file containing the entries I want to review.

    Several articles I came across suggested using Microsoft Word or OpenOffice/LibreOffice. I resisted doing this. Commonplace books have been a source of scholarly study, offering insight into the mind of the writer that their writings for publication did not. Paper and ink work just as well today as they did in the seventeenth century, but the word processing file formats popular even just twenty years ago cannot be read today by most software. Plain text files from as long ago as the 1960s are also just as readable today using any viewing or editing program as they were the day they were written.

    I store my plain text encoded in UTF-8 formatted files. My system is deliberately simple and flexible. Any program that can edit text can integrate with my system.

    The benefits of this practice should accumulate over time, but I am already gaining better knowledge and understanding from it. For my first book I chose the topic of politics and am reading extensively on-line about the various candidates and issues building to the 2016 election.

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s