UPDATE: The first version of my digital commonplace book is this blog on blogger.com: http://edcpb1.blogspot.com
I just happened, in the last few days, on the concept of the commonplace book. It has been a revelation, and a project shall commence henceforth, inspired by this wondrous idea.
But I am ahead of myself.
A commonplace book is a notebook in which to keep interesting intellectual and information snippets, often quotes, excerpts from books, poetry, etc. It is a place for centralizing all the disparate information that one comes across in the course of life, for future reference. It is a little like a diary, but not really time ordered. Some people kept indexes by title or author or subject.
Now this made a lot of sense in a world where information was trapped in paper (or people’s heads!). It’s a response to general information overload, combined with our poor ability to recall facts, and the general paucity of information technology. You can’t carry a copy of every book you’ve ever read with you, and even if you could, it’s unsearchable. You need a summary, notes, boiled down to a manageable size. The early moderns probably found it quite an improvement on oral rote learning when they could afford it (paper was very expensive).
Now, our intellectual world is completely different. Paper costs next to nothing, and you can take all the notes you like. Moreover, we can store what is functionally unlimited amounts of text in machines.
Also, we have most of the world’s knowledge increasingly at our fingertips. Why hoard knowledge when you can just go pull it out of the ether at a whim?
And yet, I’ve been bothered for years by the feeling that something is missing in this information rich environment. I have books, and other printed paper, which sits on shelves and which I can reference if I think to look. I can write notes in notebooks (something I used to do extensively), but they soon become too cumbersome to find anything in. I forget what I’ve written, so will never go look, so I may as well have never written it.
On the other hand, Google will hand up general search results to me, but these are unknown quantities, like a crowd of strangers. That’s great when I’m researching something that I don’t know, and haven’t researched before. But, if I’m trying to recall and/or work into an area that I have covered before, one I know passingly well, I often don’t want a crowd of strangers, I want to recall work that I’ve seen before.
It’s that moment, when someone remarks on subject X, or I am writing about subject Y, and I think, hey, I’ve seen something relevant to this, it’s –hazy muddled recollection–, hang on it was an article, when did I see it? Some time of searching later, maybe I find it, usually I don’t. Failure is more likely the more time has passed since I originally saw/read the item.
There is so much to know. I have adopted the practice in recent years of consuming as much information as possible. I subscribe to many blogs and newsfeeds. I have many social network connections. I follow links in texts and read background material, I trace out the network of informational connections. When I have a conversation and it becomes clear that myself and my fellow conversers are ignorant on something important, I look it up online.
What I decided to do was to take in more information than I could handle usefully, and damn the torpedoes.
And it turns out there is a limit to what you can manage. I’m beyond it. I notice symptoms like my desire to post links to interesting articles out to others as soon as possible after I read something, because I know the information will fade from my mind, in fact I’ll forget the existence of the article. I often notice that people remind me of things I’ve said or believed in the past, and I’ve completely lost all recollection of that. It’s like an intellectual groundhog day. Maybe pumping too much too fast into ones mind can cause it to split at the seams, and then one can only try to pump faster than the system leaks?
There’s a reason people call information flow “the firehose”.
Really, the amount of information that smashes me in the face will just increase, while, if anything, my ability to handle it as an au naturelle human being will continue to decrease. So, I either continue with groundhog day (which becomes groundhog 20 hours, groundhog 16 hours, groundhog half day, etc). Or, I limit my intake of knowledge, which is really to doom myself to fail in the task of understanding my universe. Or, I use technology and augment.
What my real problem here is, is that there are some separate realms where knowledge lives, and I’m having an integration issue.
The realms are these:
Internal knowledge: Stuff stored in the grey matter. Limited (I think maybe fixed) size, lossy. Quick to access. Poor resolution. Has links into External Knowledge That I’ve Seen Before.
External Knowledge That I’ve Seen Before: Books, articles, papers that I’ve read, conversations I’ve had, diagrams I’ve seen. Much of this I’ve at least partially forgotten. Even if I have hardly remembered it, it has shaped my thinking. My internal knowledge contains references into this set of knowledge, and I am inconsistent/incoherent where they no longer point to anything.
External knowledge that I’ve never seen before: Future offerings of the great Goog. It’s an ongoing quest to explore more of this undiscovered country. No links between this and the Internal Knowledge.
(nb: there is some cool latin terminology for knowledge stored inside the head vs outside, but I can’t remember where I saw it, tried googling it and failed. I read it in some book by Carl Sagan. This is *exactly* the problem I’m talking about)
The thing I’m interested in is External Knowledge That I’ve Seen Before. The crucial thing that separates it from the other external knowledge is the links between it and the Internal Knowledge. I’m not getting an upgrade of the Internal Knowledge hardware any time soon, so I need to maximise its effectiveness. I think having it filled with half baked bits and pieces linking out into nothingness is a complete waste; it’s like a piece of software that keeps throwing access violations.
This crystalised concept is what I was looking for when I wrote the post Infinite Bookshelf. Progress!
So how does this all relate back to the commonplace book?
Well, the commonplace book was also about this. You wrote this book to create solid links between Internal Knowledge and External Knowledge That I’ve Seen Before! So it isn’t just a book, it’s outsourcing part of your cognition. That’s exactly what I need to do.
However, I have different issues and opportunities. The issues are that the scale of the external knowledge spheres is much larger. Also, there’s a flow rate which changes what I can deal with; just like you can’t use a household water filter on a town’s water supply, you can’t use a paper book to outsource your Internal-External knowledge links. On the other hand, the opportunities are, that there’s much more knowledge, and so much more to gain. Also, computers are far more sophisticated than books, the networked mass of them even more so, so I can do something critical with a much larger collection of knowledge links; I can make it searchable. Crucially, for it to scale, it shouldn’t be designed to be browsed, like a book, but to be searched.
I need to invent something, the 21st century, digital commonplace book. It is a system that can manage internal-external knowledge links. It’s outsourced cognition.
The digital commonplace book
Firstly, how would I use it? Well, this breaks into two pieces, Storage and Retrieval.
– When I’m reading an online article, I should be able to put it into the commonplace book. This might be just via URL (but that may then need to draw in the article at that URL, for indexing purposes). Or, it might just be snippets from something I’m reading online, which could be cut and pasted in, along with URLs and any other useful metadata (Author? …)
– When I’m reading a paper book, I might find snippets that need to go in. I should be able to scan & OCR that text (or type it in in the extreme case, but bleh!). Then add some metadata (crucially, details of the book! ISBN would go a long way…)
– If I’m reading something large online, I should optionally be able to put the whole thing into the commonplace book, where it gets broken up and indexed.
– Storage needs to be quick and painless. The more work required, the less likely I am to do it. eg: with a book snippet, a handheld device with a hires camera and wireless net connection would be ideal; take a photo of the text, send that to the commonplace book, which OCRs it and stores the image, the OCRd text, plus any hand entered metadata. eg2: with web based information, a browser plugin could be grabbing and entering anything I read, with little or no intervention from me.
– Integrating storage with posting on social networks would be ideal. Sometimes things going into the book can also be social network posts. Sometimes social network posts should also go into the book.
– Basic retrieval should be by search. Straightforward google style search. It should only search my commonplace book, not other things (nothing from External knowledge that I’ve never seen before).
– Search retrieval ideally is at a paragraph level. A list of relevant paragraphs is retrieved, which then link back to the whole article in question. Searching directly on entire articles/pieces of information would limit the size of things that could go into the commonplace book; if I add an entire book, and a search just gives me a reference to the book, I still have too much work to do in browsing the book looking for the actual references.
– Categorization would be useful. Particularly, restricting searches to chosen categories might narrow things down when the book gets large. Text-based tags would be good enough. Note though that the use of categories/tags imposes a burden of assigning them on the Storage step.
– A browser plugin that continuously put text from the screen into the search, and showed strong matches in some kind of sidebar, should be a pretty strong external associative memory function. Nice to have.
– Date based criteria would be useful
Implementing a first Digital Commonplace Book
I realize I’ve already had a bash at this. This blog, in fact, was a first, less clear attempt at the concept. However, it isn’t a commonplace book, because blog posts (in this blog) are too much about a serious piece of writing, and not about just throwing interesting tidbits into the pot.
Also, my social network posting (currently at record levels I think, apologies to my network), have been an attempt to remember (and perhaps to load up my social network with my ideas, so later on they might be thrown back at me). I sometimes post a link on facebook when I intend to read it later in the day. But in the end the stream of facebook posts sort of go into a mostly non-useful swill bucket. Facebook is such a crappy walled garden, gah.
A blog is close to what I’m looking for.
I noticed a wonderful thing the other day. When I posted on my blog, then immediately searched on google, there was the blog post! It’s google’s new realtime web functionality.
Combined with the fact that you can restrict a google search to a single blog, this turns a blog into a wonderful free full text database with absolutely brilliant text-search functionality (it’s google search!). Provided, that is, that you are ok with the contents of said database being freely accessible. Which I am.
That’s the heart of a digital commonplace book. If I were to use a blog, I’d need to abuse the concept slightly; given a text to go into the “blog”, I’d want to break it into paragraphs, add each paragraph as a post of its own, all with links back to the original (which also needs to be added I guess). Some fun and games with categories might be useful so you can tell paragraph chunks from original texts.
The storage functions would require custom apps. Google AppEngine would help here, as would some firefox plugin development. I might also want a server somewhere (Amazon? Or just under my desk?) taking scanned text, OCRing it, and submitting the results to the blog.
Basic retrieval can be based on a simple search box on the blog as can just be added in as a widget. And, to do more sophisticated things, custom code can talk to google to get search results, then massage them into the shape desired.
A blog with a solid API seems to be the thing. Blogger would be a good bet, and as all the rest of the tools come from google, it might integrate better too.
I’ll need to check if there are limits on the size of a blogger blog, or posting limits.
Well, I’m out of steam for this post. Great idea though, I think, and I can implement it incrementally.