Deep Learning 2.0

The net is often criticized for enabling shallow learning, while leaving deep learning largely untouched. It’s easier than ever to pick up simple facts and trivia, recipe style techniques, little snippets of information. But, if you want to learn the equivalent of an undergraduate chemistry course, it’s just as much slog as it ever was.

There are people who do deep learning differently though. Autodidacts have the skill of being able to sift through structured texts and courses in what should be an area that requires long, slow deep learning, and “pick out the eyes”, grabbing just the knowledge they need. The problem here is that it requires some really unusual skills and abilities to do that, because the materials actively get in the way of this approach.

There is no reason we can’t make this easier using the ‘net. There are masses of deep learning content available for free online, it’s just impenetrable to a greater or lesser extent. If it was structured appropriately for self learners, we might get a modest renaissance going!

I propose a simple structure for representing the essence of deep, structured learning materials, in a way that makes them usable on an ad-hoc basis.

There are three fundamental entities in the structure.

– Know
– Learn
– Person

A “Know” is a description of an atom of knowledge/skill/understanding. It could describe anything. EG:

Know Name: WordPress-Blog-Posting
Know Description: Understanding of and ability to post to a WordPress Blog

A “Learn” is an atom of learning. It describes how to acquire one or more Knows.

Learn Name: Tutorial on How to post to WordPress
Learn Description: Refer to the tutorial at http://someaddress.com . Run through the whole thing, and there are some self test questions at the end if you want to make sure you understand.
Approx Duration: 20 min
Depends on these Knows: WordPress-Blog-Signup, Basic-Browser-Use
Delivers these Knows: WordPress-Blog-Posting

A “Person” is a description of a person in the context of the system. In particular, the system tracks a list of Knows that the person says they have, and possibly a list of Learns that the person has undertaken.

Given this simple three element system, all kinds of use cases are possible.

– Acquire “Know X”
The Person wants to acquire Know X. The system traverses the graph of dependencies, between Know X and the list of Knows the person has already acquired. The system can then recommend a path from where they are to acquisition of Know X, as an ordered list of Learns, just like Google Maps can give you directions from point A to point B. There can be a set of alternate recommendations.

Then the user just visits each Learn when ready, follows the instructions, and works until they feel they have acquired the Knows.

– Add to the system
People can simply add Learns and Knows as they see fit. Know and Learn names are tag like; you can define them as you wish, and people can attach their learns to your knows if they like.
Maintaining existing knows and learns could be wikipedia-like – anyone can edit, but there’s a system of admins to sort out problems. Or, they could all be owned by their creators, and if you don’t like what a person has created, you can create rival Knows and Learns that you think better reflect reality.
The dependency system (Learn X requires Know Y and gives you Know Z) might also be vote/like based; anyone can propose new dependencies, and give a thumbs up/thumbs down on existing ones, so over time you get a good relative sense of which dependencies are truly important and which aren’t.

– Prove your knowledge
A social relationship between Persons, or just public profiles where desired, could let others see the “Knows” you have.
As an extension, a Know could be owned by someone who reserves the right to allocate the Know to other people. This means they take on the responsibility of verifying that any given person knows their Know, which might be testing based, or based on records of use of a third party system, or even based on undertaking a university or other private, commercial course. These kinds of third party verified Knows could become resume items in the real world. Of course, this could hamper autodidacticism, but then anyone can create alternate, unverified routes around such Knows, and/or set up an open, free verification system. A simple peer review approach could also be implemented into this system itself as another extension. Note also that this is all secondary; access to structured learning tools themselves is the primary thing.

This all needs more thought. Firstly, it needs to be a self-managing, “crowdsourced” system. Success would rely I think on getting a system of reputation exactly right, so that reputation based incentives drive engagement.

Knows and Learns need to be the smallest possible atoms. If you see a Know or a Learn and think “That’s too complex, I could break that into parts and associate them with dependencies” then that should be done. Approx Duration should ideally be on the order of a few days at the most, hours is better.

There probably also needs to be a “combine” mechanism, maybe more like “subsume”: take Know X, and subsume it into Know Y (change all dependencies on X to be on Y, maybe add in some of the info from X to Y, then delete X).

Some aggregating mechanism might be required, but then simply making aggregate Knows might suffice. If we just add “Equivalent To” to Knows, and allow a list of Knows (or a list of lists of Knows, so we can have alternatives), then the Know “Chemistry-Major” could be created, with a short description, and an “Equivalent To” list specifying all the component Knows. This could be multiple levels deep.

If this is successful, we should see people going out and “picking the eyes” out of the free online materials: MIT OpenCourseware, sites like Academic Earth, free online textbooks, how-to sites, youtube instructional videos, informative blogs, you name it, creating interlinked Knows and Learns.

The first version should be stupidly simple. A page for Knows, a page for Learns, some basic editing facilities, some basic searching/nav facilities. Tracking what Knows you’ve acquired probably isn’t necessary first up. Editing facilities may be more important than use early on, and should assume an intimate group that communicates and self polices, suitable for a very small group of editors. As it grows, the system should change to match changing dynamics, but up front, simple unrestricted editing might be best.

That’s all I can think of for now. Comments?

Deep Learning 2.0

The Codescape

There’s this incredible place where I like to spend a lot of my time. Everyone I know is near it, closer every day, but mostly they don’t come in.

When I was a kid, it barely existed, except in corporates and universities, but it expanded slowly. There wasn’t much you could do, even after it began to really explode through the 90s. But lately it’s become somewhere new, somewhere much bigger, somewhere much more interesting.

It’s a place I call the Codescape, and it’s becoming the platform on which the whole world runs.

The Codescape is simply the space of all computer programs (code) spanning the world. The internet is implemented in it, but it is not the ‘net. “The Cloud” is one of the more interesting pieces of it, but it is not the cloud. It exists in every general purpose machine, as soon as anyone tries to make it run code. Some of it is in your computer, some is in your phone, there’s even a little bit in your car. There might be a tiny pocket in your pacemaker.

In fact it’s something that many of us grew accustomed to thinking of as a lot of isolated little pocket worlds – the place inside one machine or the place inside one network. It’s related to the computer scientist’s platonic space of pure code-as-mathematics, but it is really the gritty, logical-physical half-world of the running program instances, and the sharp edged, capricious, often noxious rules that real running environments bring. It is the space of endless edge cases, failures, unforseen and unforeseeable interactions between your own code and dimly perceived layers built by others.

The platonic vision of the code is a trick, an illusion. We like to fool ourselves into thinking that we can create software like one might do maths, in a tower of the mind, all axioms and formal system rules known and accounted for, and the program created inside those constraints like a beautiful fractal, eternal in its elegance and parsimony. Less a construct than a discovery.

The platonic code feels like a clean creation in the world of vision and symbols. Code is something you can see, after all, expressed as a form of writing. If you spend long enough away from the machines, you can think this is the real thing, mistake the map for the territory.

But the real Codescape isn’t amenable to this at all. It is a dark place and a silent place. You know you are in the Codescape because your primary sensory modalities are touch, smell, and frankly, raw instinct.

It is an environment composed of APIs, system layers, protocols and, ultimately, raw bytes. It is an environment where the code vibrates in time with the thrumming of the hardware. You feel through this environment, trying to understand the shapes, reach perfectly into rough, edged crenelations, looking for that sensation of lock, the successful grasp. Always, though, you are ready for the feeling of an unexpected sharp edge, a hot surface, the smell of something turned bad, the tingle of your spidey sense.

It is a place that you can’t physically be in, but you can project yourself into. The lines of code are like tendrils, or tentacles, or maybe like a trail of ants reaching out from the nest. That painstaking projection, and the mapping of monkey senses and instincts to new purposes, turns most people off, but I think those of us most comfortable with it find the physical world similar. Possibly less abstractable, and so more alien. Certainly dumber.

Oddly enough, we don’t talk about the Codescape much. It isn’t because we don’t want to, but because largely we cannot. We who travel freely between worlds often can’t express it, because it is a place of system and not of narrative.

During periods of hype (mostly about the internet), a lot of bad novels and terrible movies get written about it (while missing it entirely), with gee-whiz 3D graphics and faux h4XX0r jargon. Sometimes some of us are even fooled by this, and so we pay unfortunate obeisance to notions like “virtual reality” and “cyberspace”, and construct things like 3D corporate meeting places, or Second Life, or World of Warcraft. Those are bonefide places, good for the illiterate, and a pleasant place to unwind for people of the code. They even contain little pockets of bone fide codescape inside themselves – proper, first-class codescape, because all of the codescape is as real as the rest.  But there is something garrish, gauche about these 3D worlds, like the shopping mall inside an airport, divorced from the country in which it physically exists.

The main codescape now, as it exists in 2010, is like the mother of all MMOs. Many, many of us, those who can walk it (how many? hundreds of thousands?) play together in the untamed, expanding chaos of a world tied together by software and networks. Each of us play for our own reasons; some for profit, some for potential power, some for attention, and many of us, increasingly, for individual autonomy and personal expression.

It’s a weird place. It’s never really been cool (although it’s come close at times), because the kinds of people who decide on what’s cool can’t even see it. These days the cool kids (like Wired, or Make Magazine, or BoingBoing) like open hardware, or physical making. But everything interesting is being enabled by software, more and more and more software, and so becomes at heart a projection out of the Codescape.

Douglas Rushkoff’s recent book, “Program or be Programmed”, talks about how we are now living in this world where what I call the Codescape is shaping the lives of everyone, and where we are divided into the code-literate and not. His book is mostly dreary complaining that it’s all too hard and the ‘net should be more like it was in the 90s (joining an increasing chorus of 90s technorati who are finding themselves unable to keep up), but that first sentiment is absolutely spot on. If you can code, then, if you so choose, you can feel your way through codespace, explore the shifting landscape, and maybe carve out part of it in the shape of your own imaginings. Otherwise, you get internet-as-shiny-shopping-mall, a landscape of opaque gadgets, endless ads, monthly fees, and the faint suspicion that you are being constantly conned by fagan-esque gangs.

I contend that if you care about personal autonomy, about freedom, in the 21st century, then you really should try to be part of this world. Perhaps for the first time, the potential for individuals is rivalling that of corporate entities. There is cheap and free server time on offer, high level environments into which you can project your codebase. The protocols are open, the documentation (sometimes just code itself) is free and freely available. Even the very best programming tools are free. If you can acquire the skills and the motivation, you can walk the Codescape with nothing more than an internet connection, a $100 chinese netbook, and your own wits. There is no barrier to entry, other than your ability to twist your mind into the shape that the proper incantations demand.

Everything has a programmable API, which you can access and play with and create with if you are prepared to make the effort. At your fingertips are the knowledge and information resources of the world, plus the social interactions of 2 billion humans and counting, plus a growing resource of inputs and outputs in the physical world with which you can see and act.

It’s a new frontier, expanding faster than we can explore and settle it. It’s going to be unrecognisable in 2020, and again in 2030, and who knows what after that. But the milestones are boring. The fun is in living it. The first challenge is just to try.

The Codescape