The thorny issue of archival content on the intranet



70,000 pages and no one noticed

Yes, it’s true. Those are almost verbatim the words that were uttered by an intranet manager I once met. I’ve used it many times in conversations with clients about slimming down their intranets.

Now, you’re thinking, I’d love to delete a lot of old content on my intranet, but I just can’t. I know that Chris told me to blow it up, but I’ve got some pretty big reasons as to why I can’t. Like regulatory. Or records management. Or just plain old paranoia.

And now that’s disk space is so cheap, there’s really no cost to simply keep a copy of everything my organization has published to the intranet for the past 15 years. Besides, someone might need it one day. And then it will be here for them.


Of course, we know that there’s issues with that. While disk space is cheap, the cognitive load experienced by intranet users is high. Ever tried finding a particular document or collection of documents amongst 100,000 others on an intranet with a sub-par search engine, questionable information design, and highly varying degrees of reliable content? And our time is precious and expensive. Managing and maintaining 100,000 and growing pages on the intranet can cost lots.

A noble goal for many of our customers is a smaller, more relevant intranet. People like James Robertson have been calling for this for years. But it too is hard to do.

What help is there?

Recognizing the characteristics of your content is a great first step. This classic from Paul Chin in the Intranet Journal from 2004 is one of the few articles I’ve ever found that tackles the time-based dimensions of your content. What’s your content’s lifespan?

Once you recognize the short-term / long-term orientation of your content, how do you design your site for it? One of the best metaphors that comes to my mind is the IA community’s adaption of Stewart Brand’s notion of scaffolding in his book How Buildings Learn. Peter Merholz and Jesse James Garrett blogged about this in 2002. I still think it’s a powerful concept. And reminds me a great deal of how we deal with each other through physical space (which I’ve blogged about before via Edward T Hall’s notion of proxemics).

What’s the “stuff” of your intranet? The “skin” or the “structure” — how do you assemble your content based on its temporality or permanence?

And finally, what patterns can we use from the wiki body of knowledge to help re-enforce editorial activities that will keep the intranet a cleaner and tidier place?

Stewart Mader’s WikiPatterns site has a great example with Built-in obsolescence. If you know your content has a shelf life, why don’t you design for the future audience with that in mind now? It’s like a content time capsule.

Mike Briggs of Sun had this important point in his post on stale content: keep the authors tied to their content as much as possible. The publish-and-forget anti-pattern of intranet publishing, combined with the “orphaned content” anti-pattern are harder to have happen if you keep the connection alive between content and author. You created this page, it’s your responsibility to keep tabs on it, remove it when you see fit, or pass the ownership onto someone who will. That’s a design pattern that we baked into ThoughtFarmer from the start: there is no anonymous page ownership. Page and author are always coupled together.

Archiving is tricky. How have you dealt with it on your intranet? Have you ever preformed a giant web harvest snapshot and backup of your intranet, like the US government did with their federal sites in the past few years?

Has anyone ever come asking for one of those 70,000 deleted pages? What will your intranet look like if you could time travel to the future?


Join The Discussion

  1. EphraimJF

    You make a lot of great points here Gordon and James Robertson has definitely helped me understand the value of trim and high quality intranet content.

    The ThoughtFarmer feature that requires every page to have an owner is crucial and I find it very helpful.

    But, without content expiration functionality, it’s still possible for a ThoughtFarmer intranet to suffer the content dump syndrome. If a user posts a poorly-named file on an infrequently visited page, it’s easy for her to forget about it. Unless someone finds and comments on that obscure posting, it’ll sit useless and unseen for a long time. If nobody comments on or looks at a page, it doesn’t matter so much that the page has a clearly stated page owner, especially since there’s no place for an individual user to see a list of all the content she owns throughout the site.

    I think ThoughtFarmer software could benefit from some more functionality that bakes in protection from the content dump syndrome. Expiration dates on new posts that default to six months out… Auto notifications when a page hasn’t been viewed for a certain number of months… When a page is about to be deleted due to its content expiration date, the system could send auto-reminders that give the user the opportunity to extend another six months or set a specific future content expiration date. Maybe even include a “never expires” option.

    You could put content expiration under the “More options” area of page editing so someone would have to go out of her way to change from the default…

    Just some ideas. I haven’t tried to brainstorm the practical drawbacks of doing this, but it seems like it could work.

  2. Gord

    Great suggestions Ephraim. And noted! Gardening and archival tools are not perhaps the most glamorous of features, but crucial to ongoing success and keeping your intranet relevant and meaningful (and findable).

Comments are closed.