I recently had a conversation with one of the speakers for the upcoming Jupiter Weblog event (John Lawlor) who originally brought the issue of Web log page ranking to my attention. One aspect that he pointed out was kind of interesting – Weblogs are not about getting a specific OUT OF CONTEXT domain to the top of Google. Rather – he said that page ranking driven by Web logs has the greatest value for indexing content in a particular context and creating a greater likelihood of contextual discovery.
I think this is an extremely important concept that few people have thus far grasped. In the case of a search phrase such as ‘software innovator’, there is a specific context; one that is driven by common words that are likely to be typed into the title of a Web log entry, and not likely to be used as the title (or keywords) of a web site domain. Furthermore, this particular phrase is also likely to be typed into a search query; you wouldn’t type just ‘software’ or just ‘innovator' – generally speaking – you’d qualify the type of innovator, otherwise the results would be close to meaningless.
In the case of this specific example, Google returns 145,000 items, of which F. Andy Seidl (my MyST partner) is #3. Additionally, I’d like to point out that this hit is based on a specific content item (a discrete item in a channel). In this site, it is embedded as content in yet a bigger domain named http://faseidl.com/. This further demonstrates the ability to use MySmartChannels clouds around a domain that acts as a little storm of mini-hubs pointing to the greater source of content. Indeed, this is what companies seek in terms of visibility, and what users seek in terms of discoverability. However, is it fair, or even proper that Andy's resume is among the most relevant pages for the general phrase 'software innovator'? I suspect it is if you are looking for F. Andy Seidl, or you happen to be F. Andy Seidl.
Google's recent announcement that the company will also offer a service for searching Web logs raises some thorny challenges for their engineers. How do you know if specific content items are part of a Web log or simply content items from a content management system? If linking huristics are an important factor for gauging content value, how can they be totally ignored in an attempt to index only primary content? And what is 'primary content' anyway? I've observed that some primary sources of content often originate in web logs. This will be an interesting challenge for Google.
I certainly agree that something needs to be done; I've lost lots of time following links to dead-end stories or blog items that offer no significant contribution other than a brief mention and a further hop to the 'primary' source. Swinging the pendulum to the far side of the problem [such as removing blog content from Google's primary index] is just as bad as what's happening today in Google. In my view, Google simply needs to balance their huristics and not focus so much attention on the blizzard of links that Web logs tend to create.
From the outset, we've regarded MySmartChannels as a model for creating discrete objects of information with ample semantic tagging possibilities that will make it easier to find them later. MySmartChannels also provides personal publishing capabilities for information workers. As it happens, our platform can be used for many content and knowledge-related processes.
Conclusions –
- Web logs are the perfect vehicle for creating easily discovered contextual content.
- Web logs will undoubtedly be used in conjunction with marketing objectives, but they should be used in the context of creating a cloud of customer-facing dialog in specific support of customer needs.
- A cloud of customer-facing dialogs built upon a well-designed content architecture is geometric – each new web log entry continues to transform the cloud into the ‘perfect linking storm’. (e.g., each new link has a value n(squared) – see http://c2.com/cgi/wiki?ReedsLaw)
- Getting people to traffic your site is best achieved through indirect measures based on contextual [verb and adjective] tokens instead of the usual noun keywords commonly thought to be the right descriptive approach for Web sites.
- Goggle has always had a tiger by the tail, but now they have a tiger in each hand. Rating and ranking the value of the public Internet is a tall order, because it's now a very public place. Web logs (and personal publishing in general) add one additional twist that Google's current huristics fail to adequately factor.