ncyoung.com
I can think of my posts about computer books as being in the programming->books, or I can think of them as being in reviews->books or I can think of them as being in programming and being in reviews and being in books.
In the first two cases I have to choose between one category or another. In the third case I can have as many categories as I want but no hierarchy. This last most acurately characterizes the current "tags" fad sweeping around (see flickr, del.icio.us et all).
I think the best answer is a mix, where I can review a perl book and put that review in reviews->books and in programming->perl. What a coincidence! That's how my weblog works. Hmmm.
So what I can't do now, which I think I should build, is a way to look at category intersections in a way that's similar to subcategories. In other words, the perl subcategory is a way of filtering the entries in programming. If I look only at posts in programming->perl that are also in reviews->books then I am in effect looking at the subcategory programming->perl->reviews->books.
To flatten out: if you have some thing tagged as photos and books and some things tagged as books and reviews, then if you click into books you see reviews and photos as subcategories. I'm not sure how much I would like this. I think I'm going to try it.
I do think that having a hierarchy of categories instead of a totally flat tag space is helpful but I want to articulate exactly when and why.
At the risk of sounding like a pundit, microformats probably hold a substantial part of the future of the semantic web.
The basic premise is this: simple things that are going to live on web pages anyway don't need their own XML tag vocabulary. Instead, use XHTML and indicate semantic structure using simple conventions (specific values in class attributes, for example). The combination lets you create semantic elements with minimal markup that also display on a web page in a sensible and predictable way.
Here's the wiki main page with a list of microformats
I was reading del.icio.us documentation which insists that the service is all about "social bookmarking". And from the perspective of browsing the del.icio.us site to find bookmarks that are tagged in specific ways, that's true.
But from the perspective that the bookmarks are associated with semantic information about the page they bookmark, and given that there is a bookmarklet interface that allows you to see that semantic information when you are on the page in question, they start to look truly identical to annotations. (and if del.icio.us doesn't serve their bookmarks as annotations soon then shame on them)
Topic maps are used to model meta data. They support making statements about a resource (as identified by a URL) and to model relationships between resources.
They also allow you to make statements about the statements and about the relationships.
To my eye the xml format for topic maps is a lot easier to understand and write than RDF statements. There is the start of querying and browsing tools for that XML format that make choosing topic maps over home-grown formats attractive.
The tao of topic maps is a good intro, the topic maps article at xml.com is not quite as good but shorter and has XML samples in it.
Someone has made a topic map representing the diary of Samuel Pepys (view of the Samuel Pepys topic). Ontopia has a topic map browser with a demo map of opera (as in singing not browsing) related information.
I found documentation of a topic map query language but no tool to test it out with unless I want to write java classes.
Collaborative filtering is like amazon's "people who bought xxx also bought yyy" but it gets better: "people who bought w x and y (as you did) also bought z" And that's great, unless you were just shopping for your grandmother...
Actually collaborative filtering has a lot of thought and history behind it. I can see a benefit to looking at reviews of an item as meta-data. Then it could be related to how well you trust the person who published the review.
no more blockbusters
more links
An interesting idea would be to use the context in which others are looking at the page you are currently looking at to inform your context... Kind of like "people who are looking at the page you are looking at often came here from x and went to y"... maybe not too useful. But maybe "people who looked at the last 10 pages you looked at also looked at: z"
some inspiration for the above
That seems like it would have a bunch of noise input. i.e. pages you visited recently that were irrelevant to your current purpose. The answer to cutting out this noise often takes the for of more user involvement. Yet less user involvement seems to be the key for good collaborative filtering.
Case in point: you can educate amazon's recommendations, but who would bother? Either it gives you good recommendations based on your past purchases or it isn't much good.
As long as I'm rambling, aren't paths (in real life) collaborative filtering? You're following cues left by other human beings about which way to go. I've even heard of planners who build the buildings and plant lawns around them but create no sidewalks. Where the lawn gets worn away, pathways are later paved.
bring that idea online
I want to be able to see my favorite weblogs as annotations as I surf the web.
As proof of concept, I made a gateway for my own website (back when it was a b2 weblog). Well, now I've update the gateway to work with my new website.
The annotea server is at:
http://ncyoung.com/annoteaServer/CMS.php
Here's an old post that talks about annotations.
Here's where I talk about the weblog to annotation gateway. Then I was speculating on the limitations of linkback and trackback and how annotations excel. Guess I got kind of excitied about it.
And the old b2 annotea gateway
The main problem is the lack of annotea clients. And a post about what's available.
This (often quoted) Disenchanted article talks about linking back to referers as a way to build a smarter web.
But the web of linkback links has some real logical traps that greatly dilute the information these links can carry about relevance.
Linkback links are not smart, they're democratic. At worst, they may show which page fooled the most people into clicking on the link to your site. At best, they'll still highlight older and more well travelled links over newer and possibly more relevant ones.
Linkback links also assume that relevance is two-way which is certainly untrue in some cases. For example, links to reference material may often be more relevant than the backlink they generate.
The article suggests that there should be a simple way for web surfers to specify a list of relevant resources for use by other visitors to the same page.
I submit that by publishing a weblog entries as annotations, one can create such a tool. ( see my weblog as annotea annotations) I can post weblog entries that talk about related websites, and any time someone visits any of the sites, my weblog entry will appear as an annotation. Surfers have the option to subscribe to my weblog as well as (hopefully in the future) the weblog of anyone else whose opinion they value.
Well even though I didn't really have time, I created an annotea gateway to my weblog. This means that if you have an annotation client (see this post) you can see my weblog entries as annotations when you visit a website I've written about.
The server is at:
http://ncyoung.com/annoteaServer/annoteaTools.php
Update I'm working a little more on this, making sure that all the annotation fields I can fill out properly are done. I'm still learning about the format and resolving issues with the annozilla client.
Update I made a new gateway for my new weblog
I have an rss feed of my weblog. So people can merge my stream of postings with other RSS feeds, apply filters if so desired, and come up with a tailored news source. Why not the same thing for annotations?
If my weblog software implemented the query and download interactions from the annotea protocol, you could put me into your annotation client's list of annotation servers. then you'd see weblog entries as annotations whenever you visited a URL I had written about in my weblog.
I would love to be able to see weblog entries from my personalized list of weblogs as I surf the URLs they apply to.
Reading about the semantic web from shallow to deep:
Intro the first half is a good basic overview, the second half surprisingly esoteric.
In breadth Describes the pieces: URIs, documents, RDF and how they fit together.
More detail on rdf, n3, schemas, etc.
More overview and a survey of currently functioning applications.
W3C's RDF page
Checking out annotea some more.
Update: Amaya has a nice annotea client. You can annotate the document or a selection. You can specify more than one annotation server.
Mozilla has a plug in for annotea, annozilla. Doesn't seem to let you specify WHERE on the page the annotation should go, and doesn't understand the xpointers amaya uses to specify the text to which the annotation applies.
snufkin for IE also does annotations. And other stuff. Amazing. Wish I had more time.
I'm actually really interested in the annotation thing. You could tie it together with FOAF info, so that whenever you go to a web page, you can see what people whose opinion you value thought of it.
This would fix what sucks about weblogs, which is that you can only look at my stuff in context of my weblog. There's no good way to look at all weblog entries commenting on a given URL.
linkback begins to address this in a different way, in that each page has links back to things people said about it. Annotations have more semantic value than referer links, and they don't rely on being implemented in the page itself (annotation is done using the browsers plug in and server database). For better or worse, third voice's hype about "democratizing the web" also applies: Page authors have no way to limit people from commenting on their page.
I was just thinking about the much hyped third voice the other night. Third voice let you make notes on any page on the web, and other users of the software could see your notes and you in turn could see theirs.
I haven't heard about them for a while, and I vaguely remembered some kind of controversy ( or maybe a security hole?) about it.
It turns out that thirdvoice died the same way that most other dot coms did. They ran out of money for their free service.
But I thought the concept could lend itself to an open protocol with a distributed architecture. So I checked into it, and there's some interesting annotation software out there.
Annotation Engine does annotations with a perl script on the server. The script acts as a proxie, adding the comments into the HTTP stream as it relays the requested URL. Here is this post as seen through Annotation Engine. (no style sheets, unfortunately)
I loaded my own page in there and I realized that with my RSS feed, there were at least three processes (on three servers) involved in generating/filtering the page I was viewing.
Then I found annotea, built into the amaya browser. My idea that the wholy owned server that thirdVoice used could be distributed is realized here, as are lots of good ideas placing annotations in the context of the semantic web. I may have to devote a future posting to this.
http://xanadu.com.au/ may allow annotating, but Xanadu seems to be aimed solidly at the next decade, oops, while having died in the last.
crit.org does annotations and backlinks, all server side. You can run their server too. Doesn't work on my page, I don't know why.
It occurs to me that I could run a server side annotation package as a supplement to my weblog. The entry point would be a list of pages that DO HAVE annotations.
XSLT
08/14/2002
Just finished my first XSLT transformation. It was a simple but complete exercise, perfect for a first project, and it's great because it's going to save my client a huge amount of time.
The problem was a classic conversion puzzle: take the XML formatted output from one program (in this case examview) and convert it into a proprietary text based import format supported by another software package (in this case pageout)
As an XSL student I learned:
- That I didn't need the foreach and case structures that I started out with
- That XSL text functions suck, so as long as your XML input is well designed and the text within XML tags stays the same, you're fine, but if you need search and replace, get ready to go back to the dark ages: e.g. concat(substring-before(string,findText),concat(replText,substring-after(string,findText))
A good XSL tutorial from zvon.org:
http://www.zvon.org/xxl/XSLTutorial/Books/Output/contents.html
A global find and replace function implemented as a recursive xsl:template with parameters.
http://www.xml.com/pub/a/2002/06/05/transforming.html
A very useful Perl wrapper that I used to make a Perl script that runs my transformations on all XML files in a folder and subfolders.
http://www.cpan.org/authors/id/M/MU/MULL/XML-XSLT-Wrapper-0.32.readme
An functional free XML editor that lets you do tranforms with the open XML and XSL documents, letting you see instant results. Did I mention free?
http://www.simx.com/pub/Xtrans/
update: If you want to use the XSLT-Wrapper module mentioned above, you'll need XML::LibXML and XML::LibXSL. If you're working with windows and installing with ppm, the activestate ppm repository does not have those two, but they can be had from: http://theoryx5.uwinnipeg.ca/cgi-bin/ppmserver?urn:/PPMServer
Friend of a friend (FOAF) is an RDF document vocabulary that allows for description of oneself and relationships. It's a bit like a cross between RSS and a directory (think active directory, edirectory, LDAP).
A superficial look at FOAF reveals some appealing ideas, much like my first superficial look as RSS. In retrospect, my first look at RSS missed all the incredible appeal and power that exploded out of RSS as enough feeds became available for aggregation to make sense.
So now I look at FOAF with that experience and wonder if a similar explosion could someday occur.
Ed Dumbill's Developerworks article is a good intro, with links for further reading at the bottom.
Update: Here's a link to the photo demo, in which you can find a photo of someone, then jump to other photos of the people who are with them in that photo.
http://swordfish.rdfweb.org/discovery/2001/08/codepict/
I have b2 set up to trigger a web service (actually 2 services, weblogs.com and b2)whenever I update my weblog.
This adds my site to those lists of recently updated weblogs. The lists are in turn is published as RSS feeds (actually, weblogs.com has a feed - cafelog has some non-RSS XML). People subscribe to the feeds, and my weblog shows up as a link on their site (until it falls off the bottom of the recently updated stack).
You can usually see some random weblogs in my list of referers, and often my link is no longer there by the time I follow the referral back to check it out.
I saw this referrer list on another website (designflea). When a visitor follows a link to you page, a script notes where they came from and adds that address to the list you in turn see on the page.
I thought this was a cool idea just because it makes the page seem smart and it's a nice way of saying "thank you" to the people who link to your site.
But I was thinking about it recently, and I realized that it's much more than this. If a group of pages all have a list of referers on them, and the links in the list are weighted by the frequency with which they are used, then pages that surfers jumped back and forth between will show up higher on each other's lists.
Once a link bridge gets added between two pages, it should eventually get voted up or down in the list of referring links depending on how relevant it is. So it may be that a web comes into being, created by the people who surf the various web site.
Update:
7/11/02: saw this article about the same thing:
backlinks at o'reilley
and again (claiming to have invented the whole idea) at disenchanted
I lurk on the XML-RPC mailing list because it's more fun than a barrel of monkeys. XML-RPC is ruled by Dave Winer, whose crotchety and authoritarian leadership turns out to be completely justified and even possibly quite effective.
Anyway, lots of XML-RPC developers are also SOAP developers, which mutes what might otherwise be a competitive group dynamic.
Then one day the REST folks came along, uniting both SOAP and XML-RPC pundits using the time honored method of saying "You both SUCK!!!"
So what is REST anyway??
(those wondering what XML-RPC and SOAP are might be ready to stop reading by now. If you're curious, you can start with an XML-RPC how-to)
[ahem] So what is REST anyway?
Well, it's a way of structuring application services such that they are called via HTTP and passed parameters via URIs. RPC via HTTP, in contrast, commonly posts XML documents, both to pass parameters and to specify the method called.
What this means for the client API is that a REST application has one URI for each possible combination of method and parameters, while RPC via HTTP has one URI period, and methods and parameters are determined within the http request to that URI.
Implementing RPC is blessedly simple, with the structure for encoding methods and parameters well defined and widely implemented. It's also simple to set up a server at a single URI.
As far as I can tell, there is no agreed upon structure for REST applications, and thus no standardized implementations for REST call and response. Given this, even inventing an acronym for it may have been premature??? It's an interesting train of thought though, and I'll be the first to admit that I'm not fully current on it.
Let me know what you think!
Update (7/22/02)
This article does a better job of talking about it than I do...
|
|
|