Semantic Web

Google acquires Metaweb / @fbase

Open Web Vancouver, Open Restaurants, Open Data

Open Web Technology Conference in Vancouver, June 11th and 12th 2009

I got notice this weekend that my talk submission to Open Web Vancouver 2009 got accepted.

Open Web Vancouver runs June 11th and 12th and the brand new Vancouver Convention Center, and registration is now open.

Here's the blurb from the registration page:

This year's conference promises to be at least as exciting as last year's. So far we have an exciting speaker roster confirmed, including keynote sessions with Rickard Falvinge, leader of Sweden's Pirate Party, and Angela 'webchick' Byron, Drupal 7 co-maintainer and Lullabot. Other confirmed speakers include:

(BTW Open Web Van peeps - I'm tagging this #owv09, please post this somewhere as the "official" tag)

I posted my talk submission to the Open Restaurants wiki. What's Open Restaurants? Well, it's the namespace for that Semantic Web Community Barn Raising that I blogged about a while back. We had our breakfast meeting, 8 people came, and we've got a great cross section of Drupal devs, Freebase / semweb schema nerds, open data policy enthusiasts, and NLP experts that want to use Twitter posts to create Skynet :P

Semantic Web Community Barn Raising in Vancouver

Or, how I got a tasted of Linked Data and wanted more.

I've been noodling about with a variety of semantic web / open data concepts for some time. Most recently, I spoke at Drupalcon DC 2009 on Practical Semantic Web (and why you should care). There's a video of the presentation embedded in that last link, and the presentation is available on SlideShare.

I think I can explain things even more simply: we can place a simple piece of data on a web page - like an address, or marking something as a person. From those source pieces of data, we can link it to ever more pieces of information. Except, instead of just "bare links", these links carry meaning, creating a richer web of content. In my presentation, I used the phrase "RDF is food for robots" - a way to share more info about the visible, human readable information that is already there.

From my point of view, that's already cool enough. But it's still not very *practical*. Why should you care, today? That leads me to open data, in general. The Dev Seed guys in DC have a great example of this in their StumbleSafely site -- it mashes up bars in Washington DC with crime statistics, cross-referenced with time for day, night, and evening data.

StumbleSafely.com screenshot

We don't have anything like this in Vancouver. What we do have is some interesting data that is beginning to be gathered in Freebase, and more specifically in the Vancouver-specific corner of it, known as vanbase. Looking at the StumbleSafely application, and thinking about some data that I hate to see duplicated over and over again, I came up with the idea of restaurants, and extra information about them.

Social Graph applications: why not for every community website?

One of the outcomes from my trip to Victoria last week is some thinking about the social graph. More specifically, you may recall that I've been using Flock. As it turns out, I recently upgraded my laptop to Leopard, and made Flock my main browser. This has given me increased exposure to their "people bar" -- a side bar that supports a variety of big community websites, like Facebook, Flickr, Twitter, and so on. As I've been using this feature, and seeing the way that Flock "detects" features of different websites, I started thinking about how every community website could enable this functionality. Right now, the Flock team has to pre-integrate with the specific website's API to enable this functionality. But, just as they "detect" the presence of site-specific search engines, there is no reason that one couldn't expose a link header that indicates the presence of a social graph. I know what you're thinking: "But Boris, how many people use Flock? Isn't this just browser specific functionality?" Well, no. First of all, Google has a Social Graph API that is already being crawled -- looking at FOAF and XFN. Secondly, I got to thinking about all these site-specific applications -- like Twhirl that was bought by Seesmic. So, if we had some basic standards about this stuff, it would be simplistic to have one app that let us monitor / notify / update any of these systems. Yes, there will ALWAYS be websites that have more complex APIs with more features -- that are only accessible by implmenting *their* API to talk to them. But for thousands of other community websites, built in Drupal, WordPress, Joomla, or what have you -- you suddenly have the same rich access to applications as the big guys. How many websites would encourage their users to install Flock or Twhirl if it supported *their* website? Oh, and I'm completely skipping the linked data / RDF / Semantic Web factor of having community websites expose some part of their social graph, or at least make it available for querying by people that have the right credentials. OK, so how does this look to the end user? I'll use Flock as an example, since I've got agreement in principle from them that they'll work with me on this, including help in defining some of the formats.

  1. User surfs to community website where they already have an account (for simplicity's sake, we'll pretend a session is still open)
  2. Flock detects a website that has a social graph available because of a header link that looks something like this: <link rel="socialgraph" href="/user/4426/socialgraph.rdf" type="application/rdf+xml" /> (Note the user ID in there, because the user already has a current session open)
  3. Flock does it's fun in browser slide down that says something like "This website supports a people bar. Would you like to add it?"
  4. If the user clicks on "yes", then Flock initiates an OAuth request to be allowed to a) fetch the current user's social graph file and b) take actions on behalf of the user such as setting their status or sending a message/poking/whatever another user on the site
  5. The user acknowledges the OAuth request and clicks some allow buttons
  6. Voila! A fantastic site specific "people bar" right in your Flock browser

So, that was a VERY Flock specific flow, but as I mentioned with Twhirl up above, absolutely no reason that you couldn't do the same thing with those type of people notifier on your desktop apps / widgets / etc. -- just start by typing in the URL of the website, the app would go and discover the social graph link and/or initiate an OAuth request to authenticate, and all of a sudden you're directly monitoring the different community websites you're a part of directly. Bonus points to websites that expose the social graph as an XMPP Pub Sub endpoint so these apps don't need to poll constantly. Now, I know the first thing we're going to have to do is fight a religious war over the format of the socialgraph file. I'm going to suggest some minimal FOAF format, since I'm a born again RDF fan.I don't want to go spraying email addresses all over the place, so perhaps either local unique user GUIDs or OpenID could be used as identifiers for each person. We actually don't need full "person" information -- a username, avatar, status message, and date stamp for last activity sorting should be the minimal set. Even status message could be option for smaller, less complex sites so almost anyone could support this out of the box: just show everyone on the site (yes, that's right...ignore any sort of "friend" connection) sorted by last active -- which could be a post / comment, or (again, simple support by many sites...) just date stamp of last login. I'd like to think that the choice of OAuth as credentials for acessing this info isn't controversial at all. Feel free to layer OpenID in here somehow, but for the action-at-a-distance on which cool functionality can be built, this kind of a token system looks to be ideal.What next? Well, surprise, surprise, I'm going to take a crack at getting this implemented in Drupal. Raincity Studios is already working on the OAuth module, which would be one of the main pre-requisites. Once the format of the social graph file is defined (calling Joshua, Arto, and maybe RalphM...), building the next piece shouldn't be too hard.Ideally, something like the Gnomepal Drupal distribution would ship with this out of the box (for the really ambitious, Drupal 7 core!). And other systems like Marc Canter's People Aggregator could easily expose this social graph info as well.I'm excited at the continuing growth of every website as a dynamic web application, and also of the exposure of data and APIs by this web of sites. This feels like the right path we're travelling on to get everything a little bit more interconnected.

Testing Purple

It's been quite some time since I wrote about purple numbers -- fragment identifiers for bits of text within HTML documents. Or, Granular Addressability in HTML documents as E.E. Kim describes in 'An Introduction to Purple'.

Back then, I wasn't a fan of these anchor links, as anchor links aren't first class web citizens -- where links are currency. I'm still not a fan, but maybe the anchor links are just there to make it easy to grab pieces of this content.

A recent reference by Les Orchard (oh, look, OPML anchor tags!) to purple-include, which enables transclusion (aka including content from elsewhere, directly inline, rather than copy/paste) got the brain cells tickling tonight. So I built a Purple module for Drupal. Which, in reality, just includes the purple-include.js and a little bit of CSS to make purple links show up. I am trying to include some of Simon Willison's plinks cleverness, but not sure if I'm going to get that working.

Transclusion feels very SemWebby. No, I'm not going to use the dreaded Web 3.0-label (but do go read the Business 2.0 article if you want a backgrounder (via Nova Spivack, of course). Ahem. Back to the point.

From Facebook apps to photos stored on Flickr, we want to have all our "stuff" just magically collected together wherever we happen to be, whatever network we happen to be interacting with. Aggregation, sucking this content in, pushing it over there -- all just temporary ways of flowing content around. One that arguably duplicates content and spews extraneous permalinks around. I just want my pictures right here, or I just want to link deep into someone else's posting and pull in a piece of text. And I want the "other end" to know about that inclusion, a gentle ping, yeah, kind of a trackback. That's the Semantic Web to me: where every plain old HTML file is dynamic and intelligent and knows about the links and people that are incoming and outgoing.

OK, now to Purple. Here's an example that includes a file I have on my server, shamelessly copied from the purple-include examples page. First the code:

<hx:include src="/sites/bmannconsulting.com/files/purple_include.html#xpath!//p"></hx:include>

And now the transcluded bit:

Update: Kevin reports that the transclusion doesn't work in Safari. Stupid client side technologies :P

Note: I don't *actually* know XPath. But if you open the file directly, you'll see it just grabbed the paragraph. I'm assured you can do more complicated things than that :P

To Do: #

  • Find out where purple-proxy is hiding / what to do with it so I can transclude from elsewhere
  • autogenerate plinks / pilcrows / numbers, perhaps via a Drupal filter
  • make a real module available somewhere, maybe also tracking down jluster's old purple numbers code

Experimenting with Freebase

No, it's not a drug, it's the Semantic Web :P

Freebase is sort of like a structured wiki. In that, anyone can add content to the system. What's different, is that you can also add and define your own "Type" which are a collection of metadata. User generated content is quite common, and Freebase actually sucks in a ton of information from Wikipedia which gives it a huge base of content to start with, but this concept of being able to add/edit/organize higher level structures and metadata is new (tagging aside...).

I had great fun fleshing out entries for Drupal and the Drupal Association, figuring out Company, Software, and People types in the process.

OK, OK, I admit -- filling in the Beer entries for Hacker-Pschorr Weisse and the Hacker-Pschorr Brewery were actually more fun :P