RDF

Open Web Vancouver, Open Restaurants, Open Data

Open Web Technology Conference in Vancouver, June 11th and 12th 2009

I got notice this weekend that my talk submission to Open Web Vancouver 2009 got accepted.

Open Web Vancouver runs June 11th and 12th and the brand new Vancouver Convention Center, and registration is now open.

Here's the blurb from the registration page:

This year's conference promises to be at least as exciting as last year's. So far we have an exciting speaker roster confirmed, including keynote sessions with Rickard Falvinge, leader of Sweden's Pirate Party, and Angela 'webchick' Byron, Drupal 7 co-maintainer and Lullabot. Other confirmed speakers include:

(BTW Open Web Van peeps - I'm tagging this #owv09, please post this somewhere as the "official" tag)

I posted my talk submission to the Open Restaurants wiki. What's Open Restaurants? Well, it's the namespace for that Semantic Web Community Barn Raising that I blogged about a while back. We had our breakfast meeting, 8 people came, and we've got a great cross section of Drupal devs, Freebase / semweb schema nerds, open data policy enthusiasts, and NLP experts that want to use Twitter posts to create Skynet :P

Semantic Web Community Barn Raising in Vancouver

Or, how I got a tasted of Linked Data and wanted more.

I've been noodling about with a variety of semantic web / open data concepts for some time. Most recently, I spoke at Drupalcon DC 2009 on Practical Semantic Web (and why you should care). There's a video of the presentation embedded in that last link, and the presentation is available on SlideShare.

I think I can explain things even more simply: we can place a simple piece of data on a web page - like an address, or marking something as a person. From those source pieces of data, we can link it to ever more pieces of information. Except, instead of just "bare links", these links carry meaning, creating a richer web of content. In my presentation, I used the phrase "RDF is food for robots" - a way to share more info about the visible, human readable information that is already there.

From my point of view, that's already cool enough. But it's still not very *practical*. Why should you care, today? That leads me to open data, in general. The Dev Seed guys in DC have a great example of this in their StumbleSafely site -- it mashes up bars in Washington DC with crime statistics, cross-referenced with time for day, night, and evening data.

StumbleSafely.com screenshot

We don't have anything like this in Vancouver. What we do have is some interesting data that is beginning to be gathered in Freebase, and more specifically in the Vancouver-specific corner of it, known as vanbase. Looking at the StumbleSafely application, and thinking about some data that I hate to see duplicated over and over again, I came up with the idea of restaurants, and extra information about them.

Social Graph applications: why not for every community website?

One of the outcomes from my trip to Victoria last week is some thinking about the social graph. More specifically, you may recall that I've been using Flock. As it turns out, I recently upgraded my laptop to Leopard, and made Flock my main browser. This has given me increased exposure to their "people bar" -- a side bar that supports a variety of big community websites, like Facebook, Flickr, Twitter, and so on. As I've been using this feature, and seeing the way that Flock "detects" features of different websites, I started thinking about how every community website could enable this functionality. Right now, the Flock team has to pre-integrate with the specific website's API to enable this functionality. But, just as they "detect" the presence of site-specific search engines, there is no reason that one couldn't expose a link header that indicates the presence of a social graph. I know what you're thinking: "But Boris, how many people use Flock? Isn't this just browser specific functionality?" Well, no. First of all, Google has a Social Graph API that is already being crawled -- looking at FOAF and XFN. Secondly, I got to thinking about all these site-specific applications -- like Twhirl that was bought by Seesmic. So, if we had some basic standards about this stuff, it would be simplistic to have one app that let us monitor / notify / update any of these systems. Yes, there will ALWAYS be websites that have more complex APIs with more features -- that are only accessible by implmenting *their* API to talk to them. But for thousands of other community websites, built in Drupal, WordPress, Joomla, or what have you -- you suddenly have the same rich access to applications as the big guys. How many websites would encourage their users to install Flock or Twhirl if it supported *their* website? Oh, and I'm completely skipping the linked data / RDF / Semantic Web factor of having community websites expose some part of their social graph, or at least make it available for querying by people that have the right credentials. OK, so how does this look to the end user? I'll use Flock as an example, since I've got agreement in principle from them that they'll work with me on this, including help in defining some of the formats.

  1. User surfs to community website where they already have an account (for simplicity's sake, we'll pretend a session is still open)
  2. Flock detects a website that has a social graph available because of a header link that looks something like this: <link rel="socialgraph" href="/user/4426/socialgraph.rdf" type="application/rdf+xml" /> (Note the user ID in there, because the user already has a current session open)
  3. Flock does it's fun in browser slide down that says something like "This website supports a people bar. Would you like to add it?"
  4. If the user clicks on "yes", then Flock initiates an OAuth request to be allowed to a) fetch the current user's social graph file and b) take actions on behalf of the user such as setting their status or sending a message/poking/whatever another user on the site
  5. The user acknowledges the OAuth request and clicks some allow buttons
  6. Voila! A fantastic site specific "people bar" right in your Flock browser

So, that was a VERY Flock specific flow, but as I mentioned with Twhirl up above, absolutely no reason that you couldn't do the same thing with those type of people notifier on your desktop apps / widgets / etc. -- just start by typing in the URL of the website, the app would go and discover the social graph link and/or initiate an OAuth request to authenticate, and all of a sudden you're directly monitoring the different community websites you're a part of directly. Bonus points to websites that expose the social graph as an XMPP Pub Sub endpoint so these apps don't need to poll constantly. Now, I know the first thing we're going to have to do is fight a religious war over the format of the socialgraph file. I'm going to suggest some minimal FOAF format, since I'm a born again RDF fan.I don't want to go spraying email addresses all over the place, so perhaps either local unique user GUIDs or OpenID could be used as identifiers for each person. We actually don't need full "person" information -- a username, avatar, status message, and date stamp for last activity sorting should be the minimal set. Even status message could be option for smaller, less complex sites so almost anyone could support this out of the box: just show everyone on the site (yes, that's right...ignore any sort of "friend" connection) sorted by last active -- which could be a post / comment, or (again, simple support by many sites...) just date stamp of last login. I'd like to think that the choice of OAuth as credentials for acessing this info isn't controversial at all. Feel free to layer OpenID in here somehow, but for the action-at-a-distance on which cool functionality can be built, this kind of a token system looks to be ideal.What next? Well, surprise, surprise, I'm going to take a crack at getting this implemented in Drupal. Raincity Studios is already working on the OAuth module, which would be one of the main pre-requisites. Once the format of the social graph file is defined (calling Joshua, Arto, and maybe RalphM...), building the next piece shouldn't be too hard.Ideally, something like the Gnomepal Drupal distribution would ship with this out of the box (for the really ambitious, Drupal 7 core!). And other systems like Marc Canter's People Aggregator could easily expose this social graph info as well.I'm excited at the continuing growth of every website as a dynamic web application, and also of the exposure of data and APIs by this web of sites. This feels like the right path we're travelling on to get everything a little bit more interconnected.

I think I might become an RDF fan boy (again)

I'm currently in Stuttgart, Germany doing some Drupal client work in a gathering that we've come to call "Geek Week". We sit down and look at internal requirements and do 3/6/12 month planning, matched up with the state of the Drupal universe. But more on that later, probably over at RCS (the "In Drupal We Trust" t-shirts were popular).

One of the "geeks" attending here is Arto Bendiken. Check out his projects page for an example of some of the stuff he's worked on. For Drupal folks, that would be timeline - AJAX widget for visualizing temporal information, boost - static page caching for Drupal, drush - command line shell for Drupal, trace - easy debugging for Drupal, exhibit - rich visualization and faceted browsing. Yes, that is impressive :P

So, Arto and I got to talking about RDF (rdfabout.com is a good primer site), and how it's the new black. I admit that I've felt that XML vs. RDF is (almost?) a religious war. It seems to me that pointy haired bosses (PHBs) have memorized that RDF == slow and complex, and XML == fast and ubiquitous. Since in selling concepts I often interact with PHBs, RDF has felt like an uphill battle, especially as RSS/Atom grow more and more widespread.

But.

Idea Fragments: Face to Face with Ton Zijlstra, Part 2

See also: Feed Reading via People: Face to Face with Ton Ziljstra, Part 1

I wanted to share with Ton some of the information that came up during DrupalCon.

  • NINA -- a bundle of Semantic Web (yes, big "S" -- it is an RDF store) tools being built on top of Drupal. Don't forget the Relationship module -- it's still around and still very interesting. RDF for a long time has had the perception of complexity. Secondly, it has had real performance issues as tuples can't natively be represented in a relational database. Both of these are chicken-and-egg issues, I believe. If there were more running code in an easily accessible platform that automatically created and worked with RDF metadata...well, we might see some interesting things :P
  • Mixel and Ton should meet/talk more. Mixel's KNOSOS (Knowledge Sharing over Social Software) project has built some very interesting tools on top of Drupal and is continuing to evolve; we're all looking forward to see more on his tag visualization

Ton uses Qumana to easily cross-post to multiple locations. I'm unhappy with this as a solution, as context is actually lost. That is, there should be one canonical/permanent/referenceable URI for each blog post...or at least a GUID. Atom does this well, we need it to be better automated. To be fair, Ton does customize and/or intro each duplicated cross post, adding local context. Ideally, there is a field in the feed specs that could be used for this purpose without changing the GUID -- I may want to see his local context hints, but I don't want to see the same article three times from him. I know I feel the same pressure/need -- these collection of posts should be at Bryght as well as B. Mann Consulting because we haven't yet enabled automatic aggregation of targetted posts.