February 17th, 2005


FOAF Update Patch

Since I now have someone to pass the torch onto for the time being, I took the time to put together a decent patch for LiveJournal's FOAF output. There are a number of distinct aspects to improving LiveJournal's FOAF output: in this patch, I include all of them. This entry includes an annotated version of the patch, and what I'm changing, and why.

The patch can be found at http://crschmidt.net/lj/foaf.patch.txt .

LJ::load_user_props($u, qw{
- aolim icq yahoo jabber msn url urlname external_foaf_url
+ aolim icq yahoo jabber msn url urlname external_foaf_url state city country journaltitle

Here, you can see that I am adding three user props for loading: state, city, country, and journaltitle. These will all be used, if available, later in the FOAF file.

- $ret .= " xml:lang=\"en\"\n";

Pulling out the xml:lang="en", becuase it really isn't serving the purpose it should. Although there is *some* text in the document which is in the English language, much of it is going to be nicknames or other fields which are not language specific, and tagging these as English turned out to be a bad idea. In addition, sha1sums were also being tagged as english, which means that in some cases, they aren't seen as "equal" due to language differences.

The next chunk looks a bit funky, but I'm not about to hack diff to make it look better, so I'll just explain: The birthdate and Group/Person code is moving down a bit. The first part of the file will now be a "PersonalProfileDocument". This allows spiders of the document to know who the main "person" the file is about is, without needing LiveJournal specific knowledge. To that end, I've also added an rdf:nodeID='a', to link from the profile document to the <foaf:Person> who is the maker/center of the document. The reason that the nickname and birthdate are moving down is that they were earlier displayed, even if the user had an external FOAF URL, which is not acceptable: with an external FOAF URL, as much data as possible should be left out.

The next chunk:

- # include a user's journal page and web site info
- $ret .= " <foaf:weblog rdf:resource=\"" . LJ::journal_base($u) . "/\"/>\n";
+ # include a user's journal page, name/nick and web site info
+ $ret .= " <foaf:name>LJ::exml($u->{name})</foaf:name>\n" if $u->{name};
+ $ret .= " <foaf:nick>$u->{user}</foaf:nick>\n";
+ $ret .= " <foaf:logo rdf:resource='$LJ::USERPIC_ROOT/$u->{defaultpicid}/$u->{userid}' />\n" if ($u->{defaultpicid});
+ $ret .= " <foaf:weblog rdf:resource='" . LJ::journal_base($u) . "/'";
+ $ret .= " dc:title=\"" . LJ::exml($u->{journaltitle}) . "\"" if $u->{journaltitle};
+ $ret .= "/>\n";

is probably the most controversial change of the bunch. There are three things happening here:
  1. foaf:name is added, set to the "Name" field stored for the user
  2. default userpic is added, as a foaf:logo. This was a *lot* of discussion in the foaf community, and it was decided that this was the closest thing that there was to a foaf-specific tag.
  3. journaltitle, if it is available, is added to the weblog node.

The next two chunks are well laid out and self explanatory: one is a a reformatting of the foaf:dateOfBirth (Which has been redefined to not be what LiveJournal uses) into a bio:birth Event with a date. Secondly, the location information is included as a vcard:ADR, the most accurate way of describing said information at the moment.

If anyone has any thoughts on things I've missed, or would like me to explain my reasoning more for a specific choice, please feel free to ask, and I will do my best to answer.

http://crschmidt.net/lj/foaf.patched.xml is an example of the output from the new code, although it is not fully representative, as i have neither a birthdate nor location set.

Edited: Mutiny, on #foaf on irc.freenode.net, mentions that I wasn't properly escaping the $u->{name} field. This isn't needed for $u->{nick}, since it can't have any goofy characters, but name certainly can.