Sunday, January 22, 2006

A Different Implementation of Relation-Chaining

OK, I wasn't really happy with the implemenation the relation-chaining thing I did last week, so I rewrote it. This won't make a lot of sense if you haven't read last week's installment, so go read it if you haven't yet...

This time I have the forward-linking operator sort of equivalent to the python "." operator. If romulus.father == mars, then romulus >> father == rset([mars]); in other words, the set containing mars. rset is a class derived from set. It's convenient to work with sets, because in the case of the reverse-linking operator, there could be more than one result, and I wanted some symmetry between the two cases.

To do the reverse linking, I use gc.get_objects(), which is python's garbage collector's master list of all the objects in the system. I guess that's not very efficient for a large program. In a large program I'd recommend creating a "universe" dictionary, put the objects you really care about it in it, then search that instead of get_objects().

A nice result of using sets is shown in the brother() function; the set operator "-" means to remove items from a set, so (x >> "father" << "father") - x means: find the father of x, then find the people x is a father to, then subtract x from that set, leaving only the siblings.

The only reason I create my own set class, or my own relational_object class, is to get the syntax of << and >> to work. Really, this is a demonstration of a feature I think could be central to a language, and in that hypothetical language you'd build this feature in so it could be used on any objects.

Well, here's the code. I promise I'll move on to something different next week! I just had to get this out of my system.

#!/usr/local/bin/python2.4
from gc import get_objects

class relational_object(object):
def __repr__(self):
return ""

def get_conv_gen(self, att):
for other in get_objects():
try:
if other.__dict__[att] == self:
yield other
except Exception, e: pass

def get_conv_any(self, att): return self.get_conv_gen(att).next()
def get_conv_all(self, att): return rset([ x for x in self.get_conv_gen(att)])

def __rshift__(self, att): return rset([self.__dict__[att]])
def __lshift__(self, att): return rset(self.get_conv_all(att))

class rset(set):
def __rshift__(self, att):
result = rset()
for x in self:
result = result | rset([x.__dict__[att]])
return result
def __lshift__(self, att):
result = rset()
for x in self:
result = result | x.get_conv_all(att)
return result

class person(relational_object):
def __init__(self, theName):
self.name = theName
def __repr__(self): return "<>"

mars = person("mars")
romulus = person("romulus")
remus = person("remus")
assert mars == mars
assert mars != remus

romulus.father = mars
remus.father = mars

assert mars << "father" == rset([romulus, remus])

Saturday, January 14, 2006

Relation-chaining gimmick in Python

This is just a piece of a whole idea, but I thought you might find it amusing.

There is an RDF notation called N3 that has an interesting syntax for quickly describing a chain of relations linking one node to another.

Here are some examples from the N3 paper:

:joe!fam:mother!loc:office!loc:zip   Joe's mother's office's zipcode
:joe!fam:mother^fam:mother Anyone whose mother is Joe's mother.
:joe ("Colon Joe" -- sounds like a brand of medicinal coffee, doesn't it?) is an RDF node. The other strings with colons in them, fam:mother, loc:office, and loc:zip, represent binary relations. I think the part before the colon tells you the namespace it's from, but ignore that for now.

The ! is like the "." operator in most object-oriented languages. :joe!fam:mother means "Joe's Mother", and you might say this as joe.mother in some language like Python or C.

The ^ is the converse of !; so if joe!mother == sue, then sue^mother == joe. You can think of !mother as meaning "mother" and ^mother as meaning "child". ^employer means employee, ^owner means possession. "hello, world"!length == 12, so 12^length == "hello, world" (among many other strings).

My mind was brought back to this when I was working on a project with a bunch of related objects in Python, and I was looking for a clean way of expressing complicated queries among them. In my day job I do a lot of ad-hoc SQL querying, and while I don't love SQL, it is nice to be able to fluently, readably, ask some complex questions of a bunch of relations. I'd read that list comprehensions were more or less isomorphic to SQL, but I found when playing with them that my queries were inefficient and not very readable; they were no better than just writing explicit code to traverse the object tree and collect the information I wanted.

I'm not sure if these N3-style paths will be useful in practice, yet, but it was worth playing around with them. The biggest hurdle with them, as I see it, is that there's no easy way when presented with 12^length to know what string is wanted, so you have to have a well-defined universe of objects to work with, and be willing to accept multiple results from every query.

So anyway, here's some code to give you an idea what I've come up with so far:

import pprint

class relation:
def __init__(self):
self.forward = dict()
self.backward = dict()

def add(self, arg1, arg2):
if not self.forward.has_key(arg1):
self.forward[arg1] = dict()
if not (arg2 in self.forward[arg1]):
self.forward[arg1][arg2] = True
if not self.backward.has_key(arg2):
self.backward[arg2] = dict()
if not (arg1 in self.backward[arg2]):
self.backward[arg2][arg1] = True

def __call__(self, arg1, arg2):
try:
return self.forward[arg1][arg2]
except Exception, e:
return false

def get_all_keys(self, hash, key):
try:
return hash[key].keys()
except Exception, e:
return []

def __rlshift__(self, other):
result = []
for item in other:
result = result + self.get_all_keys(self.forward, item)
return result

def __rrshift__(self, other):
result = []
for item in other:
result = result + self.get_all_keys(self.backward, item)
return result

father = relation()
phone = relation()

class frank: pass
class jeff: pass
class chris: pass
class andy: pass

father.add(frank, jeff)
father.add(jeff, chris)
father.add(jeff, andy)
phone.add(andy, "101-555-1249")
assert [chris] >> father == [jeff]
assert [jeff] << father == [chris, andy]
assert [chris] >> father == [jeff]
assert [jeff] << father == [chris, andy]
assert [andy] << phone == ["101-555-1249"]
assert [chris] << phone == []
assert [andy, chris] << phone == ["101-555-1249"]
assert (([chris] >> father) << father) << phone == ["101-555-1249"]
assert [chris] >> father << father << phone == ["101-555-1249"]

Some things to note about this implementation:
  • I used classes like "jeff" and "chris" for nodes. You could use any Python objects; the "class X: pass" syntax is just a cheesy quick way to create a dummy object to play around with. In Python a class is a kind of object.
  • I used >> and <<>
  • nstead of the RDF "node", I use a list with one or more python objects in it, in this case strings. I'm using a list as kind of a half-assed monad. Since a >> or <<>
  • I define a relation object which holds all its pairs of related things in hashes. It's probably not very efficient. I originally thought I'd just use the members of objects as the relations, so that the natural joe.mother would be the basis for the relation joe >> mother. But that didn't fit in very well with the list thing. If Heather has two mommies, you can have:

    mother("Ann", "Heather")
    mother("Patricia", "Heather")
    assert ["Heather"] >> mother == ["Ann", "Patricia"]
    as opposed to

    heather.mother = ann
    heather.mother = patricia
    where Patricia just overwrites Ann. This makes Ann sad.

    Maybe there's a better way around this.

  • I'm not sure I've got the precedence the same as N3 has it.
  • It should be sets, not lists. ["Ann", "Patricia"] should be considered the same result as ["Patricia", "Ann"].
Next time I'll try to apply this to a more interesting problem to see if it holds up.

Friday, June 11, 2004

Paul Graham also has some pie-in-the-sky thoughts about computer languages; see his articles, Hundred-year language and his work on a new dialect of Lisp, Arc. Arc is a little more down-to-earth than the Tunes HLL, and he appears to have done some work on it.

The Tunes project has some requirements for an unnamed "High Level Language" for their project, which has a lot of good ideas init. As far as I can tell looking at the web page, it's kind of vaporware at this point. And maybe doomed to remain vaporware, since the requirements seem so abstract and idealistic. But anyway, some of their ideas have crossed my mind too, and they might at least server as a thoughtful critique of how current programming langauges feel inadequate to people.

Thursday, May 20, 2004

Two good articles by Cory Doctorow (Metacrap) and Clay Shirky (The Semantic Web, Syllogism, and Worldview), expressing skepticism about the Semantic Web.


Shirky says most human thought isn't deduction from syllogisms, it's much more fuzzy and heuristic than that. Doctorow points out lots of good reasons why metadata will be crap in practice.


These ideas relate to better programming languages too. There's an impedance mismatch between human concepts and the way computer software systems tend to classify things. We just don't think as rigidly about things as computers do. We get frustrated with how software will go mindlessly clanking down the wrong path (for example happily cranking out copies of a virus and emailing them to all your friends), but when we try to build smarts into it, we're frustrated by just how dumb those smarts turn out to be.


Human computer interfaces, from GUIs to languages to development and testing environments, need to help our squishy minds navigate the bizarre tinkertoy syllogismatrons we're forced to build. While we frantically look for ways to teach software to think more like we do.

Sunday, April 25, 2004

I just stumbled on a weblog about programming languages and their design, called Lambda the Ultimate. Similar to what I want to do here, although I want to focus more specifically on languages that bridge the gap in some interesting way between natural human languages and traditional computer languages.

Thursday, April 22, 2004

Apocalypse 12 is out.

In case you live in a cave, the Apocalypses are Larry Wall's occasional manifestos (manifesti?) about how some aspect of Perl 6 will work. I read them with great interest because I like how Wall thinks, and I'm an avid user of Perl, and Perl 6 has some great ideas in it. I like hyperoperators and the new regular expression syntax. That being said, I'm getting kind of worried that Perl 6 is going to be too crufty. Look at what you can do with subroutine signatures. It seems unnecessarily complex and cryptic.

Of course I thought that back when I was learning Old Perl, but now I know it well and can use pretty efficiently, so I'll withhold judgment on Perl 6 until I get a chance to use it.

Monday, April 19, 2004

What this blog is about

The last post was mostly a test. What I want to do here is review computer languages that bridge the gap between computer language and human language in some interesting way.

Lojban is an attempt to make a human language somewhat more approachable by computers -- it has a yacc-parseable grammar that is syntactically unambiguous. In practice so far it has been used for communication between humans, not to specify or constrain a computer's behavior in any way. But it's a good example of an effort to think about some of the differences between the two kinds of language and find halfway points.

I do think that someday computers will be able to cope with human language. For programming I think we'll always need something a little more precise (like the language used in human legal contracts, maybe) to avoid some ambiguity.
But that day could be a ways off; why not meet the machine halfway for now? High-level languages have come a long way from raw machine code in doing this; I'm interested in seeing in what other ways computer languages can be more human without requiring outright artificial intelligence in the computer that uses them.