All Possible Worlds

Let's try to keep things focused here

Introducing Pacer

with 3 comments

Pacer is a Ruby implementation of the Gremlin graph traversal language. It’s built on JRuby and uses the same very cool graph processing stack that Gremlin uses to do high speed pure Java graph processing, but with lots of extra Ruby goodness baked in.

This post is meant as a high level overview of the various features of Pacer and how it is both similar to and different from Gremlin.

Similarity to Gremlin

Pacer does traversals in exactly the same way as Gremlin. It uses the same Pipes framework built on the same Blueprints API. Features developed for one are able to be easily implemented in the other. The pipe transformation paths feature I initially implemented for Pacer is a good example of that. Gremlin adopted that feature the same day that my ‘paths’ branch was merged into the main Pipes repository.

To illustrate the similarity, here are some examples of both simple and complex traversals, first using Gremlin, then using Pacer:

Open a graphML file:

# Gremlin:
$_g := tg:open()
g:load('data/graph-example-1.xml')
# Pacer:
graph = Pacer.tg 'data/graph-example-1.xml'

Traverse to all vertices that have been ‘favorited':

# Gremlin:
./outE[@label='favorite']/inV
# Pacer:
graph.out_e(:favorite).in_v

Complex Example:

Let’s skip straight to a much more complex example borrowed from one of Marko’s Gremlin presentations here (slide 114). Assuming I am represented by vertex(1), this traversal recommends new things for me to like based on shared interest. It returns the most commonly liked things from among others who also like what I do, excluding the things that I already like.

# Gremlin:
$_ := g:id-v(1)
$m := g:map()
(./outE[@label=‘favorite’]/inV)[g:assign(‘$x’)]/inE[@label=‘favorite’]/outV/outE[@label=‘favorite’]/inV/.[g:except(.,$x)][g:op-value(‘+’,$m,.,1)]
g:sort($m,‘value’,true)
# Pacer:
vertex = graph.vertex(1)
m = vertex.out_e(:favorite).in_v.as(:x).in_e(:favorite).out_v. out_e(:favorite).in_v.except(:x).group_count { |v| v }
m.sort_by { |vertex, count| -count }

I think it should be clear that both Pacer and Gremlin have virtually identical traversal capabilities and even quite similar syntax. They are both so descriptive that it is actually easier to define a traversal than to explain it clearly in plain english.

Differences

Despite all of the similarities, there are still a number of differences between the two projects. When I first discovered Gremlin I was blown away, but I quickly missed having access to my familiar Ruby toolkit. What started as an experiment in reimplementing Gremlin in Ruby quickly became extremely useful to me in my daily work and turned into what I think is a very cool project with a life and direction of its own.

Let’s get two very fundamental differences out of the way first. First, Pacer uses Ruby as it’s language, employing method chaining to define traversal routes while Gremlin defines its own using a combination of xpath-like traversal definitions with its own procedural syntax for other functions. Secondly, Pacer does not have the concept of a current vertex or a current graph; the starting point of any traversal is explicit and there are no special variables like Gremlin’s $_g and $_.

Lazy Routes, Block Filters, Useful Tools, Branching and Merging, Manual Transaction Control, Contextual Steps

That’s quite a subtitle! Consider this a placeholder for a few shorter upcoming posts detailing each of these features.

Features Not Implemented Yet (coming soon)

Pacer’s pretty new and there are a number of things that I simply haven’t had a chance to implement yet. You can look forward to a steady flow of new features, including:

Lookaheads

Gremlin has a really cool feature that I actually didn’t initially understand and so implemented incorrectly. That’ll be corrected very soon, and it’ll make certain types of traversals quite a bit faster than they currently are in Pacer. Here’s an example in Gremlin, followed by the way I plan to do it in Pacer, of a traversal producing only people who like graphs:

# Gremlin:
$_g/V/outE[@label="likes"]/inV[@name="graph"]/../..
# Pacer with the new lookahead method:
graph.v.lookahead { |v| v.out_e(:likes).in_v(:name => 'graph') }

# Pacer with the current method:
graph.v.filter { |v| v.out_e(:likes).in_v(:name => 'graph').any? }

Faster Conditionals

Pacer’s a little weak on conditional statements right now. It’s good at testing equality. Anything else and you’re on your own! Actually you can do anything you want if you use a block filter, but I think it’s possible to do most of what you need more efficiently, so that’s what I plan to do. Here’s the plan:

# Pacer with arbitrary boolean expressions:
graph.v({ :lat.gt => 43, :lon.lte => 79 }.and({ :name.match => /pangloss/i }.or( :name => 'eliza' )))

# Pacer with the current method:
graph.v.filter { |v| v[:lat] > 43 and v[:lon] <= 79 and ( v[:name] =~ /pangloss/i or v[:name] == 'eliza' ) }

Once I’ve put together some benchmarks, I’ll decide whether this actually makes it into the project.

Conclusion

I hope this post has piqued your curiosity at least. Pacer‘s a very cool project made possible by the amazing work of TinkerPop. In my next posts I’m going to start showing you the features of Pacer that make it really shine!

Written by pangloss

December 19, 2010 at 10:06 pm

Posted in Pacer

Tagged with , , ,

Follow

Get every new post delivered to your Inbox.