Talking To The Graph (“Obligatory Neo4j” Part 3)

This is my third post on the popular open-source graph database Neo4j, and I’ll be taking a look at APIs.  To see the full series, you can check out the Neo4j tag on this blog –

First of all, a quick reminder of our “set text”, the Graph Databases e-book from O’Reilly and Neo Technology.  Free, concise, full of useful info – what’s not to like?  Download it here.

To recap then… we’ve had a look at Neo4j and the Cypher query language.  We like what we see.  Expressive language, great possibilities for data modelling and a browser interface that draws us pretty pictures on demand:

Neo4j-Screenshot-3       Neo4j shopping data model 2

Awesome!  So all we need to do now is install Neo4j on a server somewhere and give our business users a big text file full of common Cypher queries.  That can’t possibly go wrong!

Sorry, what’s that?  We need to integrate the database with existing in-house systems built in some kind of statically typed, object-oriented language like Java or C#?  And the web team are saying they’ll need to connect from their back-end servers using something lightweight and dynamic like Ruby or Python.  Suddenly things are looking slightly more complicated…

Thankfully, this is not a problem as Neo4j offers a variety of ways to connect from other languages and systems.  The main route in is via the REST API, which (as you would expect) can be called from any language that can piece together the requisite lump of JSON and fire it across an HTTP connection.

There’s actually at least a couple of different ways to structure REST calls, but the best way is typically going to be to send actual Cypher queries as part of the JSON payload.  This is because Cypher and a declarative approach to data interaction are powerful, well-supported and seen as a big part of the roadmap for future Neo4j development.  The queries sent can be parameterised for better performance and can be submitted either via the standard Cypher endpoint or via the Transactional endpoint, which allows an RDBMS-like atomic transaction to be scoped across multiple consecutive HTTP requests.

Naturally, users of various languages have decided that they want to consolidate their use of the REST API into libraries, and many of these are freely available, with links collected on the Neo4j website.  A lot of common languages and frameworks – Ruby/Rails, Python/Django, .Net, PHP – get at least a couple of options each, and there’s even one for Haskell if you’re feeling particularly masochistic.



I was actually planning to look quickly at options for a couple of different languages that I’m somewhat familiar with – a .NET one using C#, and one of the two available for Python.  The idea was that I could do different graph models which would have some relevance to the way the languages are perceived – something “Enterprisey” for .NET, and for Python probably some terrible joke about Spam.

It turned out though that once I got started coding in C#, I spent longer than I expected messing around with associating nodes with classes and trying to work out how I might use it in a real-world situation.  Always the way!  So the Python will have to wait, and in the meantime, courtesy of my new GitHub account, here is some code for an ** extremely ** basic implementation of an “Enterprise Widget Manager”, persisted in Neo4j.

So first of all, the library itself.  Simply because it was nearer the top of the list, I chose Neo4jClient over CypherNet for my first attempt.  Neo4jClient is the product of Tatham Oddie and Romiko Derbynew of .Net consultancy Readify.  It installs straightforwardly via Nuget (always a bonus) and should work with any other language that can run on the .NET CLR – it has certainly been tried out with F#.

Overall, it seemed to work out pretty well for me.  I managed to come out with some code which – while not especially elegant or sophisticated in terms of my own contribution – did run the Cypher queries I wanted to against my install of Neo4j on localhost:7474.  The basic idea was to create a simple parts diagram (components used in multiple assemblies) – the sort of thing which, when scaled up, probably does lend itself better to graph modelling than to a relational database.

Neo4j WidgetManager Graph

The program runs as a simple console app, creating the graph pictured above (obviously I had to connect through the browser to get the visual version), then querying for and displaying a list of components used to make both “Widgets”:

Neo4j Widget CMD

I found the documentation was pretty helpful for getting up and running fast – it makes it very clear that you should read everything through properly before getting into any serious coding, but also gave me enough examples and hints to make a quick start on trying it out.  I’m sure a lot of the other third-party connection options will also be really good and may have a look at them in future.

Going Deeper

Some of you who are really observant may have noticed that, apart from one throwaway reference, I haven’t mentioned the possibility of connecting from Java.  This isn’t because it isn’t possible – quite the opposite.  Neo4j is implemented in Java, and in fact was originally designed as an embedded database for applications running on the JVM.  So if you want to get closer to the underlying database engine – whether for performance reasons or to tweak some functional aspect of the system that Cypher and the default REST API don’t give you access to – you’ll need to be comfortable working with the Java language, or at least something like Clojure which compiles to JVM Bytecode and can communicate directly with Neo4j in embedded mode.

There are three additional Neo4j APIs that Java provides access to:

  • The Traversal API
  • The Core API
  • The Kernel API

Moving down that list, each API gets further from the expressive, declarative modelling approach exemplified by Cypher, but in return allows you to work closer to the metal and permits a greater degree of fine-tuning and performance tweaking.

You can use these APIs when running the database in embedded mode, or there’s also the option to write custom “server extensions” to the REST API, using the Java APIs to redefine behaviour in response to specific REST calls.

The final thing you can do with Java is hack the code base of Neo4j itself.  It’s open source and Neo’s own Max De Marzi provides a great example of how to take advantage of that here.

Now With GitHub + Pointers

I’ve just set up an account on GitHub – not much to see on there other than one very basic C source file, from one of the exercises in Chapter 1 of Kernighan & Ritchie – but watch this space.

Why GitHub?

Pretty normal reasons I guess:

  • The amount of (stricly non-work-related) code snippets I’ve been emailing back and forth between home and work was getting a bit silly.
  • If I ever do put anything decent on there it acts as a portfolio of sorts, and its nice to be able to discuss code by linking to fully versioned examples.
  • I use TFS at work and it seems like a good idea to see what else is out there.  I’m by no means a Microsoft hater but it’s not hard to suppose they don’t get everything right.

Why C/C++?

That’s a slightly more difficult one, or at least more a matter of personal preference.  I of course have a ton of cool languages and technologies I want to get to grips with, so with the modern trend being towards higher-level, more expressive idioms, why am I starting to learn a language where you don’t even get a boolean data type out of the box?

Well, to keep it short and sweet, I really like to understand how things work in detail.  I have to admit to being somewhat influenced by Joel Spolsky’s vintage posts on leaky abstractions and the perils of JavaSchools – macho nerd-elitism aside, a vast amount of the software we use today is in some way built on top of C/C++.  Relational databases, operating systems, frameworks and VMs, IDEs – all the things that are really fundamental to what we do as programmers tend to need at least some access to the low-level power and control that direct allocation of bytes provides.

Just using these tools day-to-day, I’m sure I can benefit from understanding their implementation a little better, and even in today’s declarative, cheap-hardware world, I suspect there are times when JIT-compiled or interpreted languages just aren’t going to cut it.

And even if there aren’t, it’s a good intellectual challenge, which is kind of what we’re all after anyway, isn’t it?