OSI model, the order of the world is:

  1. Physical
  2. Data Link
  3. Network
  4. Transport
  5. Session
  6. Presentation
  7. Application

RPC systems can be seen as something living at the session layer, but have been largely ignored in the last few years of development that has been predominantly focused on HTTP-based technology. A highly functional session layer should provide a thorough abstraction of the underlying transport layer.

As an example, D-Bus. D-Bus is used for connecting a number of services in a Linux desktop session together. Each desktop session has a corresponding D-Bus "session" running that provides high performance RPC and session discovery between applications. Using a service on D-Bus is fairly trivial:

  1. Connect to the bus through an opaque bus handle
  2. Ask the bus to locate your service by name which provides an opaque service handle on success
  3. Use the service handle to call methods, and manipulate data

Interacting with a web service is equally trivial:

  1. Query DNS to find the hostname, returning an address on success
  2. Open an HTTP+SSL connection to the address, presumably on port 443
  3. Submit a request that is adheres to a service-specific format found in external documentation

However, the difficulty comes from exposing these services.

To expose a session-wide service on D-Bus:

  1. Connect to the bus through an opaque bus handle
  2. Respond to method calls through your opaque bus handle
  3. OPTIONAL: Ask the bus to give your handle a bus-wide name

Compare this with the steps needed to expose an internet-wide web service:

  1. Register a domain name
  2. Determine a publically routable IP address for your service and point the domain name at it
  3. Install a SSL certificate for whichever web server you choose
  4. Grab exclusive use of port 443 on the IP address
  5. Tell people about your service's DNS name and what formats to send over HTTP

D-Bus provides a very minimal number of moving parts for the user to manipulate:

HTTP provides a few more knobs:

All of those things are baked into the HTTP protocol. If you want to use HTTP, you need to learn something about all of those components.

D-Bus avoids this by living at the session layer. Under the hood it is interacting with unix domain sockets and a centralized bus daemon. To the presentation layer it exposes an external interface composed of well known bus names and mechanisms to associate well known service names to endpoints inside applications.

An Internet Friendly Session Layer

For some time, I've been quietly developing what I feel is a solid approach at providing a session layer interface for The Real World. In this Real World, multiple disjoint systems interact over a number of different channels, the networks used are terrible and complicated, and you need to know a little bit of everything to reliably keep a HTTP service available. The software is based on the basic design of D-Bus but with a focus on the internet instead of a local desktop session. The working title of this technology is Graviton, and here's some C code that exposes a simple ping service on the network:

#include <graviton/server/quickserver.h>
#include <glib.h>

// Callback that responds to ping() calls
static GVariant *
cb_ping(GravitonService *control, GHashTable *args, GError **error, gpointer user_data)
{
  g_print ("Responding to ping request\n");
  graviton_service_emit_event (control, "handled-ping", NULL);
  return g_variant_new_string ("pong");
}

int main(int argc, char** argv)
{
  GravitonQuickserver *server = NULL;

  // Construct an endpoint container on the bus
  server = graviton_quickserver_new ();
  // Expose a service on the bus
  graviton_quickserver_add_method (server,
                                   "net:phrobo:graviton:examples:ping",
                                   "ping",
                                   cb_ping,
                                   NULL,
                                   NULL);

  // Print the opaque handles others can use to connect to it
  const gchar *cloud_id = graviton_server_get_cloud_id (server);
  const gchar *node_id = graviton_server_get_node_id (server);

  g_print ("Echo server running at %s:%s\n", cloud_id, node_id);

  // Start the event loop
  graviton_quickserver_run (server);

  return 0;
}

The above code:

  1. Connstructs an opaque handle for an endpoint container on the bus.
  2. Exposes the net:phrobo:graviton:examples:ping service on the bus, which has the ping() method
  3. Waits to handle method calls

Behind the scenes, the Graviton framework:

  1. Asks a set of publishing backends to publish endpoint connection details through whatever means available.
  2. Handles incoming RPC calls from backends and sends results back

The magic happens in each of the publishing backends. Right now, there are three publishing backends available: DNS-SD, Spitzer, and D-Bus. Each of these backends starts a JSON-RPC HTTP server that listens on 0.0.0.0 and a random port. The D-Bus backend publishes connection details as a D-Bus service with a singular 'port' property and a similar bus name. DNS-SD publishes the connection information as a similar service name and a singular 'port' TXT record. Spitzer is a tiny flask server running on gdns.phrobo.net that updates DNS records.

The corresponding client code is similar and even smaller:

#include <graviton/client/cloud.h>
#include <graviton/client/service-interface.h>
#include <unistd.h>

// Callback that is called for each provider of a service as they are
// discovered
static void
cb_services (GravitonCloud *cloud, GravitonServiceEvent event, GravitonServiceInterface *iface, gpointer user_data)
{
  GError *error = NULL;
  GravitonNode *node;
  GMainLoop *loop = (GMainLoop*)user_data;//FIXME: proper cast
  switch (event) {
    // An endpoint was enumerated by a backend
    case GRAVITON_SERVICE_NEW:
      node = graviton_service_interface_get_node (iface);
      g_print ("Calling ping on %s\n", graviton_node_get_id (node, &error));
      graviton_service_interface_call_noref (iface, "ping", &error, NULL);
      break;
    // All the discovery backends have enumerated the currently known
    // endpoints. More might come later.
    case GRAVITON_SERVICE_ALL_FOR_NOW:
      exit(0);
      g_main_loop_quit (loop);
      break;
  }
}

int main(int argc, char **argv)
{
  GMainLoop *loop = g_main_loop_new (NULL, 0);
  GravitonCloud *cloud = graviton_cloud_new_default_cloud ();

  graviton_cloud_browse_services (cloud, "net:phrobo:graviton:examples:ping", cb_services, loop);

  g_main_loop_run (loop);

  return 0;
}

Corresponding to each of the available publishing backends is a discovery backend. In this case, there are currently two: D-Bus and DNS-SD. The corresponding Spitzer discovery is a work in progress.

Additionally, there is a mesh network discovery backend in development. Using the mesh network backend, one can easily interact with other members of a Graviton network without having to know how to punch through NAT, HTTPS proxying, or any other hard problems. This backend uses the existing HTTP endpoints to proxy RPC calls.

If multiple transport backends report an ability to connect to a particular service, the framework will chose the one with the best connection. For example, using a LAN TCP socket instead of trying to route through a remote server. Should a node become unreachable on one transport, another might be used instead. Atomicity is not guaranteed due to the byzantine generals problem though using multipath routing to generate a consensual view of a resources's state is not out of the question.

The Elephant in the cloud, er, room: ZeroMQ

In describing this experiment to my peers, the number one question that comes up is "why not use zeromq?". This is a valid point. though Graviton and ZeroMQ aim to solve slightly different problems. In fact, it is possible to run Graviton on top of ZeroMQ.

Two subjects that ZeroMQ does not address is being transport agnostic and having built-in authentication. There currently exists a ZAP for building an authentication protocol on top of ZeroMQ, but I don't feel that it is the same as including authentication into the system.

When establishing a ZeroMQ socket, one supplies the transport protocol to use through a connection string such as tcp://\*:5670. Once that socket is created, it is only possible to use TCP which brings along its own set of questions:

Most ZeroMQ developers recommend using ZeroMQ for the LAN with other code that can bridge network segments together using out of band communications. It works, but ZeroMQ wasn't designed to work across the vastness of the Internet. Furthermore, one must wrap ZeroMQ in another translation layer to provide end-to-end encryption.

Authentication within a ZeroMQ system is another component that must be added on. It is a common practice to not include an authentication layer when used within a datacenter's LAN, but again that does not work for the greater Internet.

Graviton hides those knobs from users and expects transport backends to implement a common authentication API and to encrypt data as appropriate. Work is ongoing to support the use of public key encryption as authentication and encryption for the JSON-RPC interface.

The Privacy Things

It is not currently implemented, but an authentication and encryption protocol is to be implemented using public key encryption. In Graviton, networks are organized by two identifiers: the node and the network. Each is currently represented by a UUID, though this may change to a raw public key in base58 much like most modern crypto network systems.

Membership within a network can be identified by being able to respond to a challenge with a signature based on the shared private network key combined with a single node's private node key. Together, these two signatures can uniquely identify network membership and network identity. A rogue node that has acquired the shared private key can be blacklisted based on their identity key, giving the network operator time to rekey nodes.

To accomplish this challenge-response mechanism, inter-node communications will be encrypted to the destination node's public key and signed with the network's shared private key. Since network keys are shared among a potentially vast number of nodes, it is important that a mechanism for frequent network key rotation is implemented while still being able to maintain a network's original identifier as the session identifier.

Authorization mechanisms can be built on top of this as every transport backend must support authentication via these two keys and provide them to the upper layers in the OSI stack as attributes of the opaque handles. One can see how separate node/group/unknown identities is analogous to the existing user/group/world permissions in Unix.

In a more abstract sense, holding a copy of the network key grants a node the permission to publish services on a network, while the node key grants other nodes the ability to precisely know who they're dealing with. One can also go a step further and sign another network's key with your own to act as a way to establish trusted routes between networks, further giving the network agency to circumvent other network disruptions.

The Mesh Network Things

There is currently no implementation of a mesh routing protocol in any of the backends. One is planned for the HTTP based endpoints by using the CONNECT method present in HTTP and passing in the node+network identifiers.

Where to get it

If you think this is a great idea with a future, you should check out some of the sources on github:

These github repositories are mirrors of the self-hosted repositories on git.phrobo.net. If you find this useful, please let me know how you're using it so I can make sure if I'm on the right track. Check out the contact page on my site for details.

Future Work

As mentioned above, it is a near-term goal to have Graviton running on top of ZeroMQ for LAN interaction with even less overhead along with public key encryption and authentication for JSON-RPC transports.

Previously, Graviton contained an API for streaming data chunks around but it quickly became overcomplicated and is being shelved to have the first stable release of Graviton tagged and distributed to packagers.

Language bindings for NodeJS, Rust, and Python are also in the works by way of GObject Introspection.

Using Namecoin as an publishing and discovery backend is also a possibility, as is implementing a consensus algorithm for the automatic reissuance of network keys should another node be detected as compromised.

"/>

Decentralization is a wildly popular area of research lately. Blockchain technology is seen as a revolutionary development as are writing tools to build distributed systems. However, building decentralized systems whose use has a low barrier to entry is really really hard. Plenty of work has been done on the subject of service discovery technology though none yet seem to be satisfactory enough to be as pervasive as HTTP has done for interacting with services.

If you remember from your learning about the OSI model, the order of the world is:

  1. Physical
  2. Data Link
  3. Network
  4. Transport
  5. Session
  6. Presentation
  7. Application

RPC systems can be seen as something living at the session layer, but have been largely ignored in the last few years of development that has been predominantly focused on HTTP-based technology. A highly functional session layer should provide a thorough abstraction of the underlying transport layer.

As an example, D-Bus. D-Bus is used for connecting a number of services in a Linux desktop session together. Each desktop session has a corresponding D-Bus "session" running that provides high performance RPC and session discovery between applications. Using a service on D-Bus is fairly trivial:

  1. Connect to the bus through an opaque bus handle
  2. Ask the bus to locate your service by name which provides an opaque service handle on success
  3. Use the service handle to call methods, and manipulate data

Interacting with a web service is equally trivial:

  1. Query DNS to find the hostname, returning an address on success
  2. Open an HTTP+SSL connection to the address, presumably on port 443
  3. Submit a request that is adheres to a service-specific format found in external documentation

However, the difficulty comes from exposing these services.

To expose a session-wide service on D-Bus:

  1. Connect to the bus through an opaque bus handle
  2. Respond to method calls through your opaque bus handle
  3. OPTIONAL: Ask the bus to give your handle a bus-wide name

Compare this with the steps needed to expose an internet-wide web service:

  1. Register a domain name
  2. Determine a publically routable IP address for your service and point the domain name at it
  3. Install a SSL certificate for whichever web server you choose
  4. Grab exclusive use of port 443 on the IP address
  5. Tell people about your service's DNS name and what formats to send over HTTP

D-Bus provides a very minimal number of moving parts for the user to manipulate:

  • Picking between the well-known Session Bus, well-known System Bus, and out-of-band Private buses
  • Service names
  • Method names
  • Bus policy

HTTP provides a few more knobs:

  • SSL verification
  • Trusting SSL chains
  • Domain names
  • Firewall configurations
  • Content negotiation
  • Representation formats of requests and resources
  • A suite of authentication mechanisms
  • Port numbers
  • IP addresses

All of those things are baked into the HTTP protocol. If you want to use HTTP, you need to learn something about all of those components.

D-Bus avoids this by living at the session layer. Under the hood it is interacting with unix domain sockets and a centralized bus daemon. To the presentation layer it exposes an external interface composed of well known bus names and mechanisms to associate well known service names to endpoints inside applications.

An Internet Friendly Session Layer

For some time, I've been quietly developing what I feel is a solid approach at providing a session layer interface for The Real World. In this Real World, multiple disjoint systems interact over a number of different channels, the networks used are terrible and complicated, and you need to know a little bit of everything to reliably keep a HTTP service available. The software is based on the basic design of D-Bus but with a focus on the internet instead of a local desktop session. The working title of this technology is Graviton, and here's some C code that exposes a simple ping service on the network:

#include <graviton/server/quickserver.h>
#include <glib.h>

// Callback that responds to ping() calls
static GVariant *
cb_ping(GravitonService *control, GHashTable *args, GError **error, gpointer user_data)
{
  g_print ("Responding to ping request\n");
  graviton_service_emit_event (control, "handled-ping", NULL);
  return g_variant_new_string ("pong");
}

int main(int argc, char** argv)
{
  GravitonQuickserver *server = NULL;

  // Construct an endpoint container on the bus
  server = graviton_quickserver_new ();
  // Expose a service on the bus
  graviton_quickserver_add_method (server,
                                   "net:phrobo:graviton:examples:ping",
                                   "ping",
                                   cb_ping,
                                   NULL,
                                   NULL);

  // Print the opaque handles others can use to connect to it
  const gchar *cloud_id = graviton_server_get_cloud_id (server);
  const gchar *node_id = graviton_server_get_node_id (server);

  g_print ("Echo server running at %s:%s\n", cloud_id, node_id);

  // Start the event loop
  graviton_quickserver_run (server);

  return 0;
}

The above code:

  1. Connstructs an opaque handle for an endpoint container on the bus.
  2. Exposes the net:phrobo:graviton:examples:ping service on the bus, which has the ping() method
  3. Waits to handle method calls

Behind the scenes, the Graviton framework:

  1. Asks a set of publishing backends to publish endpoint connection details through whatever means available.
  2. Handles incoming RPC calls from backends and sends results back

The magic happens in each of the publishing backends. Right now, there are three publishing backends available: DNS-SD, Spitzer, and D-Bus. Each of these backends starts a JSON-RPC HTTP server that listens on 0.0.0.0 and a random port. The D-Bus backend publishes connection details as a D-Bus service with a singular 'port' property and a similar bus name. DNS-SD publishes the connection information as a similar service name and a singular 'port' TXT record. Spitzer is a tiny flask server running on gdns.phrobo.net that updates DNS records.

The corresponding client code is similar and even smaller:

#include <graviton/client/cloud.h>
#include <graviton/client/service-interface.h>
#include <unistd.h>

// Callback that is called for each provider of a service as they are
// discovered
static void
cb_services (GravitonCloud *cloud, GravitonServiceEvent event, GravitonServiceInterface *iface, gpointer user_data)
{
  GError *error = NULL;
  GravitonNode *node;
  GMainLoop *loop = (GMainLoop*)user_data;//FIXME: proper cast
  switch (event) {
    // An endpoint was enumerated by a backend
    case GRAVITON_SERVICE_NEW:
      node = graviton_service_interface_get_node (iface);
      g_print ("Calling ping on %s\n", graviton_node_get_id (node, &error));
      graviton_service_interface_call_noref (iface, "ping", &error, NULL);
      break;
    // All the discovery backends have enumerated the currently known
    // endpoints. More might come later.
    case GRAVITON_SERVICE_ALL_FOR_NOW:
      exit(0);
      g_main_loop_quit (loop);
      break;
  }
}

int main(int argc, char **argv)
{
  GMainLoop *loop = g_main_loop_new (NULL, 0);
  GravitonCloud *cloud = graviton_cloud_new_default_cloud ();

  graviton_cloud_browse_services (cloud, "net:phrobo:graviton:examples:ping", cb_services, loop);

  g_main_loop_run (loop);

  return 0;
}

Corresponding to each of the available publishing backends is a discovery backend. In this case, there are currently two: D-Bus and DNS-SD. The corresponding Spitzer discovery is a work in progress.

Additionally, there is a mesh network discovery backend in development. Using the mesh network backend, one can easily interact with other members of a Graviton network without having to know how to punch through NAT, HTTPS proxying, or any other hard problems. This backend uses the existing HTTP endpoints to proxy RPC calls.

If multiple transport backends report an ability to connect to a particular service, the framework will chose the one with the best connection. For example, using a LAN TCP socket instead of trying to route through a remote server. Should a node become unreachable on one transport, another might be used instead. Atomicity is not guaranteed due to the byzantine generals problem though using multipath routing to generate a consensual view of a resources's state is not out of the question.

The Elephant in the cloud, er, room: ZeroMQ

In describing this experiment to my peers, the number one question that comes up is "why not use zeromq?". This is a valid point. though Graviton and ZeroMQ aim to solve slightly different problems. In fact, it is possible to run Graviton on top of ZeroMQ.

Two subjects that ZeroMQ does not address is being transport agnostic and having built-in authentication. There currently exists a ZAP for building an authentication protocol on top of ZeroMQ, but I don't feel that it is the same as including authentication into the system.

When establishing a ZeroMQ socket, one supplies the transport protocol to use through a connection string such as tcp://\*:5670. Once that socket is created, it is only possible to use TCP which brings along its own set of questions:

  • How to discover endpoints across NAT?
  • How to communicate through NAT?
  • Encryption?

Most ZeroMQ developers recommend using ZeroMQ for the LAN with other code that can bridge network segments together using out of band communications. It works, but ZeroMQ wasn't designed to work across the vastness of the Internet. Furthermore, one must wrap ZeroMQ in another translation layer to provide end-to-end encryption.

Authentication within a ZeroMQ system is another component that must be added on. It is a common practice to not include an authentication layer when used within a datacenter's LAN, but again that does not work for the greater Internet.

Graviton hides those knobs from users and expects transport backends to implement a common authentication API and to encrypt data as appropriate. Work is ongoing to support the use of public key encryption as authentication and encryption for the JSON-RPC interface.

The Privacy Things

It is not currently implemented, but an authentication and encryption protocol is to be implemented using public key encryption. In Graviton, networks are organized by two identifiers: the node and the network. Each is currently represented by a UUID, though this may change to a raw public key in base58 much like most modern crypto network systems.

Membership within a network can be identified by being able to respond to a challenge with a signature based on the shared private network key combined with a single node's private node key. Together, these two signatures can uniquely identify network membership and network identity. A rogue node that has acquired the shared private key can be blacklisted based on their identity key, giving the network operator time to rekey nodes.

To accomplish this challenge-response mechanism, inter-node communications will be encrypted to the destination node's public key and signed with the network's shared private key. Since network keys are shared among a potentially vast number of nodes, it is important that a mechanism for frequent network key rotation is implemented while still being able to maintain a network's original identifier as the session identifier.

Authorization mechanisms can be built on top of this as every transport backend must support authentication via these two keys and provide them to the upper layers in the OSI stack as attributes of the opaque handles. One can see how separate node/group/unknown identities is analogous to the existing user/group/world permissions in Unix.

In a more abstract sense, holding a copy of the network key grants a node the permission to publish services on a network, while the node key grants other nodes the ability to precisely know who they're dealing with. One can also go a step further and sign another network's key with your own to act as a way to establish trusted routes between networks, further giving the network agency to circumvent other network disruptions.

The Mesh Network Things

There is currently no implementation of a mesh routing protocol in any of the backends. One is planned for the HTTP based endpoints by using the CONNECT method present in HTTP and passing in the node+network identifiers.

Where to get it

If you think this is a great idea with a future, you should check out some of the sources on github:

These github repositories are mirrors of the self-hosted repositories on git.phrobo.net. If you find this useful, please let me know how you're using it so I can make sure if I'm on the right track. Check out the contact page on my site for details.

Future Work

As mentioned above, it is a near-term goal to have Graviton running on top of ZeroMQ for LAN interaction with even less overhead along with public key encryption and authentication for JSON-RPC transports.

Previously, Graviton contained an API for streaming data chunks around but it quickly became overcomplicated and is being shelved to have the first stable release of Graviton tagged and distributed to packagers.

Language bindings for NodeJS, Rust, and Python are also in the works by way of GObject Introspection.

Using Namecoin as an publishing and discovery backend is also a possibility, as is implementing a consensus algorithm for the automatic reissuance of network keys should another node be detected as compromised.