Tunneling HTTP Connections via Socket.io Streams

So you’ve got a service internal to your network (or maybe even listening on localhost!) and you want to expose it on the public net, huh?

Oh sure, you could open a port on your router. You could use something like localtunnel or Pagekite. But what fun is that? Instead, I’m going to show you how to roll your own solution using just a few npm modules. That being said, I did take quite a bit of inspiration from the localtunnel project, so major kudos to the devs.

Motivation

So why do this? Recently I was working on a project (wikipi.io) that consists of booting Raspberry Pis up from behind NAT routers and having those Pis get a public URL automatically. The Pis run a copy of Dokuwiki and Nginx and I wanted to have each instance of Dokuwiki exposed on a public subdomain (i.e. https://mywiki.wikipi.io).

After a few minutes of googling, I found the localtunel project. It seemed to be exactly what I needed. And also in the language (NodeJS) I was using too. Unfortunately there are a few implementation aspects that didn’t quite sit well with me.

How localtunnel works

In a nutshell, localtunnel consists of two components. A server and a client. The server uses express to listen for incoming requests. These requests can either be from a connecting client that wishes to expose its local port to the public net or a client wishing to connect to an already established service at a subdomain.

When a request comes in from the localtunnel client component, it makes a request to https://thelocaltunnelserver/desiredsubdomain. The express server checks to see if a localtunnel client has already claimed that subdomain. If not, the localtunnel server fires up a new TCP server on a randomly generated port greater than 1023 (non-privileged).

The express server then returns this randomly generated port to the localtunnel client and gives the client 5 seconds to connect. If the localtunnel client does not establish a connection to the TCP port within 5 seconds, the server is closed and the localtunnel client will have to reconnect to try again.

If the localtunnel client is able to connect to the localtunnel server’s randomly generated TCP port, by default it opens 10 TCP sockets to the server. These connections are held open, even if no data is being transferred. The localtunnel client then waits for requests to come in over any of these 10 TCP sockets. When a request comes in, it is piped to a nodejs HTTP client that connects to localhost for the desired service.

In order to expose the localtunnel client’s local service to the web, the localtunnel server waits for requests to come in on the subdomain chosen by the localtunnel client. If it matches the subdomain of a currently connected client, the localtunnel server proxies the request to one (or more) of the 10 TCP sockets being held open by the localtunnel client.

How we can improve from here

Excellent, so that’s localtunnel in a nutshell. It was written during a hackathon and wasn’t designed for production use. No worries. But what can we improve or even simplify?

Connection Snatching

The initial phase of the connection seems solid. A localtunnel client connects and provides the subdomain it desires (typically over HTTPS). But then the localtunnel server creates a new TCP server on a random port. What if an attacker was port scanning the server or even just attempting to connect to all ports all the time? The attacker could snatch a connection to the newly opened TCP server and start serving whatever they wanted on the subdomain the localtunnel client was trying to use. Yes, there’s a limited window of 5 seconds. But this is still an attack vector.

TCP in the clear

Another concern is the non-TLS TCP socket that the localtunnel server is opening. As the initial handshake happens over HTTPS, we’d also want everything to be encrypted between the localtunnel client and server.

So many connections!

Another inefficiency of localtunnel is that every connected localtunnel client burns through 10 TCP connections on the server, even when no data is being transferred. Behind certain firewalls and NATs, I was also seeing these connections being rejected or even just dropped. Of course, if we were using TLS, most firewalls would be more hesitant to block the traffic as it couldn’t be inspected. We are also rather fixed in the number of requests that can be handled simultaneously. What if we need to serve more than 10 requests at a time?

Solutions, give them to me!

So how can we solve these issues? And what if we could also simplify the code in the process? Enter socket.io + the socket.io-stream module.

For the remainder of this post, let’s call this new implementation ‘socketTunnel’.

The Connection

So we’ve got an express server on the backend. That’s good, we’ll keep that. But now the express server will only handle requests from end clients (a user’s browser for instance) instead of incoming connections from socketTunnel clients.

From the backend, we’ll now use socket.io to listen on a different port than the express server. We’ll also install the socket.io-stream npm module. This allows us to create binary streams between the socket.io server and socket.io clients. You’ll see why this is awesome soon enough.

So let’s bind to the socket.io event ‘proxySubdomain’ on the backend. When a socketTunnel client wants to make a localhost service accessible from the public net, it will emit this ‘proxySubdomain’ event to the socketTunnel server. If you need to perform authentication of a socketTunnel client, this would be the place to do it. Provided everything checks out (the subdomain isn’t already taken) and the optional auth code passes, we’ll add the socket.io client object to an associative array of subdomains as the key.

TLS, my friend

So we’ve got 2 different ports exposed from our socketTunnel server. One for express (end clients) and one for socket.io (socketTunnel clients). Using nginx as a reverse proxy in front of these two ports, we can expose both of them via the same port (443) with TLS enabled. If a request comes in over a subdomain, it’s routed to the express server. If a request comes in via the primary domain, it’s routed to the socket.io port. Bam! We’ve got full encryption everywhere.

Bouncing requests

Excellent. Good. Outstanding. So how do we actually route incoming end client requests to the client sitting behind the NAT router? Since we’re using socket.io now, we get keep alive for free. We don’t need to open a TCP port for every socketTunnel client that wants to make itself available. AND we can also authenticate a socketTunnel client via the socket.io ‘proxySubdomain’ event.

Here’s where it gets cool. The socketTunnel client also has the socket.io-stream module installed. The socketTunnel client binds the event ‘incomingClient’. When the socketTunnel server receives a request for a subdomain, it looks up the matching socket.io client in the associate array. If it exists, it emits the ‘incomingClient’ event to the socketTunnel client with a randomly generated GUID. The server then binds a socket.io ‘once’ event, using the GUID as the socket.io event name. We will expect the socketTunnel client to return a stream using this event name.

var ss = require(‘socket.io-stream’);
var socketTunnelClient = socketsBySubdomain[requestedSubdomain];
var requestGUID = GEN_GUID();

socketTunnelClient.emit('incomingClient’, requestGUID);

ss(socketTunnelClient).once(requestGUID, function (stream) {
// uses https://github.com/substack/bouncy to bounce the end client
bounce(stream)
});

The socketTunnel client receives the ‘incomingClient’ event with the GUID, creates a nodejs http client that connects to the webserver listening on localhost, and then creates a new socket.io-stream piped bidirectionally with the http client. Using socket.io-stream, this stream is then emitted back to the socketTunnel server via the one-time GUID event. As seen in the code above, this stream is then piped bidirectionally with the end client’s request via bouncy.

var ss = require('socket.io-stream’);
socket.on('incomingClient’, function (requestGUID) {
var client = net.connect(80, 'localhost’, function () {
var s = ss.createStream();
s.pipe(client).pipe(s);

s.on('end’, function () {
client.destroy();
});

ss(socket).emit(requestGUID, s);
})
})

What Just Happened

Whew! So what did we just do here? Let’s recap.

We accepted a request from an end client over HTTPS (https://example.sockettunnelserver.com)
Inspected the subdomain to identify a connected socketTunnel client (we made the assumption here that it exists and is connected)
Sent a one time request to the socketTunnel client via socket.io (running over TLS) to ask for a stream to be sent back to the server
From the client: we created a nodejs http client aimed at localhost port 80 (the service we are exposing to the public net)
Created a socket.io-stream bidirectionally piped with the localhost net client
Sent the socket.io-stream back to the socketTunnel server via the expected GUID event name
Bound the end client’s request with the stream that was just returned from the socketTunnel client
Allowed data to flow freely between the user’s browser and the stream aimed at localhost:80 on the client
Terminated the HTTP client after the stream ended
Repeated the above steps for as many HTTPS connections requested by the end client’s browser

Outcome

Although there’s a lot going on here, the resulting code is quite simple. A few other benefits of using this method described above:

All links are encrypted with TLS
No holding open unused TCP connections (just a single socket.io connection per socketTunnel client required)
An attacker is unable to snatch newly opened ports
Fewer open ports exposed from the server (ideally just 443, but you’ll probably want 80 open to force a redirect to 443).
We can serve more than 10 TCP connections simultaneously
Authentication can be implemented to restrict use of subdomains and access to the socketTunnel server

And with that, this post has become long enough. Thanks for reading. I’m currently wrapping up my implementation of the server/client components described above and will be looking into open sourcing the code in the future. Until then, I’d love to hear your feedback and suggestions in the comments.

UPDATE: Reference implementation now available at https://github.com/ericbarch/socket-tunnel Dig in!

-Eric