Building Cross-Platform Offline-First Apps with Bluetooth Low Energy
17 May 2024
This is the transcript and slides from a presentation I gave at /dev/world in Melbourne on May 9, 2024, under the somewhat less descriptive title "Bluetooth Low Energy On All The Things". If you like to watch things in video format then AUC has the original on their YouTube channel.
Hello everybody. Yes, my name is Tom and today we're going to look at Bluetooth Low Energy and how you can use it for peer-to-peer communication in your apps. Now, Apple has an API for Bluetooth Low Energy and that's called Core Bluetooth but for this presentation I'll be using the acronym "BLE".
This talk will be in four main parts. First we'll talk about the concept of offline-first apps, why they're cool, and why BLE could be a really compelling technology choice for you. After that I'll briefly introduce the main concepts of BLE, things like advertisements and how connections work.
Then with that basic understanding in place I'm going to do a deep dive into six specific areas where things get kind of tricky, things which you won't discover reading Apple's documentation. For each one I'll explain what the issues are and give you some advice about what to do.
Towards the end I'll touch on some non-Apple BLE implementations and how you might want to structure your code to work on multiple systems, drawing on the experience we have providing BLE support in the Ditto SDK.
Let's get started.
When I say the word "Bluetooth" your mind probably jumps to things like AirTags, game controllers, headphones—you know, usually some sort of gadget. Today I'm talking about a very different situation.
We develop apps. These apps run on smartphones, maybe laptops, maybe desktops. Now I don't know each of you personally but you probably have an architecture like this. You've got a client that logs on to some sort of internet service, and then they exchange data back and forth with the service. This is what basically every popular mobile app does today.
The trouble is, the server is not always online.
There's a lot of buzz now about so-called "offline-first" apps. What this means is that even if the user is offline they can continue to work and use the app, as much as this makes sense, and then later when internet connectivity is restored their work and their activity will sync back up to the server.
This is cool, but we can do better than that.
What if our clients could continue to collaborate and share data while they can't reach the server?
What if just one of them had an internet connection and acted as a gateway to propagate updates in both directions?
What if users never even noticed that something happened?
Now that's offline-first.
Resilience against outages is one common reason why you might want to introduce peer-to-peer communication to your app. There are two other common ones. Even if the connection to the server is reliable, user-to-user latency might be lower if the devices can talk to each other directly. Also, for some apps the user experience can benefit if the app is aware of who or what is physically nearby. So if you want to take your great app to the next level then these are all pretty good ways to do it[1].
If we want to make this happen, though, clearly we need some sort of local communication system. These devices need to be able to discover each other. They need to prove to each other that they're trustworthy. They need to be able to transfer arbitrary app data. They have to handle multiple connections. And they have to do all this without getting in the user's way.
I'm here today to tell you that Bluetooth Low Energy is a gift from the technology industry to you. Yes, you, personally. It's a well-established tech that you can deploy today to do a lot of good things.
Bluetooth gives you a lot of range. Modern iPhones can do over 100 metres in the open which is 3–4 times the distance of AWDL, which is the peer-to-peer WiFi technology that powers AirDrop. You can create large meshes through having multiple connections, easily meshes of 50+ devices in total. You can get tens of KB/s of real world throughput.
The best thing is when you launch the app for the first time you get one prompt: do you want to allow Bluetooth? You say yes; you're now free to scan for and connect to as many other devices as you want. There's no ongoing interaction required there.
If you turn on the background modes your your app is able to persist in the background and wake up to discover peers and connect to them and then exchange data, even while the phone's locked in your pocket. And it's an industry standard, as I alluded to before. There are equivalent APIs available on Windows, Android and Linux and they're all compatible with each other so you can build your app in a way that all of these things can talk to each other.
So it's a really good deal but there are two downsides. One is it's kind of slow compared with, you know, the internet. The other is that actually using it is kind of quirky and that's the point of this presentation. I want you to leave this session with a basic understanding of how BLE works, how it fits into a peer-to-peer application, and equip you with various practical tips that you could use to maximize your success.
Let's briefly describe how BLE works at a high level. BLE devices operate in one of two roles. These are called "central" and "peripheral". Very loosely speaking, a central is a client and a peripheral is a server. The peripheral continuously broadcasts advertisements on its radio. For an iPhone this is in the order of 20 advertisements per second. The central scans for these and then if the central identifies a peripheral that it thinks is interesting, it can try to initiate a connection. Conceptually peripherals ignore each other and centrals ignore each other. All the action happens between a peripheral and a central.
In traditional BLE applications the gadgets are the peripherals that advertise, and the app on your phone is the central that scans for them. Now, our scenario is a bit different. Our devices are equal peers. If you have two iPhones side-by-side, which one's going to be the central? Which one's going to be the peripheral?
The answer is "both". On all of the platforms that I've mentioned so far, devices are able to exist in both roles simultaneously. What's more, one central is able to handle multiple concurrent connections to peripherals, and one peripheral is able to handle multiple incoming connections from centrals. This means it's possible for your devices to form a relatively dense mesh with redundant paths.
Next let's look at what happens when you actually establish a connection. BLE is heavily tied up in a protocol that's called GATT. The peripheral has a database of fields called "characteristics" that contain some short values. These characteristics are organised into groups called "services" and each of these services and characteristics are identified by some UUID.
The thing that makes this particularly tricky is that the characteristics may be very small, maybe as low as 20 bytes depending on the device support that you have. That's not much data. What do you do if you want to transfer a message to another device that is longer than 20 bytes? Well, basically you have to chunk it up into little pieces and write to the same characteristic over and over and reassemble it at the other end. This is horrible, right?
So GATT is great if you're connecting to a thermometer and trying to read the current temperature. That's what it's there for. But if we have two iPhones talking to each other it would be much more convenient for us if we had something resembling a TCP socket.
I have good news: that exists.
This diagram is a massive simplification but basically GATT sits on top of a protocol called L2CAP which provides reliable data streams over Bluetooth, very much looking like TCP. As a central you can ask Core Bluetooth to open one of these L2CAP channels and then you're able to read and write blobs of bytes directly into that channel and bypass GATT entirely. So if you can get away with it my suggestion is to ignore GATT entirely. Just use L2CAP channels. They're wonderful.
So why wouldn't you be able to get away with it? Well, traditionally Apple and Google like to drag their feet when it comes to exposing the wonderful features of BLE to developers. GATT's been there from the very beginning but when it came to L2CAP, those APIs were only added in iOS 11 and Android 10 so if you use them it's only going to be supported on relatively new devices.
Now, this is an Apple conference and you're probably all thinking, "Tom, iOS 11 was a very long time ago." You're right, actually, and in truth if you're targeting only iOS and Mac for your app then probably you're good to go. You probably don't need to worry about GATT at all. However, at Ditto we still support back to Android 6 so to support those devices on Bluetooth we maintain compatibility with GATT communication even though it's slower and more complicated.
So let's recap. We have some phones. They're both advertising as a peripheral and scanning as a central, making connections to each other and as a result they form a nice interconnected mesh.
You now understand the basics of BLE. Well done.
I mentioned earlier that I chose six quirky areas that I was going to dive into. I've labelled those here. Some of them are going to be about advertising: what you put in your advertisements, how you track devices based on their advertisements. I'm also going to dive into the connections and how you secure them, how we set them up, and a little bit about how you might choose which peers to connect to based on the advertisements you hear.
So, quirk number one. First let's discuss the contents of our BLE advertisements. Core Bluetooth only gives you a little bit of control over your advertisement content so you need to use it well. The thing you need to realise is that in general it's fast and cheap to receive and process advertisements but it's slow and expensive to initiate a connection to another device, relatively speaking. Therefore you want to be pretty sure that a certain peer is actually interesting and someone you want to talk to before you go to the trouble of establishing a connection to them.
There are two pieces of data you can use to make this easier or get the result you want. The most important thing is a service UUID. For our purposes you can think of this as an application identifier. This depends a little bit on your app: if all of your app users could potentially be connecting to each other because they're all interested in each other, then that means that they should all advertise and scan for the same service UUID. If, however, your app is the kind of app where you have different customers with isolated groups of devices and those isolated groups never have any business talking to each other then you would prefer that they each use a different UUID.
So how do you pick a UUID? Well, you just generate a random one actually. If you look into it you'll find that there's this whole scheme where paid-up members of the Bluetooth Special Interest Group can, you know, allocate short UUIDs to themselves for special nefarious purposes[2] but you don't need to worry about that. That's not an issue.
The other type of advertisement content that Core Bluetooth will let you configure is the field called the "local name" which lets you provide some sort of string unique to your device. This typically contains your device name—if you use a Bluetooth scanner app this will show you the name of your phone. Core Bluetooth doesn't give you direct control over the advertisement content but in my experience a safe maximum length for this is about 23 bytes[3].
Here's where things get weird, though: if you port what I've just described here directly to Android, it also has an API to include the device name in the advertisement alongside the service UUID you specified[4]. But if you try to change the local device name on Android, it does it. It has system-wide effects.
This is a real photo from when I was working on this at Ditto. After configuring Bluetooth advertisements in an unrelated app my phone suddenly had a different name in Spotify. This is actually how I became aware of the problem. So that's not real good. So what's the solution?
There's another type of field that can exist in a Bluetooth LE advertisement and that's called "manufacturer data". On iOS you're able to read the manufacturer data from an advertisement that you have received from the radio but you're unable to set it or configure it in your own advertisements that you transmit. Android's a bit more flexible: you can do both.
So here's something you can do. iOS will write its unique device identifier into the local name field since that's what Core Bluetooth will let you do. Android and probably other platforms will use the manufacturer data instead. Then on all platforms when you're receiving advertisements you simply check both fields to see whether it contains the identifier you're looking for. So now you have a strategy that will work on all platforms.
For the next quirky area let's talk about a little bit about identifying and tracking devices. When you're using Core Bluetooth to scan for advertisements or you're receiving incoming connections or something like that, it will give you an identifier—on Core Bluetooth it's a UUID—so that you can tell whether two events that occurred at different times relate to the same remote device or not.
The trouble is this identifier is derived from the MAC address that this device is transmitting and for privacy reasons phones will use random MAC addresses and they will also rotate them at runtime so you're not really sure whether it's the same device or not. In fact, in one Android phone I tested it was super aggressive. It would actually change MAC address every single time it established a new BLE connection.
For most applications where you're trying to reach out to peers and form a mesh you really want to know two things: which distinct remote peers exist, and if someone does decide to rotate their MAC address you'd want to be able to tell whether it's the same person who was there before, or whether this is a newcomer who needs to be integrated into the mesh. This is kind of basic information you need to create a good algorithm.
The simplest solution for this is to incorporate some unique value into your local name or manufacturer data. It's best if this is something random and chosen by your app rather than a human name or else you won't be able to tell the difference between two devices called "iPhone". This problem has gotten worse now that Apple doesn't let you access the real device name any more. So that that's really helpful for tracking purposes.
While you're doing all this, keep in mind that everything we we broadcast and everything we receive from Bluetooth advertisements is completely untrustworthy. It's completely unauthenticated. It's all public, so don't put private information in there. If some sort of nasty imposter or attacker comes along they could, you know, duplicate someone else's advertisements. They could have a device that pretends to be 1,000 different MAC addresses all claiming that they are running your app, something like that. So you have to treat this as an adversarial environment and most importantly you need to make sure you don't transmit any sensitive or private data until you have fully established a secure connection with another device.
…which brings us to our next little topic: authentication and encryption. So Bluetooth has its own security mechanisms. You might be familiar with them. They usually involve pairing—we don't want pairing. We want our devices to be able to just go and find someone and connect to them and establish a secure link without having to, you know, type in numbers or something rubbish like that, that users don't want to do.
The solution then is to configure BLE to operate in an insecure mode. We use insecure L2CAP channels which are unauthenticated. We use insecure characteristics if we're using GATT. Then we provide our own security layer on top at the application level.
Now, I said before that an L2CAP Channel looks a lot like a TCP connection. This really works in our favour because we're talking about phones, not tiny microcontrollers. We can use industry-standard technology to provide authentication and encryption just like we would for a TCP connection. I would recommend you look into one of two options.
Mutual TLS is a variant of TLS. That's the regular encryption that you use to securely visit websites in your browser. The difference here is that both sides of the connection have to present a valid certificate and they mutually authenticate each other. The idea here is that your backend would issue a valid certificate to every device that's running your app and they all trust a common CA. If you want to you can include information in that certificate about who the user is, what they should have access to, stuff like that, but keep in mind that the certificate contents are effectively public.
Mutual TLS is a great choice if you have a central registry of accounts. When users log in you can dish them out a certificate and then they have it ready to go for all peer-to-peer access.
The other protocol you might find interesting is Noise Protocol. This is much simpler than TLS and it's famously used as the cryptography backend for WireGuard, the VPN. Noise itself doesn't have any built-in notion of certificates, only static keys. If your app doesn't have a central source of accounts—if you just always form trust relationships on an ad-hoc basis like scanning each other's QR codes or something like that—then Noise might be a more natural fit for what you're trying to do than TLS.
In short, the bad news is that BLE is an untrusted network. The good news is that our phones are so powerful now that we can just throw state-of-the-art industry-standard solutions at it and push nice secure tunnels right through the middle of that untrusted environment.
All right, next. Number four. Let's talk about what you need to do to open an L2CAP channel. If you're acting as a peripheral, which you will remember is kind of like a server, you can open a listener that will receive incoming channel open requests in much the same way as TCP. Also like TCP, the client needs to specify a number which is kind of like a port number but in BLE that port number is called a "PSM". Again it's a 16-bit integer. The trouble is that when you start an L2CAP listener on iOS or Android the PSM is always dynamically allocated so you have to transmit it to the central in some way before they connect. Well, I say it's dynamically allocated. For some reason it's always 192 but we don't know that so we have to take it and communicate it to the central[5].
There are two suggestions I have here about how you actually provide that to the central. I said before that at Ditto we provide backwards compatibility with GATT, so we provide an upgrade path to L2CAP, and this is a key part of how that works. If you're a peripheral that is capable of L2CAP then the service contains an extra characteristic that contains the PSM value. So when a central connects it tries to read that PSM value from that characteristic. If it gets one then it connects and forgets about GATT and now it's operating on L2CAP, but if that characteristic is not present then it assumes this peripheral doesn't know how to do it. It doesn't even look for it if the central itself is not capable, and then it stays in GATT.
The alternative, if you're lucky enough to only have to deal with modern devices, is that you can potentially put the PSM directly in your advertisement. Then you don't even have to go looking for a characteristic and stuff around with that.
Cool. Next let's have a chat about who you decide to connect to if you've got various devices that you're discovering. The first thing to know is that as a peripheral you have no control over anything. Centrals can connect to you; they can disconnect from you. You just have to put up with it. So if you're trying to achieve a particular result, if you're trying to achieve a particular mesh configuration, you have to come up with a policy and rules that apply to the centrals and who they decide to connect to.
This is a huge topic but for today I want to draw your attention to two basic things.
If you walk your phone towards another person at a distance you'll probably start hearing sporadic advertisements from them long before you're actually close enough to establish a reliable connection. This isn't very good use of resources, trying to connect to somebody who's unreachable, so a better approach is to listen to their advertisements over time and if the rate at which you're receiving advertisements from the same peer exceeds a certain threshold then say, "ah yes, this peer is probably relatively solid; I should actually take the time to connect."
The trouble is that by default many Bluetooth APIs, including Core Bluetooth, will not tell you about every single advertisement that a given peer does. They'll just say, "oh, I found a peer; here it is" and just sort of wait for you to do something. That's not helpful to us, so you actually have to opt into an "allow duplicates" setting, and then it will actually tell you about them all, acknowledging that this will use a little bit more battery power because your delegate's being invoked for every single one of these things.
In Ditto we do all this in Objective C—that's why my examples are in Objective C. I hope everyone loves Objective C, right[6]?
Right. Then the question is "which peer should I actually connect to?" and "how do I form a good quality mesh?" which is again a large topic so let's focus on one subset of that problem.
If you have two iPhones, they're both a central, they're both a peripheral, it's pretty easy for them to end up in a situation where they both connect to each other, and that's a waste. You've just used one of your connection slots for something that doesn't give you any benefit. You were already talking to each other.
One simple way you can do that is by including some information in the local name field. I already talked about putting a random identifier in there. You can use some some sort of tiebreaker, like say, "Okay whoever has the lowest random number is going to be the central in this relationship and whoever has the highest number is going to be the peripheral." Ditto uses a more complex stateful algorithm than that but you can sort of see how by creating these kind of policies and rules you can cause the mesh to emerge in the shape that you want, by everybody following good local rules.
Finally, in terms of quirks, let's talk about background modes. This is really exciting because you can share data with other people when you aren't even actively using the app. The first thing you need to know is that you need to turn on the background modes. This is in the capability settings of Xcode. There are no extra permissions required—I guess it just comes up in App Store review whether you actually need this or not.
The main problem is that there are no guarantees whatsoever. Like everything with background modes, iOS will do exactly what it wants and your app just has to put up with it, so don't expect this to work on demand basically.
There are also some restrictions that are put on the the Bluetooth communications while your app is in the background. There's some power saving measures. If you're advertising you don't get 20 per second anymore. It's much, much slower. And if you're scanning it doesn't operate at 100% duty cycle. It only scans for little snippets of time. The upshot of this is that it doesn't use nearly as much power but it might take much longer for devices to discover each other. If this is kind of worse if one of them's backgrounded or locked, the problem is much worse if you've got two devices which are backgrounded or locked because it just takes much longer for those random periods to overlap where one of them will actually detect the other.
There are a couple more curve balls. Your advertisements that you're transmitting as a peripheral no longer contain the local name identifier. You only have the service UUID. So if you want to connect to and exchange data with a device that's in this state you kind of just have to see them and be like, "you don't have an identifier but I'm going to connect to you blindly and see what happens," because you can still go through your TLS negotiation and find out who they really were.
The other problem is that if you're the central and you're scanning, even if you did that code snippet I pointed to before to request every single advertisement, iOS says, "I'm not doing that; you're in the background; you get one." That will probably mess up the algorithm you have to measure the advertisements over time so that means you now have to track whether you're in the background and do something slightly different.
As you can see it's a bit of a hassle because of these extra restrictions but the user experience can be really cool and magical if, you know, you don't even have the internet and devices are just sending data to each other. And because of the way iOS works it wakes up your whole app so you can have this funny thing where your app wakes up to do Bluetooth and then it starts doing peer-to-peer WiFi at the same time because it happens to be running, and iOS for whatever reason doesn't have really fine-grained limits over what your app is able to do once it has been woken up in the background, so that might work in your favour too.
So those are all the nitty-gritty tips that I wanted to discuss today. Before I wrap up I want to touch on how we've handled the different cross-platform implementations of BLE in the Ditto SDK.
We need to support a lot of SDK targets. We have Android, Windows, Mac, we've got WASM, stuff like that, and to make that manageable we push as much logic as we possibly can into the common core code which is written in Rust, and then we compile that for each platform that we use.
For Bluetooth we use platform specific drivers, so for Apple that's written in Objective C; for Android that's in Kotlin; on Windows we use the windows-rs crate to access the windows API; and on Linux we go through DBus to talk to bluez, which is the most common way of controlling the Bluetooth adapter on a Linux system. This way we get access to BLE functions on all platforms.
While all of these are quite different in how you actually use them, at the end of the day they're all doing kind of the same thing. You can create an abstract generic interface across all of them. If you think about a central, it has to do certain things: it has to start scanning for a certain UUID; it has to stop scanning; it has to try to initiate a connection attempt; things like that.
So what we do is we have a layer here where each of these four things implements a common interface and then at build time depending on which OS we're targeting for this particular build, one of those solutions is paired up with the Rust core. Then the Rust core doesn't really care which one of these it's talking to. There are functions in one direction and events in the other. If we have logic about which is the best peer to connect to and stuff like that, we write it once in Rust, and then these BLE implementations are as narrow as possible—the actual connect logic and the advertising logic and things like that.
If you're trying to do BLE across multiple platforms, even if it's just two, I'd strongly recommend you use some sort of approach to consolidate code in a way that's possible to share across both, otherwise you'll end up with quite a lot of duplication.
All right, very good. You now have pretty much everything you need to start writing peer-to-peer apps in BLE. This is a photo of my colleague, Ben. He's setting up 100 phones for integration testing. I hope you have as much fun as Ben working on your peer-to-peer apps! Thank you.
- In hindsight I would have liked to give more concrete examples where this is really useful: flight attendants, field workers, hospitality, things like that. ↩︎
- The real reason is that if you have exact control over the advertisements you're transmitting, these shorter IDs allow you to use more of the limited space for other purposes. Apple gives us so little control it's not worth worrying about. ↩︎
- Technically the field can be longer than this but last time I investigated this, iOS wanted to insert another field alongside the local name which reduced the budget. ↩︎
- I avoided mentioning the difference between advertisements and scan responses, partly because this is an introductory talk, partly because Apple doesn't give you any control over this anyway. On other platforms like Android you will most likely want to configure a regular advertisement with the Service UUID and a scan response with the manufacturer data. ↩︎
- Obviously it's not always 192 if you have multiple listeners but I wanted to make the joke because it's mostly true. ↩︎
- Objective C vs Swift is one of those topics that people at Apple-themed conferences like to have mock flamewars about. It was mentioned in talks prior to this one. ↩︎