So Streamplace contains an embedded ATProto Personal Data Server. I get more questions about this than anything else, so let's go into why we did that and how it's implemented. I'll also talk about how the general pattern of a "Static PDS" shows promise, even if I wouldn't recommend most people implement things exactly how Streamplace did it.
Why?
> "Why not just use a PDS?" —@bnewbold.net
> "i thought it was the most insane idea when i first heard about it lmao" —@natalie.sh
The immediate impetus for the PDS was our need to host the `place.stream` lexicon definitions somewhere. I'd done a one-time export of our lexicons from their home in the GitHub repository to the @stream.place repository but they were out-of-date; we really needed them to be auto-updating.
The easy way to do that would have been some sort of CI script that pushed the lexicons automatically. Whenever we release a new version, go through the lexicons and make sure the `com.atproto.lexicon.schema` records in the PDS are up-to-date. If you're working on an atproto app that's not Streamplace, this is probably what you should do. It's easier to maintain and has fewer corner cases to worry about. But we already wanted Streamplace to be a PDS for other reasons.
Streamplace has been designed from the ground up to be a fully-functional, "batteries included" replacement for Twitch and YouTube. Having the ability to be the canonical store of a users' content is consistent with this philosophy. This doesn't mean we wouldn't want to support external PDSses, that's important too! But it's important for the project that we be a one-stop-shop for decentralized video publishing, and that includes the PDS. After you finish setting up your Streamplace node, you shouldn't need an account somewhere else to get started.
There are also some practical concerns. We don't presently publish all of the one-second segments comprising a users' livestream to their PDS, even though that would be the most "protocol-native" way to operate. One reason for this is that users don't necessarily expect their livestreams to be immediately archived and publicly-available; we need to build user configuration for this kind of thing before we start publishing VODs. But also, sending one segment per second would run into issues with Bluesky rate limiting; the Bluesky "mushroom" PDS fleet isn't designed to handle that kind of traffic. A Streamplace-embedded PDS allows us to experiment with livestream archival without knocking over any Bluesky servers.
So we want an embedded PDS. Let's implement one!
How?
The first thing you need to self-host an ATProto repository is a decentralized identity — a `did:plc` or `did:web`. A `did:web` is a really great fit for our use case; we're establishing the canonical lexicons for `place.stream` lexicons, hosting them at `did:web:stream.place` makes intuitive sense. In fact, we were already hosting a DID doc at https://stream.place/.well-known/did.json to back the Streamplace Bluesky feeds that show you posts of all live users. So our first step was to generate a signing key was to teach the node to generate a signing key and list it as a `verificationMethod` in the PDS document.
[ PDS document snipped because it broke mobile rendering 😭 ]
With that done, now we need to create a new ATProto repository internal to the app. Thankfully, we get a lot of help with this from Bluesky's "indigo" repo containing their Go code; the `repo` package already contains a complete of the repository structure, block store, and signing process. So on first boot, Streamplace creates a new repo and populates it with all of its records as `com.atproto.lexicon.schema` records. Repo created!
Now, we need to implement the smallest possible subset of XRPC endpoints to expose our PDS to the world. Thankfully, because all of our content is coming from within the node, we can skip dozens of methods related to authentication, upload, and record creation. We already have all of our data internally, we just need to expose it to the world. We eventually landed on this set of methods:
com.atproto.repo.describeRepo
com.atproto.repo.getRecord
com.atproto.repo.listRecords
com.atproto.server.describeServer
com.atproto.sync.getRecord
com.atproto.sync.listRepos
com.atproto.sync.subscribeRepos
How did I arrive at that list? Guesswork, mostly. I started out by implementing just the endpoints necessary to make the PDS node and repository show up correctly on pdsls.dev. Most of them were easy to implement, as they just return various JSON collections of repos and records, but `com.atproto.sync.getRecord` requires you to return a proof of the record in the repo, which required digging a bit more into the MST structure that comprises an atproto repository.
I then booted up an atproto relay locally (also from indigo) and made a `com.atproto.sync.requestCrawl` request to ask it to start to crawl my repository. To make this work, I needed to support the `com.atproto.sync.subscribeRepos` endpoint, colloquially the "firehose".
Supporting the firehose was the trickiest part of the project, because the convention there is that you don't just push out all of the data in the repository every time the relay queries it; after the initial sync you're expected to return a cursor along with the data. When the relay connects again, they'll provide that cursor, and you're expected to return only the data that's changed since they last asked.
To support this, I had to go back to the initial repository-creation code, and had it keep track of what had changed in the database on boot. When there are changes to the lexicons, it creates a new signed commit structure and saves that into the Streamplace database. This way, whenever a relay connects, we can replay just the commits that have ocurred since the last cursor. With a few tweaks to support the metadata necessary for the inductive firehose, we have a working firehose.
And it's live; check out the stream.place PDS!
Toward a Static PDS
While we went to more work than necessary to implement a lexicon repository, I think there's some promise in the general pattern here — call it a "Static PDS" to make an analogy with static website generators. Not every ATProto use case justifies a full PDS with OAuth and full CRUD capability. Sometimes you just want to put a few records up on the internet. Doing this with a single server or worker that provides both the `did:web` identity and a single embedded ATProto repo could provide a relatively easy and low-effort means of publishing to the network. What would it take to get to that point?
First, we'd need a coherent spec for what XRPC methods are necessary to constitute a compliant "read-only" PDS. The list above is enough to get the server working with pdsls and Bluesky's relay, but other applications might mandate other use cases. A spec and a set of conformance tests for a PDS from the protocol developers would go a long way toward enabling this use case.
Secondly, we should explore the idea of an "offline repository." Streamplace's PDS implementation possesses the repo's signing keys, but for the "static" use case you might want to, say, write a new blog post and sign it locally without exposing those signing keys to a hosting provider. This should be totally possible! You'd have to do a fair amount of pre-generation, producing MST proofs for every possible record that could be requested by `com.atproto.sync.getRecord`. And you'd need to precompile a list of signed commits for returning to the firehose. But good tooling could abstract away all of that complexity for a user, and you could imaging automating all of that generation to make it no more complex than a git commit.
Finally, you'd need an appropriate hosting provider. You couldn't get away with a truly static set of files, as you need to support a websocket. But that websocket doesn't need to actually do very much; it returns all of the static commits since the cursor and then sits there until a new version of the repo gets deployed. But with the proliferation of serverless workers and whatnot, it's easy to imagine a variety of providers that could host a static ATProto repo cheaply and performantly.
Thanks for reading! Streamplace is building decentralized livestreaming for decentralized social networks — if this kind of thing is exciting to you, check our Jobs page or join the Discord!