So, I've been thinking about this, and have some ideas.
In what I've got so far, the fundamental objects are JSON objects. The Java wrappers just provide convenience functions for accessing them. This is not a requirement of the system, but it could be. Either way, the Data Store interface could be extended with a requirement to provide an export function, that returns a JSON object representing the totality of information needed to persist its state, and an import function that replaced the current state with the state represented by the JSON argument passed in. With this, the system as a whole could provide import and export functionality by just calling each Data Store and aggregating them into a larger construct.
Given two JSON structures representing two states of the system, you can construct a method to define the differences between those two structures. And, giving a unique ID to each state, allows for a checkpoint system to be created.
When a deployment is started, it is initialized with a saved checkpoint state. (Or a null, default checkpoint, if it's the first time.) Subsequently, when a checkpoint is requested, it references the initial checkpoint state, compares it to the current state, and saves a new checkpoint that includes the initial checkpoint, plus all the deltas between that checkpoint and the current state. This new checkpoint replaces the initial checkpoint in the running system.
Essentially, as each new checkpoint is requested, another "layer" is added to the file. Playing through the layers applies each block of changes to the base data to come to the current state. (If you are familiar with Docker images, you can see where this idea came from.) With that chronicle of history, it is trivial to "back out" any changes in reverse chronological order. It is possible to back out individual changes, not in direct chronological order, with the risk of leaving the system in an undefined state. So you would only want to do that with caution.
That gives you a method of dealing with simple, linear changes. To support parallel work, you need trees. Consider that you have a staging server, that represents some work people are collaborating on. At the start of a task, you can take a checkpoint from that server, and use that as the basis for development on a local server. When you have completed your work, you checkpoint your own local server and can begin a merge process to the staging server. First a new checkpoint is made on the staging server, incorporating any changes made since you took your own checkpoint from them. That checkpoint is compared to the checkpoint of your local server. At some point, the two history trees intersect, since you started from some previous version of that server. Any changes to the staging server made since your branch started need to be inserted into your own branch. Conflicts can be warned about, and breakages may occur. It should be re-tested at that point. If everything is accepted, then a new local checkpoint is made. This one will only differ from the staging server in simple appended layers. These can just be appended onto the integration sever to incorporate the changes. Local and staging are now running with the same state, the new features have been incorporated.
This is not unlike how Git works. I'm trying to think of how to make Git the backbone of all of this. But I've never been happy with Git's merging capabilities. So I'm unsure.
Progression from a staging server to a production server would work in, pretty much, the same way.
This covers isolated work on people running individual servers for development. There may be a way to encapsulate truly collaborative work. Any change that someone makes is, effectively, and update to a property on an object somewhere. If it is also tracked who made which updates, it should be possible to make a checkpoint from a given position only including a specific person (or group's) changes. That would allow for selectively backing out changes from individuals, or creating something equivalent to Git patches based on an individual's work. Even if that work was done on a collaborative server. This would need some overhead to track, but if the values are retained in the checkpoint file, and flushed from live representation after a checkpoint, that should make it manageable.
A maintenance concern is that each layer in a checkpoint file adds more processing. To compute the final state, you need to start with the initial state and apply each change in succession. That will eventually become cumbersome. In Docker images this is alleviated by having the ability to "squish" images, and combine multiple historical layers. That's a technique that would work here as well.
A practical concern is that some changes are highly relevant (code for a verb on a base object), others are a lot less so (changes to the containment hierarchy by someone walking around). Less relevant changes will cause a lot of thrash in the layers if they can't be detected as unimportant and excluded.
This system should work well with a robust testing environment. Given a set of unit tests that validate functionality against a known state, a change can be approved if those tests run successfully against the new state. If automated appropriately they could be used in an integration pipeline to approve incoming changes to a base stream. If logs are kept of user interactions on the production server, those could also form the basis of further tests. For example, before rolling out an update, those interactions could be played into a test server with the new code to ensure that either the same results happens, or there are changes based on known work.
Anyway, let me know if that addresses some of the concerns you have raised about what is lacking in current collaborative MUD development and if such a system would be worth doing.