Darwino for Domino: Replication and Data Format

Wed May 11 14:35:50 EDT 2016

Tags: darwino

Apr 14 2016 - An Overview of Darwino for Domino Types
May 11 2016 - Darwino for Domino: Replication and Data Format
May 16 2016 - Darwino for Domino: Domino-side Configuration
Jun 01 2016 - Darwino for Domino: Conceptual Overlap and Distinctions

One of the key points of interest in Darwino for Domino developers is its two-way replication. Darwino's replication system was built in such a way that, in addition to its own internal needs, you can also write a replicator to connect to an entirely-unrelated system, as long as that replicator translates the foreign data to and from JSON documents. Domino is a perfect case for this, since the data model is already very similar, and its replicator ships with Darwino and has been a focus of my attention for a while.

Behind the scenes, this replication takes advantage of a lot of the same kinds of things that Domino's replication always has: UNIDs, sequence IDs, original-vs.-in-file modification/creation, and deletion stubs. These details are transparent to the developer: the Darwino adapter knows how to fetch the appropriate data from the NSF and to convince Domino that the Darwino DB is (almost) like a remote Domino server, at least as far as the stored data is concerned.

The primary differences show up in the way data is formatted for storage. Darwino uses JSON for its document format, which has a couple key advantages and disadvantages compared to NSF's "bag of items" approach. The best approach is probably to provide an example and highlight the pertinent differences. Say you have a Domino document that (conceptually) looks like this:

FirstName
	Type: TEXT
	Value: Foo
LastName
	Type: TEXT
	Value: Fooson
Username
	Type: TEXT
	Flags: NAMES READWRITERS
	Value: CN=Joe Schmoe/O=SomeOrg
Birthday
	Type: TIME
	Value: 1970/2/1
Vacations
	Type: TIME_RANGE
	Value: 2016/1/1-2016/1/5, 2016/3/3-2016/3/5
IsAdmin
	Type: TEXT
	Value: Y

On the Darwino side, that would potentially look like this:

{
	"_writers": {
		"username": ["cn=Joe Schmoe,o=SomeOrg"]
	}
	"firstname": "Foo",
	"lastname": "Fooson",
	"birthday": "1970-02-01",
	"vacations": [
		"2016-01-01/2016-01-05",
		"2016-03-03/2016-03-05"
	]
	"admin": true
}

There are certainly a few things to take note of here. First and foremost is the structure of the authors field. Because JSON doesn't have field metadata, the way Darwino does its readers/writers security is by using specially-named and -structured properties within the JSON, and so the converter moves all readers and writers fields into that. This has internal-implementation reasons, but I think it's also conceptually preferable to, say, having a multi-level object within each field to declare its flags separate from the value. The name also happens to be stored in LDAP style, because Darwino is more at home with standard LDAP conventions for that sort of thing.

Another thing to note is the format of the date fields. Since JSON doesn't have a real date/time type of its own, these values are converted according to ISO 8601 and stored as strings. That means that your Darwino application will need to know that those string values represent dates, but they're reasonable to deal with.

The multi-value date field leads to another important aspect: arrays. In Domino, most items are conceptually presented as arrays, regardless of whether they contain single or multiple values, leading to code that requires either explicitly asking for the first element or jumping through hoops to deal with single or multiple values. Since that's a drag to worry about with JSON, the default behavior for single-value items transferred to Darwino is to store them as "bare" values. When configuring the translation (which will be fodder for a future post), you are able to specify that you want a field to be always stored as an array, which will allow the Darwino-side code to be simpler.

The last field shows off an outright advantage of the JSON format: boolean storage. When defining the conversion, you can specify a field as boolean and provide what the true/false values will be, and they will be sent over to JSON as true and false explicitly. That's not a night-and-day change, but it is a nice help.

Finally, there's the matter of rich text, which is unsurprisingly a nontrivial problem. This is handled in an XPage-alike way: MIME rich text is transferred with only minor adjustments, while Composite Data is converted to HTML and cleaned up a bit before transfer. Darwino supports the concept of attachments natively, and so Domino attachments are brought over with a naming prefix to match the field they're attached to, plus a delimiter to indicate whether they're normal attachments or embedded images. The way it is presented on the user end is dependent on the application, but Darwino has some routines to translate storage-safe inline image refs to app-relative URLs.

Later, I will go into the process of how the Domino-Darwino adapters are set up. The short of it is that you create scripts that can run the gamut from just telling the server where to find the NSF to customizing the translation of each field encountered. This allows you to either transfer the data back and forth in the default "best approximation" approach or use the opportunity to enforce a bit of a schema on old data.

New Comment

Name: Body: