Showing posts for tag "darwino"

Replicating Domino to Azure Data Lake With Darwino

Mon May 10 10:06:09 EDT 2021

Tags: darwino

Though Darwino is an app-dev platform, one of the main ongoing uses it's had has been for reporting on Domino data. By default, when replicating with Domino, Darwino uses its table layout in its backing SQL database, making use of the various vendors' native JSON capabilities to have storage capabilities analogous to Domino. Once it's there, even if you don't actually build any apps on top of it, it's immediately useful for querying at scale with various reporting tools: BIRT, Power BI, Crystal Reports, what have you. Add in some views and, as needed, extra indexes and you have an extraordinarily-speedy way to report on the data.

Generic Replication

But one of the neat other uses comes in with that "by default" above. Darwino's replication engine is designed to be thoroughly generic, and that's how it replicates with Domino at all: since an adapter only needs to implement a handful of classes to expose the data in a Darwino-friendly way, the source or target's actual storage mechanism doesn't matter overmuch. Darwino's own storage is just "first among equals" as far as replication is concerned, and the protocol it uses to replicate with Domino is the same as it uses to replicate among multiple Darwino app servers.

From time to time, we get a call to make use of this adaptability to target a different backend. In this case, a customer wanted to be able to push data to Azure Data Lake, which is a large-scale filesystem-like storage service Microsoft offers. The idea is that you get your gobs of data into Data Lake one way or another, and then they have a suite of tools to let you report on and analyze it to your heart's content. It's the sort of thing that lets businesspeople make charts to point to during meetings, so that's nice.

This customer had already been using Azure's SQL Server services for "normal" Darwino replication from Domino, but wanted to avoid doing something like having a script to transform the SQL data into Data-Lake-friendly formats after the fact. So that's where the custom adapter came in.

The Implementation

Since the requirement here is just going one way from Domino to Data Lake, that took a bit of the work off our plate. It wouldn't be particularly conceptually onerous to write a mechanism to go the other way - mostly, it'd be finding an efficient way to identify changed documents - but the "loose" filesystem concept of Data Lake would make identifying changes and dealing with arbitrary user-modified data weird.

The only real requirements for a Darwino replication target are that you can represent the data in JSON in some way and that you are able to identify changes for delta replication. That latter one is technically a soft requirement, since an adapter could in theory re-replicate the entire thing every time, but it's worlds better to be able to do continuous small replications rather than nightly or weekly data dumps. In Darwino's own storage, this is handled by indexed columns to find changes, while with Domino it uses normal NSFSearch-type capabilities to find modifications since a specific date.

Data Lake is a little "metadata light" in this way, since it represents itself primarily as a filesystem, but the lack of need to replicate changes back meant I didn't have to worry about searching around for an efficient call. I settled on a basic layout:

Data Lake layout

Within the directory for NSFs, there are a few entities:

  • darwino.json keeps track of the last time the target was replicated to, so I can pick up on that for delta replication in the future
  • docs houses the documents themselves, named like "(unid).json" and containing the converted JSON content of the Domino documents
  • attachments has subfolders named for the UNIDs of the documents referenced, followed by the attachment and embedded images, names prefixed with the fields they're from

Back on the Domino side, I can set this up in the Sync Admin database the same way I do for traditional Darwino targets, where the Data Lake extension is picked up:

Data Lake in Sync Admin

Once it's set up, I can turn on the replication schedule and let it do its thing in the background and Data Lake will stay in step with the NSFs.

Conclusion

Now, immediately, this is really of interest just to our client who wanted to do it, but I felt like it was a neat-enough trick to warrant an overview. It's also satisfying seeing the layers working together: the side that reads the Domino data needed no changes to work with this entirely-new target, and similarly the core replication engine needed no tweaks even though it's pointing at a fully-custom destination.

XPages on Android

Mon Apr 15 16:30:50 EDT 2019

  1. Letting Madness Take Hold: XPages Outside Domino
  2. XPages on Android
  3. Concept Proven: A Complex Production XPages App Running Outside Domino

Around the start of the year, I had a bit of a dalliance with the idea of running XPages outside Domino. The upshot of that project is that it is indeed possible to do so, but there'd be some work to do to make it practical.

This month, we revisited the idea with a healthy dose of Darwino to provide some undergirding technology, with the goal of being able to run a plain old XPages app on mobile devices backed by Darwino DBs and replicating back to Domino. There was a lot of fiddling involved, but it works:

Android is the natural first target, but iOS is about 70% there, with the scaffolding loading up to the point where it loads a page but currently with a bit of trouble when it comes to resolving data classes and executing renderers.

What Specifically Is Going On?

We set out a few required parameters to call it a successful proof-of-concept:

  • It has to use the actual XPages framework - that is to say, the jar files shipped with Notes/Domino
  • The XPages themselves have to be shared among Domino and the Darwino app, in the form of their Java "intermediate" source (compiling from .xsp source is possible but is a whole other thing)
  • It has to use xp:dominoView and xp:dominoDocument data sources
  • It has to load the Extension Library
  • It has to use ancillary elements from the app: managed beans, themes, CSS, images, SSJS libraries

This checks all of those boxes, and it's pretty satisfying to see in action.

The Stumbling Blocks

As I discussed in my post about the original project, there are aspects of XPages that are meant to make this sort of thing possible, with the core parts abstracting out different platforms and runtime environments. Over the years, though, assumptions about Domino and OSGi crept in, with newer additions and the Extension Library taking a bit less care to be environment-neutral.

Moreover, XPages is not open source and I don't have any particular special access to it, making this whole thing essentially, in the video-game sense, hard mode. There are a handful of classes that needed to be outright swapped out, like the annoyingly-final NotesContext class, but there was much less of that than I'd thought.

After that, getting the data to point to Darwino was pleasantly straightforward. Other than a handful of areas where the NAPI comes in, the Domino data sources largely adhere to the rules of the lotus.domino interfaces and don't make assumptions about the implementing classes (this is essentially how ODA shims in as well).

What This Could Be Useful For

With the right fleshing out, this could be a real way to run existing XPages apps on mobile devices. That could be pretty useful on its own, but I don't think that carrying forward XPages apps as-is is the right idea. The side effect of this, though, is that you have a functioning XPages app in a normal old .war file project, structured with Maven or Gradle, and ready to be molded into newer frameworks using whatever tooling you'd like. No Designer, no OSGi, no Servlet 2.4/2.5, just a clean basis running your existing logic and ready to be improved.

If the mountain of existing XPages code is going to have a future, I think it should be something like this.

Implementing Cluster Replication From Domino to Darwino

Tue Mar 19 11:24:31 EDT 2019

Tags: darwino

Since its inception, Darwino has had two-way replication between it and Domino, and it's evolved over the years in fidelity and configurability. Recently, I was able to check an item off the to-do list that I've wanted for a while: "cluster-style" replication from Domino, where a document change immediately kicks off replication to Darwino.

Component #1: Extension Manager

Fortunately, the implementation is straightforward in concept: the Extension Manager has been in there since 4.0 and provides exactly the hooks one would need to implement this. The trouble with the Extension Manager, though, is that you can only subscribe to events from within an ExtMgr addin, and that means native code. You can't just hook into it from, say, an OSGi plugin.

My first thought was to use DOTS, which created this exact sort of bridge years ago. However, it's critically limited: the EM ferrying was never really fleshed out that much, and nor will it ever be, since it's not a supported project. It was kind-of-sort-of supported in the 9.x era for "social" purposes, but those days are behind us. Moreover, its separate OSGi environment wouldn't suit Darwino's needs particularly well.

The DOTS dynamic library, though, could still be potentially useful. Nathan Freeman came across this a couple of years ago: due to the way the DOTS dylib ferries the events from the ExtMgr world to DOTS, the channel is actually consumable by anything, not just DOTS specifically. My initial implementation did exactly this: it fed from the fountain of messages produced by the DOTS dylib for its own ends.

However, the core trouble still remains that DOTS isn't supported as such, and it has too many moving parts for us to want to take on as a dependency. Moreover, the "siphoning" only works if there's only one subscriber listening - on a server that also runs ODA (which Darwino does not use), you have the two consumers contending for messages, which is a recipe for missed events.

So I decided to write a custom-made ExtMgr addin, which would have the advantages of being much smaller and easier to maintain, feeding a different queue, and being a fun learning opportunity for me. The last part isn't as important to Darwino-the-product per se, but I always like when it lines up like that.

Component #2: Message Queues

The way DOTS and this new addin do their things is to use Message Queues, another technology presumably-not-coincidentally added to Domino in R4. The way these work is that you create a named queue (DOTS's, for example, is named MQ$DOTS) and then feed it strings, which are then consumed by anything running in the same Notes/Domino environment by requesting the queue of the same name and waiting for messages. It's pretty simple both in theory and in execution, with the minor problem that the API isn't officially available from Java.

Fortunately for ODA's (and presumably DOTS's) use, though it's not part of the official API, there is a lotus.notes.internal.MessageQueue class that is a (shockingly-thin) wrapper around the MQ* functions in the C API. It's functional, and the thinness of the wrapper means that the method parameters, though unnamed in the bytecode as usual, are clear matches for the equivalent C parameters.

I initially started using this, but ended up writing a nicer wrapper in Darwino's NAPI that implements BlockingQueue, making consumption in Java much clearer.

Component #3: The Listener

The final piece was the most comfortable, since it's entirely back in the warm embrace of Java: I wrote a class to listen for events coming through this queue, extract the database name, check for replicators configured for that NSF, and immediately kick any applicable ones off.

The End Result

Since the overhead of the work in a small replication is so minor, the end result is effectively the same as cluster replication between Domino servers: the change/creation/deletion is propagated over within a couple milliseconds (for small documents, due to, you know, physics). It's pretty satisfying to see in action.

Pub/Sub

If you've been following HCL's announcements lately and feel like this sounds very similar to the pub/sub support they've slated for V11, you're right. My guess is that they wanted to get Elastic Search working with low latency and (kindly and wisely) decided to turn the work required into a nicer interface for the same EM events we're using. It's a good feature and, assuming it's consumable from local Java, would have made my work here easier, but I didn't want to wait, and we target a couple Domino releases back anyway.

Reforming the Blog in Darwino, Part 4

Fri Jul 20 18:59:51 EDT 2018

Tags: java darwino
  1. Reforming the Blog in Darwino, Part 1
  2. Cramming Rails Into A Maven Tree
  3. Reforming the Blog in Darwino, Part 2
  4. Reforming the Blog in Darwino, Part 3
  5. Reforming the Blog in Darwino, Part 4

Last time, I went over my switch in tack for how I'm making the new version of my blog, and my overall focus on picking an interesting stack of JEE technologies. In this post, I'm going to start diving into the implementation of the UI, though I think that it will make sense to dedicate two posts to it.

The biggest decision I made with the UI side of this app is that I didn't want to make a client-side JS app. There's a reason they're so ascendant, and I find development with React or Stencil pretty enjoyable, but I wanted to go a different route here for a few reasons:

  • For a blog, a CSJS app is wildly overkill, and, in fact, would require extra work to fulfull one of the basic requirements of a blog, which is being web-crawler friendly.
  • I want to see how svelte I can make the client payload.
  • Skipping a JS framework (and a CSS one) is a great way to brush up on what plain HTML and CSS are capable of nowadays.
  • Unlike a typical Darwino app, my only target is a full-on Java web server, so I'm not held back on the Java side by the capabilities, say, of Dalvik on Android 4.
  • Part of me misses the simplicity of my early PHP days, albeit not the language.

The Java Side

I decided to go with the MVC 1.0 draft spec because it lets me write extremely focused code. Here is the controller for the home page:

package controller;

import javax.inject.Inject;
import javax.mvc.Models;
import javax.mvc.annotation.Controller;
import javax.ws.rs.GET;
import javax.ws.rs.Path;

import model.PostRepository;

@Path("/")
@Controller
public class HomeController {
	@Inject
	Models models;
	
	@Inject
	PostRepository posts;
	
	@GET
	public String get() {
		models.put("posts", posts.homeList());
		
		return "home.jsp";
	}
}

Naturally, there's a lot of magic going on behind the scenes - there's tons of heavy lifting going on here by JAX-RS, MVC, CDI, JNoSQL, and Darwino - but that's the point. All the other components are off doing their jobs in their areas, while the code that provides the UI doesn't have to care about the specifics.

Things can get more complicated on the pages that actually have some functionality to them, but the code remains pleasantly focused. Take the handler for deleting posts:

@DELETE
@Path("{postId}")
@RolesAllowed("admin")
public String delete(@PathParam("postId") String postId) {
	Post post = posts.findByPostId(postId).orElseThrow(() -> new IllegalArgumentException("Unable to find post matching ID " + postId));
	posts.deleteById(post.getId());
	return "redirect:posts";
}

This adds another level of magic in the form of javax.security.annotation.RolesAllowed, but it's more of the good kind: even with no knowledge of the underlying frameworks, it's pretty clear what every bit of code is doing here. It's a refreshing bit of that Rails simplicity, just more compile-type-safe and much uglier.

Even beyond the minimal code is the cleanliness that this brings to the structure of the application: other than the img, css, and js paths, all of the routing within the application is done care of JAX-RS and MVC. It's not beholden to the folder structure in the project or to a Domino-style implicit app router.

JSP

JSP has been the prototypical Java HTML language for about 20 years, and it's had a rough upbringing. The early versions committed the PHP/XPages sin of encouraging you to put business logic right on the page, and it even still has the typical Java problem that it's tricky to find advice about using it that uses technologies added since 2005.

Still, when used properly, it can be a nice, clean templating language. Again from the main home page:

<%@page contentType="text/html" pageEncoding="UTF-8"%>
<%@taglib prefix="t" tagdir="/WEB-INF/tags" %>
<%@ taglib prefix="c" uri="http://java.sun.com/jsp/jstl/core" %>
<t:layout>
	<c:forEach items="${posts}" var="post">
		<t:post value="${post}"/>
	</c:forEach>
</t:layout>

For an XPages developer, this is extremely comfortable. It's also very refreshingly elemental: there's no server-side persistence of the page, so everything is "load-time bound" and, with just HTML tags and core JSTL tags, nothing ends up on the page that you don't explicitly put there.

Ozark, the MVC implementation, also supports using JSF "Facelets" for the view portion, but JSP suits the task just fine.

HTML + CSS

It'd been far too long since I last really sat down and looked at what baseline HTML and CSS are like - in particular, I'd watched the rise of CSS Flexbox and Grid from afar, and never gave them a shot. Using components that generate their own HTML and pre-existing CSS frameworks to target with class names is all well and good, but it does leave you a bit disconnected from the fundamentals.

So, for this iteration, I tossed aside the very-nice Bootstrap framework I've been using, dusted off one of my old hand-built ones, and got to translating it into CSS Grid. This cut down on the page size enormously: I had already echewed Dojo by not using XPages, but this now also meant that I could ditch the core bootstrap.css, jQuery, and any jQuery plugins.

Beyond CSS Grid, have you seen how nice HTML forms are nowadays? Just looking at this post reveals how much is built in in the way of validation and different input types, even before you write a line of JavaScript.

Turbolinks

Having such a trimmed-down UI means that pages already load extremely quickly, but I figured this was also a perfect chance to try out a bit of clever tech from the team at Basecamp: Turbolinks. Turbolinks is a JS file that you bring into your app which then transparently takes over your in-app links to minimize the amount of rendering you have to do. Since the surrounding boilerplate of the app usually doesn't change between requests, it can figure out the "diff" between old and new and just replace the body. It's essentially like partial refreshes without the server knowing anything about it.

It's still technically inefficient to have the server render and transfer surrounding page elements that are just going to be discarded anyway. But, on the other hand, skipping that means that I don't have to write JavaScript handlers myself, use a full CSJS app framework, or keep state on the server side. The server just keeps doing what it does with a fully context-less request and the browser sorts it out. Basecamp's programmers are masters at the targeted deployment of kludges for maximum benefit.


In the next (final?) post in the series, I'll finish up with my "low-JS" experience and other lessons learned from this project.

Another Project: XPages Jakarta EE Support

Sun Jun 03 16:40:11 EDT 2018

In my dealings with JNoSQL recently, I’ve been delving more into the world of modern Jakarta EE/Java EE/J2EE development, particularly the magic land of CDI.

The JEE stack tends to be organized as a collection of specs and implementations, many of which are really independent of each other and the underlying platform, making them pretty portable onto any reasonably-recent JVM. Now that Domino is actually on a reasonably-recent JVM, that makes it a workable target! So I decided to create a side project to bring some of JEE to XPages.

XPages has always been “sort of Java EE” - you don’t really have the full stack, and it’s far behind on the components that it does have, but a lot of the concepts are there. Of particular interest are managed beans and expression language.

CDI and Managed Beans

The XPages stack contains what amounts to a priomordial version of CDI. Since the release of XPages, JSF improved on the original faces-config.xml declaration method to add annotation-based declarations, and then CDI is something of a codification and expansion of that into the full Java world.

My project uses the Weld reference implementation of CDI to create a CDI context for each XPages app that opts in, allowing it to use annotations on classes to declare beans and properties:

@ApplicationScoped
@Named // or @Named("applicationGuy")
public class ApplicationGuy {
    public void getFoo() {
        return "hello";
    }
}

These can then be used like normal managed beans in an XPage:

<xp:text value="#{applicationGuy.foo}"/>

The project’s README contains some further examples.

I went with the Java SE implementation of Weld instead of the pre-built servlet or OSGi packages since those are a little too smart for this use: they pick up on the fact that they’re in a JSF environment, but expect newer versions of the servlet spec and JSF.

Expression Language

Since its original release, EL went through a similar standardization process as CDI and is now at version 3.0 and is distinct from JSP and JSF. As anyone who has tried to call a method on a bean in EL has found out, the XPages EL implementation lags pretty far behind, at the JSF 1.0/1.1 level. Since that time, it sprouted parameters and “projection” and is essentially a tiny scripting language now.

My project uses GlassFish’s EL implementation to outright replace the stock EL interpreter for apps making use of it. I added some affordances to IBM’s customized data support, so it’s intended as a drop-in replacement:

<xp:text value="${dataObjectExample.calculateFoo('some arg')}"/>

<xp:text value="#{el:requestGuy.hello()}"/> 

Note the “el:” prefix in the runtime-bound expression: that’s to get around Designer’s validation of runtime EL expressions.

So… Why?

That’s a good question! The first two reasons are “because it’s fun” and “to learn more about JEE”, but there’s also practical value for this sort of thing.

XPages is moribund, and that leaves Domino developers with a few options:

  • Go back to LotusScript. The iPad Notes client makes this a terrifyingly-practical option, but it’s soul death.
  • Go to JavaScript (or another platform). This is another route HCL is pushing, and it’s entirely valid: Node is a great platform with excellent support and momentum.
  • Go to modern Java.

For anyone who has invested a lot of time and brainpower in XPages over the years, that last one particularly appealing, and projects like this can help you get there. If you have a large XPages code base, as I do with one of my clients, it makes a lot more sense to work on that in such a way that it gradually becomes less XPage-dependent while avoiding the trap of a full rewrite in another language.

Many of us have already done something of this sort: JAX-RS is another JEE standard, and the Wink implementation in the Extension Library, though also aging, accomplishes this same sort of task. Especially if your services don’t reference Wink explicitly and write just to the spec, they are very portable.

That portability - of code and skillset - is critical. Say you have a class like this:

import javax.inject.Inject;
import javax.ws.rs.Path;
import javax.ws.rs.GET;
import javax.ws.rs.core.MediaType;
import javax.ws.rs.core.Response;

@Path("/issues")
public class IssuesResource {
    @Inject IssueRepository issueRepository;

    @GET
    @Produces(MediaType.APPLICATION_JSON)
    public Response get(@QueryParam("category") String category) {
        return issueRepository.find(category).stream()
            .map(this::doSomething)
            .skip(3)
            .collect(this::toResponse);
        }

​     // ...
}

Which Java platform is that targetting? What’s the data storage mechanism? Who cares? This class certainly doesn’t. That could just as easily be Domino reading from an NSF or (as is actually the case in the example’s source) Tomcat with Darwino.

What’s Next?

Truthfully, maybe not much. Though JEE contains a whole raft of technologies, these two were the ones that scratch my immediate itch. We’ll see, though - the skill portability of erstwhile XPages developers is critically important, and I think that this is another one of the paths that can get us where we need to go.

Lessons From Writing a JNoSQL Driver

Sat Dec 30 11:12:24 EST 2017

The other day, I decided to start up a side project to write an app for my Stars Without Number game in Darwino. Like back when I wrote a forum/raiding app for my WoW guild, I like to use this kind of opportunity to try new technologies and flesh out my skills in existing ones.

One such tech I've had my eye on for a bit is JNoSQL, which is a framework for integrating with NoSQL databases in Java. It's along the lines of Hibernate OGM, but intended to avoid the pitfalls of the relational/NoSQL that came with trying to adapt JPA directly to NoSQL databases. JNoSQL promised to be much easier to implement for a new database, so I decided to give it a shot.

JNoSQL

JNoSQL is split into two paired components, cleverly named Diana (the driver side) and Artemis (the model/integration side). The task of writing a driver for a new database is pretty well-contained: pick the database type(s) you want to implement (out of key/value, column, document, and graph) and implement about half a dozen interfaces. This is in stark contrast from when I took a swing at writing a Hibernate OGM driver, where the task was significantly more daunting. The final result is only ten Java files, with a chunk of them being utility classes for code organization.

It's a young project - young enough that the best version to run right now is 0.0.4-SNAPSHOT - but it functions well and it's been taken under the wing of the Eclipse foundation, which builds some confidence.

Implementation

Though the task was small, there were still a couple initial hurdles to getting going.

To begin with, I decided to start with the Couchbase driver - this certainly made the overall task easier, since Couchbase's semantics are very similar to Darwino's, but it also meant that I had to be wary of which parts of the codebase were really about implementing a Diana driver and which were Couchbase-isms. Fortunately, this was much easier than the equivalent work when I adapted the CouchDB Hibernate OGM driver, which was a sprawling codebase by comparison.

More significantly, though, it's always tough coming in to modify a codebase written by a single person or small team and learning as you go. The structure of the code is clean, but not quite my normal style (in part because Domino kept me from diving into Java 8 streams for so long), and I also had to ramp up quickly on the internal concepts of Diana. Fortunately, this was mostly easy, since the document-DB driver scaffolding is purpose-built, the hooks are straightforward and the query semantics were extremely easy to adapt. The largest impediment was getting used to the use of the term "Document", which internally refers to a key/value pair, while "DocumentEntity" is closer to the expected meaning.

Like the core implementation, the test suite I adapted from Couchbase was also pleasantly svelte, covering the bases without being an onerous nightmare to convert. Indeed, most of the code I added to it was the Darwino app scaffolding just for the test runtime.

Putting It Into Practice

Once the driver was written, I was hit by a bit of a personal curveball when I went to implement some actual data models. The model side, Artemis, is heavily wrapped together with CDI, which is a Java EE thing that, as I gather, handles managed beans, separation of implementation, and variable injection. This is a pretty normal thing for Java EE developers, but XPages's "don't call it Java EE" environment didn't introduce me to this aspect. As such, the fact that the documentation just kind of casually tossed around CDI terms and annotations threw me for a bit of a loop trying to determine what was what was required and what was just an idiom.

I eventually determined that I could use the reference implementation, Weld, without necessarily going whole-hog on Java-EE-everything. I'm a bit wary of what this bodes for whether I'll be able to use JNoSQL in Darwino on mobile devices, but I'll cross that bridge when I come to it. Once I got a bit of a handle on what Weld is and how to use it in unit tests (hint: make sure you have beans.xml files!), I was able to start writing my model objects and testing them.

Doing It Again

The fact that the bulk of my implementation work ended up being on the app side with CDI goes to show that the Diana driver model really shines. It got me thinking about how difficult it would be in the future, say to write a driver for Domino. There'd be some hurdles - Domino's lack of nested objects and antiquated querying mechanisms would need replacing - but the core task wouldn't be too bad. I don't know if I'd have a need for it, but it's nice to keep in mind as potential future small project.

All in all, I'm optimistic about the use of this. I'd love for Darwino to integrate as smoothly as possible into whatever standard environments it can, and this is one more step in that direction. I'll know as my side app takes shape how much this ingrains itself into my actual work.

Reforming the Blog in Darwino, Part 1

Thu Sep 15 15:38:12 EDT 2016

Tags: darwino
  1. Reforming the Blog in Darwino, Part 1
  2. Cramming Rails Into A Maven Tree
  3. Reforming the Blog in Darwino, Part 2
  4. Reforming the Blog in Darwino, Part 3
  5. Reforming the Blog in Darwino, Part 4

This continues to be a very interesting time for Domino developers, with the consternation of MWLUG giving way to IBM's recent announcement about their plans for Domino. Like everyone, I have my feelings about the matter, but the upshot is that moving-forward tone still stands. With that in mind, let's get down to business, shall we?

I'm going to kick off my long-term blog series of moving my blog itself over to a Darwino+JEE application. I say "long-term" because it's a pure side project, and I still have a lot of decisions yet to make with it - for one, I haven't even decided on which toolkit I'll be using for the UI. However, since this will have the additional effect of being a demo for Darwino, I want to lay the groundwork early. And hey, that's one of the benefits - since the data will keep replicating, I can take whatever time I need on the new form while the old one chugs along.*

As I have time to work on it, I'll go over the steps I've taken and put the current state up on GitHub. These descriptions are meant to be somewhere in between an overview and a tutorial: I'll cover the specific steps I took, but I'm leaving out a lot of background info for now (like installing a PostgreSQL database, for example).

To start out with, I'll create the basic projects. A prototypical Darwino application consists of a two-tiered structure of Maven modules: a root container module and then several related projects. In my case, the tree looks like this:

Since my plans are simple, this only has a few of the potential modules that could be created, but retains the separation-of-concerns structure. The purposes of these projects are, in order of importance:

frostillicus-blog-shared
This project holds the database definition and any business logic that should be shared across each UI for the application. Most of my Java code will go here (unless I pick a Java-heavy UI toolkit, I guess). Any model objects, servlet definitions, scheduled tasks, and so forth will go here.
frostillicus-blog-webui
This project holds the assets used by the web UI of the application, which could be shared by the JEE project and any "hybrid" web-based mobile UIs I make (which I won't for this). By default, this contains a skeletal Ionic-based UI.
frostillicus-blog-j2ee
This project holds the scaffolding for the JEE servlet app ("J2EE" still has a better ring to it than "JEE", though). Depending on where I go with the UI, this will either be a small shim just to get HTML served up or a larger server-side toolkit project.

In a different situation, there could be up to five additional modules: native and hybrid UIs for Android, the same pair for iOS, and an OSGi plugin shim for running on Domino. I may end up wanting to run this on my existing Domino server, but I don't have a need for offline mobile access just for my blog, so I'll definitely be skipping those. (Edit: I forgot one potential UI project: a SWT front-end for desktops)

For the most part, the default created classes do what I want them to do to get it started. There's really only one tweak to make: since I'll want to be able to search, I know up front that I want to enable FT searching in the Darwino DB. This sort of thing is done in the created AppDatabaseDef class. There's quite a bit that you can do there, but I'll mostly just uncomment the lines that enable full-text search in the loadDatabase method:

@Override
public _Database loadDatabase(String databaseName) throws JsonException {
	if(!StringUtil.equalsIgnoreCase(databaseName, DATABASE_NAME)) {
		return null;
	}
	_Database db = new _Database(DATABASE_NAME, "frostillic.us Blog", DATABASE_VERSION);

	db.setReplicationEnabled(true);
	
	db.setInstanceEnabled(false);
	
	{
		_Store _def = db.getStore(Database.STORE_DEFAULT);
		_def.setFtSearchEnabled(true);
		_FtSearch ft = (_FtSearch) _def.setFTSearch(new _FtSearch());
		ft.setFields("$"); //$NON-NLS-1$
	}

	return db;
}

That will ensure that, when the database is deployed (or, if I make changes later, upgraded), FT search is on. The line ft.setFields("$") uses a bit of JSONPath, basically saying "start at the root, and cover all fields". I'll probably come back to this class later to add some more optimizations, but that can wait until I'm sure how the structure of the app will take form.

The last step for now is to set up replication between Domino and this fledgling Darwino app. To do that, I'll set up an adapter definition in the Sync Admin database (the Domino-side application that manages Darwino replication):

The code on the page is the DSL used to define the mapping between Domino documents and Darwino's JSON docs. In this case, it's the code that is automatically generated by the "Generate From Database" tool, and everything after the first line isn't strictly necessary: without guidance, the replicator will try to translate the doc contents as best it can, and the data in this DB is pretty clean. It doesn't hurt to clamp it down a bit, though. I have it pointed to a non-replica copy of the DB for now, since I plan to do some destructive tinkering with the data when I actually make a UI that I don't want replicating back to "production" just yet, but I'll clear the Darwino side and replicate in the live data when I'm ready.

To enable replication in the Darwino app, I commented out the labeled block in the JEE project's web.xml and set a few properties in my Tomcat server's darwino.properties (to externalize sensitive information):

frostillicus_blog.sync-enabled=true
frostillicus_blog.sync-url=http://pelias-l.frostillic.us/darwino.sync

Once that was set, I launched the app using a Tomcat instance in Eclipse, and I could see it doing its thing:

Start deploying database frostillicus_blog, POSTGRESQL, 0/0
Finished deploying database frostillicus_blog, 0, 0/0
Start replication with server http://pelias-l.frostillic.us/darwino.sync
Started replication Pull frostillicus_blog, estimated entries: 5013 [September 15, 2016 2:03:37 PM EDT]
1 processed, 0% (total 5013, remaining time 1m55s, avg rate 1/s, instant rate 0/s)
664 processed, 13% (total 5013, remaining time 13s, avg rate 332/s, instant rate 332/s)
1313 processed, 26% (total 5013, remaining time 11s, avg rate 328/s, instant rate 324/s)
1941 processed, 38% (total 5013, remaining time 9s, avg rate 323/s, instant rate 314/s)
2518 processed, 50% (total 5013, remaining time 7s, avg rate 314/s, instant rate 288/s)
3086 processed, 61% (total 5013, remaining time 5s, avg rate 308/s, instant rate 284/s)
3597 processed, 71% (total 5013, remaining time 4s, avg rate 299/s, instant rate 255/s)
4069 processed, 81% (total 5013, remaining time 2s, avg rate 290/s, instant rate 236/s)
4489 processed, 89% (total 5013, remaining time 1s, avg rate 280/s, instant rate 210/s)
4929 processed, 98% (total 5013, remaining time 0, avg rate 273/s, instant rate 220/s)
5013 processed, 100% (total 5013, remaining time 0, avg rate 278/s, instant rate 84/s)
+++ Finished, 5013 processed (estimated 5013, time 18s, avg rate 278/s)

Once that was done, I went over to the default utilitarian web UI to make sure everything looked good, and it did:

(the advice about reverse proxies still stands, by the way)

Nothing app-specific in there yet (and I haven't hooked up authentication to Domino yet, so I don't have my Gravatar icon), but it shows that the data made the trip none the worse for wear, authors field and all.

That will do it for the initial phase. I plan to revisit this down the line, once I've made a decision on a UI framework and have the time to actually start implementing that. That will be an interesting one - there are strong reasons to make single-page applications in JavaScript, but my heart is still in server-side toolkits. That will be a choice for another day, though.


* If you'll forgive the blatant sales pitch.

Release Weekend: ODA and Darwino

Tue Aug 02 08:10:44 EDT 2016

Tags: oda darwino

This past weekend was a nice one for releases to a couple of the projects I work on: the OpenNTF Domino API and Darwino.

The ODA release is something of a "consolidation" release over 2.0.0: it fixes a few of the bugs that cropped up since then, adds some important lower-level tweaks, and brings the graph REST API into the mainline release. One note with the REST API is that, due to making use of a recently-added extension to the core code, it requires a recent release of the Extension Library, 9.0.1_17 or higher.

The Darwino release is something of a preparatory release, containing some significant improvements since the initial 1.0 earlier this year. A lot of the features will warrant a proper announcement post, but the ones I find most intriguing (or worked on the most) are significantly-improved Domino replication, a new scheduling framework, and some nice cloud-deployment improvements, such as Watson integration tools and proper support for Microsoft Azure. This also sets the ground for a number of features in the next major release, which exist in initial form now, but need some final polish.

Should I have time (it continued to be a very-busy summer), both of these will warrant some more discussion. But, in the mean time, give the new versions a shot!

Darwino for Domino: Domino-side Configuration

Mon May 16 10:51:47 EDT 2016

Tags: darwino
  1. An Overview of Darwino for Domino Types
  2. Darwino for Domino: Replication and Data Format
  3. Darwino for Domino: Domino-side Configuration
  4. Darwino for Domino: Conceptual Overlap and Distinctions

In my last post, I mentioned that a big part of the job of the Darwino-Domino replicator is converting the data one way or another to better suit the likely programming model Darwino-side or to clean up old data. The way this is done is via a configuration database on the Domino side (an XPages application), which allows you to specify Database Adapters that configure the translation. While it is possible to write these in Java, the primary way is to use a Groovy-based DSL script.

The simplest form of this script may be one line just defining where the NSF is:

nsfName "foo.nsf"

With that, the replicator will open foo.nsf and attempt to replicate all documents with "best fit" translations of each field it comes across. Things can get a little more complex, though:

form("SomeForm") {
	excludeField "foo"
	excludeFields "IgnoreMe"
	
	field "foo_(.*)", nameRegex: true
	arrayField "bar_(.*)", delimiter: "\$", zeroBased: true, initialIndexed: false, prefix: false, compact: true,
		userDataFormatName: "ODAChunk", nameRegex: true
	field ~"impliedfoo_(.*)"
}

form("ArrayTest") {
	arrayField "FirstName", type:TEXT, delimiter: "_"
	arrayField "LastName", type:TEXT, delimiter: "\$", zeroBased: true, initialIndexed: false, prefix: true, compact: true
	
	// Similar to the ODA style
	arrayField "UserData", type:USERDATA, delimiter: "\$", zeroBased: true, initialIndexed: false, prefix: false, compact: true,
		userDataFormatName: "ODAChunk",
		toDarwino: { value ->
			// The value will be an array of byte arrays, which should be strings
			println "testrep: called toDarwino!"
			value.collect { bytes ->
				new String((byte[])bytes, "UTF-8");
			}
		},
		toDomino: { value ->
			// We should provide an array of byte arrays
			println "testrep: called toDomino!"
			println "value is ${value}"
			def result = value.collect { string ->
				string.getBytes("UTF-8")
			}
			println "result is ${result}"
			return result
		}
}

form("RestrictedFields") {
	restrictToDefinedFields true
	field "KnownField"
	field "KnownField2"
}

The specifics of what's going on in this example (pulled from unit test data) aren't too important, but it demonstrates the customizability that an in-language DSL brings. Since the code is executed in Groovy, it has access to the full Java runtime (including, if you deliver it in a plugin, your own custom classes and dependencies), as well as Groovy's nice abilities like closures. In fact, a great many properties, like the converters above, can be specified as closures and executed in the context of each document as it's replicating, allowing for pretty fine-grained translation.

And personally, adapting Groovy into the process was an interesting exercise. Since Groovy was designed explicitly as a "scripting" variant of Java, the process of working it in to an existing Java code base is very smooth, and there aren't too many gotchas. I wrote some Java classes that provide the context for the root and individual "form" blocks, wired up the interpreter, and then they call each other seamlessly. Other languages could probably suit the job well too - JRuby, Rhino, etc. - but Groovy is mature, purpose-built, and largely syntax-compatible with Java itself, making it a very comfortable fit.

Darwino for Domino: Replication and Data Format

Wed May 11 14:35:50 EDT 2016

Tags: darwino
  1. An Overview of Darwino for Domino Types
  2. Darwino for Domino: Replication and Data Format
  3. Darwino for Domino: Domino-side Configuration
  4. Darwino for Domino: Conceptual Overlap and Distinctions

One of the key points of interest in Darwino for Domino developers is its two-way replication. Darwino's replication system was built in such a way that, in addition to its own internal needs, you can also write a replicator to connect to an entirely-unrelated system, as long as that replicator translates the foreign data to and from JSON documents. Domino is a perfect case for this, since the data model is already very similar, and its replicator ships with Darwino and has been a focus of my attention for a while.

Behind the scenes, this replication takes advantage of a lot of the same kinds of things that Domino's replication always has: UNIDs, sequence IDs, original-vs.-in-file modification/creation, and deletion stubs. These details are transparent to the developer: the Darwino adapter knows how to fetch the appropriate data from the NSF and to convince Domino that the Darwino DB is (almost) like a remote Domino server, at least as far as the stored data is concerned.

The primary differences show up in the way data is formatted for storage. Darwino uses JSON for its document format, which has a couple key advantages and disadvantages compared to NSF's "bag of items" approach. The best approach is probably to provide an example and highlight the pertinent differences. Say you have a Domino document that (conceptually) looks like this:

FirstName
	Type: TEXT
	Value: Foo
LastName
	Type: TEXT
	Value: Fooson
Username
	Type: TEXT
	Flags: NAMES READWRITERS
	Value: CN=Joe Schmoe/O=SomeOrg
Birthday
	Type: TIME
	Value: 1970/2/1
Vacations
	Type: TIME_RANGE
	Value: 2016/1/1-2016/1/5, 2016/3/3-2016/3/5
IsAdmin
	Type: TEXT
	Value: Y

On the Darwino side, that would potentially look like this:

{
	"_writers": {
		"username": ["cn=Joe Schmoe,o=SomeOrg"]
	}
	"firstname": "Foo",
	"lastname": "Fooson",
	"birthday": "1970-02-01",
	"vacations": [
		"2016-01-01/2016-01-05",
		"2016-03-03/2016-03-05"
	]
	"admin": true
}

There are certainly a few things to take note of here. First and foremost is the structure of the authors field. Because JSON doesn't have field metadata, the way Darwino does its readers/writers security is by using specially-named and -structured properties within the JSON, and so the converter moves all readers and writers fields into that. This has internal-implementation reasons, but I think it's also conceptually preferable to, say, having a multi-level object within each field to declare its flags separate from the value. The name also happens to be stored in LDAP style, because Darwino is more at home with standard LDAP conventions for that sort of thing.

Another thing to note is the format of the date fields. Since JSON doesn't have a real date/time type of its own, these values are converted according to ISO 8601 and stored as strings. That means that your Darwino application will need to know that those string values represent dates, but they're reasonable to deal with.

The multi-value date field leads to another important aspect: arrays. In Domino, most items are conceptually presented as arrays, regardless of whether they contain single or multiple values, leading to code that requires either explicitly asking for the first element or jumping through hoops to deal with single or multiple values. Since that's a drag to worry about with JSON, the default behavior for single-value items transferred to Darwino is to store them as "bare" values. When configuring the translation (which will be fodder for a future post), you are able to specify that you want a field to be always stored as an array, which will allow the Darwino-side code to be simpler.

The last field shows off an outright advantage of the JSON format: boolean storage. When defining the conversion, you can specify a field as boolean and provide what the true/false values will be, and they will be sent over to JSON as true and false explicitly. That's not a night-and-day change, but it is a nice help.

Finally, there's the matter of rich text, which is unsurprisingly a nontrivial problem. This is handled in an XPage-alike way: MIME rich text is transferred with only minor adjustments, while Composite Data is converted to HTML and cleaned up a bit before transfer. Darwino supports the concept of attachments natively, and so Domino attachments are brought over with a naming prefix to match the field they're attached to, plus a delimiter to indicate whether they're normal attachments or embedded images. The way it is presented on the user end is dependent on the application, but Darwino has some routines to translate storage-safe inline image refs to app-relative URLs.

Later, I will go into the process of how the Domino-Darwino adapters are set up. The short of it is that you create scripts that can run the gamut from just telling the server where to find the NSF to customizing the translation of each field encountered. This allows you to either transfer the data back and forth in the default "best approximation" approach or use the opportunity to enforce a bit of a schema on old data.

An Overview of Darwino for Domino Types

Thu Apr 14 19:05:47 EDT 2016

Tags: darwino
  1. An Overview of Darwino for Domino Types
  2. Darwino for Domino: Replication and Data Format
  3. Darwino for Domino: Domino-side Configuration
  4. Darwino for Domino: Conceptual Overlap and Distinctions

So, Darwino! I've mentioned it quite a few times on Twitter and, particularly, in person, but I think it's high time I write some proper blog posts about it.

To start with, I'll cover what Darwino is. The short version is it's a Java-based development framework with a replicating document database. The interesting aspects go beyond that, though:

  • In addition to Java web servers, it targets mobile devices, both Android and, through RoboVM, iOS. Those devices store their own replicas of the databases for offline work in the same conceptual way as Notes, but with native (or hybrid web, if you're so inclined) mobile user interfaces.
  • The document database sits on top of SQL servers. Many modern SQL servers have native support for JSON data, and Darwino takes advantage of this to get document-DB flexibility with SQL features.
  • Business logic is shared between platforms. Because Java acts as a common language between each platform, and the document DB works the same way locally and remotely, the core business logic of the app can be identical across each targetted platform, with only the UI changing between them.
  • Along those lines, Darwino isn't prescriptive with the UI: it's not a front-end framework itself, instead providing the basis for using other front-end tools, such as Ionic, JSF, and Vaadin for web/hybrid UIs and the native OS toolkits on mobile.
  • The Darwino syncing protocol is designed to be adaptable to other services. This is immediately notable for Domino developers, but can also be (and has been, in some cases) adapted for arbitrary other back ends, like Connections social data or other databases.

How does this relate to Domino/XPages development? That depends on your desires, really.

In some ways, it doesn't. Darwino is its own platform, running on Java web servers like WebSphere and Tomcat, using independent SQL servers like DB2 and PostgreSQL. Darwino's replication between the server and mobile devices is similar to Domino's, but is its own thing. Similarly, the document model, though conceptually similar to Domino (including enhanced reader and author fields), is not NSF.

However, there are a number of reasons why it's of interest to a Domino developer, and the most immediate of those is its ability to do two-way replication with Domino databases. I'm a little biased on this point because of how much time I've spent working on it, but this replication is capable of some nifty tricks to make it capable and adaptable, including transformation of the data, two-way maintenance of document time stamps, and so forth. With this syncing, it makes it very practical to extend your existing Domino app - be it a classic-style Notes/web app or an XPages one - with a Darwino-side UI that uses the same data, synced down to mobile devices for offline access. And this doesn't require migration; since the changes replicate back, the app can remain chugging away unchanged on the Domino side if desired. This also can be tremendously useful for reporting, by syncing the data over to a full SQL database that can be viewed and queried by normal tools.

And, really, it's also of interest to Domino developers personally by virtue of being a platform that has learned a lot of valuable lessons from Domino and extended them in new ways. Those years of accumulated document-database knowledge will carry over nicely, with some extra benefits if you're SQL-familiar too. Any Java knowledge will come in handy immediately, as Darwino is thoroughly Java-based on all target platforms. And, thanks to its pedigree, a lot of the platform support concepts are similar to aspects of XPages (the good parts). In general, the more XPages development you've done, the more it will benefit you (especially if you want to use JSF for the UI!). You could also, with some servlet-implementation limitations, run Darwino apps on Domino via OSGi, and I've been putting some side work into accessing Darwino databases from XPages directly - Darwino could make a solid basis for Domino-run apps.

So this turned into a bit of a sales pitch, but there's no getting around it - I find this thoroughly compelling and exciting. As I have time, I plan to expand on Darwino's various capabilities (along with the other various blog series I still plan to get to). For now, you can register for and download the Community Edition, read the documentation there, and/or track me down with any questions.

Connect 2016 and Darwino 1.0

Mon Feb 08 09:45:12 EST 2016

Last week was Connect 2016 and, while I don't have a full review of it, I felt that it was a pretty successful conference. The new venue was much less weird and more purpose-fitting than expected. Moreover, while the conference content wasn't bursting with announcements and in-depth technical dives like at something like WWDC, it did feel a bit more grounded and less marketing-hollow than the last two. So I'll call it a win.

On the OpenNTF front, the conference saw a bit more in the slow rollout of our improved processes and server infrastructure. Prominic has been graciously providing us with servers to run the Atlassian stack of development apps, and we're gradually putting these to use in building and tracking projects run through OpenNTF. Christian Güdemann talked about his company's move to more-structured development using tools like this, as well as an overview of where OpenNTF in particular is heading. Before too long, the OpenNTF Domino API will fall in line with this, switching to a Gitflow-style branch structure and a clean workflow between Stash, Bamboo, and Jira.

The conference also saw the 1.0 release of Darwino. If you're not familiar with it, Darwino is a platform for Java-based development that provides a document database (with improved takes on classic Domino features like hierarchies and reader/author security) and services that will run the same business logic on Java app servers, iOS, and Android. Of particular interest for Connect was the Domino connector, which does two-way replication with Domino databases. The free-for-non-commercial-use Community Edition is a good place to dive in, and we'll be putting out tutorial videos and posts in the weeks to come.

All in all, last week set the stage for the year to come, and I think it should be a very interesting year indeed.