Couchbase Followup and a Github Repository

Tue Jun 18 15:27:33 EDT 2013

Since my initial tinkering with Couchbase, I've been idly thinking about improvements to the data sources and potential uses I may have for it.

As for the latter, I think I may give a shot to implementing IKSG's new project-tracking system in it. I'll lose out on reader and author fields, but otherwise it'd be a pretty direct translation between NSF or Couchbase data stores.

For the former, I decided to drop my development code in a Github repository and start tracking my to-dos there:

https://github.com/jesse-gallagher/Couchbase-Data-for-XPages

The test database stored in that repository is the same as the one from my previous post with two main exceptions:

  1. I added a new parameter to XSPCouchbaseConnection to specify whether or not it should, when acting as a DataObject, treat the bucket as a document store. Since "documents" in Couchbase are just keys with JSON strings as their values, it's a tricky matter to detect intent... thus, I decided I may as well just make it a config option.
  2. I added a second example connection in faces-config.xml: clusterScope. Since the Java API automatically serializes and deserializes objects to store in the bucket, it makes for a perfect backing store for a clustered XPage app's persistent clusterScope.

I can't say how much time I'll have to dedicate to the improvements I want to make in the short term, but I think even the current form should be enough for me to build apps upon. I'm very interested to see if I can come up with tricks to use that are more difficult or are very inefficient in an NSF.

Couchbase Data Sources for XPages

Sun Jun 16 23:25:51 EDT 2013

The other day, I attended a Couchbase Developer Day in Philly. If you're not familiar with Couchbase, it's one of the young crop of NoSQL databases, and more specifically a document database and a conceptual cousin to Domino by way of Damien Katz. It's this similarity that makes it so interesting to me - Couchbase's still-living predecessor CouchDB started life as Notes built from the ground up for the web and the heritage is clear. Couchbase itself ends up being like if you melted down the cores of NSF/NIF and poured the result on top of a speedy key-value store.

That all makes Couchbase a perfect candidate for some XPage tinkering. Ignoring the foundational key-value aspects of Couchbase, the JSON document storage and views in 2.0 make for perfect analogues to the Domino data sources in XPages. To wit:

<xp:this.data>
    <cb:couchbaseView var="beersView" connectionName="couchbasePelias"
        designDoc="beer" viewName="by_name" />
    <cb:couchbaseDocument var="beer" connectionName="couchbasePelias"
        documentId="${param.documentId}"/>
</xp:this.data>

I've set up a demo app on one of my test servers (don't be surprised if that link is periodically unavailable). It demonstrates one of the views in the beer sample database that comes with Couchbase (it's like fakenames.nsf, but more intoxicating), with each row providing a link to edit the document in the DB, though editing is disabled for Anonymous users. The app also contains pages to show the applicable Java classes and support elements. The sources are nowhere near complete works - the view source lacks any knowledge of keys, grouping, etc., while the document source doesn't allow for the creation of new documents, nor is there any representation of non-document values.

Pretty much every concept you work with in Couchbase has a corresponding concept in Domino, but with a bit of slant, like a different dialect (for example, they call it a Royale with Cheese). A table may suit this explanation best:

 

Domino Couchbase Notes
Server Node + Cluster Couchbase has a strong focus on splitting up the job of housing databases into many servers for reliability and scalability reasons. Where Domino's clustering is RAID 1, Couchbase's is more like RAID 5.
Replication XDCR Unlike Domino's replication, Couchbase uses a set of rules to determine which of two conflicting documents lives and which dies unceremoniously.
Database Bucket The Couchbase name is a bit more evocative.
View Design Doc + View Unlike Domino1, Couchbase stores multiple views per design document, and it appears to be a sort of arbitrary art to decide how you want to split your views up among multiple design documents.

 

I'm tempted to find an excuse to build something using Couchbase as a back-end, just to see how its performance stacks up. It seems like my years of built-up NSF habits would serve me well, as many of the programming models and tradeoffs are similar, except that Couchbase views are probably consistently fast without having to cheat.

If I come back to the data sources, I'll probably flesh them out to support creating new documents, properly handle dynamically-computed properties, provide some control over view parameters (maybe including the geoboxing stuff built in), support Couchbase's document-save-conflict detection, and add a data source for basic key/value pairs. And if I do, that could provide a good excuse for me to package them up as a proper OSGi plugin, which I've so far avoided bothering with. We shall see.

1. This is a lie. The Couchbase way is actually very similar to Domino: Domino view design notes contain one or more collations, which correspond to the main index and any click-to-sort column directions. However, each of Couchbase's views inside its design document can produce wildly different data, not just resorts of the same selection, and so the fact that they're similar technically isn't very useful.

The "final" Keyword in Java

Sun Jun 16 14:45:24 EDT 2013

Yesterday, I suggested that everyone enable finalizing parameters in their code cleanups and, better, to do the same in their Java save actions; later, I set Eclipse loose on the org.openntf.domino API to implement it there. Adding final keywords to method parameters (and elsewhere where possible) is one of the habits I picked up from reading Effective Java a bit ago, but it's kind of a subtle thing and the benefits aren't immediately obvious.

If you're not at all familiar with the final keyword, it has a couple meanings in Java - in the context of variable and parameter declarations, it means that you can't reassign the value of the variable after it's been assigned the first time. So, for example, this is illegal and will generate a compile error:

final int i = 1;
i = 2;

In that example, it seems like a "why the heck would I do that?" sort of thing, but it's not too much of a stretch to picture a block of code involving something like this:

void doSomething(Document doc) {
	System.out.println("doc is " + doc.getUniversalID());
	
	// a bunch of code
	doc = database.getDocumentByID("FFFF0002");
	// a bunch more code
	
	System.out.println("doc is " + doc.getUniversalID());
}

That will WORK and it won't damage the original doc that was passed in, but it'll get confusing, particularly the more lines of code or eyeballs involved. Functioning or not, it's poor code style, and changing the method signature to doSomething(final Document doc) means that the compiler will enforce good style.

In addition to being good style, final has some technical benefits. First off, it might be faster. Secondly, if you ever use anonymous classes, final variables are a necessity for accessing information in the surrounding block. If you get into the habit of finalizing your variables, you'll already be ready by the time you have a need for anonymous classes.

final is a component of a larger and more important topic: immutability. I should probably write a lengthy post on immutability in the future, but for now there's an important thing to remember: final does not make your objects immutable. It only prevents assignment - it doesn't prevent calling methods on an object that change the object's value (in fact, Java lacks any language construct or cultural convention for immutability). So even if your Document variable is declared final, you (or any code you pass it to) can still call doc.replaceItemValue("Form", "Ha ha"); with impunity.

Development Improvements, June 2013

Tue Jun 04 13:05:38 EDT 2013

Tags: development

I'm always looking to improve my programming/development/administration methods, and in the past couple months I've made a few more changes. To my chagrin, these changes are mostly things that I should have been doing for a while now and I've been sloppy to do them differently, but hey - fixing a problem is still improvement!

Serious About Source Control

I've been using source control for a while now, and I hope everyone else has as well. However, I've been lax in a couple ways, which bit me a while ago. For example, I had a couple actively-used projects tied to source control in Designer - a few NSFs, a few NTFs. That was fine NORMALLY, but then this sequence of events happened:

  • I run Designer in a Parallels and have my repositories stored Mac-side, pointed to via a drive-letter mapping to my home folder
  • I Remote-Desktop'ed into my Windows VM from another machine
  • Remote Desktop remapped the drive letters to share local resources
  • Designer, which was running at the time, could no longer find the repository folders
  • Apparently, Designer/eGit/whatever decided that the rational thing to do was to "sync" the NSFs with the now-missing repositories... which is to say, delete all design elements from the NSFs

Imagine my surprise when a previously-fine DB I hadn't intentionally touched in a while was suddenly naked, consisting of a single view that somehow survived the purge. Fortunately, the nature of the problem meant that the design was still stored in the repository and thus recoverable, but that's still not a problem you want to have.

So I've been shifting my development process another few steps closer to the ideal. The way I see it, that ideal is:

  • Never open a production DB in Designer
  • Do all development in entirely-distinct NSFs created either from scratch (for a new project) or via the "Associate With New NSF" action from the source-controled on-disk project. These NSFs should be worked on by a single person and should either be in an individual-user dev folder on the server or, better, on a local development server
  • Create deployment "builds" as NTFs, either as design-only copies of a source-associated NSF or, again, from the "New NSF" action from an on-disk project (but it should then be disassociated immediately)
  • "Deploy" the "builds" using Replace Design on the production NSF/NTF from the Notes client or, better, Administrator
  • Ideally, the "build" NSFs should closely match a tag or branch in the source repository, but I can't be bothered to do that for my needs yet

That process goes the same for "classic" and XPages apps. Since source control became viable for Domino devs around the same time as XPages hit the scene, the two are often conceptually tied together, but in my experience the classic binary-DXL handling is plenty good enough to use it in projects without a line of Java or XSP.

Separate Application and Data

The standard way to develop a Notes app - classic or XPages - is to bundle the design and data together. This has historically had a lot of benefits and still makes sense in some cases (say, customizing individual mail files for different uses). However, particularly in the case of a standalone "app" (a help-desk application, a project-tracking system, etc.), this tight joining makes little sense and would be considered a bizarre aberration by anyone using most other systems.

Fortunately, as Domino developers, even if we split up the design and data, we still have it easier than most, since we don't have to worry about drivers and authentication credentials for the back-end DB (unless we want to). In the simplest case, which I've adopted, you have your "app" NSF and a "data" NSF next to it. The "app" NSF contains no forms or views, which the "data" NSF contains no XPages, Java, or other "executable" entities. I've derived a couple benefits from this already:

  • The app and data NSFs can have different ACLs
  • The data NSF can be set to disallow URL open. This means you don't have to worry about, say, clever users trying ?ReadViewEntries to peek into more than they should see
  • You can easily switch between "dev", "production", and "test" data NSFs, just like normal developers do. You don't have to worry about copying data around between different dev versions of your app
  • Multiple apps can point to the same data with impunity
  • The data NSF can exist on another server... or cluster of servers, which you could switch between programmatically
  • Data sharding becomes easier. Depending on the type of app, each user/project/department/whatever could have a different backing NSF fronted by the same single app

Admittedly, this makes some things moderately more complicated. Any Domino data sources will need to have a databaseName property specified, which will need to get its value from somewhere (worst case, you could hard-code it to start with, then "upgrade" to a managed bean). You also won't be able to use "app.nsf/view/key"-style URLs, since the app won't have any views to search by - you'll have to use query parameters or cobble together your own good-looking URLs with the assistance of Web Rules. You might not want to make your structure too document-focused anyway.

 

These changes have been bringing my development closer to classic desktop-app development, where these are less "techniques" and more requirements due to the nature of development, but they've proven to be valuable. Domino's ease of getting started and fast-and-loose development methods have distinct advantages, but it's always important to question which ones are better than other environments and which are traps.