How the ODP Compiler Works, Part 2

Mon Jul 01 11:36:57 EDT 2019

Tags: nsfodp xpages
  1. Next Project: ODP Compiler
  2. NSF ODP Tooling 1.0
  3. NSF ODP Tooling Example Project
  4. NSF ODP Tooling 1.2
  5. How the ODP Compiler Works, Part 1
  6. How the ODP Compiler Works, Part 2
  7. How the ODP Compiler Works, Part 3
  8. How the ODP Compiler Works, Part 4
  9. How the ODP Compiler Works, Part 5
  10. How the ODP Compiler Works, Part 6
  11. How the ODP Compiler Works, Part 7

In yesterday's post, I briefly touched on how the XPages runtime sees its environment by way of a FacesProject and related components. Today, I'd like to expand on that a bit, since it's useful to understand the various layers of what makes up an "XPages app" at compilation and runtimes.

Designer and Domino largely take two paths to try to arrive at the same location in how they view an NSF. The way Designer works is more complicated and opaque than Domino, with extra layers of VFS and an internal RPC mechanism(!) for editors, but there is at least some shared code from the XSP runtime. Beyond that, it does almost the same thing to determine the project's dependency classpath, while the internal NSF classpath is entirely distinct, using Eclipse's project structure to builds towards the different structure Domino will use.

Libraries

The notion of an XSP Library is one of the main parts of directly-shared code between the server and Designer. The way an XSP Library works is that you create a class that implements com.ibm.xsp.library.XspLibrary and then declare that as an IBM Commons extension contribution (more on that later) for the com.ibm.xsp.Library service type.

The fact that this is live code sitting in a plugin has a significant implication. Namely, anything that interprets it has to actually load the class and its dependencies. This is as opposed to just a static configuration file, which could be read without executing any custom code. For the server, the distinction doesn't matter too much, since you'll want to load all your class files anyway. For Designer, this is where we get the requirement to install libraries into Designer itself, rather than just adding plugins to the Target Platform. This is also an area that's a breeding ground for IDE bugs, since Designer needs the plugin available both internally and in the Target Platform, but they're not inherently tied together.

Though the XspLibrary implementation class is executable code, its main purpose is to point the runtime to various bits of static configuration information: the unique identifier for the library (e.g. com.ibm.xsp.extlibx.bazaar.library), lists of *.xsp-config and *-faces-config.xml files to define XSP and JSF contributions, and a list of other library IDs that this one depends on.

I believe that Designer and Domino use these bits of information slightly differently - I'm not sure that Domino cares too much about the *.xsp-config files, for example - but there's a lot of overlap here.

Configuration Files

The two main types of static configuration files used by libraries serve distinct purposes.

The *-faces-config.xml files (not required to be so named, but it's a good convention) are layered under the faces-config.xml file contained in your NSF. They define managed beans, converters, PhaseListeners, and other JSF-isms. These files come directly from the underlying JSF implementation and share the same syntax, at least until the JSF-1.2-era forking of XPages.

The *.xsp-config files look similar - they also use the <faces-config/> root element - but I believe that these are largely an XSP-specific detail. It looks like JSF 1.2 also uses the same <faces-config-extension/> tag, but to a different end - perhaps this evolution started the same way but then diverged there. In any event, these files are where Designer (and the XSP compilation process in general) looks for custom-defined components and their accessible properties. There's an interesting point to note there: though defined components are effectively beans with properties, Designer doesn't introspect the object to get its property names and types, but instead relies entirely on the definitions found in these files. It will still eventually use the component class when it goes to compile the translated XSP Java files, so they still need to be correct, but it's certainly a spot where it's easy to make a typo or mismatched property type.

I think that the latter files aren't used by the server, since their purpose is to provide the XSP source Java translator with mappings for components' XML elements to the Java classes. However, the core XPages runtime classes on the server still retain knowledge of this configuration, which is how the Bazaar and ODP Compiler do their thing. The com.ibm.xsp.registry package and sub-packages are filled with a mix of parser classes and in-memory representations, like com.ibm.xsp.registry.parse.ConfigParserImpl and com.ibm.xsp.registry.LibraryFragmentImpl.

Non-Library Contributions

Though not related to libraries, it's useful to know about a handful of XPages-specific class contributions that can come into play at runtime. These use the IBM Commons extension mechanism, like libraries themselves, but contribute to a good many different parts of the runtime and application flow. Some of these can be defined inside an NSF, while some are only recognized when defined in plugins - there's a good rundown of these on the ODA wiki. It's pretty rare to see these in the wild, but you may see an application here or there that uses these contributions, via in-NSF files like META-INF/services/com.ibm.xsp.core.events.ApplicationListener.

OSGi and Dependencies

In the early days, XPages was not OSGi-based. That came in in the 8.5.2 era (I believe - I wasn't aware enough in the 8.5.0/8.5.1 era to know the specifics) with the "extensibility API". For the most part, this lineage remains, and the XPages runtime itself isn't too dependent on OSGi, even when it comes to library contributions. Little bits have crept in here and there - the getPluginId() method in XspLibrary and the getOSGiBundle() method in ExtLibLoaderExtension, for example - but it's still largely incidental.

IBM Commons Extensions

If you've done both XPages plugin and Eclipse-the-IDE plugin development, you may have noticed that, while Eclipse plugins usually contribute to customized extension points with complicated schemas, XPages contributions all look like this:

	<extension point="com.ibm.commons.Extension">
		<service type="com.ibm.xsp.Library" class="com.example.SomeXPagesLibrary" />
	</extension>

There are still some places in Domino where you use different extension points, such as when you register a servlet with the Equinox OSGi runtime directly, but for the most part it's just this one point. This is because this extension point is designed to paper over the differences between OSGi extensions and the vanilla-Java-style ClassLoader#getResources mechanism. The type of the service you provide lines up with the META-INF/services/some.extension.type files you can use in your NSF and which still remain inside the embedded jars in the core XPages plugins.

The reason why this OSGi extension point exists is that OSGi intentionally creates separations between the individual plugins that make up your app runtime. In a "normal" web app, all of your dependency jars end up in the WEB-INF/lib sub-directory and are effectively all poured together to make a single class-loading environment. The ClassLoader#getResources route will look through all of the jars in the classpath for these META-INF/services files, but OSGi puts walls between them, and instead provides its own extension mechanism (among others, but this is the one Domino uses).

Dependency Resolution

Both Domino and Designer view the NSF like an OSGi plugin, but go about resolving the dependencies slightly differently. Fortunately, this is a case where the differences seldom crop up in practice - I've only seen some minor differences in how they honor the Export-Package directive in the bundle manifest and how fragment bundles are included.

When Designer is building an XPages app, it references the xsp.properties file to determine which XSP Libraries to include, and then uses their getPluginId() method to determine which OSGi bundle that matches up to (I think). It adds that plugin to the list of dependencies in plugin.xml and (since 9.0.1 FP10) META-INF/MANIFEST.MF. The Eclipse side of Designer then uses that to compose the Plug-in Dependencies list from those bundle IDs and any of their dependencies that are marked as re-exported. I think that Domino only cares about the generated plugin.xml/MANIFEST.MF files - I don't think that it does the resolution based on the library class, though I might be wrong about that.

ODPCompiler's Version

Currently, the ODP Compiler hews closer to the "Domino-style" route. For resolving the active class path, it trusts that the plugin.xml that exists in the ODP is correct and resolves dependencies from there. In the future, it may make sense to have the compiler generate the plugin.xml file itself, in which case it will also have to resolve the plugins based on the library classes. That wouldn't be too difficult, but for now it relies on the exported ODP.

Layer Cake

Looking at the whole XPages architecture, something that strikes me is how much it's simultaneously a giant stack of parts - config parsers, resolvers, runtime bootstrappers, and so forth - but also a pretty straightforward server-side web stack from a Java EE perspective. I've been diving deep into XPages in various ways for a long time now - building complex apps, writing library plugins, and even yanking the runtime out of Domino - yet writing the compiler led to this whole distinct set of capabilities. But a lot of this is essentially "just" ahead-of-time work, with Designer and the ODP Compiler's jobs being a lot of world-resolution followed by placing compiled pieces into the right places in the NSF.

By the time it gets to the NSF, it actually ends up as a pretty normal-style web app - XPages are just Java classes floating around, non-OSGi dependency jars are in WEB-INF/lib, and the faces-config.xml controls rendering in the same way as in JSF. A lot of that, though, will come up in later posts, where I go into the gotchas involved in taking these compilation results and other ODP resources and actually getting them into an NSF.

How the ODP Compiler Works, Part 1

Sun Jun 30 13:54:40 EDT 2019

Tags: nsfodp xpages
  1. Next Project: ODP Compiler
  2. NSF ODP Tooling 1.0
  3. NSF ODP Tooling Example Project
  4. NSF ODP Tooling 1.2
  5. How the ODP Compiler Works, Part 1
  6. How the ODP Compiler Works, Part 2
  7. How the ODP Compiler Works, Part 3
  8. How the ODP Compiler Works, Part 4
  9. How the ODP Compiler Works, Part 5
  10. How the ODP Compiler Works, Part 6
  11. How the ODP Compiler Works, Part 7

A year ago, I started a project to compile NSFs - and particularly large XPages projects - independently of Designer. Since the initial releases, the project has grown, gaining an ODP exporter, extra Eclipse UI integration, and the ability to run without installing components on a remote server. It's become an integral part of my workflow with several projects, where I include the NSFs as part of a large Maven build alongside OSGi plugins and a final distribution.

Building this tooling required learning a lot about the internals of XPages, the specifics of how various design elements are stored and handled in an NSF, and miscellaneous bits about Equinox and Maven. Since there's a good amount of arcane knowledge embedded in the project, I think it'll be helpful to take some time to dive deep into what's going on, starting with XPages.

XSP to Java to Bytecode

The first challenge for me to overcome was how to go from XPages XML source to the Java class files (like those seen in the "Local" source folder in Designer) and finally to compiled Java bytecode. Much like in Designer's process, the middle part is just incidental: only the XSP source and the bytecode are actually stored in the NSF.

The official XSP -> Java compiler exists only in Designer, and so one route would be to try to get those plugins working on Domino. I think that'd work, but it'd be a huge hassle and, fortunately, an unnecessary one. The XPages Bazaar project contains essentially a clean re-implementation of the glue code required to coax the runtime into emitting what I needed. I used the Bazaar as an incubator for the early versions of the compiler, and tweaked the core code with additions and fixes to work with various needs I ran into.

OSGi Bundles

On its own, the Bazaar did the job well of taking an XPage and compiling it with whatever the surrounding environment had. However, to compile a full XPages app as part of a Maven build, I'd need the ability to dynamically load XPages libraries and dependencies on the fly.

To do this, I added the option to include an update site directory, and then I have the compiler stream through all the plugins and initialize them. Fortunately, the OSGi environment makes this pretty easy. The BundleContext object that's available from each OSGi bundle has an installBundle method you can use by pointing at a bundle file URL, and then you can call start() on that result to actually initialize it. I had to do a little extra work to account for bundles that shouldn't be started (source-only bundles and "fragments", which are additions onto normal plugins) and the like, but it's not too complicated.

XSP Libraries

Just installing the OSGi plugins isn't enough to get the XPages runtime to know about any libraries that may be included, however. This is done by finding all of the library extension contributors, sorting them for compatibility, wrapping them in a com.ibm.xsp.library.LibraryWrapper for some reason, wrapping that in a com.ibm.xsp.registry.config.SimpleRegistryProvider, and then adding the results of that to an in-memory com.ibm.xsp.registry.FacesSharableRegistry instance.

This is one of those cases where the actual code involved in the end isn't terribly long, but the amount of delving into the framework to figure out the needed parts was immense. In essence, each XPages application is represented as a FacesProject implementation object, which retains a registry of the libraries it knows about. In a normal running server, this happens automatically during initialization: the runtime opens the NSF, figures out which libraries it needs, finds them from the ones it knows about, and constructs the app's contained world. For the compiler, I ended up having to do something of an ad-hoc version of this as it goes, finding all the parts that need to be kicked to notice the available libraries and have them around to map something like <abc:someCustomComponent/> to an instance of com.abc.xsp.CustomComponent.

Custom Controls

Custom controls are a similar story, but they use a specialized variant of the "real" library system called LibraryFragment. The way these work, before the actual XSP -> Java compilation step, the compiler reads their definitions and adds them to the in-memory FacesProject. It's important to define them all in this way before actually trying to interpret their source (or any XPages), because this allows for the interpreter to understand a reference to one CC from another. Fortunately, the process is separated enough that the definitions can all happen before any of the Java classes actually exist. Otherwise, I would have had to either try to make a dependency graph of which CC references which other, or just keep trying to compile them repeatedly until it got the right order by brute force. The latter process will actually show up in a future blog post.

Java Source Files

Compiling individual Java source files - both those that show up in the Code/Java part of an NSF (or a custom source folder) as well as the translated XSP source - is handled by the Bazaar, in a class called JavaSourceClassLoader. This class wraps around the in-memory Java compilation capabilities that were added to the JDK in Java 1.6. In particular, this class, paired with SourceFileManager, provides knowledge to resolve classes from OSGi bundles in the active environment, including specialized knowledge of dependency management based on re-exported dependencies, embedded jars, and other OSGi-isms that play a big part in XPages libraries.

In essence, these classes provide a similar compilation environment to what Designer does when it creates the Eclipse project for compiling an NSF. They build up an environment with knowledge of all the dependencies that, in Designer, show up in "Plug-in Dependencies", and then the compiler feeds it all of the Java source files in the project. With all of this environment provided, the underlying Java compiler is able to do its job of converting them to bytecode en masse, which is then passed back to the compiler for insertion into the NSF.

Other Big Components

In the next blogs posts in this series, I'll go over some of the other big hurdles I had to overcome to get everything working properly: namely, figuring out all of the specialized behavior necessary to flag imported/created design notes properly and then the arcane incantations necessary to get this OSGi environment working outside of the Domino server.

Seeing the Familiar in SwiftUI and Combine

Tue Jun 11 14:17:10 EDT 2019

Tags: ios

Apple announced quite a bit at WWDC last week, and some of the stealth favorites for programmers have turned out to be SwiftUI and Combine. My macOS and iOS development is incidental at most, but I like to pay attention to this stuff, and I think that these frameworks in particular are notable for how they reflect attributes of other languages and frameworks that I do use.

SwiftUI

Before beginning, I should point out that the best source of information about SwiftUI at the moment is Apple's WWDC video archive, in particular "Introducing SwiftUI" and "SwiftUI Essentials".

SwiftUI is generally summed up as a "declarative UI framework", with "declarative" here primarily setting it apart from imperatively-defined UI. Truth be told, a mix of both has been common in both Cocoa development and elsewhere for a very long time - the definition of UI layout in .nib files comes from the old NeXT days, for example. The things that set this apart are some technical improvements and a switch of focus from your program creating and managing a UI to instead your program defining what it wants the UI to be and letting the framework handle it.

Declaring the UI

The best way to think of the way this works is that, instead of actively calling methods and setting properties on a window, some buttons, etc. on the screen, what your code does in SwiftUI is instead create a work order for what it wants the current state of the UI to be and hands that to the framework to figure out what needs to happen to get there. That "what needs to happen" can range from initially creating the layout to figuring the right way to find the difference between the new states, removing old elements, adding new ones, and animating transitions between them.

The most immediate analogue for this in common use today is React, which has essentially the same model. In (presumably) both SwiftUI and React, your job as a programmer is to have a method on a component object that is called frequently to emit what it thinks its contents should be right now. So if you have, for example, an array of objects that's displayed as an unordered list, you'll have a render() method that looks like:

render() {
    return (
      <ul>
        {this.props.items.map(item => (
          <li key={item.id}>{item.text}</li>
        ))}
      </ul>
    );
  }

In SwiftUI, a cut-down version would look like:

var body: some View {
  List(model.items) {
    Text(item.text)
  }
}

And, as I mentioned, the core concepts aren't new. In core XPages, we'd write something very similar:

<xp:repeat value="#{items}" var="item">
	<xp:this.facets>
    	<xp:text xp:key="header" contentType="HTML" value="&lt;ul&gt;"/>
    	<xp:text xp:key="footer" contentType="HTML" value="&lt;/ul&gt;"/>
  </xp:this.facets>
  
  <li><xp:text value="#{item.text}"/></li>
</xp:repeat>

However, the last one differs in a number of ways, not the least of which is that the UI-generation part doesn't retain a concept of shifting state. You, the programmer, may know that the items list may grow or shrink by a value, but, in the XPages model, the browser just starts with one block of HTML and replaces it with another on a refresh. Internally on both server and browser, I'm sure there's some retained memory for efficiency's sake, but a removed list item won't, for example, know to gracefully fade out and let the remaining ones shift into its place.

SwiftUI drives this distinction home by preferring UI component definitions to be "structs". Java doesn't currently have an analogue to structs, but they come from C and they're effectively a "pure data" type. The distinction between classes and structs gets a little murky in Swift, but the core idea is that structs are meant to show pure data in a given state and are copied when passed around. That relates here because your SwiftUI component's job is not to be the UI, but rather to emit in-memory blueprints for what it wants given the current state of the data. When data changes, the framework asks the component for a new representation, compares the old one and new one in memory, and then makes changes to the UI representation as appropriate.

It's kind of an odd distinction to write out, but I found that there was a point when I was learning React when it "clicked" in my head.

Separation of UI Definition and Output

Beyond the simplicity of declaring the UI, one of the big things that the SwiftUI presentations drive home is how one UI definition can be used across all of Apple's platforms. A list is a list is a list, and the fact that it looks one way on a watch and another way on a TV isn't inherently important.

Seeing this made me feel pretty good, since it reminded me of a post a wrote years ago that dealt with the component/renderer split in XPages at a conceptual level. Though renderers in XPages don't go so far as spitting out an AppKit application at runtime, the concept of an abstractly-defined UI that is then rendered differently based on the target is very important.

What remains to be seen with SwiftUI is how clean this distinction can remain. In my XPage example, I used some of the ExtLib's semantic form components, but real XPage applications are almost always mucked up with inline CSS classes and styles, client and server JavaScript, meaningless HTML tags like <div>, and so forth. It's one thing to say "oh, this just means a checkbox on macOS and a toggle on iOS", but another to handle a complex, crafted user interface that may be radically different in different contexts.

DSLs

SwiftUI is written in the full Swift language, but it's best described as a domain-specific language written in Swift. The term "DSL" originally meant a whole language dedicated to a small task, but nowadays usually refers instead to a style of writing an API that, when paired with a general-purpose language, acts like it's a language dedicated to the task. This is usually a feature of "scripting"-type languages, where the syntax is clean and flexible enough to not interfere.

In the Java world, the go-to language for this has long been Groovy. Darwino uses Groovy for the database adapter DSL, and SmartNSF does as well for defining services. The most common use nowadays is probably Gradle, the Maven-competing build system. It uses Groovy's closures and parentheses-less method calls to make a build script that looks more like a configuration file than a script:

plugins {
    id 'java'
    id 'application'
}

repositories {
    jcenter() 
}

dependencies {
    implementation 'com.google.guava:guava:26.0-jre' 

    testImplementation 'junit:junit:4.12' 
}

mainClassName = 'demo.App' 

The key reason why these DSLs are useful (other than not having to write a new parser) is that you have the full abilities of the underlying language at your disposal. In SwiftUI, you can do your "hide-when" logic by using the normal old if structure, rather than having a specialized "when should this show up?" property. That also means that you can bring in whatever other logic you have, third-party libraries, and so forth, without having to have specialized support in the API.

Combine

Combine, in addition to having an ominous name, is a less-flashy addition than SwiftUI, but is nonetheless important and also shows the integration of growing themes in programming elsewhere, specifically reactive programming. It's one of those topics where you can very easily fall off a conceptual cliff, and I found even Apple's introductory session to make it sound more daunting than it is by focusing on the specific Swift protocols in practice.

I've found that the best way to learn something like this is to set aside the "asynchronous" aspect at first to focus on the "data flow" part. For this, we're in luck, since this is what Java 8 streams are. Streams are all about starting with some source of data - often just a List implementation, but it's intentionally arbitrary - and changing, filtering, sorting, and otherwise manipulating it to get a result. So a prototypical example from Java can be:

Stream.of(10, 1, -3, 5)
	.filter(i -> i > 0)                  // [10, 1, 5]
	.map(i -> i * 2)                     // [20, 2, 10]
	.sorted()                            // [2, 10, 20]
	.map(String::valueOf)                // ["2", "10", "20"]
	.collect(Collectors.joining(", "));  // "2, 10, 20"

Most of the time in practice, the actual implementation will pretty much match the sequential reading of this path and so will be roughly similar to if you wrote it out with for loops, if statements, and so forth. However, because your code is describing what you want done rather than how to do it, there's room here for short-circuits, multithreading, and optimizations for different collection types.

If you look at a snippet of an example from Combine, you can see near-identical syntax in Swift, with the same idiomatic indentation:

let p4 = p1
	.merge(with: p2)
	.append(5) // add 5 to the end of the sequence
	.allSatisfy { $0 >= 1 } // check if all values are bigger than 0
	.count() // how many values: 1

Importantly, the same style of syntax applies when the incoming data is asynchronous and when the final output is similarly out-of-band. This is what all of Apple's Combine examples start with, which is why I think it can seem a bit daunting. But I think the only real switch to make in your head is to slot in the term "Publisher" for the starting provider of your data (originally an array here, but it could be a keyboard or network resource) and "Subscriber" for the code that deals with it.

Admittedly, things can get more complicated there when you need to add in error handling, value binding, and other practical considerations, but the simplicity of the core concept remains.

Overall

For Domino needs specifically and web development generally, SwiftUI and Combine don't themselves matter much. Still, I think it's useful to take times like this to see larger trends and, when you can, bask in the satisfaction of having seen the concepts elsewhere.

My Slides From Engage 2019 - De04. Java With Domino After XPages

Mon May 20 09:24:03 EDT 2019

Tags: xpages engage

Engage 2019 has come and gone, and I had an excellent time. I also quite enjoyed presenting my "group therapy" session on some options that XPages developers have for the future. In a lot of ways, it was similar to my presentation at CollabSphere last year, mixed with the various new developments I've talked about on here since then:

Engage 2019 - De04. Java with Domino After XPages

Engage 2019

Mon May 06 11:10:41 EDT 2019

Tags: engage

Engage is just around the corner in Brussels, and I'll be joining everyone there, presentation in hand. Specifically:

Dev04. Java With Domino After XPages
Tuesday, May 14 at 16:00 in room E. Mahy

XPages guided Domino web development out of a world of archaic proprietary hacks and into the realm of Java server development and something approximating Java EE. Now that the XPages framework is moribund, the question is: what's next? If you've built up Java skills over the years, you have a direct path to use them in the new modern world, whether via OSGi plugins and new XPages extensions on Domino or standalone Java web projects. This session will discuss the lessons we learned from XPages, how they correspond to newer technologies, and how to bring your existing apps and plugins forward.

If you attended my session CollabSphere last year (by the way, you should go to CollabSphere this year too), you may note that the title is pretty similar to that one, albeit changing "In Domino" to "With Domino". This massive change in preposition reflects how things have only gotten more awkward for us in the intervening year.

So, if you're attending Engage, join me as we work through our sorrows together!

 

Java Grab Bag 2

Fri May 03 15:38:00 EDT 2019

Tags: java
  1. Java Hiccups
  2. Bitwise Operators
  3. Java Grab Bag 2
  4. Java Travelogue: The Care and Feeding of Locales
  5. More Notes on Filesystem and Charset Portability

Following in the vein of "Java Hiccups", I've had a couple things floating around my head lately that I think collectively make for a good post for Java developers, particularly those working in the Domino arena.

Without further ado:

Map#computeIfAbsent

This is a method that was added in Java 8 and, while it's not as big a deal as the addition of streams, it's one of my favorite additions and something I use very frequently. To give a point of reference, consider this common idiom from an imagined XPages app:

Map<String, Object> applicationScope = ExtLibUtil.getApplicationScope();
if(!applicationScope.containsKey("someVal")) {
  applicationScope.put("someVal", someExpensiveOperation());
}
String someVal = (String)applicationScope.get("someVal");

Essentially, using a Map as a cache for a complex computed value. Java 8 added the #computeIfAbsent method (alongside several similar ones) to do this in one go:

String someVal = (String)ExtLibUtil.getApplicationScope().computeIfAbsent("someVal", key -> someExpensiveOperation());

The second parameter is (usually) a lambda, like the ones used in streams, that takes the provided key as an argument and is only executed if the value does not already exist. Due to the way this was added, most implementations do pretty much the same thing as the first block of code, but you don't have to care about that. Your code gets a bit smaller, the intent is much clearer, and it's less prone to small bugs like changing the key and forgetting to change it in all three places.

Arrays Are Weird

Java's built-in array type is loosely based on C's, and that's reflected in the syntax:

int[] foo = new int[4];
foo[0] = 1;
foo[1] = 2;
foo[2] = 3;
foo[3] = 4;

Like C, they are zero-based, declared with the capacity and not the max index, and cannot be resized. Unlike C, arrays aren't just syntactical sugar on top of pointers, and this manifests immediately in bounds checking. Take this line:

foo[4] = 10;

In C, this will (famously) just write an integer 10 value into whatever memory happens to be just beyond the bounds of your array. In Java, you'll get an ArrayIndexOutOfBoundsException, saving you from the insidious bug. But, since Java arrays are (probably) implemented internally very similar to in C - they're likely contiguous blocks of memory sized to the type - they're still extremely efficient, and so they show up in a lot of speed-critical code.

As speedy and safe as they are, though, they're still pretty unfriendly. For starters, they can't be resized. When you do new int[4] (or use literal syntax like new int[] { 1, 2, 3, 4 }), you carve out that much memory and can't shrink or expand it in-place. You can change the values inside an array, just not its size. To "resize" efficiently, you have to make a new array and then use System.arraycopy to populate the new array with the contents of the old.

This is all why the List interface (with its predecessor class Vector) exists: they serve the same function of "ordered collection of stuff", but allow for dynamic resizing. Because these objects are usually "efficient enough" (ArrayList uses "true" arrays under the covers) while having numerous additional benefits, you should use them as your first go-to and only use arrays if you have a reason.

That's in part because the weirdness of arrays doesn't end with their inconvenience. Array types are actually implicitly-created classes, even when they contain primitive types. So:

int foo = 3; // primitive value
foo = null; // syntax error!
int[] bar = new int[] { 3 }; // Object
bar = null; // legal!

List<int> fooList = new ArrayList<>(); // syntax error - primitives can't be in generics List<int[]> barList = new ArrayList<>(); // legal!

Object.class == Object[].class; // false int[].class.isArray(); // not only legal, but true

Java provides two main utility classes for working with arrays: java.util.Arrays (extremely useful for Arrays.asList) and java.lang.reflect.Array (usually only useful in edge cases).

Java Has No Library Versioning System

If you've worked with Java in Designer or Eclipse, you've likely run across this preferences pane or its per-project version:

Eclipse Java compiler settings

These settings affect two things:

  • The syntax allowed in your source (e.g. new ArrayList<>() requires 1.7 or higher)
  • The class file format version (you can think of this like an NSF ODS). You've likely seen the latter in play by receiving an UnsupportedClassVersionError trying to run Java 7 or 8 code on a pre-9.0.1FP8 Domino server.

Conspicuously absent from this short list is anything to do with classes or methods added to the runtime in newer versions. For example, the String class gained a static method String.join to conveniently concatenate strings with a given delimiter. If you're targeting an older Java version but using a newer Java library (as Designer 9.0.1FP10 does with a default target of 1.5 and JVM of 1.8), you can write a line of code using that method without issue - the syntax doesn't require anything above 1.5, so all is clear as far as the compiler is concerned. But if you then try to run that code on an older JVM (such as an older Domino server), you'll get an exception at runtime, since the method doesn't exist.

Unfortunately, the only true answer to this is to tell your IDE about a JRE for each specific Java version you're targeting, something that doesn't happen by default, and which will gradually get more difficult as Java 6 becomes harder and harder to come across.

This is one of the things that OSGi aims to fix - you could, for example, have many versions of Guava installed, and you could declare that your plugin works specifically with version 18. Then, when loading, the runtime will either bind to a matching version or give you an error that no version could be resolved. No mystery involved. Unfortunately, OSGi is a niche thing losing ground, and the module system introduced in Java 9 consciously does not address this.

In a pinch, you can use the file tool on most Unix systems to check the version of an individual class file:

$ file Foo.class
Foo.class: compiled Java class data, version 52.0 (Java 1.8)

Licensing

A little while ago, Oracle raised a bit of a stink by declaring that, as of this year, commercial use of their Java runtimes would require paid licensing. Historically, you could get support for Java for money, and certain additional components had their own licensing requirements, but it was pretty normal otherwise to install Java from java.sun.com and not give it a second thought.

This naturally caused a few questions when it comes to Domino and other Java-incorporating IBM products, and IBM released a statement that basically amounted to "you don't have to worry about it". IBM has maintained their own variant of the JVM (called J9 for Smalltalk-related reasons, not to be confused with Java 9) and anyway has always had arrangements with Sun/Oracle such that IBM's customers don't have to worry about dealing with Oracle directly.

But what about using Java outside of a licensed product, such as if you just want to run Tomcat on some server? The short answer there is that you're still fine, but you just have to know a little about the difference between "a JDK" and "Oracle's JDK". Java and the surrounding JDK have been progressively open-sourced in fits and spurts over the years, and are now at a point where the project called OpenJDK is basically the real Java environment, and then Oracle's JDK is just one implementation of it. It's similar to Linux: the core parts are open-source, and many distributions are entirely free, but there also exist commercial variants for pay.

So, if you want to run a Java stack, you can do so without putting forth a single cent by using an OpenJDK build. Oracle, IBM, and others will still be happy to take your money if you want a commercially-supported Java environment, of course.

For a longer explanation, this blog post from @javachampions is pretty much the definitive word.

 

XPages on Android

Mon Apr 15 16:30:50 EDT 2019

  1. Letting Madness Take Hold: XPages Outside Domino
  2. XPages on Android
  3. Concept Proven: A Complex Production XPages App Running Outside Domino

Around the start of the year, I had a bit of a dalliance with the idea of running XPages outside Domino. The upshot of that project is that it is indeed possible to do so, but there'd be some work to do to make it practical.

This month, we revisited the idea with a healthy dose of Darwino to provide some undergirding technology, with the goal of being able to run a plain old XPages app on mobile devices backed by Darwino DBs and replicating back to Domino. There was a lot of fiddling involved, but it works:

Android is the natural first target, but iOS is about 70% there, with the scaffolding loading up to the point where it loads a page but currently with a bit of trouble when it comes to resolving data classes and executing renderers.

What Specifically Is Going On?

We set out a few required parameters to call it a successful proof-of-concept:

  • It has to use the actual XPages framework - that is to say, the jar files shipped with Notes/Domino
  • The XPages themselves have to be shared among Domino and the Darwino app, in the form of their Java "intermediate" source (compiling from .xsp source is possible but is a whole other thing)
  • It has to use xp:dominoView and xp:dominoDocument data sources
  • It has to load the Extension Library
  • It has to use ancillary elements from the app: managed beans, themes, CSS, images, SSJS libraries

This checks all of those boxes, and it's pretty satisfying to see in action.

The Stumbling Blocks

As I discussed in my post about the original project, there are aspects of XPages that are meant to make this sort of thing possible, with the core parts abstracting out different platforms and runtime environments. Over the years, though, assumptions about Domino and OSGi crept in, with newer additions and the Extension Library taking a bit less care to be environment-neutral.

Moreover, XPages is not open source and I don't have any particular special access to it, making this whole thing essentially, in the video-game sense, hard mode. There are a handful of classes that needed to be outright swapped out, like the annoyingly-final NotesContext class, but there was much less of that than I'd thought.

After that, getting the data to point to Darwino was pleasantly straightforward. Other than a handful of areas where the NAPI comes in, the Domino data sources largely adhere to the rules of the lotus.domino interfaces and don't make assumptions about the implementing classes (this is essentially how ODA shims in as well).

What This Could Be Useful For

With the right fleshing out, this could be a real way to run existing XPages apps on mobile devices. That could be pretty useful on its own, but I don't think that carrying forward XPages apps as-is is the right idea. The side effect of this, though, is that you have a functioning XPages app in a normal old .war file project, structured with Maven or Gradle, and ready to be molded into newer frameworks using whatever tooling you'd like. No Designer, no OSGi, no Servlet 2.4/2.5, just a clean basis running your existing logic and ready to be improved.

If the mountain of existing XPages code is going to have a future, I think it should be something like this.

Bitwise Operators

Sat Apr 06 16:11:00 EDT 2019

Tags: javaee
  1. Java Hiccups
  2. Bitwise Operators
  3. Java Grab Bag 2
  4. Java Travelogue: The Care and Feeding of Locales
  5. More Notes on Filesystem and Charset Portability

In the "Java Hiccups" post, I briefly mentioned this little tidbit:

(short)Integer.MAX_VALUE == -1   // true

This fact betrays a lot about how numbers are stored internally in most languages. Some of those details, like the fact that the specific method is called two's complement, are fully in the range of "computer science" stuff - but it can be handy to know about a couple ways you can put this sort of thing to use.

For our purposes, let's work with the short data type in Java, because it's painfully important to Notes. "Short" is called such because it's smaller than a standard integer type in a given language. In some languages, like C, the length of core data types like this is sometimes tied to the bitness of the processor, but in Java they're consistent across everything. Specifically, short in Java is 16 bits long. Let's take the number 21,038, which is represented in binary as:

0101 0010 0010 1110

That's two bytes, and each byte is broken up into two groups of four bits each because it turns out to be helpful visually and because each "nibble" there can be matched to a single hexadecimal character (522E in this case). This is the most common way you'll see binary written when you start getting into this sort of thing.

So I mentioned casting in the previous post, and what casting does with primitive integer types like this is to chop off the bits that don't fit. If you cast the above value into a byte, it'll chop off all but the ending 8 bits, giving you:

0010 1110

Which is 46. If you cast it to a larger type, such as int (32 bits in Java), it just adds some zeros to the front:

0000 0000 0000 0000 0101 0010 0010 1110

Which is still 21,038, but now it takes up twice as much memory.

For our uses, this stuff matters immediately in two ways: it represents the limits of some data storage and it allows us to manipulate numbers with bitwise operators.

Storage Limits

If you max out the bits in a 16-bit integer, you get 1111 1111 1111 1111, and this value is either -1 or 65,535 (64k) depending on whether or not the value is considered "signed" by the code using it. When it's "unsigned", a 16-bit integer can represent between 0 and 65,535; when it's "signed", its range is -32,768 to 32,767 (32k). Java only has "signed" types, which is why Short.MAX_VALUE is 32k and not 64k. C, on the other hand, lets you choose in your code which type you want to use.

The Notes C API uses a couple type names to represent consistent sizes across different platforms, and the one that's important here is WORD: a 16-bit unsigned integer. This is used in, for example, the function used to set the value of a text item in a note:

NSFItemSetText(
  NOTEHANDLE  hNote,
  const char far *ItemName,
  const char far *ItemText,
  WORD  TextLength)

That WORD at the end there is where you tell the API how much text you're providing, and is thus capped at 64k - and is why you can store 60k of text in a non-summary text field but not 70k. As to why summary data is limited to 32k and not 64k, I'm guessing that either there's a signed value in there somewhere or that last bit was needed for something else.

If you look over the list of Notes limits, you can see this reflected all over the place, along with a few 8-bit (0-255) limits for good measure.

Bitwise Operators (for real this time)

The term "bitwise" refers to a handful of operators that deal directly on the bits of a number. Java takes a cue from C here and provides &, |, ^, ~, <<, >>, and >>>. These all have their uses, primarily for really-low-level stuff and efficiency, but we mainly care about & and |. These two may have bitten you in the past because of their similarity to && and || - and in particular because Formula Language uses the single-character versions to mean what Java means by the double-character ones. They also kind of work the same way, but aren't really the same.

To see how they work, let's take our original starting number, 21,038, and pair it up with another value, 5,000:

0101 0010 0010 1110
0001 0011 1000 1000

Visualizing the numbers in binary and stacked like this comes in extremely handy when it comes to dealing with bitwise operators. We'll start with &, or "bitwise AND". What this operator does is take two numbers and return a number where the matching slots in each one both contain a 1. So, in this case, it results in this internal math:

0101 0010 0010 1110
0001 0011 1000 1000
===================
0001 0010 0000 1000

Which corresponds to 4,616.

The | operator, or "bitwise OR", will provide a result where either of the original numbers has a 1 in the slot. In this case:

0101 0010 0010 1110
0001 0011 1000 1000
===================
0101 0011 1010 1110

Or 21,422.

It's pretty rare that you want these operators for the actual numerical values, though. What they're pretty much always used for are "bit fields", an extremely-efficient way to store a set of related flags. Going back to the Notes API, the NSFItemAppend function (a generic way to add an item value) has a parameter WORD Flags, that matches up to a number of properties that you can assign to an item. In the C source, they're written as hexadecimal values, but I've added in the binary versions here:

#define	ITEM_SIGN          0x0001   // 0000 0000 0000 0001
#define	ITEM_SEAL          0x0002   // 0000 0000 0000 0010
#define	ITEM_SUMMARY       0x0004   // 0000 0000 0000 0100
#define	ITEM_READWRITERS   0x0020   // 0000 0000 0010 0000
#define	ITEM_NAMES         0x0040   // 0000 0000 0100 0000
#define	ITEM_PLACEHOLDER   0x0100   // 0000 0001 0000 0000
#define	ITEM_PROTECTED     0x0200   // 0000 0010 0000 0000
#define	ITEM_READERS       0x0400   // 0000 0100 0000 0000
#define ITEM_UNCHANGED     0x1000   // 0001 0000 0000 0000

So you can see that each flag there has a distinct slot filled in that each other doesn't (you can also see, like missing teeth, the bits that are probably obsolete or undocumented features). If you want to create an item that is a signed, sealed, summary, readers item, you'd do this:

WORD flags = ITEM_SIGN | ITEM_SEAL | ITEM_SUMMARY | ITEM_READERS

…which will result in this bit of internal math:

0000 0000 0000 0001
0000 0000 0000 0010
0000 0000 0000 0100
0000 0100 0000 0000
===================
0000 0100 0000 0111

This is extremely efficient, both because you can store a bunch of information in just 16 bits, but also because working with these bit fields is baked into processors at the lowest level. There's not much you can do on a computer that's faster, in fact.

When you have a value like this, you can query it for whether or not a given value is set by using the AND operator:

flags & ITEM_SUMMARY

This will result in:

0000 0000 0000 0100

…which is the same as the value of ITEM_SUMMARY itself, since that one slot is the only bit in common. Conversely, if you check:

flags & ITEM_PROTECTED

…you'll end up with zero, since the flag doesn't match.

Use in Java

Most of the time, you don't have to think too much about the bits that make up numbers in Java, but they creep up from time to time. Generally, the main situations where they arise are when you're dealing with a low-level API or with code written by someone who spent a lot of time with C and deeply internalized its efficiencies.

For an example of the former, take a look at the com.ibm.designer.domino.napi.NotesNoteItem class in IBM's NAPI. It contained a method called int getFlags(), which returns exactly the value we were talking about above. If you get your hands on a summary item this way, you can do this to tell if it's a summary item:

(item.getFlags() & 0x0004) != 0

Note the extra != 0. While the operation involved is the same as in C, C lets you treat any non-zero integer value as a boolean true, while Java forces you to perform an actual boolean operation.

For an example of the latter, turn your eyes to the DefaultColumnDef class used in xe:dynamicViewPanel generation and customization, a snippet of which is:

public static class DefaultColumnDef implements ColumnDef {
      public static final int FLAG_HIDDEN         = 0x000001;
      public static final int FLAG_LINK           = 0x000002;
      public static final int FLAG_ONCLICK        = 0x000004;

      public int flags;
  
      public boolean isHidden() {
          return (flags&FLAG_HIDDEN)!=0;
      }
      public boolean isLink() {
          return (flags&FLAG_LINK)!=0;
      }
      public boolean isOnClick() {
          return (flags&FLAG_ONCLICK)!=0;
      }
}

This could also be represented as a bunch of boolean properties on the object or, most idiomatically, an EnumSet, but using a bitmask is slightly more space- and CPU-cycle-efficient. Slightly. The difference isn't enough to even think about normally, but can become worth it in cases where the code is called so often that tiny efficiencies can make a difference.

This whole topic is the sort of thing that is normally so many layers of abstraction below where we work that it's basically invisible, but it's definitely good to at least know the basics for the times when it does come up.

Implementing Cluster Replication From Domino to Darwino

Tue Mar 19 11:24:31 EDT 2019

Tags: darwino

Since its inception, Darwino has had two-way replication between it and Domino, and it's evolved over the years in fidelity and configurability. Recently, I was able to check an item off the to-do list that I've wanted for a while: "cluster-style" replication from Domino, where a document change immediately kicks off replication to Darwino.

Component #1: Extension Manager

Fortunately, the implementation is straightforward in concept: the Extension Manager has been in there since 4.0 and provides exactly the hooks one would need to implement this. The trouble with the Extension Manager, though, is that you can only subscribe to events from within an ExtMgr addin, and that means native code. You can't just hook into it from, say, an OSGi plugin.

My first thought was to use DOTS, which created this exact sort of bridge years ago. However, it's critically limited: the EM ferrying was never really fleshed out that much, and nor will it ever be, since it's not a supported project. It was kind-of-sort-of supported in the 9.x era for "social" purposes, but those days are behind us. Moreover, its separate OSGi environment wouldn't suit Darwino's needs particularly well.

The DOTS dynamic library, though, could still be potentially useful. Nathan Freeman came across this a couple of years ago: due to the way the DOTS dylib ferries the events from the ExtMgr world to DOTS, the channel is actually consumable by anything, not just DOTS specifically. My initial implementation did exactly this: it fed from the fountain of messages produced by the DOTS dylib for its own ends.

However, the core trouble still remains that DOTS isn't supported as such, and it has too many moving parts for us to want to take on as a dependency. Moreover, the "siphoning" only works if there's only one subscriber listening - on a server that also runs ODA (which Darwino does not use), you have the two consumers contending for messages, which is a recipe for missed events.

So I decided to write a custom-made ExtMgr addin, which would have the advantages of being much smaller and easier to maintain, feeding a different queue, and being a fun learning opportunity for me. The last part isn't as important to Darwino-the-product per se, but I always like when it lines up like that.

Component #2: Message Queues

The way DOTS and this new addin do their things is to use Message Queues, another technology presumably-not-coincidentally added to Domino in R4. The way these work is that you create a named queue (DOTS's, for example, is named MQ$DOTS) and then feed it strings, which are then consumed by anything running in the same Notes/Domino environment by requesting the queue of the same name and waiting for messages. It's pretty simple both in theory and in execution, with the minor problem that the API isn't officially available from Java.

Fortunately for ODA's (and presumably DOTS's) use, though it's not part of the official API, there is a lotus.notes.internal.MessageQueue class that is a (shockingly-thin) wrapper around the MQ* functions in the C API. It's functional, and the thinness of the wrapper means that the method parameters, though unnamed in the bytecode as usual, are clear matches for the equivalent C parameters.

I initially started using this, but ended up writing a nicer wrapper in Darwino's NAPI that implements BlockingQueue, making consumption in Java much clearer.

Component #3: The Listener

The final piece was the most comfortable, since it's entirely back in the warm embrace of Java: I wrote a class to listen for events coming through this queue, extract the database name, check for replicators configured for that NSF, and immediately kick any applicable ones off.

The End Result

Since the overhead of the work in a small replication is so minor, the end result is effectively the same as cluster replication between Domino servers: the change/creation/deletion is propagated over within a couple milliseconds (for small documents, due to, you know, physics). It's pretty satisfying to see in action.

Pub/Sub

If you've been following HCL's announcements lately and feel like this sounds very similar to the pub/sub support they've slated for V11, you're right. My guess is that they wanted to get Elastic Search working with low latency and (kindly and wisely) decided to turn the work required into a nicer interface for the same EM events we're using. It's a good feature and, assuming it's consumable from local Java, would have made my work here easier, but I didn't want to wait, and we target a couple Domino releases back anyway.

Anatomy of a Clean Open-Source Project

Mon Mar 18 11:54:47 EDT 2019

Tags: open-source

Over the years, initially thanks to Peter Tanner's diligent work as the OpenNTF IP Manager and now my own occupation of that seat, I've learned to really appreciate the virtues of dotting your "i"s and crossing your "t"s when it comes to making an open-source project legally clean and clear.

It's definitely something I underrated early on, though - caring about the specific differences between licenses and, in particular, maintaining things like per-file license/copyright headers felt like annoying busywork. For a project that only you will ever use, it technically is, but the hope of open source is that you'll get other people using your work and, ideally, contributing back in turn, and that's when it's important to make sure you have everything sorted out.

Why Bother?

Well, for one, you or your users could theoretically be sued or otherwise legally entangled if you don't keep track of this stuff. Admittedly, it's fairly unlikely, but the consequence of, for example, unknowingly including GPL software in your proprietary product is potentially significant.

It's for that sort of reason that it's important to make sure everything is clean before some large corporations will risk even looking at your project. IBM is particularly good about this because they were significantly burned in the past, and came out of it with extremely-strict view, and that rubbed off on OpenNTF both culturally and with their gracious technical assistance along the way.

And, since large consumers require this sort of vetting, it's also important to know how to do it if you want to contribute code to an open-source organization like OpenNTF, Eclipse, or Apache.

It's also surprisingly satisfying once you get into the swing of it, I've found.

The Example Project

Since I've been spending a lot of time recently with the NSF ODP Tooling project, we'll look at its GitHub repository.

Common Files

There are a couple common features that tend to show up, and which both people and tooling (like GitHub's license identifier) look for:

  • The LICENSE file, which is the most critical. This contains the text of the license you're using, as well as one of the declarations of the copyright year (though, admittedly, it's easy to forget to include that part). This is what declares the effective license for the code in the repository that you own the copyright to, and should be included right from the start if possible.

  • The NOTICE file, which is vital if you're including any code from any sources not covered under the main copyright. This file should list all of the third-party code you have included in the repository, its license type, and, if possible, where to acquire it. If your project's distributable form includes additional third-party code not included in the repository (such as Maven or npm dependencies), these should be enumerated here as well

    • Writing this file has an important side effect in that it forces you to account for the licenses of your dependencies. More than once, I've run into a situation where I found that a common dependency had an incompatible license (such as the pure GPL). In some cases, this has meant abandoning the dependency outright, while in others it has meant finding a better-licensed alternative. Eclipse Orbit exists in large part for this purpose.
  • A legal directory containing any additional license/redistribution information not covered by the NOTICE. This can also sometimes take the form of files like NOTICE-Weld in the root of a project, and is useful for mass-including copyright/notice information from third parties in their original form.

In addition to including these files in the project repository, you should also make sure to include them in any binary distributions you make. In my projects, this takes the shape of inclusions in a Maven Assembly Plugin packaging file.

File Headers

I originally chafed against the idea of per-file copyright/license headers. They're not strictly necessary when the files are included in the original repository, they're redundant, you end up with massive commits touching hundreds of files just to change a year, and they can dwarf the size of the actual code they're copyrighting.

However, I've really come around to the practice of including them, and the main reason is that it makes the files easier for others to copy and use legally. It's one thing when someone finds their way to the root of your repository or distribution package, but it's another when they find an individual class by doing a web search or hitting F3 in Eclipse. In those cases, they can find their way up to the license (assuming your source package includes it), but it's much easier if it's just declared right up front.

It's also easier to clearly distinguish the third-party code you're including. When each file has its copyright information clearly noted, you can easily tell the difference between a sui-generis project file and an included third-party file without having to parse through the NOTICE every time.

And, fortunately, it doesn't have to be a huge hassle to maintain. In each of my Maven projects, I include a license plugin configuration to declare copyright information, any special data types, and which files to not include. Then, whenever I add new files or make a change after the turn of a year, I can run mvn license:format and it'll keep everything tidy for me.

pom.xml Configuration

Maven (and it's not alone in this) provides a lot of pom.xml-level elements to declare all sorts of metadata about your project, like its SCM repository, issue tracker, and, critically, license and developers. I like to declare the inception year, the license, and the <developers> block:

	<inceptionYear>2018</inceptionYear>

	<licenses>
		<license>
			<name>The Apache Software License, Version 2.0</name>
			<url>http://www.apache.org/licenses/LICENSE-2.0.txt</url>
		</license>
	</licenses>

	<developers>
		<developer>
			<name>Jesse Gallagher</name>
			<email>jesse@frostillic.us</email>
		</developer>
	</developers>

I use that <inceptionYear/> value as part of the license-header file to keep track of the copyright range, at least when there's contiguous multi-year development.

OSGi Stuff

Since most of my projects are still OSGi, I've been aiming to improve my licensing setup there too. The main place where this comes into play is in feature.xml files, which have required elements to specify the copyright and license. It's not terribly unusual for these to end up as their "[Enter Copyright Description here.]" defaults, but it's important to fill these in. They're included in the "accept the licenses" dialog when installing into Eclipse/Designer, and are available in the "Installed software" descriptions in the UI.

But What License to Use to Begin With?

I'll finish off this post with what is actually the most important part of the process, but which can usually be answered simply. There are a lot of open-source licenses out there, and you could theoretically make up your own, but for our purposes the choice tends to follow some basic rules:

  • If you're contributing to an established OS organization, use theirs - for example, if you're contributing to Eclipse, use the EPL.

  • If you want your code to be mixed other OS projects and (potentially) proprietary ones, pick Apache or something like it.

    • At OpenNTF, we have a preference for Apache over other similar licenses, because it's well-established and makes copyright handling clearer than the equivalents, something that is critical for large companies. Let past lawyers do your legwork on this one.
  • If you want to require that users of your code keep the code open source, consider the GPL.

    • Be extremely wary of this, however: the GPL is intentionally "infectious" and limits how the code can be used. Various projects carve out little exceptions to the GPL to allow use in otherwise-non-GPL products, but it's still something of a minefield.
    • The GPL is one of the approved licenses for OpenNTF, but we kind of discourage it except in cases where a project is GPL because it's derived from previously-GPL'd code.
  • If you don't want to be bothered too much by copyright and just want the code out there, consider Public Domain. In practice, it's usually best for you to retain copyright, but explicitly declaring Public Domain is certainly an effective way of allowing any use.

For projects in our community, the quick answer is "use Apache". It's permissive, covers copyright, and is known and trusted by pretty much everyone.

More Work Than It's Worth?

Both the earlier parts of this post and Betteridge's Law contribute to making it clear that my answer is "no, it's not more work than it's worth", but I can certainly see why it'd feel that way. The first couple times I submitted projects to OpenNTF and got a "here's some stuff to fix" email from Peter Tanner, part of me definitely chafed at the whole thing. That can be particularly the case for Notes-based contributions - sometimes, you just want to plunk an NTF on the project page and be done with it, and Notes certainly doesn't have a "wrap this NSF copy in a ZIP with LICENSE and NOTICE files" checkbox.

However, as I learned more about the legal importance of having licenses correct and got more practice at doing this stuff from the start, I started to appreciate the whole process. It also turned out to be really helpful to sort this stuff out on smaller projects before working on larger ones, especially ones with established teams and procedures.

In all, it's worth it both to allow you to contribute to larger projects and, regardless of project size, it's worth it for anyone consuming your code.