Bitwise Operators

Apr 26, 2019 4:11 PM

Tags: javaee
  1. That Java Thing, Part 1: The Java Problem in the Community
  2. That Java Thing, Part 2: Intro to OSGi
  3. That Java Thing, Part 3: Eclipse Prep
  4. That Java Thing, Part 4: Creating the Plugin
  5. That Java Thing, Part 5: Expanding the Plugin
  6. That Java Thing, Part 6: Creating the Feature and Update Site
  7. That Java Thing, Part 7: Adding a Managed Bean to the Plugin
  8. That Java Thing, Part 8: Source Bundles
  9. That Java Thing, Part 9: Expanding the Plugin - Jars
  10. That Java Thing, Part 10: Expanding the Plugin - Serving Resources
  11. That Java Thing, Interlude: Effective Java
  12. That Java Thing, Part 11: Diagnostics
  13. That Java Thing, Part 12: Expanding the Plugin - JAX-RS
  14. That Java Thing, Part 13: Introduction to Maven
  15. That Java Thing, Part 14: Maven Environment Setup
  16. That Java Thing, Part 15: Converting the Projects
  17. That Java Thing, Part 16: Maven Fallout
  18. That Java Thing, Part 17: My Current XPages Plug-in Dev Environment
  19. Java Hiccups
  20. Bitwise Operators
  21. Java Grab Bag 2

In the "Java Hiccups" post, I briefly mentioned this little tidbit:

(short)Integer.MAX_VALUE == -1   // true

This fact betrays a lot about how numbers are stored internally in most languages. Some of those details, like the fact that the specific method is called two's complement, are fully in the range of "computer science" stuff - but it can be handy to know about a couple ways you can put this sort of thing to use.

For our purposes, let's work with the short data type in Java, because it's painfully important to Notes. "Short" is called such because it's smaller than a standard integer type in a given language. In some languages, like C, the length of core data types like this is sometimes tied to the bitness of the processor, but in Java they're consistent across everything. Specifically, short in Java is 16 bits long. Let's take the number 21,038, which is represented in binary as:

0101 0010 0010 1110

That's two bytes, and each byte is broken up into two groups of four bits each because it turns out to be helpful visually and because each "nibble" there can be matched to a single hexadecimal character (522E in this case). This is the most common way you'll see binary written when you start getting into this sort of thing.

So I mentioned casting in the previous post, and what casting does with primitive integer types like this is to chop off the bits that don't fit. If you cast the above value into a byte, it'll chop off all but the ending 8 bits, giving you:

0010 1110

Which is 46. If you cast it to a larger type, such as int (32 bits in Java), it just adds some zeros to the front:

0000 0000 0000 0000 0101 0010 0010 1110

Which is still 21,038, but now it takes up twice as much memory.

For our uses, this stuff matters immediately in two ways: it represents the limits of some data storage and it allows us to manipulate numbers with bitwise operators.

Storage Limits

If you max out the bits in a 16-bit integer, you get 1111 1111 1111 1111, and this value is either -1 or 65,535 (64k) depending on whether or not the value is considered "signed" by the code using it. When it's "unsigned", a 16-bit integer can represent between 0 and 65,535; when it's "signed", its range is -32,768 to 32,767 (32k). Java only has "signed" types, which is why Short.MAX_VALUE is 32k and not 64k. C, on the other hand, lets you choose in your code which type you want to use.

The Notes C API uses a couple type names to represent consistent sizes across different platforms, and the one that's important here is WORD: a 16-bit unsigned integer. This is used in, for example, the function used to set the value of a text item in a note:

NSFItemSetText(
  NOTEHANDLE  hNote,
  const char far *ItemName,
  const char far *ItemText,
  WORD  TextLength)

That WORD at the end there is where you tell the API how much text you're providing, and is thus capped at 64k - and is why you can store 60k of text in a non-summary text field but not 70k. As to why summary data is limited to 32k and not 64k, I'm guessing that either there's a signed value in there somewhere or that last bit was needed for something else.

If you look over the list of Notes limits, you can see this reflected all over the place, along with a few 8-bit (0-255) limits for good measure.

Bitwise Operators (for real this time)

The term "bitwise" refers to a handful of operators that deal directly on the bits of a number. Java takes a cue from C here and provides &, |, ^, ~, <<, >>, and >>>. These all have their uses, primarily for really-low-level stuff and efficiency, but we mainly care about & and |. These two may have bitten you in the past because of their similarity to && and || - and in particular because Formula Language uses the single-character versions to mean what Java means by the double-character ones. They also kind of work the same way, but aren't really the same.

To see how they work, let's take our original starting number, 21,038, and pair it up with another value, 5,000:

0101 0010 0010 1110
0001 0011 1000 1000

Visualizing the numbers in binary and stacked like this comes in extremely handy when it comes to dealing with bitwise operators. We'll start with &, or "bitwise AND". What this operator does is take two numbers and return a number where the matching slots in each one both contain a 1. So, in this case, it results in this internal math:

0101 0010 0010 1110
0001 0011 1000 1000
===================
0001 0010 0000 1000

Which corresponds to 4,616.

The | operator, or "bitwise OR", will provide a result where either of the original numbers has a 1 in the slot. In this case:

0101 0010 0010 1110
0001 0011 1000 1000
===================
0101 0011 1010 1110

Or 21,422.

It's pretty rare that you want these operators for the actual numerical values, though. What they're pretty much always used for are "bit fields", an extremely-efficient way to store a set of related flags. Going back to the Notes API, the NSFItemAppend function (a generic way to add an item value) has a parameter WORD Flags, that matches up to a number of properties that you can assign to an item. In the C source, they're written as hexadecimal values, but I've added in the binary versions here:

#define	ITEM_SIGN          0x0001   // 0000 0000 0000 0001
#define	ITEM_SEAL          0x0002   // 0000 0000 0000 0010
#define	ITEM_SUMMARY       0x0004   // 0000 0000 0000 0100
#define	ITEM_READWRITERS   0x0020   // 0000 0000 0010 0000
#define	ITEM_NAMES         0x0040   // 0000 0000 0100 0000
#define	ITEM_PLACEHOLDER   0x0100   // 0000 0001 0000 0000
#define	ITEM_PROTECTED     0x0200   // 0000 0010 0000 0000
#define	ITEM_READERS       0x0400   // 0000 0100 0000 0000
#define ITEM_UNCHANGED     0x1000   // 0001 0000 0000 0000

So you can see that each flag there has a distinct slot filled in that each other doesn't (you can also see, like missing teeth, the bits that are probably obsolete or undocumented features). If you want to create an item that is a signed, sealed, summary, readers item, you'd do this:

WORD flags = ITEM_SIGN | ITEM_SEAL | ITEM_SUMMARY | ITEM_READERS

…which will result in this bit of internal math:

0000 0000 0000 0001
0000 0000 0000 0010
0000 0000 0000 0100
0000 0100 0000 0000
===================
0000 0100 0000 0111

This is extremely efficient, both because you can store a bunch of information in just 16 bits, but also because working with these bit fields is baked into processors at the lowest level. There's not much you can do on a computer that's faster, in fact.

When you have a value like this, you can query it for whether or not a given value is set by using the AND operator:

flags & ITEM_SUMMARY

This will result in:

0000 0000 0000 0100

…which is the same as the value of ITEM_SUMMARY itself, since that one slot is the only bit in common. Conversely, if you check:

flags & ITEM_PROTECTED

…you'll end up with zero, since the flag doesn't match.

Use in Java

Most of the time, you don't have to think too much about the bits that make up numbers in Java, but they creep up from time to time. Generally, the main situations where they arise are when you're dealing with a low-level API or with code written by someone who spent a lot of time with C and deeply internalized its efficiencies.

For an example of the former, take a look at the com.ibm.designer.domino.napi.NotesNoteItem class in IBM's NAPI. It contained a method called int getFlags(), which returns exactly the value we were talking about above. If you get your hands on a summary item this way, you can do this to tell if it's a summary item:

(item.getFlags() & 0x0004) != 0

Note the extra != 0. While the operation involved is the same as in C, C lets you treat any non-zero integer value as a boolean true, while Java forces you to perform an actual boolean operation.

For an example of the latter, turn your eyes to the DefaultColumnDef class used in xe:dynamicViewPanel generation and customization, a snippet of which is:

public static class DefaultColumnDef implements ColumnDef {
      public static final int FLAG_HIDDEN         = 0x000001;
      public static final int FLAG_LINK           = 0x000002;
      public static final int FLAG_ONCLICK        = 0x000004;

      public int flags;
  
      public boolean isHidden() {
          return (flags&FLAG_HIDDEN)!=0;
      }
      public boolean isLink() {
          return (flags&FLAG_LINK)!=0;
      }
      public boolean isOnClick() {
          return (flags&FLAG_ONCLICK)!=0;
      }
}

This could also be represented as a bunch of boolean properties on the object or, most idiomatically, an EnumSet, but using a bitmask is slightly more space- and CPU-cycle-efficient. Slightly. The difference isn't enough to even think about normally, but can become worth it in cases where the code is called so often that tiny efficiencies can make a difference.

This whole topic is the sort of thing that is normally so many layers of abstraction below where we work that it's basically invisible, but it's definitely good to at least know the basics for the times when it does come up.

XPages on Android

Apr 15, 2019 4:30 PM

  1. Letting Madness Take Hold: XPages Outside Domino
  2. XPages on Android
  3. Concept Proven: A Complex Production XPages App Running Outside Domino

Around the start of the year, I had a bit of a dalliance with the idea of running XPages outside Domino. The upshot of that project is that it is indeed possible to do so, but there'd be some work to do to make it practical.

This month, we revisited the idea with a healthy dose of Darwino to provide some undergirding technology, with the goal of being able to run a plain old XPages app on mobile devices backed by Darwino DBs and replicating back to Domino. There was a lot of fiddling involved, but it works:

Android is the natural first target, but iOS is about 70% there, with the scaffolding loading up to the point where it loads a page but currently with a bit of trouble when it comes to resolving data classes and executing renderers.

What Specifically Is Going On?

We set out a few required parameters to call it a successful proof-of-concept:

  • It has to use the actual XPages framework - that is to say, the jar files shipped with Notes/Domino
  • The XPages themselves have to be shared among Domino and the Darwino app, in the form of their Java "intermediate" source (compiling from .xsp source is possible but is a whole other thing)
  • It has to use xp:dominoView and xp:dominoDocument data sources
  • It has to load the Extension Library
  • It has to use ancillary elements from the app: managed beans, themes, CSS, images, SSJS libraries

This checks all of those boxes, and it's pretty satisfying to see in action.

The Stumbling Blocks

As I discussed in my post about the original project, there are aspects of XPages that are meant to make this sort of thing possible, with the core parts abstracting out different platforms and runtime environments. Over the years, though, assumptions about Domino and OSGi crept in, with newer additions and the Extension Library taking a bit less care to be environment-neutral.

Moreover, XPages is not open source and I don't have any particular special access to it, making this whole thing essentially, in the video-game sense, hard mode. There are a handful of classes that needed to be outright swapped out, like the annoyingly-final NotesContext class, but there was much less of that than I'd thought.

After that, getting the data to point to Darwino was pleasantly straightforward. Other than a handful of areas where the NAPI comes in, the Domino data sources largely adhere to the rules of the lotus.domino interfaces and don't make assumptions about the implementing classes (this is essentially how ODA shims in as well).

What This Could Be Useful For

With the right fleshing out, this could be a real way to run existing XPages apps on mobile devices. That could be pretty useful on its own, but I don't think that carrying forward XPages apps as-is is the right idea. The side effect of this, though, is that you have a functioning XPages app in a normal old .war file project, structured with Maven or Gradle, and ready to be molded into newer frameworks using whatever tooling you'd like. No Designer, no OSGi, no Servlet 2.4/2.5, just a clean basis running your existing logic and ready to be improved.

If the mountain of existing XPages code is going to have a future, I think it should be something like this.