New enhancements for Liberty InstantOn in 23.0.0.2-beta

Thomas Watson on Feb 10, 2023

announcements , beta , developer-experience , performance-enhancements ,

Post available in languages: 日本語 ,

This article was published when Liberty InstantOn was still beta. Liberty InstantOn moved out of beta as of the Liberty 23.0.0.6 release. For the latest information about Liberty InstantOn, see Faster startup for containerized applications with Open Liberty InstantOn in the Open Liberty docs.

The Open Liberty 22.0.0.11-beta introduced InstantOn, an exciting new feature that provides incredibly fast startup times for MicroProfile and Jakarta EE applications. Since this initial beta release, we’ve made a few changes and enhancements to make it easier to create and deploy applications with Liberty InstantOn.

This post uses the same project setup from our original InstantOn article, Liberty InstantOn startup for cloud native Java applications. If you would like to follow along with the examples here, read and follow the instructions in that post first.

The following sections describe the improvements to Liberty InstantOn that are included with the 23.0.0.2 beta release.

Removal of the features checkpoint phase

Initially, the InstantOn beta provided three phases during the startup process where a checkpoint could occur:

features - This is the earliest phase where a checkpoint can happen. The checkpoint occurs after all of the configured Open Liberty features are started, but before any processing occurs for the installed applications.
beforeAppStart - The checkpoint happens after processing the configured application metadata. If the application has any components that get run as part of the application starting, the checkpoint is taken before executing any code from the application.
afterAppStart - This is the last phase where a checkpoint can happen, so it has the potential to provide the fastest startup time when restoring the application instance. The checkpoint happens after all configured applications are reported as started. This phase happens before opening any ports for listening to incoming requests for the applications.

Of these three options, the features phase offers the least value in terms of improving an application startup time. Scenarios where an application deployment would want to use features over beforeAppStart are very limited and are not good use cases for Liberty InstantOn. For the 23.0.0.2-beta, the features checkpoint phase has been removed. This change leaves only two options to choose from for the checkpoint: beforeAppStart and afterAppStart.

Reduced set of required Linux capabilities to checkpoint and restore

For criu to take a checkpoint of and restore a process, the criu binary must be granted additional Linux capabilities. In particular, for the Open Liberty 22.0.0.11-beta, it needed to be granted checkpoint_restore, net_admin and sys_ptrace capabilities.

The Open Liberty 23.0.0.2-beta removes the need for the net_admin capability for both the checkpoint and restore of a process; also, the sys_ptrace capability is no longer necessary to restore the process when running the application with Liberty InstantOn. This change is an improvement: the net_admin and sys_ptrace capabilities grant a broad set of capabilities that might be unacceptable when deploying an application to the cloud. The 23.0.0.2-beta introduces a requirement for the setpcap capability; the use of this capability is described in the next section.

Simplified container image builds with InstantOn

The Liberty InstantOn beta image contains the prerequisites for building an application container image with a checkpoint server process. Applications can use the Liberty InstantOn beta image as a base to create their own application container image with a checkpoint process. For the 22.0.0.11-beta, this was a three step process:

Build the application container image.
Checkpoint the application by running the application container.
Commit the application container with the checkpoint process to a new container image.

For the 23.0.0.2-beta, this process is simplified for systems that are running on kernel versions with the clone3 system call and the checkpoint_restore capability. These kernel features, available in kernel version 5.9 or greater, make it possible to perform the process checkpoint during the container build. One additional Linux capability, called setpcap, must be granted to criu to allow the process checkpoint from a container build to be restored successfully. The setpcap grants criu the ability to drop capabilities for the restored process. This function is necessary in cases where the checkpoint process running during the build step has fewer Linux capabilities granted to it than the container does when running the application image. When criu detects this situation, it drops the Linux capabilities associated with restored process to match the reduced set that is used during the container build.

To do the process checkpoint during the container build step, the Dockerfile for the application must be updated with an additional RUN instruction at the end to perform the process checkpoint. Using the example application project from the original post, add RUN checkpoint.sh afterAppStart to the end of the Dockerfile:

Dockerfile

FROM icr.io/appcafe/open-liberty:beta
ARG VERSION=1.0
ARG REVISION=SNAPSHOT
LABEL \
  org.opencontainers.image.authors="Your Name" \
  org.opencontainers.image.vendor="IBM" \
  org.opencontainers.image.url="local" \
  org.opencontainers.image.source="https://github.com/OpenLiberty/guide-getting-started" \
  org.opencontainers.image.version="$VERSION" \
  org.opencontainers.image.revision="$REVISION" \
  vendor="Open Liberty" \
  name="system" \
  version="$VERSION-$REVISION" \
  summary="The system microservice from the Getting Started guide" \
  description="This image contains the system microservice running with the Open Liberty runtime."

COPY --chown=1001:0 src/main/liberty/config/ /config/
COPY --chown=1001:0 target/*.war /config/apps/

RUN configure.sh
RUN checkpoint.sh afterAppStart

This configuration adds the application process checkpoint as the last layer of the application container image. The checkpoint.sh script allows you to specify either afterAppStart or beforeAppStart to indicate what phase of the startup should perform the process checkpoint. With podman, you can use the --add-cap and --security-opt options to grant the container build the necessary capabilities to take a checkpoint during the container build step. This reduces the application container image build to the following step:

podman build with required capabilities added

podman build \
   -t dev.local/getting-started-instanton \
   --cap-add=CHECKPOINT_RESTORE \
   --cap-add=SYS_PTRACE\
   --cap-add=SETPCAP \
   --security-opt seccomp=unconfined .

Running the application with InstantOn

If the host OS has kernel version 5.9+, then the clone3 system call is used by criu. This removes the need to mount ns_last_pid. With the 23.0.0.2-beta, the getting-started-instanton container can be run with the following command:

podman run with limited capabilities added

podman run \
  --rm \
  --cap-add=CHECKPOINT_RESTORE \
  --cap-add=SETPCAP \
  -p 9080:9080 \
  getting-started-instanton

The 23.0.0.2-beta removes the requirement to add the sys_ptrace or net_admin when running the application container with Liberty InstantOn. Note that podman grants running containers the setpcap capability by default. So, you might be able to run the container without explicitly adding this capability with --cap-add.

What is next?

As you can see, we continue to refine the InstantOn beta to make it easier to consume. Stay tuned for more updates in coming beta releases, including how to deploy InstantOn to public clouds like AWS. If you have any requests or suggestions, we would love to hear from you!

See all blog posts