Faster startup for containerized applications with Open Liberty InstantOn

Open Liberty InstantOn provides fast startup times for MicroProfile and Jakarta EE applications. With InstantOn, your applications can start in milliseconds, without compromising on throughput, memory, development-production parity, or Java language features.

InstantOn is configured as part of the application image build. When you build an application image, InstantOn creates an extra image layer that contains a checkpoint of the Open Liberty application process. This checkpoint is a snapshot of the running application process that can be persisted and then quickly restored to bring the application process back into the state it was in when the checkpoint was taken. When the application image starts, the Open Liberty runtime recognizes the InstantOn layer and restores the application process from that checkpoint.

InstantOn uses the Checkpoint/Restore In Userspace (CRIU) feature of the Linux kernel to take a checkpoint of the JVM that can be restored later. Open Liberty and OpenJ9 contain hooks that participate in the checkpoint and restore process. These hooks allow the Open Liberty runtime to safely persist the checkpoint process in the container image. This preparation step ensures that the process can be restored from the same persistent state in the container image into multiple, possibly concurrent, instances of the application. For more information on the CRIU support in OpenJ9 and the IBM Semeru JVM, see the Java CRIU Support documentation.

You cannot use InstantOn outside of a container image build. An application container image provides a consistent environment, which is required to ensure a reliable restore of an Open Liberty application process. The InstantOn container layer is the last layer of the application container image. This configuration ensures that the resources in the underlying layers of the image do not change from the time the checkpoint is taken to the time the image starts with InstantOn.

The following sections describe the prerequisites and processes to build and run an InstantOn application image:

Runtime and host build system prerequisites
Building an InstantOn application image
Running and deploying an InstantOn application image
Open Liberty InstantOn supported features

Runtime and host build system prerequisites

InstantOn requires the Linux operating system with kernel version 5.9 or greater. It also requires a version of IBM Semeru Java, depending on the host machine architecture and Open Liberty version.

The following table defines the InstantOn-supported IBM Semeru Java levels and minimum Liberty version for each supported architecture. Currently, InstantOn is not supported on Java vendor implementations other than IBM Semeru.

Supported Java levels and Open Liberty versions for InstantOn
Architecture	Supported IBM Semeru Java levels	Minimum Open Liberty version
Linux X86_64 UBI (`amd64`)	Java SE 11.0.19+ Java SE 17.0.7+ Java SE 21.0.1+	23.0.0.6 and later
Linux on Power (`ppc64le`)	Java SE 21.0.1+	24.0.0.1 and later
Linux on IBM Z (`s390x`)	Java SE 21.0.0.1+	24.0.0.1 and later

Linux capability prerequisites for checkpoint and restore

To use InstantOn during an application container image build, the host machine must allow Podman or Docker to run privileged container builds. The use of privileged containers in the application container image build is required only when you build the container image with InstantOn. Running the resulting InstantOn application image does not require the use of privileged containers. CRIU supports checkpointing and restoring application processes by unprivileged (non-root) users.

To checkpoint and restore an application process, CRIU requires the following Linux capabilities:

CHECKPOINT_RESTORE - This capability was added in Linux 5.9 to separate checkpoint/restore functions from the overloaded SYS_ADMIN capability.
SETPCAP - This capability is required for the subsequent restore.
SYS_PTRACE - CRIU uses this powerful capability to capture and record the full process state. It is necessary only when CRIU checkpoints an application process. This capability is not required to restore an application process.

To use Docker, a version 23.0 or greater is required. Earlier versions of Docker did not support the CHECKPOINT_RESTORE Linux capability, which prevents InstantOn from doing a checkpoint or restore of a process in container.

When to make a checkpoint: beforeAppStart or afterAppStart

When an InstantOn checkpoint occurs, the Open Liberty runtime starts. During this startup, the runtime processes the configuration, loads all the enabled features, and starts processing the configured application. This startup sequence varies depending on the size and complexity of the application and the number of Open Liberty features it requires. For small simple applications, this process can take less than 1 second. For more typical applications, it can take several seconds. InstantOn takes a checkpoint after the Open Liberty runtime completes startup. Two options are available to determine whether the checkpoint occurs before or after the application itself starts. You can specify one of these options as part of the InstantOn RUN configuration:

afterAppStart - This option takes a checkpoint after the application starts. It provides the fastest startup, but might not be suitable for some applications.
beforeAppStart - This option takes a checkpoint before the application starts and before any application code runs. Use this option in cases where the application code is not compatible with the afterAppStart option.

Which of these options you choose depends on the code your application must run. Jakarta EE and MicroProfile applications might contain application code that runs as the application starts, such as the following examples:

A servlet that uses the loadOnStartup attribute
An EJB that uses the @Startup annotation
A CDI bean that uses @Observes @Initialized(ApplicationScoped.class) annotations

Sometimes, the application code that runs as the application starts might not be suited for an InstantOn checkpoint. For such cases, use the beforeAppStart option. For example, avoid the following application code scenarios before an InstantOn checkpoint. They are better suited for the beforeAppStart option:

Accessing a remote resource, such as a database. The correct data source is unlikely to be available to connect to during an application container build.
Creating a transaction. Currently, transactions are prohibited before an InstantOn checkpoint.
Reading configuration that is expected to change when the application is deployed, for example configuration from MicroProfile Config.

Using the beforeAppStart option in these cases ensures that the application code is run only after the InstantOn checkpoint process is restored. This option might result in slower restore times because it must run more code before the application is ready to service any incoming requests. If you determine that the application early start code is safe and acceptable for checkpoint, then use the afterAppStart checkpoint option. This option provides the fastest startup time when the application process is restored.

If an application has no code that is run as the application starts, then the beforeAppStart and afterAppStart checkpoints are equivalent. In these cases, both checkpoint options perform a checkpoint of the process before the configured ports are enabled for servicing requests. This sequence ensures that the transport protocols for the application are enabled only after the InstantOn checkpoint process is restored.

For more information about limitations with early startup code and possible workarounds, see InstantOn limitations and known issues.

Building an InstantOn application image

Two options are available to build an application container image that uses InstantOn:

Add a special RUN instruction at end of a Dockerfile or Containerfile that runs the checkpoint.sh script to take an application checkpoint at container image build time. This option requires you to use Podman.
Use a three-step process to build the application image, run the checkpoint, and commit the final result into an InstantOn application container image. This option allows you to use either Podman or Docker version 23.0 or later.

To run the checkpoint.sh script, you must use Podman to build the application container image. Currently, you cannot use Docker to build the InstantOn application container image because Docker does not provide a way to grant the container build the necessary Linux capabilities. To use Docker to build an InstantOn application container image, you must follow the three-step build process.

Building the InstantOn image with Podman and the checkpoint.sh script

You can use the checkpoint.sh script to take the application checkpoint by adding the RUN checkpoint.sh instruction to the end of your Dockerfile or Containerfile file. The checkpoint.sh`script must be the last `RUN instruction during your container image build. This configuration takes the application process checkpoint and stores the process data as the last layer of the application container image. This script requires you to use Podman rather than Docker because Docker cannot grant the necessary Linux capabilities.

The following image template example uses the kernel-slim-java17-openj9-ubi tag to build an image that uses the latest Open Liberty release with the IBM Semeru distribution of Java 17. This example uses the afterAppStart checkpoint option.

Dockerfile

FROM icr.io/appcafe/open-liberty:kernel-slim-java17-openj9-ubi

# Add a Liberty server configuration that includes all necessary features
COPY --chown=1001:0  server.xml /config/

# This script adds the requested XML snippets to enable Liberty features and grow the image to be fit-for-purpose.
# This option is available only in the 'kernel-slim' image type. The 'full' and 'beta' tags already include all features.
RUN features.sh

# Add interim fixes (optional)
COPY --chown=1001:0  interim-fixes /opt/ol/fixes/

# Add an application
COPY --chown=1001:0  Sample1.war /config/dropins/

# This script adds the requested server configuration, applies any interim fixes, and populates caches to optimize the runtime.
RUN configure.sh

# This script performs an InstantOn checkpoint of the application.
# The application can use beforeAppStart or afterAppStart to do the checkpoint.
# The default is beforeAppStart when not specified
RUN checkpoint.sh afterAppStart

Use the following Podman command to build the InstantOn application container image. To grant the necessary Linux capabilities to the container image build, run this command either as the root user or by using the sudo utility.

podman build \
   -t dev.local/liberty-app-instanton \
   --cap-add=CHECKPOINT_RESTORE \
   --cap-add=SYS_PTRACE\
   --cap-add=SETPCAP \
   --security-opt seccomp=unconfined .

The three --cap-add options grant the three Linux capabilities that CRIU requires to perform the application process checkpoint during the container image build. The --security-opt option grants access to all Linux system calls to the container image build.

Building the InstantOn image by using the three-step process with Docker or Podman

If you cannot use Podman to run the checkpoint.sh during the container image build, you can use the following three-step process to build the InstantOn application container image:

Build the application container image without the InstantOn layer.
Run the application container to take a checkpoint of the application in the running container.
Commit the stopped container with the checkpoint process data into an InstantOn application container image.

You can use these steps with either Podman and Docker to build an InstantOn application image. For Docker, version 23.0 or later is required. The following examples assume that you are using Docker to build an application image that is named liberty-app.

1. Build the application container image without the InstantOn layer

Set the image template (Dockerfile or Containerfile) similar to the following example. This example uses the kernel-slim-java17-openj9-ubi tag to build an image that uses the latest Open Liberty release with the IBM Semeru distribution of Java 17. This template does not run the checkpoint.sh script.

Dockerfile

FROM icr.io/appcafe/open-liberty:kernel-slim-java17-openj9-ubi

# Add a Liberty server configuration that includes all necessary features
COPY --chown=1001:0  server.xml /config/

# This script adds the requested XML snippets to enable Liberty features and grow the image to be fit-for-purpose.
# This option is available only in the 'kernel-slim' image type. The 'full' and 'beta' tags already include all features.
RUN features.sh

# Add interim fixes (optional)
COPY --chown=1001:0  interim-fixes /opt/ol/fixes/

# Add an application
COPY --chown=1001:0  Sample1.war /config/dropins/

# This script adds the requested server configuration, applies any interim fixes, and populates caches to optimize the runtime.
RUN configure.sh

To build the application container image with Docker, run the following command:

docker build -t liberty-app .

The resulting application container image, which is tagged liberty-app, does not contain the InstantOn checkpoint process layer.

2. Run the application container to take a checkpoint

Run the application container image to take a checkpoint of the application process within the running container. The following example uses the liberty-app application image to run the checkpoint of the application process with the afterAppStart option:

docker run \
  --name liberty-app-checkpoint-container \
  --privileged \
  --env WLP_CHECKPOINT=afterAppStart \
  liberty-app

This command runs the application within a container and takes an application process checkpoint. The --env option sets a WLP_CHECKPOINT environment variable to specify the checkpoint afterAppStart option. When the application process checkpoint completes, the liberty-app-checkpoint-container application container is stopped and exits.

3. Commit the stopped container with the checkpoint process data

The stopped liberty-app-checkpoint-container container from the previous step contains the data from the InstantOn checkpoint process. Lastly, take this checkpoint process data and commit it to an application container image layer by running the following commit commands:

docker commit liberty-app-checkpoint-container liberty-app-instanton
docker rm liberty-app-checkpoint-container

You now have two application images: liberty-app and liberty-app-instanton. Starting a container with the liberty-app-instanton container image shows a faster startup time than the original liberty-app image. The liberty-app-checkpoint-container stopped container is no longer needed and can safely be removed.

Running and deploying an InstantOn application image

Special considerations are required to run an InstantOn application image locally or when it is deployed to a public cloud. The following prerequisites are required to restore the InstantOn checkpoint process.

The host that is running the container image must use Linux kernel 5.9 or greater.
The CHECKPOINT_RESTORE and SETPCAP Linux capabilities must be granted to the running container.
The necessary system calls must be granted to the running container.
The host processor must be X86-64/AMD64. If you are running IBM Semeru Java version 21.0.1+, Linux on Power and Linux on Z (s390x) architectures are also supported.

Running an InstantOn application image locally

The following command runs the liberty-app-instanton InstantOn application image with Podman:

podman run \
  --rm \
  --cap-add=CHECKPOINT_RESTORE \
  --cap-add=SETPCAP \
  --security-opt seccomp=unconfined \
  -p 9080:9080 \
  liberty-app-instanton

The following command runs the liberty-app-instanton InstantOn application image with Docker:

docker run \
  --rm \
  --cap-add=CHECKPOINT_RESTORE \
  --cap-add=SETPCAP \
  --security-opt seccomp=unconfined \
  -p 9080:9080 \
  liberty-app-instanton

In both cases, the --cap-add option grants the CHECKPOINT_RESTORE and SETPCAP capabilities. The SYS_PTRACE capability is not required to run the InstantOn application container image.

Required Linux system calls

The --security-opt option grants the running container access to all Linux system calls. Depending on the defaults of the container engine, the --security-opt with the seccomp-unconfined setting might not be required. For CRIU to restore the InstantOn application process, the container must have access to clone3, ptrace, and other system calls. This requirement is true even though the elevated Linux capability of SYS_PTRACE is not required to restore the process. You can update the defaults of the container engine to include all the required system calls.

Alternatively, you can specify a file with the --security-opt seccomp option that specifies the policy for the container. Use the following command to specify a JSON policy file for seccomp:

podman run \
  --rm \
  --cap-add=CHECKPOINT_RESTORE \
  --cap-add=NET_ADMIN \
  --cap-add=SYS_PTRACE \
  --security-opt seccomp=criuRequiredSysCalls.json \
  -p 9080:9080 \
  liberty-app-instanton

The resulting criuRequiredSysCalls.json file grants access to all the Linux system calls that CRIU requires to restore an InstantOn application process.

Recovering from a failed InstantOn restore

If restoration of the InstantOn application process fails, Open Liberty starts the server without using the InstantOn checkpoint process. In such cases, the Open Liberty application starts as if no InstantOn checkpoint process layer exists, which takes longer than a successfully restored InstantOn process. You can disable this recovery launch from a failed InstantOn restore by setting the following environment variable:

CRIU_RESTORE_DISABLE_RECOVERY=true

After you build an InstantOn application container image, you can verify a successful restore by setting this environment variable to run locally. For example, you can run the following Podman command:

podman run \
  --rm \
  --cap-add=CHECKPOINT_RESTORE \
  --cap-add=SETPCAP \
  --security-opt seccomp=unconfined \
  --env CRIU_RESTORE_DISABLE_RECOVERY=true \
  -p 9080:9080 \
  liberty-app-instanton

To avoid cloud environments continuously trying to restart the failed start of an application container image, the default value of the CRIU_RESTORE_DISABLE_RECOVERY variable is false.

Deploying an InstantOn application to Kubernetes services

Currently, Open Liberty InstantOn is tested and supported on the following public cloud Kubernetes services:

Other public cloud Kubernetes services might also work if they have the prerequisites to allow the InstantOn application process to restore.

When you deploy to Kubernetes, the container must be granted the CHECKPOINT_RESTORE and the SETPCAP Linux capabilities to allow the InstantOn application process to restore. You can configure these capabilities in the deployment YAML file by specifying the following securityContext for the container:

        securityContext:
          allowPrivilegeEscalation: true
          privileged: false
          runAsNonRoot: true
          capabilities:
            add:
            - CHECKPOINT_RESTORE
            - SETPCAP
            drop:
            - ALL

Red Hat OpenShift security context constraints

To deploy applications to Red Hat OpenShift with InstantOn, you must specify a security context constraint (SCC) that, at a minimum, specifies a list of additional capabilities that are added to any pod. The following SSC yaml file example defines an SCC with the required capabilities by using the defaultAddCapabilities parameter:

defaultAddCapabilities:
- CHECKPOINT_RESTORE
- SETPCAP

The applications you deploy must be associated with an SCC that adds the required capabilities. For example, you might deploy an SCC called liberty-instanton-scc that adds the required capabilities. In the following example, the deployment yaml file specifies the serviceAccountName parameter to set the SCC name to liberty-instanton-scc:

  serviceAccountName: liberty-instanton-scc
  securityContext:
    allowPrivilegeEscalation: true
    privileged: false
    runAsNonRoot: true
    capabilities:
      add:
      - CHECKPOINT_RESTORE
      - SETPCAP
      drop:
      - ALL

For more information, see the Red Hat documentation for Managing security context constraints.

Open Liberty InstantOn supported features

InstantOn supports a subset of Open Liberty features. If a feature is enabled that InstantOn does not support, a failure occurs when you try to take a checkpoint of an application process. InstantOn supports the following Jakarta EE and MicroProfile convenience features and all the features that they enable:

Jakarta EE Web Profile versions 8.0 and later
MicroProfile versions 4.1 and later

You can individually enable the Open Liberty public features that are enabled by the Jakarta EE Web Profile and MicroProfile features, depending on the needs of your application. This option avoids enabling the complete set of features that are enabled by the convenience features. However, InstantOn currently does not support standalone MicroProfile features, which are MicroProfile features that are not enabled by any of the convenience features.

In addition to the features that are enabled in the MicroProfile and Jakarta convenience features, InstantOn also supports the following features:

For more information about limitations, see InstantOn limitations and known issues.