Faster startup for containerized applications with Open Liberty InstantOn
Open Liberty InstantOn provides fast startup times for MicroProfile and Jakarta EE applications. With InstantOn, your applications can start in milliseconds, without compromising on throughput, memory, development-production parity, or Java language features.
InstantOn uses the Checkpoint/Restore In Userspace (CRIU) feature of the Linux kernel to take a checkpoint of the JVM that can be restored later. With InstantOn, the application is developed as normal, and then InstantOn makes a checkpoint of the application process when the application container image is built. When the application is restored, it runs in the same JVM, which provides complete parity between development and production. Because the checkpoint process takes only seconds, your CI/CD process is barely affected.
When you build an application container image, InstantOn creates an extra image layer that contains a checkpoint of the Open Liberty application process. When the application image starts, the Open Liberty runtime recognizes the InstantOn layer and restores the application process to the checkpoint that was created during the application container image build.
Open Liberty and OpenJ9 contain hooks that participate in the checkpoint and restore process. These hooks allow the Open Liberty runtime to prepare the process for checkpoint such that persisting the checkpoint process in the container image is safe. This preparation step ensures that the process can be restored from the same persistent state in the container image into multiple, possibly concurrent, instances of the application. For more information on the CRIU support in OpenJ9 and the IBM Semeru JVM, see the Java CRIU Support documentation.
InstantOn cannot be used outside of a container image build. An application container image provides a consistent environment, which is required to ensure a reliable restore of an Open Liberty application process. The InstantOn container layer is the last layer of the application container image. This configuration ensures that the resources in the underlying layers of the image do not change from the time the checkpoint is taken to the time the image is started with InstantOn.
The following sections describe the prerequisites and processes to build and run an InstantOn application image:
When to make a checkpoint: beforeAppStart or afterAppStart
When an InstantOn checkpoint occurs, the Open Liberty runtime starts. During this startup, the runtime processes the configuration, loads all the enabled features, and starts processing the configured application. This startup sequence varies depending on the size and complexity of the application and the number of Open Liberty features it requires. For small simple applications, this process can take less than 1 second. For more typical applications, it can take several seconds. InstantOn takes a checkpoint after the Open Liberty runtime completes startup. Two options are available to determine whether the checkpoint occurs before or after the application itself starts, either of which can be specified as part of the InstantOn RUN
configuration:
afterAppStart
- Performs a checkpoint after the application is started. This option provides the fastest startup, but might not be suitable for some applications.beforeAppStart
- Performs a checkpoint before any application code runs. Use this option in cases where the application code is not compatible with theafterAppStart
option.
Which of these options you choose depends on the code your application must run. Jakarta EE and MicroProfile applications might contain application code that gets run as the application is started, such as the following examples:
A servlet that uses the
loadOnStartup
attributeAn EJB that uses the
@Startup
annotationA CDI bean that uses
@Observes @Initialized(ApplicationScoped.class)
annotations
Sometimes, the application code that runs as the application starts might not be suited for performing an InstantOn checkpoint. For such cases, use the beforeAppStart
option. For example, the following application code scenarios must be avoided before an InstantOn checkpoint, and are therefore better suited for the beforeAppStart
option:
Accessing a remote resource, such as a database. The correct data source is unlikely to be available to connect to during an application container build.
Creating a transaction. Currently, transactions are prohibited before an InstantOn checkpoint.
Reading configuration that is expected to change when the application is deployed, for example configuration from MicroProfile Config.
Using the beforeAppStart
option in these cases ensures that the application code is run only after the InstantOn checkpoint process is restored. This option might result in slower restore times because it must run more code before the application is ready to service any incoming requests. If the application early start code is determined to be safe and acceptable for checkpoint, then the afterAppStart
checkpoint option can be used. This option provides for the fastest startup time when the application process is restored.
If an application has no code that is run as the application is started, then the beforeAppStart
and afterAppStart
checkpoints are equivalent. In these cases, both checkpoint options perform a checkpoint of the process before the configured ports are enabled for servicing requests. This sequence ensures that the transport protocols for the application are enabled only after the InstantOn checkpoint process is restored.
For more information about limitations with early startup code and possible workarounds, see InstantOn limitations and known issues.
Runtime and host build system prerequisites
Starting with Open Liberty version 23.0.0.6, all X86-64/AMD64 UBI Open Liberty container images include the prerequisites for InstantOn to checkpoint and restore Open Liberty application processes. Open Liberty Ubuntu container images are not enabled for InstantOn.
Currently, InstantOn is supported with IBM Semeru Java version 11.0.19+ and IBM Semeru Java version 17.0.7+. InstantOn is expected to support new versions of IBM Semeru Java as they are released. Currently, InstantOn is not supported on other Java vendor implementations.
To use InstantOn during an application container image build, the host machine must satisfy the following prerequisites:
Uses the Linux Operating System with kernel version 5.9 or greater
Uses the X86-64/AMD64 processor architecture
Allows execution of privileged container builds by using Podman or Docker
The use of privileged containers in the application container image build is required only when you build the container image with InstantOn. Running the resulting InstantOn application image does not require the use of privileged containers.
To use Docker, a version 23.0 or greater is required. Earlier versions of Docker did not support the CHECKPOINT_RESTORE
Linux capability, which prevents InstantOn from doing a checkpoint or restore of a process in container.
Linux capability prerequisites for checkpoint and restore
Unprivileged (non-root) users are supported by CRIU for checkpointing and restoring application processes. To checkpoint and restore, CRIU requires different Linux capabilities.
To perform an application process checkpoint, CRIU requires the following Linux capabilities:
CHECKPOINT_RESTORE
- This capability was added in Linux 5.9 to separate checkpoint/restore functions from the overloaded SYS_ADMIN capability.SETPCAP
- Allows CRIU to drop capabilities from the restored application process. Because the CRIU binary is granted more capabilities, likeCHECKPOINT_RESTORE
, it needs these capabilities so it can drop such capabilities from the final process that CRIU restored.SYS_PTRACE
- A powerful capability needed by CRIU to be able to capture and record the full process state. This capability is necessary only when CRIU is checkpointing an application process. For the restore process, only theCHECKPOINT_RESTORE
andSETPCAP
capabilities are necessary.
Building an InstantOn application image
Two options are available to build an application container image that uses InstantOn:
Add a special
RUN
instruction at end of aDockerfile
orContainerfile
that runs the checkpoint.sh script to perform an application checkpoint at container image build time. This option requires you to use Podman.Use a three-step process to build the application image, run the checkpoint, and commit the final result into an InstantOn application container image. This option requires you to use either Podman or Docker version 23.0 or later.
To run the checkpoint.sh
script, you must use Podman to build the application container image. Currently, you cannot use Docker to build the InstantOn application container image because Docker does not provide a way to grant the container build the necessary Linux capabilities. To use Docker to build an InstantOn application container image, you must follow the three-step build process.
Building the InstantOn image with Podman and the checkpoint.sh script
You can use the checkpoint.sh
script to perform the application checkpoint by adding the RUN checkpoint.sh
instruction to the end of your Dockerfile
or Containerfile
file. The execution of the checkpoint.sh
must be the last RUN
instruction during your container image build. This configuration performs the application process checkpoint and stores the process data as the last layer of the application container image. Currently, this script requires you to use Podman rather than Docker because Docker cannot grant the necessary Linux capabilities.
The following image template example uses the kernel-slim-java17-openj9-ubi
tag to build an image that uses the latest Open Liberty release with the IBM Semeru distribution of Java 17. This example uses the afterAppStart
checkpoint option.
FROM icr.io/appcafe/open-liberty:kernel-slim-java17-openj9-ubi
# Add a Liberty server configuration that includes all necessary features
COPY --chown=1001:0 server.xml /config/
# This script adds the requested XML snippets to enable Liberty features and grow the image to be fit-for-purpose.
# This option is available only in the 'kernel-slim' image type. The 'full' and 'beta' tags already include all features.
RUN features.sh
# Add interim fixes (optional)
COPY --chown=1001:0 interim-fixes /opt/ol/fixes/
# Add an application
COPY --chown=1001:0 Sample1.war /config/dropins/
# This script adds the requested server configuration, applies any interim fixes, and populates caches to optimize the runtime.
RUN configure.sh
# This script performs an InstantOn checkpoint of the application.
# The application can use beforeAppStart or afterAppStart to do the checkpoint.
# The default is beforeAppStart when not specified
RUN checkpoint.sh afterAppStart
Use the following Podman command to build the InstantOn application container image. To grant the necessary Linux capabilities to the container image build, this command must be run either as the root
user or by using the sudo
utility.
podman build \
-t dev.local/liberty-app-instanton \
--cap-add=CHECKPOINT_RESTORE \
--cap-add=SYS_PTRACE\
--cap-add=SETPCAP \
--security-opt seccomp=unconfined .
The three --cap-add
options grant the three Linux capabilities that CRIU requires to perform the application process checkpoint during the container image build. The --security-opt
option grants access to all Linux system calls to the container image build.
Building the InstantOn image by using the three-step process with Docker or Podman
If you cannot use Podman to run the checkpoint.sh
during the container image build, you can use the following three-step process to build the InstantOn application container image:
Build the application container image without the InstantOn layer
Run the application container to perform a checkpoint of the application in the running container
Commit the stopped container with the checkpoint process data into an InstantOn application container image
You can use these steps with either Podman and Docker to build an InstantOn application image. For Docker, version 23.0 or later is required. The following examples assume that you are using Docker to build an application image that is named liberty-app
.
1. Build the application container image
Set the image template (Dockerfile
or Containerfile
) similar to the following example. This example uses the kernel-slim-java17-openj9-ubi
tag to build an image that uses the latest Open Liberty release with the IBM Semeru distribution of Java 17. This template does not run the checkpoint.sh
script.
FROM icr.io/appcafe/open-liberty:kernel-slim-java17-openj9-ubi
# Add a Liberty server configuration that includes all necessary features
COPY --chown=1001:0 server.xml /config/
# This script adds the requested XML snippets to enable Liberty features and grow the image to be fit-for-purpose.
# This option is available only in the 'kernel-slim' image type. The 'full' and 'beta' tags already include all features.
RUN features.sh
# Add interim fixes (optional)
COPY --chown=1001:0 interim-fixes /opt/ol/fixes/
# Add an application
COPY --chown=1001:0 Sample1.war /config/dropins/
# This script adds the requested server configuration, applies any interim fixes, and populates caches to optimize the runtime.
RUN configure.sh
To build the application container image with Docker, run the following command:
docker build -t liberty-app .
The resulting application container image, which is tagged liberty-app
, does not contain the InstantOn checkpoint process layer.
2. Run the application container to perform a checkpoint
Run the application container image to perform a checkpoint of the application process within the running container. The following example uses the liberty-app
application image to run the checkpoint of the application process with the afterAppStart
option:
docker run \
--name liberty-app-checkpoint-container \
--privileged \
--env WLP_CHECKPOINT=afterAppStart \
liberty-app
This command runs the application within a container and performs an application process checkpoint. The --env
option sets a WLP_CHECKPOINT
environment variable to specify the checkpoint afterAppStart
option. When the application process checkpoint completes, the liberty-app-checkpoint-container
application container is stopped and exits.
3. Commit the stopped container with the checkpoint process data
The stopped liberty-app-checkpoint-container
container from the previous step contains the data from the InstantOn checkpoint process. Lastly, take this checkpoint process data and commit it to an application container image layer by running the following commit commands:
docker commit liberty-app-checkpoint-container liberty-app-instanton
docker rm liberty-app-checkpoint-container
You now have two application images: liberty-app
and liberty-app-instanton
. Starting a container with the liberty-app-instanton
container image shows a faster startup time than the original liberty-app
image. The liberty-app-checkpoint-container
stopped container is no longer needed and can safely be removed.
Running and deploying an InstantOn application image
Special considerations are required to run an InstantOn application image locally or when it is deployed to a public cloud. The following prerequisites are required to restore the InstantOn checkpoint process.
The host that is running the container image must use Linux kernel 5.9 or greater
The Linux capabilities CHECKPOINT_RESTORE and SETPCAP must be granted to the running container
The necessary system calls must be granted to the running container
The host processor must be X86-64/AMD64
Running an InstantOn application image locally
If a host system is running the Linux kernel 5.9 or greater with the X86-64/AMD64 processor, you can run an InstantOn application image by using Podman or Docker locally. The following command runs the liberty-app-instanton
InstantOn application image with Podman:
podman run \
--rm \
--cap-add=CHECKPOINT_RESTORE \
--cap-add=SETPCAP \
--security-opt seccomp=unconfined \
-p 9080:9080 \
liberty-app-instanton
The following command runs the liberty-app-instanton
InstantOn application image with Docker:
docker run \
--rm \
--cap-add=CHECKPOINT_RESTORE \
--cap-add=SETPCAP \
--security-opt seccomp=unconfined \
-p 9080:9080 \
liberty-app-instanton
In both cases, the --cap-add
option grants the CHECKPOINT_RESTORE
and SETPCAP
capabilities. The SYS_PTRACE
capability is not required to run the InstantOn application container image.
Required Linux system calls
The --security-opt
option grants the running container access to all Linux system calls. Depending on the defaults of the container engine, the --security-opt
with the seccomp-unconfined
setting might not be required. For CRIU to restore the InstantOn application process, the container must have access to clone3
, ptrace
, and other system calls. This requirement is true even though the elevated Linux capability of SYS_PTRACE
is not required to restore the process. You can update the defaults of the container engine to include all the required system calls.
Alternatively, you can specify a file with the --security-opt seccomp
option that specifies the policy for the container. Use the following command to specify a JSON policy file for seccomp
:
podman run \
--rm \
--cap-add=CHECKPOINT_RESTORE \
--cap-add=NET_ADMIN \
--cap-add=SYS_PTRACE \
--security-opt seccomp=criuRequiredSysCalls.json \
-p 9080:9080 \
liberty-app-instanton
The resulting criuRequiredSysCalls.json file grants access to all the Linux system calls that are required by CRIU to restore an InstantOn application process.
Recovering from a failed InstantOn restore
If restoration of the InstantOn application process fails, Open Liberty starts the server without using the InstantOn checkpoint process. In such cases, the Open Liberty application starts as if no InstantOn checkpoint process layer exists, which takes longer than a successfully restored InstantOn process. This recovery launch from a failed InstantOn restore can be disabled by setting the following environment variable:
CRIU_RESTORE_DISABLE_RECOVERY=true
After you build an InstantOn application container image, you can verify a successful restore by setting this environment variable to run locally. For example, you can run the following Podman command:
podman run \
--rm \
--cap-add=CHECKPOINT_RESTORE \
--cap-add=SETPCAP \
--security-opt seccomp=unconfined \
--env CRIU_RESTORE_DISABLE_RECOVERY=true \
-p 9080:9080 \
liberty-app-instanton
To avoid cloud environments from continuously trying to restart the failed start of an application container image, the default value of the CRIU_RESTORE_DISABLE_RECOVERY
variable is false
.
Deploying an InstantOn application to Kubernetes services
Currently, Open Liberty InstantOn is tested and supported on the following public cloud Kubernetes services:
Other public cloud Kubernetes services might also work if they have the prerequisites to allow the InstantOn application process to restore.
When you deploy to Kubernetes, the container must be granted the CHECKPOINT_RESTORE
and the SETPCAP
Linux capabilities to allow the InstantOn application process to restore. You can configure these capabilities in the deployment YAML file by specifying the following securityContext
for the container:
securityContext:
allowPrivilegeEscalation: true
privileged: false
runAsNonRoot: true
capabilities:
add:
- CHECKPOINT_RESTORE
- SETPCAP
drop:
- ALL
Open Liberty InstantOn supported features
InstantOn supports a subset of Open Liberty features. If a feature is enabled that InstantOn does not support, a failure occurs when you try to perform a checkpoint of an application process. InstantOn supports the following Jakarta EE and MicroProfile convenience features:
You can individually enable the Open Liberty public features that ae enabled by the Jakarta EE Web Profile and MicroProfile features, depending on the needs of your application. This option avoids enabling the complete set of features that are enabled by the convenience features.
In addition to the features that are enabled in the convenience features, InstantOn also supports the following features:
For more information about limitations, see InstantOn limitations and known issues.