Journal consumes too much disk space and causes slow application performance

Symptoms

With a default FileJournal configuration in place, depending on the activity on the repository, over time, many Journal log-files will be created. This eventually may cause a disk space issue and performance problems in applications that use CRX.

Cause

The default configuration of the Journal theoretically allows for an unlimited number of rotated log-files.

Resolution

Depending on the setup, the configuration differs slightly and is done in the repository.xml. For details on the individual configuration parameters, please refer to the Journal configuration article.

NOTE: please backup the instance before applying any changes!

CQ / CRX version Location
5.2.x / 1.4.2 crx-quickstart/server/runtime/0/_crx/WEB-INF/repository.xml
5.3 / 2.0 crx-quickstart/repository/repository.xml

In addition to the reconfiguration of the FileJournal, the existing Journal needs to be cleaned up. Please note: as a prerequisite for clustered CRX environments it is strongly advised to let all cluster-nodes synchronize fully to the same revision as the master before applying following changes:

  1. stop (all) CRX instances
  2. if you have a multi node cluster then verify that all nodes are synchronized to the current revision.
    1. For each cluster node, verify that the revision number in crx-quickstart/repository/revision.log matches crx-quickstart/repository/shared/journal/revision
    2. The revision numbers are stored in binary so you can use a diff command in Linux to compare them. In Windows you can use a binary capable diff tool such as WinMerge
  3. delete ALL files below crx-quickstart/repository/shared/journal
  4. delete crx-quickstart/repository/revision.log
  5. apply Journal reconfiguration
  6. start (all) CRX instances


Non-clustered environment

In a non-clustered environment where CRX is running standalone, it is recommended to configure the maximum size of a Journal log-file to 100MB and limit the number of allowed files to 1. This is more than sufficient for such a setup.

<Journal class="com.day.crx.core.journal.FileJournal">
   <param name="sharedPath" value="${rep.home}/shared"/>
   <param name="maximumSize" value="104857600" />
   <param name="maximumFiles" value="1" />
</Journal>


Clustered environment

If CRX is running in a cluster with other CRX instances, it is recommended to adjust the number of rotated Journal log-files. The Journal basically keeps track on every save operation of the repository, so depending on how long a slave cluster-node can be offline before being able to catch up with the latest revision of the master cluster-node instance, following 2 parameters can be configured to match requirements:

  • maximumAge : maximum age of a Journal log-file before it gets removed
  • maximumFiles : maximum number of allowed log-rotation


<Journal class="com.day.crx.core.journal.FileJournal">
   <param name="sharedPath" value="${rep.home}/shared"/>
   <param name="maximumSize" value="104857600" />
   <param name="maximumAge" value="P1M" />
   <param name="maximumFiles" value="10" />
</Journal>

Applies

CRX1.4.2, CRX2.0