Wednesday, August 17, 2016

CQ upgrade and challenges

Introduction

This paper is about experiences and learnings of upgrading CQ 5.4 to AEM 6.1 for a leading telecommunication devices company. The way the CQ implementations are done and the upgraded version’s new features/issues can together magnify the problems many times and make the upgrade complex and time consuming. But with these learnings, you should be able to control the situation and pull out your upgrades successfully. This document assumes that the reader has basic to medium CQ/AEM architecture and development knowledge so as to relate to the challenges described here easily. Thus it does not delve too much into AEM architecture and design aspects.

AEM 6.0 and earlier versions

CQ 5.6 and earlier versions run on CRX2 repository. AEM 6.0 optionally can be run on CRX2 repository but with AEM 6.1, the option is removed and the repository has to be migrated to Apache OAK before upgrading to AEM 6.1.

Pre upgrade/Upgrade process

Clean up your source and purge all the workflows and audit trails. Reduce the repository footprint as much as possible. Run repository checks and compaction tools and watch out for any errors during these runs.
It is recommended that you open a support ticket with Adobe for your upgrade project and have a communication channel open with them to work on any issues related to your upgrade. Most of the issues are known issues (to Adobe support) and you can leverage that knowledge without having to re-invent the wheel.

Example
Even simple steps found in AEM documentation for upgrade failed without showing proper errors in the logs. We had to get some work arounds from Adobe support in order to proceed with upgrade. In fact we had to work with Adobe support to come up with list of customized steps for our upgrade from CQ 5.4 to AEM 6.1

POST upgrade

Out of the box features

Multi-Site Manager (MSM):

This is the basic feature of AEM which you may assume that it should be backward compatible but watch out for issues with triggers for rollout and changes to rollout configurations.
Some of the features like rollout of page for a rollout configuration (like on modify or activation) may not work after upgrade:
Example: We had “Same name sibling” issues that prevented the Rollouts. We used a tool provided by AEM support to remove the same.

Workflows  

DAM Workflows

AEM 6.0 and above introduced Dynamic Media. By default the feature is disabled. Please check out if you need the Dynamic media related features to be enabled. There are few things like few of the OOTB DAM workflows are modified and may create issues if Dynamic media is enabled/disabled.
Example
We had issues with ‘Update Asset’ workflow model which had ‘Create PTIFF’ step which involved some third party tool to create the PTIFF renditions as part of Dynamic media features. This step used to fail whenever there was an asset upload and many workflows went into ‘stale’ mode. The issue was due to the tool not being compatible with our OS version (Solaris 10) and we had to disable the step in the workflow model to get rid of this problem. Later Adobe support confirmed the issue and confirmed that this workaround is the correct solution for the issue.

Custom Workflow implementations

Resource resolver new admin sessions are required for every workflow process since Apache OAK does not allow the same session to do multiple operations on a JCR.
Example
It is common to use the login session in the workflow processes which may queue up the operations on JCR due to common session used.

Indexing and Search

With Apache OAK, the default indexes are removed and it is the responsibility of developers to index the content as required. The AEM 6.1 version offers support for multiple search/indexing engines like Lucene and Solr, which can be configured as per the requirements of the project.

Queries related issues

The queries are much stricter and may yield different results after upgrade. This is due to the cost comparison done by using available indexes and selecting the index using lowest resources (time and space).

OOTB Search implementation

CQ API implementation for SimpleSearch and its usage provided different/irrelevant results after upgrade (from CQ 5.4). Again it may be due to various reasons like the queries being much crisper and indexes involved during search. We had to redo all the implementation with custom queries with ordering based on JCR properties for relevant results.


Customization

Overlaid components issue

The overlaid components may not work properly after upgrade due to the changes in the underlying foundational components. Copy the latest foundational components and start adding your features on top of it again. But remember to check and take advantage of the new features in the upgrade.

CQ/AEM API changes

If you are using any of the custom code for workflow, services or UI related functionality, then check for API changes in the upgraded version and related impact on your code.
Example:
Deprecated or removed API calls should be replaced with newer ones. Javascript frameworks related to widget development (Coral UI or Granite framework) may need some changes to your custom JS code like Custom XTypes.

 

Third party libraries and related issues

The third party components or libraries that were used may need to upgrade themselves after AEM upgrade. This is due to the changes in the AEM API and related issues.
Example
We were using a Translation engine plugin called ‘Claytablet‘. This plugin itself has newer versions that were required to be updated to be compatible with upgraded AEM version.
JQuery or Bootstrap versions may need upgrade.

Operational

Operations like Tar optimization/compaction, workflow/Audit trail purge schedules have some default settings in upgraded AEM version which may interfere with your work when working in offshore/onsite model for 24 hours a day. Check how you can prevent such issues by pushing the maintenance windows to weekends timings when both offshore/onsite are not working.

Health checks via dashboard

Check for advanced dashboards to see basic operational issues with the new AEM upgrade and work towards removing them.
Example:
We saw some slow queries and slow page loading times which we analyzed and fixed by introducing OAK indexes and optimizing code wherever applicable.

Security

The upgraded instance has strict ACL policies on /etc/ folders. If you rely on permissions on these folders, then create ACL packages suitably to rewrite the permissions as required.

Security checklist

Use the security checklist in AEM documentation and tick off as many items as possible. Some items may need to be addressed case by case basis and appropriate action need to be taken as per your project requirements.
Example:
We had to defer some of the security and JCR repository hotfixes so as to not to interfere with our upgrade release.

New Features of AEM upgraded version

Check on the latest features available in the upgraded version. Compare your customizations and see if they are still required or the new features can be leveraged.

Analyze Touch UI vs Classic UI requirements

Check if you want to use Touch UI immediately after upgrade or want some more time for business to gradually get there. The conversion of your old components to be touch enabled is not straight forward and it involves running of dialog conversion tools and testing of the components recursively to see nothing is broken.

Sightly (now HTL) development paradigm

Adobe introduced a HTML template based language for component development and it is recommended to keep your presentation logic and business logic separate. HTL is much very restrictive compared to JSP or ECMA used before for the good reasons. Thus you may have to accommodate for moving towards this development paradigm.




References


AEM 6.1 official upgrade documentation - https://docs.adobe.com/docs/en/aem/6-1/deploy/upgrade.html