Excluding Ghost components from getting indexed
AEM version: 5.6.1
Component: Sites – MSM
Business Impact: End user receives wrong search results and retrieves information that should be hidden for him.
- Set up a blueprint and a livecopy with some content in the MSM module.
- Delete any element (eg. A text component) in the livecopy.
- Perform a full text search for any text in the deleted text component using the Query Builder API or the query tool in CRXDE lite.
- The deleted component content is returned as part of the search results because a ghost node is present, while it is not expected to be found.
There are two solutions to this issue. Both of them are discussed as below:
- Upgrade to AEM 6.0This issue is fixed in the OOTB AEM 6.0 implementation so that when the livecopy component is deleted, any of the text contained in it is not returned as part of the search results. This issue seems to be only present in AEM 5.6.1 or earlier versions.
- Modify the indexing configuration in the current AEM version.If upgrade to AEM 6.0 is not possible, then the index configurations need to be modified so that the deleted component is not searchable.
The deleted livecopy component has a sling:resourceType property having the following value
This can be used to create a rule in the custom indexing configuration file that excludes nodes having sling:resourceType property value of Wcm/msm/components/ghost from getting indexed.
For that we need to create a custom indexing configurations file and perform the following steps:
- Copy the attached file in the the following location : crx-quickstart\repository\workspaces\crx.default
- Modify the workspace.xml by providing the path for the indexing configuration file as below :