Saving Legacy Data in AEM

AEM | April 11, 2017

Migrating content from one system to another system, whether it’s AEM or another content management system (CMS), is a challenging journey for many reasons. Just making decisions on what is or is not relevant is incredibly tricky and time consuming. But with the right processes in place, including saving your legacy data, migration can be a smooth transition.

Categories of Data

When migrating data, it’s typical to see anywhere from 100-400 different fields, each with its own significance to the new system. All of this information, regardless of which XML is used, can be broken down into two categories:

  1. Author generated: content that is added or edited by the authors within a CMS
  2. System generated: content that is created by the CMS

Author Generated Content

During migration, all author generated content needs to be put into the new system because it is what makes up the majority of the page content on a site. Some author generated content might need to go through a cleanup and/or transformation process before migrating it over to AEM. All of this content will be need to be made accessible to authors, either as part of the page properties or as part of a template that was also migrated over.

Some examples of author generated content include:

  • Title
  • Keywords
  • Descriptions
  • Topics
  • Article date
  • Text area and any other relevant content

System Generated Content

System generated content is content that is added by system; either as a result of an action taken by an author, or due to the nature of the system in place.

Some examples of system generated content as a result of an author’s action are:

  • Template value based on template selected by author
  • SystemPublishedDate based on when author published specific page
  • Serverurl and Pageurl is based on where page was authored

Some examples of system generated content that is created without an author’s actions include:

  • Metainfo which is just an amalgamation of various content within a page that has no significance within the target system
  • Level which identifies which level page is on
  • Encrypted values within a field that have no significance in the target system

Migration Takeaways

Always migrate over fields from the legacy system into JCR so that they are available if needed for any future improvements or reporting. Ensure all reference legacy fields only start with a standard prefix like “old” or “legacy”  (EX: oldPageUrl or oldTemplate, etc. The screenshot of a sample JCR node shows all system generated values with a prefix of “old”).

JCR Screenshot

During migration, you should always clearly identify that these are non-authorable fields; meaning they should not be available in any page content or page properties. By doing this, you will avoid any accidental manipulation that your authors might make by mistake.

In Conclusion

Storing legacy data is a solid contingency plan for helping to resolve bugs or issues that may arise after the migration process has been completed. You never know when or why you might need to reference old fields or data on the new site, and having the information organized and clearly labeled makes the process much easier.

Divya Kandikatti is a Lead Business System Analyst at iCiDIGITAL. She works with clients throughout their AEM journey, working closely with the business, dev and QA teams to ensure delivery of a quality solution. She has 7+ years of experience gathering requirements for web-based applications, APIs, and CMS implementations. In her free time, she enjoys movies, traveling, and trying out new things like kickboxing and acrylic painting.