Grabbit: A Content Sync Solution

Have you ever had issues with migrating large amounts of content from one AEM instance to another? Maybe you are upgrading to a new instance or just trying to keep your various environments (Dev, QA, Prod, etc.) in sync. The traditional way of doing this, of course, is using Content Packages (the .zip packages) or vlt rcp for large amounts of content. But these existing solutions have been found insufficient in terms of both time and space efficiency. Most of the existing solutions use WebDav technology with XML serialization, doing HTTP handshakes for every node. This means that any latency whatsoever on the network hurts the content sync performance enormously.

This is where Grabbit comes in. Grabbit was developed as a content sync solution for one of our clients, with iCiDIGITAL members being the primary contributors. Initial releases for Grabbit proved to be extremely useful for the client. This lead to Grabbit being ultimately open sourced.

Callout (1)

How Does it Work?

Grabbit tackles the problem of content sync a bit differently. Instead of high frequency HTTP handshakes between the source and destination (which for eg: vlt rcp typically uses), Grabbit creates a stream of data using Protocol Buffers. This also allows us to avoid latency issues mentioned above. A major feature that sets Grabbit apart is the Monitoring feature. This feature lets the Grabbit user monitor and validate what content sync jobs are currently running and what there status is.

GRABBIT graphGeneral Layout

There are two primary components to Grabbit. A Grabbit Client and Grabbit Server that is run in two AEM instances that you want to copy to and from (respectively). The Grabbit Client makes a request to the Grabbit Server to fetch content. This is done by creating a Grabbit Configuration File. This grabbit configuration file is provided by the user of the tool that instructs the Grabbit Client to then go talk to Grabbit Server to fetch content from it. The configuration file can be supplied to the client in either JSON or YAML format. There are several configuration parameters that can be set in the configuration file. You can find a lot more details around that here.

More on Grabbit Monitoring

As briefly mentioned above, Grabbit allows you to monitor the content sync jobs. Under the covers, Grabbit uses Spring Batch and uses its Querying Features to monitor the status of a job. A Job in Grabbit is essentially a Spring Batch Job that is used to sync one path specified in the Grabbit configuration file. There can be one or more Jobs in Grabbit based on whether there are one or more paths configured in the configuration file. Using the monitoring API, the Job status is represented in JSON format as below:

Screen Shot 2017-02-03 at 10.09.16 AM

You can find more information about monitoring here.

Where to go Next?

Grabbit is supported for most of the newer versions of AEM and continues to be under active development. Its release details and artifacts are located on Bintray. Since Grabbit is open sourced, you can always submit issues / questions / comments on the Github project. We hope you give Grabbit a shot!!

 

Sagar Sane

Sagar Sane

Sagar Sane is a Technical Architect at iCiDIGITAL with over five years of professional web development experience. He has 4+ years of experience working with AEM / CQ; primarily focused on server side development and integrations of AEM with other systems. He also works closely with the sales and architecture teams to help out on various efforts - including project estimations / scoping, designing and developing technical implementations for AEM projects, etc. In his free time he enjoys music, cooking and watching sports - especially Cricket.