SyntaxHighlighter

Saturday, August 17, 2013

Spring Batch High Availability

Spring Batch High Availability


here's a quick project i recently threw together to demonstrate how to setup a high availability cluster for processing Spring Batch.  HA doesn't mean what you typically think, but in a batch environment, it allows for a configurable period of time in which should a job 'stop running' (that is, seem to be running but the underlying JVM has failed), the job will be 'failed' and restarted immediately on another cluster. the design is as follows;

Master/Heartbeat Consumer/Launcher

Overview

- JVM that starts Jobs on itself or other nodes
- Heartbeat Consumer - runs a TCP socket server that listens for 'heartbeats' from other JVM's
- Is responsible for failing over a job to another server should the job fail where it is

Key Components

- MessageChannelJobLauncher - uses Spring Integration to serialize the Job's name and it's parameters and send it out as a start request
- BatchHAService - service to be periodically invoked and check for that executions registered are still running/stable - responsible for failing and restarting jobs that become 'ghosts'
- BatchHeartbeatConsumerService - responsible for consuming heartbeats and registering job execution ids

Client/Heartbeat Producer

Overview

- JVM that actually runs a Job
- produces Heartbeats - a list of registered/in-flight job execution ids
- uses a RemoteJobRegistry to persist job names in a shared batch database

Key Components

- JobExecutionRegisterListener - wraps any jobs and registers the job execution id in the current JVM
- JobExecutionRegisterListenerPostProcessor - responsible for implementing the Listener on all Jobs
- RemoteJobRegistry - wraps the 'JVM local' JobRegistry and persists in a shared database, any Job names on the local JVM
- BatchHeartbeatClientService - responsible for accepting jobExecution registers and publishing

The infrastructure pieces are;
- Job "start" requests are sent via JMS
- there is a shared Job Database (batch database) that includes a new table, Job_Entity for persisting job names
- TCP server/client implemented in Spring Integration to publish the Job Execution Ids as part of the heart beat

the code will be up on github soon and i'll flesh it out properly, but it really is magic what Spring Integration can do.


1 comment:

  1. Do you have the code you mentioned in github? Could you please share your github repository details?

    -Lahiru

    ReplyDelete