Tuesday, March 21, 2017

Oracle BPM: Hiding Faults from BPM? Don't use Service Activity!

In the following I explain how you can hide faults from BPM by not using (synchronous) Service activities, but (asynchronous) Send/Receive activities instead.

When calling services from a BPM process, you should think about where you want faults to show up and to be handled. This is specifically of interest when you have some integration layer between your BPM processes and external services that you call to abstract the external services from the BPM process. Let's call this layer the Service Layer. I have seen such a layer in various formats, ranging from a Reusable Subprocess, a BPEL process in the same composite as the BPM process, or a BPEL process in a separate composite, or instead of BPEL a Mediator. You may have such a layer to hide technical details from the business process, to cover some sort of custom exception handling, or to hide the message format from these external services from the BPM process (or a combination of all that). The latter might be because you don't have the luxury to do message transformation in a service bus.

In case the BPM process calls the Service Layer through a (synchronous) Service activity and that fails, then this will result in the main BPM instance to get into an errored state, and you will have to handle the error in the BPM process. This behavior might be exactly what you wanted to prevent with the Service Layer, for example because the Service call is in a parallel flow and you want to be sure that the fault does not impact processing of the other, parallel threads.

The following example shows what happens. It concerns a main BPM process, that calls synchronous ServicePS from the Service Layer, which on its turn calls some other ServiceA that (finally) calls a FailingService that always fails. The example is a bit over complicated because I configured a fault policy in the synchronous services. You may be aware that I wrote some other article explaining that this is not a good practice, but when creating this example I did not had that insight yet ;-) So bear with me and just ignore these synchronous services still being in a "Running" state after they failed.

The following shows the synchronous BPEL of the ServicePS.

Because the whole chains of calls is synchronous from beginning to the end, you will see that all synchronous services have the "Faulted" state. Because of the fault policy in the BPM (the only one that makes sense in this case) it is still running, but because the fault bubbled up to the BPM instance that shows the error as well.

Now lets refactor this to a solution where the Service Layer will hide the fault from the BPM process. To do so, all calls from the BPM process to the Service Layer will have to be asynchronous.

The following shows the asynchronous BPEL of ServiceAsyncPS_NP. 

Learning from my earlier mistake with the fault policy, this asynchronous service now is the only one in the chain with a fault policy. Because the FailingService failed, also the (synchronous) ServiceA_NP failed. But because ServicePSAsync_PS is asynchronous, that is where it stopped.

The error can be recovered from there, and in the meantime, the BPM process runs like there is no cloud in the sky.

Because of the asynchronous nature of the ServiceLayer, this is not a decision you should take lightly. For example, statefull BPEL cannot be migrated, so any error in it cannot be fixed for running instances. It therefore might not be the silver bullet you were looking for.

1 comment:

Herberson Silva said...

My "silver bullet" for this case is the "Fault Management Framework" (https://docs.oracle.com/cd/E23943_01/dev.1111/e10224/bp_faults.htm#SOASE481).
That way there's no changes on BPMN process.
I still can use "service activity" and the Enteprise Manager for recovery process instances on error state.