Thursday, February 03, 2022

OIC: Parallel Gateway and Multi-Threading, the Work-Around

Since using BPMN I have ran twice in the situation where I noticed people confusing the BPMN semantics of Parallel and Inclusive Gateway with the runtime behavior of the engine. In this article I will explain the difference, and how to implement a kind-of 'multi-threading' in Oracle Integration (OIC) Structured Process.

In BPMN 2.0 the Parallel Gateway is a modeling concept and it does not say how it should be implemented by the BPM engine vendor. It might surprise you that not only (current) Oracle Integration but also BPM process engines like Activiti and Camunda do not execute parallel activities exactly at the same time (in parallel threads). Instead they typically wait for each activity to reach a wait state (like a User or Receive activity) before the next one executes and there are some good reasons for that.

First let me explain what Parallel Gateway (and Inclusive Gateway for that matter) in BPMN 2.0 means. From the OCEB Certification Guide, Second Edition, paragraph 6.6.2 (OCEB 2™ is a professional certification program developed by the Object Management Group):

If the sequence flow is split by a parallel gateway, each outgoing sequence flow receives a token. Conditions in the sequence flows are not permitted. The parallel gateway waits for all tokens for synchronization. In this case, the number of tokens corresponds to the number of incoming sequence flows. It is not specified whether the activities A, B, and C shown in the example of Figure 6.39 are executed at the same time. It is ensured, however, that the flow at the parallel gateway is not continued until all three activities have been completed and the tokens arrived.

As you can see, as far as Parallel Gateway is concerned BPMN 2.0 does not imply multi-threading. As a matter of fact BPMN 2.0 does not specify how vendors should implement their engine at all, other than it should comply to the BPMN 2.0 semantics. 

Now I'm not from Oracle Product Development but when I would stand in their shoes, my reason for not supporting multi-threading (at least not unlimited) in a Cloud-native offering like Oracle Integration would be that you don't have any control over the amount of threads the customer's applications might instantiate. Obviously that implies there can be surges in memory usage and CPU that might compromise overall performance or even stability of the environment and with that of the SLA you want to be able to maintain.

Besides performance related arguments there are also some logical issues. All flows potentially could update the same entity at the merge and arrive there at the same time, which implies that some locking pattern is needed to prevent deadlocks at the merge or (alternatively) the tool should let the developer decide and configure how the merge of data changes should take place. That on its turn comes with performance or complexity challenges of its own.

Just to be clear: I'm not claiming there are no BPM engines that support multi-threaded Parallel Gateways, only not the ones I know. By now I hope you understand why they might not and if they do there probably are consequences as resources simply are not unlimited.

And still, you may have a use case where you need some sort of 'multi-threaded' parallel execution. For example, a Structured Process is started from a UI and the user expects a user task to be scheduled within a few seconds. Or there are 2 user tasks in a row, both assigned to the same user who expects a seamless transition from one to the other ("sticky user" that was called in Oracle BPM 10g). In between multiple services need to be called that are not all that quick. When Parallel Gateway and - by the way - also Integration do not support out-of-the-box multi-threading, how to achieve that anyway?

(Spoiler alert!) The short answer is by calling each synchronous service from its own asynchronous Service Handler, where each Service Handler does a synchronous service call and have the individual flows of the Parallel Gateway calling these Service Handlers.

To elaborate on it, let me start with explaining that Service Handler is a pattern I use for calling a Service (Integration) from a Structured Process of its own. I will explain it better some other time, but for now it is good enough knowing that the Service Handler does nothing else then calling one service and handle any fault it may raise. By doing so you prevent that the technical complexity that might be involved in exception handling is exposed in the main flow, and as a bonus a Service Handler with complex exception handling can easily be copied and reconfigured to call another service, so it is also a development productivity booster. I rest my case.

Normally a Service Handler is implemented as a Reusable Subprocess but to achieve some sort of parallelism we will use a Structured Process with a Message Start and Message End event, which I call "Process as-a Service" and that is called from the flows in the Parallel Gateway with a Send/Receive activity combination (as in the picture above).

The beauty of it is that it is a piece of cake to transform a Service Handler from a Reusable Subprocess into a Process as-a Service and likewise, changing the call to it is also very easy.

Reusable Process versus "Process as-a Service"

Now the trick of achieve parallelism this way, is that a Receive activity implies a wait state which means the Receive itself happens in a new transaction. The process engine will first execute the Send activity, right after that schedule the corresponding Receive activity and in the same thread go to the next flow of the Parallel Gateway until all of them are in their Receive activity. And then it is ready to start receiving responses.

So all Send activities are still done sequentially but in the meantime for each one of those its Service Handler can start calling the synchronous service. And because that happens in a process instance of its own they are done in parallel. Once a synchronous service call is done, the Service Handler does the callback to the Structured Process with the Parallel Gateway. As simple as that!

You now may wonder if and how this solution would compromise performance and stability of OIC, so what is the caveat? Under high load there will be a performance penalty for sure. The engine must instantiate and handle an extra process instance for every asynchronous Service Handler and at some point that will start to impact overall performance. However, it should not impact stability. The reason is that Send and Reciece activities are message-based, meaning the message send to and by the Service Handler are put on a queue from which the engine can pick it up as soon as it can find the time. That is how resource exhaustion is prevented.

Big question now is: when would this work-around start to be interesting to you?

I have done some performance testing and found that pushing logic to a Reusable Subprocess adds some 40ms compared to executing the same logic in the main thread. For me this is small enough for not needing to think about whether I should use a Service Handler or not, not even when performance is key. So I always do that. However, when comparing a Process as-a Service to a Reusable Subprocess, the first one adds more than 100ms of overhead. So when there are not that many parallel flows and when the services are quick enough, this overhead will not justify doing them asynchronously. However, there will be some turning point where synchronous handling is going to be outpaced in spite of the overhead.

There are 2 dimensions impacting that turning point. The first being the amount of services to call. In case of synchronous calls the Parallel Gateway will never be faster that the sum of the processing times of the individual services. In case of asynchronous calls it will never be faster than the time needed to initiate all Send and Receive activities plus the overhead for using Send/Receive plus the time used by the slowest service. Which points to the second dimension, being the processing time of the service calls. The more parallel flows involved or the longer the slowest service takes, the more attractive it becomes to do them asynchronously.

I have created a test application with a setup where I have 4 parallel flows in my gateway. I use an Inclusive instead of Parallel Gateway. Performance-wise that does not make a difference but provides me the option to execute 1, 2, 3 or 4 flows at the same time. All flows call the same, synchronous Integration which on its turn calls an external service that waits a configurable amount of time before returning a response. In this way I can vary on both dimensions as I wish.


As you can imagine I could have spend days trying out all kinds of combinations (they are countless). Like you I also have many other things to do, so I limited myself to try out and blog about 2 test cases that I ran in a quite hour while making sure my wife was not watching Netflix or something. Just to give you some impression where such a turning point might be. These test cases are:

  1. Initiating 4 parallel flows, each Integration call with the same delay,
  2. Initiating 4 parallel flows, 3 Integration with a small and 1 with a larger delay (the "slowest service").

And then I played with that until I found the point where synchronous and asynchronous were about as fast.

What I found during test 1 is that with 4 parallel flows and the Integration having an average response time of 150ms, that both performed about the same. When the Integrations were faster (e.g. 140ms), synchronous execution was quicker and when they were slower (e.g. 160ms) asynchronous was quicker. 

What I found during test 2 is that with the 3 quicker ones having an average response time around 125ms, from 290ms and higher for the slow one, the asynchronous option started to outpace the synchronous one.

Like I said, there are countless combinations I could have tried regarding the amount of parallel flows and the response times of those and there also other aspects to consider, like the performance under stress of both the process and the services. So in practice you will have to load test your solution to find out what performs best in your case. Just remember, when using Service Handlers it is very easy to switch from one option to the other. What a great pattern is that!

No comments: