Friday, July 23, 2021

How Granular Should My Microprocesses Be?

As with all modularization principles, finding the right granularity is not always trivial and the more important. Some of us have seen complete projects fail because of getting this wrong. The Microprocess Architecture is no exception to this rule and in the following I discuss this topic, hoping to guide you in getting it right.

As explained in the article introducing the Microprocess Architecture the rationale for applying it consists of a combination of reducing impact when implementing new features and bug fixes, ease applying them to an already deployed business process, supporting parallel development and few others. Said differently and in one word: agility.

To correct the mistake made in the introducing article of not defining what a microprocess stands for:

A microprocess is a subprocess of a larger business process, where the subprocess spans the execution of one or more activities to reach a business significant state change of the business process, and which can be developed and deployed as a stand-alone component.

This definition implies that the scope of a microprocess has business visibility. However, as such that does not yet clarify the right granularity. Too coarse-grained and there is a risk of not delivering on the core value of agility, too fine-grained and you risk issues with performance and scalability.

 

Too coarse grained
 

 

Too fine grained
 

So, what is the right granularity? First let me try to illustrate by example. I then capture some of the main characteristics that should give you guidance on how to apply it for your use case.

Order handling example

An order handling process of a bank, that starts with a customer submitting an order form and ends with invoking one or more back-end systems to handle the delivery, could consist of the following microprocesses:

·        Customer checks: execution of several checks to determine if the bank can and should provide the product to the customer (e.g. criminal record check, credit check, etc.). This could involve orchestration of several calls to back-end systems and even services external to the bank and may involve human intervention to deal with the situation that one or more checks fail (alternate scenarios). The state reached by this microprocess is “customer validated”. In the happy scenario where all checks succeed, all is done in a time span of seconds. But when one or more checks fail and some bank employee must decide, it could take hours or even days (especially with a weekend in between).

·        Generate quote: determining the price and conditions for the ordered product(s). For a customer order for a combination of a current account, a savings account and a credit card this could involve the orchestration of calls to 3 different back-end systems (to get the price and conditions of each individual product) plus a call to some business rule to check if it is a valid product combination, another business rule to determine if some discount should be applied and finally a call to some service to retrieve the conditions for the order. The “quote generated” state is reached when the price and conditions of the order are presented to the customer, either online (in the same session) or by sending a link to some secure inbox to review it later (when the session is already closed). In the happy scenario all is done in a time span of seconds.

·        Sign order: signing of the order by all required signers. This can be anything from signing by one individual for a private account, up to some board of directors of a company in case of a business account. In the latter case the time span might range from minutes up to days or even weeks. The “order signed” state is reached when the order is signed by all signers.

Finalize order: execution of steps to persist all data, determine delivery dates, send an order confirmation to the customer and initiate the back-end system(s) to deliver the products.  The “order finalized” state is reached when order delivery has started. In the happy scenario this is done in a time span of seconds.

At a higher level there will be a process in which you can clearly recognize each individual step, be that as activities in a structured process flow (BPMN), or as activities in a case management application, like below:

When drilling down in any of the activities you might find a relatively complex structured process model. For example, the structured process backing the Sign Order activity generates the order agreement, handles multiple signers that may sign over a somewhat longer period of time, includes a loop for sending reminders after some deadline has been reached, and handles order cancellation when it has expired.


All states can be reached by the higher-level process in a timespan of a few seconds to minutes, which qualifies it as near “straight-through processing” (near-STP). But in case of issues like some external system being unavailable, human intervention by an Applications Administrator may be required which might not even happen the same day.

Microprocess characteristics

The following characteristics can help with determining the right granularity for your use case:
  • All activities in the same microprocess are tightly coupled to achieving a state of the process that is meaningful to the business. Put differently: when you can’t explain its purpose to a businessperson, it’s not a microprocess.
  • Although the happy scenario might concern near-STP, a microprocess typically involves human intervention for handling alternate or exception scenarios(*). When no human intervention of any kind is applicable, it’s not a microprocess. Therefore, microprocesses are asynchronous by definition.
  • Processing of the average microprocess has a timespan ranging from seconds to days (in case of human intervention). Weeks are very rare exceptions. Therefore, the chance of a future need to “patch” an in-flight microprocess instance (that is: migrate it from one version to the next) is minimal if not non-existent. When in-flight instance migration is expected to be commonly required, it’s too course-grained so don’t implement it as a microprocess.
  • With very few exceptions a microprocess can be replaced by a newer version without impact on any of its peers. There can be some impact on the higher-level process, which tends to be restricted to its interface (for example an extra id that needs to be passed on). Vise verse, changes at the higher-level process level do not impact a microprocess. When a change has a high probability of impacting a peer it’s too fine-grained and it implies they should be part of the same microprocess.
  • A microprocess is an autonomous deployable unit and can be deployed on a different tier than the higher-level process. Moving a microprocess from one tier to another will only have impact on the endpoint used by the higher-level process.

(*) Alternate scenarios in the end reach the same result as the “happy scenario” but in a different way. Exception scenarios are those when things go wrong and someone (typically an Applications or Systems Administrator) must intervene to put the process back on track.

As the goal of business process automation is to reduce human intervention, in the end the result might be a process without any human task (mind this is still a business process). Consequently, a microprocess does not need to have human tasks but human intervention will be applicable to recover in exception scenarios.

Like in the example given, microprocesses orchestrate human intervention with zero or more (synchronous or asynchronous) “services”. The services orchestrated by a microprocess, on their turn can be build using different technologies ranging from synchronous web services (like an OIC Integration) to an asynchronous structured (BPMN) processes of its own. However, the latter does not qualify as a microprocess. It’s just another way of implementing a “service”.

Mind that not every activity in the higher-level process necessarily concerns a business process state change. Most business processes include a few “technical” activities for housekeeping kind-of logic and keep track of technical state changes (like “process initiated”, “service errored”, etc.). A typical example of such activities is some “Initiate Process” activity (the “Receive Order” in the example) which enriches and persists the data from the start request message and instantiates the internal business object the higher-level process works with. Such activities have no meaning to the business (and therefore do not concern microprocesses) but you can’t leave them out of the model either. They are typically tightly coupled to the higher-level process, tend to be part of the same deployable unit and are often implemented as structured processes to provide points of recovery in case of issues.

Friday, April 23, 2021

OIC: Working Around not Having Complex Gateway

In this article I describe how you can work around not having the Complex Gateway in OIC Structured Process. I will end with what I believe to be the best work around with respect to support for refactoring.

The Complex Gateway in BPMN 2.0 is one of the least used features of BPMN. However, when you find a use case for it also may find that there is no alternative way to model it, or not an easy one. The challenge being that in case of parallel flows (be it via Parallel or Inclusive Gateway) the token must move to the merge gateway for each individual flow before it can move further.

One typical use case that I have ran into a couple of times is the one where at a specific point in the process there is more than one way to do something, and either one of them might happen after which the process can move on to the next activity. In case there are only two ways, most of the times you can model this by adding an interrupting Boundary Message or Timer Catch event to the activities.

For example, in the next process there is a Human Task in an order handling process where either 1 of 3 events can happen:

  1. The order is handled
  2. The order is cancelled
  3. The order is expired

Obviously, handling is done via the task. Cancellation can be handled by the Message Boundary Event that exposes a "cancel" operation that - when called - will interrupt the task and move to the end. Expiry can be done via the Timer Boundary Event that - when expired - will interrupt the task and move to the end

More complex do things become when you have two parallel tasks. In the following example an order can be updated by the Customer via the Update Order task, as long as it is not yet handled by the Clerk via the Handle Order task. The Customer can also cancel the order which should withdraw the Handle Order task. 


The solution chosen in this model is a Message Boundary Event on each Human Task, one to withdraw the Update Order task and one to withdraw the Handle Order task in case the Customer cancelled the order. It uses an integration that can be called after either one of the tasks is executed, which on its turn will then call the Message Boundary Event of the other task to withdraw it. 

A more clever solution is a more generic integration that leverages the OIC tasks API to withdraw any task based on a unique identifier (identificationKey). I could then leave the Message Boundary Events out of the model for withdrawing Human Tasks but that would not work for other types of activities or events. "He, but what about Event Based gateway?", you may think now. Not going to work, because you cannot use that in combination with a Human Task.

As you can see, the second example is much more complex than the first one, even though I left the order expiry use case completely out of the picture. In contrast, the model would look pretty simple when we would have a Complex Gateway. even with the expiry use case included (note: I "photo-shopped" the complex gateway 😉):

You will appreciate that with the work-arounds mentioned before, adding alternate scenarios (like a third human task or an operation to support an update via an integration instead of a task) will soon make any model incomprehensible even for the most experienced BPMN modeler. It just does not scale.

The best approach I could think of is as in the next model, which I have implemented for a real customer case:

 

The trick is that each individual, parallel flow has its own Embedded Subprocess with a Message Boundary Event. The advantage over the previous workarounds is that within the Embedded Subprocess I'm now free to expand or change the logic by adding or removing activities, events or gateways without breaking the interface of the operations of the overall process. The Cancel Polling and Withdraw Task activities both call one single integration that invokes the boundary event of the other one. With a little bit more effort I can extend it to support more parallel flows

Obviously adding or deleting any of the parallel flows will require the integration to be changed. Still not ideal but at least this scales much better than any of the other workarounds.

Wednesday, March 31, 2021

OIC: Somewhat "Hidden" Feature: Reusable Subprocess

This blog article describes how to configure input and output parameters for Reusable Sub-processes (BPMN Call Activity) in OIC Structured Process

It's more than once that I have see that OIC Process developers are not using the Reusable Subprocess in Structured Process, simply because they are not aware that you can pass on arguments to the Start event and from the End event, like you can for processes that start and end with Message Events.

I will not discuss the benefits of and when to use a Reusable Subprocess, I will save that for a blog posting soon to come.

The thing is, the parameters are somewhat hidden. You create a Reusable Sub-process by creating a process with a None Start and None End event, as in the picture below. To add parameters to both events you can do it like this:

  1. Click on the Start Event.
  2. From the hamburger menu choose Open Properties. This will show you the propertie of the Start Event, where (probably to your surprise) you cannot find the input parameters.
  3. With the Properties tab of the Start Event still shown, click anywhere on the process canvas as long as it is not another component (event, activity, flow, gateway). Ta-da!!

 


You can now use the Reusable sub-process in any other process using the Call Activity:


As you can see you can pass on argument to it on the Input tab of the mapper, as well as pass on arguments from it on the Output tab.


That easy ;-)

Wednesday, March 24, 2021

OIC Process Correlation: Take Good Care of Your Properties

The other day we had an issue with correlating process instances which turned out to be caused by some "mistake" we made. Took quite some time to figure it out so I thought I share it with you, hoping it can safe you some time.

First I will explain what correlation is about (you may want to check out a much more elaborate blog article on how to use it in OIC-Process by Martien van den Akker, or another one from Anthony Reynolds explaining the concept in the context of BPEL). Correlation is in OIC-Process not really different from what it is in Oracle BPM Suite so if you already know that or when you have done it before in OIC you can skip the next paragraph.

When one process instance is calling another one and of the latter there may be multiple instances, you need a way to make sure the second process calls back the right instance of the first process. That is done by making that the instance to call can uniquely be identified, or "correlated" as it is called. In many cases correlation is out-of-the-box, like for synchronous calls or asynchrounous calls using WS-Addressing. When there is no out-of-the-box correlation, you need to configure it explicitly using what is called "message-based correlation". That means that instances are correlated using a key (value or combination of values) which is (part of) the message that is send from one instance to the other. In OIC that key is called (not surprisingly) the "correlation key" (same as "correlation set" in BPEL). The correlation key has one or more "properties" for which the (combination of) values must be unique in such a way that at any time there cannot be two or more instance flows of the calling process using the same correlation key value(s).

The issue we ran into is that we had to call the same child Structured Process from a parent Structured Process in parallel and that we made a "mistake" with defining the correlation key. The mistake being that we defined 2 correlations keys for 2 parallel flows, calling the same Structured Process but using different properties. When using 2 correlation keys sharing the same property it worked.

The actual process model is similar to the following:


Both parallel flows call the same process. In the example the child process takes a string input argument and performs a callback using that same value. The child is started in the Send activity. To make that the proper flow is being called back in the Receive activity, correlation must happen in the callback. The way to do so is to set up 2 different correlation keys, 1 for each flow and then initiate a unique correlation in the Send activity and in the Receive activity correlate on that unique value. In the example the way to do so is like the following:


 

This shows how two correlation keys, "ck1" and "ck2" are defined, both having the same "property1". The next picture shows how correlation is initiated in the Send activity: 

 



 The correlation happens in the Receive activity as show in the next picture:

 


What could possibly go wrong, right? Well, you could define your correlation keys like this:

 


The difference being that the correlation keys have a different property. Net effect being that after either one or both the sub-processes finished without any issue, the parent still waits for the callback that never will come:

 

When setting up correlion with the same process, there should be no reason to have different properties in the correlation keys, so other than being an inconvenience when you accidentally do it wrong, it should not be a practical problem. I do hope though that with some next release this issues goes away either by that having different properties is supported (might be a reason why that is not a feasible solution) or that you are blocked from configuring it wrong.

By the way, another easy to make mistake is to use "Initialize" instead of "Correlate" in the activity that should correlate. Happens to the best.