The other day we had
an issue with correlating process instances which turned out to be caused by
some "mistake" we made. Took quite some time to figure it out so I
thought I share it with you, hoping it can safe you some time.
First I will explain what correlation is about (you may want
to check out a much more elaborate blog article on how to use it in
OIC-Process by Martien van den Akker, or
another one from Anthony Reynolds explaining the concept in the context of BPEL). Correlation is in
OIC-Process not really different from what it is in Oracle BPM Suite so if you
already know that or when you have done it before in OIC you can skip the next
paragraph.
When one process instance is calling another one and of the
latter there may be multiple instances, you need a way to make sure the second process calls back the right instance of the first process. That is done by making that the
instance to call can uniquely be identified, or "correlated" as it is
called. In many cases correlation is out-of-the-box, like for synchronous calls
or asynchrounous calls using WS-Addressing. When there is no out-of-the-box
correlation, you need to configure it explicitly using what is called
"message-based correlation". That means that instances are correlated
using a key (value or combination of values) which is (part of) the message
that is send from one instance to the other. In OIC that key is called (not surprisingly)
the "correlation key" (same as "correlation set" in
BPEL). The correlation key has one or more "properties" for which the
(combination of) values must be unique in such a way that at any time there
cannot be two or more instance flows of the calling process using the same
correlation key value(s).
The issue we ran into is that we had to call the same child
Structured Process from a parent Structured Process in parallel and that we
made a "mistake" with defining the correlation key. The mistake being
that we defined 2 correlations keys for 2 parallel flows, calling the same
Structured Process but using different properties. When using 2 correlation
keys sharing the same property it worked.
The actual process model is similar to the following:
Both parallel flows call the same process. In the example
the child process takes a string input argument and performs a callback using
that same value. The child is started in the Send activity. To make that the
proper flow is being called back in the Receive activity, correlation must
happen in the callback. The way to do so is to set up 2 different correlation
keys, 1 for each flow and then initiate a unique correlation in the Send
activity and in the Receive activity correlate on that unique value. In the
example the way to do so is like the following:
This shows how two correlation keys, "ck1" and
"ck2" are defined, both having the same "property1". The
next picture shows how correlation is initiated in the Send activity:
The correlation happens in the Receive activity as show in
the next picture:
What could possibly go wrong, right? Well, you could define
your correlation keys like this:
The difference being that the correlation keys have a different property. Net effect being that after either one or both the
sub-processes finished without any issue, the parent still waits for the
callback that never will come:
When setting up correlion with the same process, there should be no reason to have different properties in the correlation keys, so other than being an inconvenience when you accidentally do it wrong, it should not be a practical problem. I do hope though that with some next release this issues goes away either by that having different properties is supported (might be a reason why that is not a feasible solution) or that you are blocked from configuring it wrong.
By the way, another easy to make mistake is to use "Initialize" instead of "Correlate" in the activity that should correlate. Happens to the best.