Couple of weeks ago I saw a great presentation by Joram Barrez of Dolmen called "The Full Stack". Basically he showed how they selected jBPM and SeeWhy and composed it into a full BPM solution. Since I went to the event with not much expectations, this was a pleasant surprise. Things became even more interesting when Joram showed the performance tests they did to see if jBPM would meet their demands. We wanted to do those tests for ages, but never got to it. So big thanks, Joram for sharing them with us !
As you might know process execution in jBPM can happen with or without persistence. These measurements are done *with* persistence, as this is the most common way of how jBPM is used.
He started with the simplest process. That took 2 milliseconds to execute.
Then a processes that grew longer and longer:
Then he verified the effect of simple and complex process concurrency:
Amen.
So even this complex process runs with only 12 ms overhead of jBPM ! Awsome. Even I was surprised :-)
And the last test was a complete realistic process of handling a hospital report. That took 3 milliseconds to execute.
All of this shows that the overhead created by jBPM runtime process management is really small. Doing the statemanagement of such processes yourself will lead to a lot more development time, and in many cases, the performance will not be as optimized as just using jBPM.
This only highlights the performance evaluation part of the talk. I definitely recommend to read his blog post about the full contents of Joram's presentation.
I had expected some reactions on the measurements , so here is some clarification:
ReplyDelete- The examples I used were very academic. In our production systems, we don't get numbers which are as high as these ones.
However, people tend to call jBPM 'slow' when referring to the whole business process time execution.
This is not fair, since the logic that is attached to the nodes is most of the times the culprit. Jbpm is most of the time plain Hibernate, and we all know Hibernate can be very fast.
- In the examples, the interactions with the processes were minimal or non-existant. This allowed me to do quick measurements, but it is true that this is not a real scenario.
- Hibernate's 'show_sql' was set to false.
This little parameter had a huge impact on the total time. When it was set to true, however, you could see that there are a lot of non-cached actions (inserts/updates).
- No logging to the JBPM_LOG happened. This has an impact, since a lot of logging happens by default.
- Log4j was configured at the ERROR level, which means that no logging was happening.
- The database was a locally installed MySQL database, configured to cope with a high level of traffic.
- The tests ran isolated only to test the jBPM overhead. No logic was added (the ActionHandler classes are empty).
- In my experience, the biggest benefactor to a better jBPM performance is Hibernate caching.
It is true that jBPM does a lot of queries, but many of them can be cached without any problems. If fact, 90% of the time jBPM is doing Hibernate stuff. The whole process is always the same (same nodes), so a lot of caching can happen there.
The examples I showed were quite basic and almost completely static, so caching had a big some impact in the numbers (I used EHCache for the presentation).
Do note that this does not mean that everything is cached, all the inserts/updates must still happen, but the reads are very fast.
I haven't done a real comparison where no caching is used, this will certainly be a big difference. But then again, who uses Hibernate without using a cache?
- When I run my 'realistic' business process at this very moment (note: 2 JBoss servers are running, 2 eclipses and some databases so the numbers are a bit higher than usual) I get about 10 ms when I take the average difference between the start timestamp and end timestamp which are saved in the JBPM_PROCESSINSTANCE table.
However, if you take the total running time the whole measurment takes 65 ms/process. This is not correct however, since XML is generated in a separate thread and the task lists are managed by 2 other threads. So the average difference between the timestamps are all I have. Unless these are wrong (which is hard to believe), I think that this is quite fast.
To conclude: I'm not saying the numbers I showed can be mapped to real production systems.
In my opinion, jBPM is at a high-level nothing more than a state machine implemented with Hibernate.
We know that Hibernate can be fast, so why would jBPM be slow?
The measurements I did were only to check whether the run-time overhead was decent enough for this customer, which was the case for his customer.
Comments and remarks are appreciated!
Joram,
ReplyDeleteThanks for sharing that! It's very usefull information.
regards, tom.
What I have seen to be jBpm bottleneck is ActionHandlers, and parsing them. I don`t believe in such a numbers with action-handling in nodes.
ReplyDeletePou,
ReplyDeleteWhat do you mean by 'parsing actionhandlers'?
In my measurements the logic in the ActionHandlers was empty (only a signal), since I only wanted to test the overhead of jBPM.
In a few days, I will have some source code online about these measurements: keep an eye out for the update on my blog.
Trying to configure an ActionHandler, through fields or setters. Runtime condiguration of the ActionHandler is done by parsing the configuration xml everytime the action-handler is executed. This parsing is very slow compared to the execution of the process instance.
ReplyDeletePou,
ReplyDeleteI haven't done any measurements with an ActionHandler using this kind of parsing, since in our production systems we use Spring for the dependency injection.
The dependencies are injected once by spring and then the action handler bean is used in the jbpm process.
But you have a point, I myself would like to see a better integration with Spring (since it is the de facto container for that sort of stuff). This would also benefit the performance issues you have.
We use reusable ActionHandlers, so we configure every instance of the action handlers, like:
ReplyDelete<action name="x" class="x.y.z">
<param1>param1</param1>
<param2>param1</param2>
<param3>param1</param3>
</action>
This is very slow when you have a big process with every node with a configured delegate (action handler). I think best option should be to have some form of the delegate instantiated and configured qith the processDefinition and make a copy when you want the processInstance. This way, this delegate doesn't need to be configured at runtime, but only when the ProcessDefinition is created.
Indeed, this would certainly benefit performance.
ReplyDeleteLike I said, I haven't done any measurements with configurable action handlers (since I don't use them), but I'll try to look in to them when I find some time
Pou,
ReplyDeleteYou're right. Long time I was in the belief that it should be possible for action handlers to be stateful or stateless.
So that implies that we would need to introduce a configuration option to activate automatic caching of action handlers.
Now, I realized that statefull action handlers (the ones that change their own memberfields containing configuration) are so rare that we can ignore that. So if we can assume that action handlers don't change their own configuration and if they are stateless, then they can be safely shared by threads and reused.
This new assumption is going to make it easier to add caching of action handlers. One way to do the caching that I'm currently considering is making sure that the instantiated action handlers some how get cached in hibernate's second level cache. That strategy would prevent us to write caching and cache configuration would remain centralized.
But probably it will take some time before we can actually implement it.
Why in hibernate 2nd level cache? What happens when you don't use persistance, or at least, jbpms own persistance, through hibernate?
ReplyDeleteFor what I've seen, it would be a good way to implement dinamicity of the configuration of the action handlers through custom VariableResolvers, and making those variable fields be resolved dinamically.
I would like to see that in pvm action-handlers configuration is not done in runtime.
Thanks for your sharing.
Joram, it would be great if you try to benchmark an application with configurable action-handlers, so we can see if it`s jBpm weakness in performance.
Sir I need to access the databse from my process froms, so i was wonder about doing this as in jbpm they used jsf but how they mapped javabeans with jsf tags..i dont know? so guide me...actually where they mapped all tasks, pilist..etc javabeans either in faces-cnfig.xml or where?
ReplyDeleteWhat about jBPM with Seam ?
ReplyDeleteDoesn't that solve the problem for statefull/stateless ActionHandlers ?
(since the Node logic is put in a reusable contextual Seam component: EJB or POJO).
With SEAM it actually is possible. Good thinking !
ReplyDeleteAre you kidding or what? jBPM don't stand any significant workload. Real-world experience.
ReplyDeleteand what about ETL process. I use jBPM in a app and works fine(well, not always, but it's ok).
ReplyDeleteBut when I start a ETL process with jBPM I have a lot of problems. Hibernate cannot open connections, or JTA problems.
Anyone has any ideas?
my question is how to execute the jpbm process in EJB or Servlet. I think it it very important, but i can not find any sample in gool.com
ReplyDeletethank you