Labro Dimitriou

Subscribe to Labro Dimitriou: eMailAlertsEmail Alerts
Get Labro Dimitriou: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


Related Topics: Java EE Journal, SOA & WOA Magazine

J2EE Journal: Article

An Architectural Blueprint, Part 2

The reasons for building a model

  • Read part one of this series

    Let's dive into the murky waters of modeling, describe some of its challenges, and provide, an overview of the state of business process modeling.

    In my first article in this series (WLDJ, Vol. 3, issue 4), I discussed the importance of architectural blueprints and best practices in order to establish repeatable ways for building robust, enterprise-wide integration solutions, for an adaptable and agile enterprise. I then established that service is the unifying construct that merges SOA and BPMS with Web services, as the underpinning of connectivity, in highly distributed and ever-changing business ecosystems. SOA is an evolutionary step in distributed computing where business process and BPMS is an evolutionary new technology innovation - a first-class citizen in computing. I concluded with the notion of elementary business services (EBS) - small units of work made available to the enterprise via the enterprise info bus (yet another anachronistic and overloaded term). The portfolio of EBS delivers the ultimate guide to enterprise-level reuse. New business processes are orchestrated in near real time by aligning existing EBS, new business events, and human resources with adaptive corporate objectives.

    In this article I discuss emerging business process design patterns that provide BPM-centric architectural solutions to long-standing, enterprise-wide business challenges. After a look at what it takes to model, we'll deconstruct business processes to components we know well and try to understand the surrounding design challenges. Finally, I will provide a proposed taxonomy for business-process design patterns.

    Meanwhile, the board of the fictive Car Insurance Agency whose business process get car insurance quote I was going to model asked me to wait a bit before I presented the solution. They are going through a business process redesign phase and they are about to approve a new policy. Under the new policy all quotes have to be authored by an external underwriter. The underwriters come in via a pool of a virtual collaborative B2B exchange. Additionally, they are negotiating, as part of the process, a credit report check via secure Web services. Therefore, despite what I promised in my first article, and for the sake of completeness, I will have to defer the discussion for the next article.

    The Problem with Modeling
    Booch et al tells us "a Model is a simplification of reality" and that "the best models are connected to reality"; but whose reality are we modeling? Furthermore, what kind of modeling paradigm or framework do we use to model the modeling framework? And why do we need models? They can be dangerous because they make you forget reality. Consider the following analogy: business domains are like dreams and models are what you answer when somebody asks you to explain them. Pretty soon you remember your answers and forget the dreams. To that end, my objective is not to answer exhaustively all these matters, even if I could. It is merely to bring awareness of the underlining concepts and provide a brief account of generally accepted notions and challenges involved in (1) modeling business ecosystems with processes - business processes; (2) using modeling frameworks that model business processes; (3) selecting a language/syntax to present the modeling framework; and (4) selecting a graphic notation to visually render business processes.

    Business processes model collaborative business ecosystems well and the BPMS framework successfully bridges the impedance mismatch between business and IT. But whose modeling standards should you use? BPML has been around for a while, but BPEL4WS seems to be the winner; although by the sheer weight of the proverbial 900-pound IT gorilla(s). Clearly, both standards use XML as the implementation choice. On the other hand, the UML camp is not standing still. Are UML sequence diagrams good enough for modeling? What is this buzz about OMG's model-driven architecture, not to be confused with aspect-driven architecture or domain-driven design? Not to mention, of course, numerous other standards such as XPDL, ebXML, XLANG, WSCI, WSFL, and many others brewing in the academic and research world, such as YAWL (yet another workflow language). Now is a good time to break a common fallacy: contrary to popular belief, workflow and process do share very similar aspects. That was true well before workflow software companies hijacked the term workflow to mean document flow and work allocation, and BPM evangelists, including myself, wanted nothing to do with things of the past like <e>AI and workflow engines. (I use the <e> instead of the traditional E to denote AI beyond the firewall and across business ecosystems.) Clearly, graphical, control-flow representation, and graph theory are the common aspect of workflow and process alike. So from now on, I'll use the terms workflow and process interchangeably.

    Business-process models don't encapsulate business domain-specific knowledge but have expressive power in defining business content. On the other hand, this opacity of business content has a nondeterministic effect on the business protocol, making exception handling and compensating transactions more challenging.

    Two mathematical/modeling theories are primarily used to model processes: (1) Petri nets, and (2) π-calculus or variances and supplemental notions such state charts, timed stochastic nets, and ambient calculus, respectively.

    Petri nets were introduced by C. A. Petri in the early 1960s as a mathematical tool for modeling distributed systems and, in particular, notions of concurrency, nondeterminism, communication, and synchronization. π-calculus was defined by Milner, Parrow, and Walker as "Calculus of Mobile Processes." Petri nets-based languages perform better in state-based workflows but have difficulty and increased complexity in modeling multiple concurrent processes and complex synchronization requirements.

    Simply put, graphs have nodes that are connected with edges. Petri net is a kind of graph with nodes that are places (circles) or transitions (rectangles) and tokens. Nodes of different kinds are connected together by means of arcs. There are two kinds of arcs: input and output. A place can hold one or more tokens. The state of a process is modeled by places and tokens and state transitions are modeled by transitions. Figure 1 demonstrates a B2C and B2B interaction as Petri nets, a client requesting an insurance quote from an agent and underwriting process, respectively. The private processes operate essentially independently, but have to synchronize via the shared points. Professor Wil van der Aalst's presentation provides a simple introduction to Petri nets with a number of good examples and an applet to design your own Petri nets.

    π-calculus models concurrent processes communicating via channels where the network topology can change dynamically. Nodes are processes and edges are named channels. Processes can exchange names over channels and there are no other values than channel names. For example, the notation x(y).P means that a process P receives a value y over a channel x. P1|P2 indicates that P1 and P2 are two concurrent processes. Finally, (y) means to send a name y over a channel a. For example, the process of a user requesting a quote via an insurance Web site would like something like this:

    webChannel(sendData).RequestQuote | webChannel (getData).ProcessQuote,

    where RequestQuote and ProcessQuote are two processes running in parallel.

    Why complicate things with such formalism you may ask. Because otherwise it would be equivalent to doing accounting without dual book entry techniques and general ledgers, and hoping that it would all be a zero sum at the end. The underlying process algebra helps us (BPM engines and next-generation business activity monitors) to find deadlocks and race conditions, reduce processes to simpler ones, and find optimal paths, better opportunities (based on business rules), and answer other interesting questions. The vision of BPMS goes well beyond simple execution of processes. It is exactly because of BPMS' formalism that we can now have a direct real-time API to the runtime enterprise and achieve executive dashboard nirvana.

    In terms of graphical notation, Business Process Modeling Notation (BPMN) seems the only game in town outside academic and research campuses. While there are talks about implementing BPMN on top of BPEL4WS, a few vendors have announced compliance, including Popkin Software, a software analysis tool vendor; and Intalio, a pure play BPM tool. BEA's WebLogic Platform 8.1 has taken the route of an all-encompassing IDE for designing and deploying distributed applications, based on WebLogic Server, the de facto app server industry standard. The IDE uses a few powerful and intuitive constructs to facilitate business-process design and development. The visual paradigm successfully hides the rigors of OO programming, J2EE, J2EE CA, JMS, and WS-WSDL from the unwary. Most other vendors use typical Visio-like workflow or object-oriented UML notation.

    Deconstructing Business Processes
    The business process management approach is essentially a top-down approach. Starting at the top, the breadth of a process aligns all knowledge domains within the agile enterprise: business intelligence, subject matter expertise interaction (UI and other), location and organizational boundaries, legacy applications, integration points, and data requirements. As we increase the depth of the process we identify subprocesses - some within the boundaries of a line of business - down to the micro level of individual business rules. The layered top-down approach makes BPMS the perfect vehicle for legacy retiring. Enterprise-wide processes can trigger and execute subprocesses and so on. Clearly this top-down approach facilitates incremental change rather than "a big bang" approach.

    Gregor Hohpe and Bobby Woolf, in Enterprise Integration Patterns, conclude:

    The business process component unites a series of services into logical unit that interacts with other such units via messaging to achieve highly scalable, resilient flows of logic and data. The coalescence of process, object, and interaction patterns into business process component is the future.

    Figure 2 provides a layered view of a business process and the associated design patterns involved at each layer. A business process starts with a business event triggered by an actor, real person (internal or external to the organization), or a system. A business process contains dynamic and static aspects.

    The dynamic aspect is designed using drag-and-drop mechanisms within an IDE and encapsulated by the control flow language and messaging/connection points, including <e>AI, legacy adaptors, and Web services. The control flow soft wires volatility, making it easy for domain experts and design modelers to make changes and deploy them with a click of a button. Messages are implemented using abstract and concrete classes. Concrete classes hide the protocol-dependent implementation details from the control flow.

    The static layer contains the usual suspects: business services, business objects, and data. There are two places for data in a business process: within the exchange of messages for protocol and business content, and the bottom of the stack for persistence and other business-process metadata repositories. To restrain any RDBMS modeling ideas, I recall Ian Graham's view of data from Object-Oriented Methods: Principles and Practice: "It is my firm conviction that data driven methods are dangerous in the hands of someone educated or experienced in the relational tradition. "

    BPEL4WS discusses messaging only in light of Web services. Contrary to that, BEA's WebLogic Platform 8.1 neatly enables encapsulation of any imaginable entity as a control and exposes method calls that can be used as connection points. Through introspection, it can even expose internal methods as "straight-through" connection points. I'll talk more about the power of controls on the next article.

    Business Processes Design Patterns
    Clearly, the dynamic layer of a business process establishes the foundation for business-process patterns: combination of workflow and Web services/messaging patterns. In addition, as a new breed of OEMs, merging subject matter expertise in specific business verticals with technology, start rolling out portfolios or libraries of highly configurable executable business processes, we will witness the emergence of policy patterns or best practices business processes.

    They will tackle regulatory issues such as the Patriot Act and Sarbanes-Oxley, accounting best practices, complex and politically charged interorganizational reference data issues, risk management, and data caching policies and SLAs for grid computing, to name a few.

    Consider the reference data issue: a Tower reference data report tells us (1) organizations are spending $3.2M for reference data annually; (2) 48% of interviewed organizations revealed that reference data are contained in more than 10 systems, and 8% stated a staggering 150 systems or more; (3) poor quality of reference data causes 30% of business failures; and (4) the same Tower report reveals that manual entry and error-prone manual maintenance is still a widespread practice. Need I continue? You get the idea. Traditional EAI techniques, data replication, and naive notions of enterprise-level data normalizations not only had limited success but actually magnified the problem by cloning bad, unreliable, and conflicting data throughout the enterprise.

    But how can a BPMS-based solution solve such a gargantuan problem? The complete answer and strategic approach can fill a chapter or two of a practical guide to BPM, or be a multimillion project in its own right. Without going into many details, here is the approach: visualize a piece of reference data having a number of attributes associated with it, say an order of magnitude of 600 or so. The implied assumption is that different organizations within an enterprise have first authoring rights into different segments or clusters of attributes; therefore, segregate the attributes into principal clusters by primary LOB user owners, say six domains, each one having 100 attributes or so. Finally, for each domain design and implement processes that manage the life cycle of each subset of attributes and include policies about the cluster's life cycle and approvals (see Figure 3).

    In other words take the departmental ownership rights, procedures, and policies and implement them in BPMS. That's what BPM is all about, right? Executable processes. The heart of the problem is now the key to a successful solution. And last but not least, you have solved yet another challenging issue around enterprise integration: corporate governance and ownership of the technology solution.

    The academic community has developed a set of 20 basic patterns that accurately define all the major workflow patterns. Workflow IDEs can then be measured against the established patterns. These are divided into six broad categories: (1) basic control flow, (2) structural, (3) state based, (4) advanced branching and synchronization patterns, (5) cancellation patterns, and (6) multiple instances.

    1.  Basic control flow patterns are (a) sequence, (b) parallel split, (c) synchronization, (d) exclusive choice, and (e) simple merge.
    2.  Structural patterns are (a) arbitrary cycles and (b) implicit termination.
    3.  State based patterns are (a) deferred choice, (b) interleaved parallel routing, and (c) milestone.
    4.  Advanced branching patterns are (a) multi-choice, (b) synchronization merge, (c) multi-merge, and (d) discriminator.
    5.  Cancellation patterns are (a) cancel activity and (b) cancel case.
    6.  Multiple instances patterns are (a) multiple instances without synchronization, (b) multiple instances with a priori design time knowledge, (c) multiple instances with a priori runtime knowledge , (e) multiple instances without a priori design time knowledge.

    Wil van der Aalst of Technical University of Eiendhoven University provides a detailed account of some of the patterns above through an example process.

    Patterns (3)b, 5(a) and (b), all of (6) require messaging and Web services patterns. These patterns can be divided in the following categories: service access and configuration patterns including wrapper façade, component configurator, and interceptor. Event handling patterns, including reactor, proactor, and acceptor-connector and concurrency patterns, including active object, monitor object, and leader/followers architectural patterns. Douglas Schmidt, et al, in Pattern-oriented Software Architecture (Vol 2): Patterns for Concurrent and Networked Objects provide an excellent account of the above patterns. The Addison Wesley Signature Series from Martin Fowler, et al, is also an other great source for connectivity patterns.

    The BEA 8.1 WebLogic Platform takes away much of the need it for low-level implementation. However, it is important to realize that since BPM is indeed a programming paradigm in its own right, the prospect of creating spaghetti processes is as real as basic spaghetti full of GoTos. Proper design is highly recommended!

    Summary
    In this article I described the reasoning behind modeling, the need for modeling the model, and the state of the current state in the standards front for BPM. I described how BPMS provides a top-down incremental approach unifying all the knowledge domains within the enterprise and presented a high-level best practice approach to the daunting problem of enterprise reference data. Finally, I presented a survey of workflow and connectivity patterns.

    In the final article of this series I'll describe the get insurance quote business case and present BMP modeling options. I will then implement a solution with BEA's 8.1 WebLogic Platform using some of the workflow and connectivity patterns I presented in this article and then discuss some of the limitations that still exist and possible solutions.

    Until then: processes are everywhere. Can you see them?

    References

  • Engberg, U. and Nielsen, M. (1986). "A calculus of communicating systems with label-passing." Report DAIMI PB-208, Computer Science Department, University of Aarhus, Denmark.
  • Milner, R.; Parrow, J.; and Walker, D. (1992). "A calculus of mobile processes, Parts I and II". Information and Computation, 100, 1, pp 1-77.
  • van der Aalst, W. "Classical Petri nets: The basic model." http://tmitwww.tm.tue.nl/staff/wvdaalst/Courses/pm/pm2classicalpn.pdf
  • ter Hofstede, A. (QUT); Kiepuszewski, B. (QUT); Barros, A. (UQ); Ommert, O.(EUT); Pijpers, T. (ATOS); et al. Workflow patterns. www.tm.tue.nl/it/research/patterns/
  • Booch, G.; Rumbaugh, J.; and Jacobson, I. (1999). The Unified Modeling Language User Guide. Addison-Wesley.
  • Hohpe, G. and Woolf, B. (2004). Enterprise Integration Patterns: Designing, Building, and Deploying Messaging Solutions. Addison-Wesley.
  • BPEL4WS, Business Process Execution Language for Web Services. BEA, IBM, Microsoft.
  • BPML, (2002). Business Process Modeling Language. Bpmi.org.
  • Reference Data, (2001) "The Key to Quality STP and T+1." A Tower Reuters CAPCO report.
  • Graham, I. (2000). Object-Oriented Methods: Principles and Practice. Addison-Wesley.
  • More Stories By Labro Dimitriou

    Labro Dimitriou is a BPMS subject matter expert and grid computing advisor. He has been in the field of distributed computing, applied mathemtics, and operations research for over 20 years, and has developed commercial software for trading, engineering, and geoscience. Labro has spent the last five years designing BPM-based business solutions.

    Comments (4) View Comments

    Share your thoughts on this story.

    Add your comment
    You must be signed in to add a comment. Sign-in | Register

    In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


    Most Recent Comments
    Labro Dimitriou 06/14/04 03:03:00 PM EDT

    Dr. Dumas> Thanks for your comments. Indeed it''s a hot issue and I do not that to change any time soon. Glad to see you are teaching web-services SOA and workflows this semester.

    Yes, "supplemental" is the wrong word. I did not mean it as "optional" and I will double check the bibliography. Also, I am glad you found the deadlock!! Would a BPM engine be able to find it before I bet my enterprise on it?

    Thanks again -- Labro Dimitriou

    Robert DuWors 06/02/04 04:19:56 PM EDT

    This article is an excellent primer on current state of the art regarding discrete process modeling in the distributing computing world. Basically these techniques describe reactive (distributed) systems - it certainly appears that reactive systems are not just for "real time" control applications anymore!

    But, all of these discrete techniques lack an essential application level notion of temporal semantics (virtual time and wall clock time) that increasingly pervade real business processes and business rules. Fundamentally, this means that the semantics of time itself must appear at the application level, e.g. real valued time, temporal intervals, basic temporal operators, clearly specified distinction between virtual and "real"/physical time in each design, and adequate modeling of true concurrency. Without significant progress in this area, the march of loosely (and many tightly) coupled distributed systems will either grind to a halt or at least be severely retarded.

    The lack good explicit temporal semantics has been a failing of the computing world for nearly 60 years (my, how this field is aging). Consider how much basic business applications such as payroll or billing are actually temporally driven, yet we largely cobble around the problem in a manner not readily transferable to any other application (even those of the same kind - large organizations typically have billing systems that number in the dozens and even hundreds, each with subtly different and incompatible notions of time, typically buried in obtuse codes). A payroll or billing system in fact has the basic design pattern of a servo mechanism, which makes it inherently non-linear, indeterminate over time, capable of modifying its own environment and receiving feedback, and is highly temporally dependent.

    The notion of time is also essential for the union of discrete and continuous computational systems, i.e. distributed systems that are both reactive/"digital" and continuous/"analog" in behavior. I believe this a sleeping issue (perhaps even the major one) that will grow ever more important as the age of embedded control systems expands to eventually become the bulk of traffic on the Internet. Increasingly, "business processes" will become machine-to-machine(-to machine, etc.) rather than following the current dominance of being mostly human-to-machine in nature. Simply put, there are no good general solutions today for hybrid digital/analog systems.

    The field similarly has failed to address the issues of geospatial data adequately and only now with position based applications coming into their own do we see interest in cleaning that up. (Again, massively deployed embedded systems will push on the geospatial envelope as well as the temporal one, so in fact these application have great need for much better geotemporal semantics).

    But most seriously of all, the field has failed to join data and processing in a unified manner, despite have used the term "Data Processing" in the early years and despite von Neumann et al stressing the opportunity for unified computation and data nearly 60 years in the foundational paper of all modern computer architecture. (Perhaps the bum steer was to emphasize self-modifying systems rather than dynamically composed systems).

    The OOP paradigm did not solve the problem, but only encased it more layers of cement. The "component" craze only buried things further. What we need is not "hardshelled" objects with destructive "information hiding" of application level semantics, but rather a way to achieve rule flow as readily as data flow - better yet to flow and to compose the two together.

    To do this we need units of communication to flow both units of data and units of computation ("programs") with COMPOSITION OPERATORS to recompose them on the fly, i.e. if a transaction step works on a "document" why not compose that document (containing both integrated rules and combined parametric data) dynamically from the multiple arriving documents, and then send out the results as multiple documents - the Petri net notion of "firing" makes one excellent form of coordination where each token is a document, as does any "tuple" based form of communication such as used in LOTOS, Linda, etc.

    It is not original to me to point out that tuple oriented communication is an alternative to the dominant "Hoare monitor" style of distributed objects. Personally, I believe this dominance of OOP "objects" in distributed systems came from a naive notion of APIs as the starting point of all design and implementation - but obviously dynamic composition of documents could easily become the more important approach in open, distributed SOAs. In fact, if we consider the tremendous societal transforming success in the mid-1990s of HTML, MIME data types, URLs, Javascript, and HTTP versus the relative failure of CORBA/ODP/DCOMM and the ilk, we can see in many ways it already has.

    Obviously under AOP influence, we need to be able generate new entities that integrate multiple sets of business rules ("programs/scripts") and associated parameters (data) from multiple sources into new arrangements as needed in each step of the transaction. A successful version of this conceptual framework will move ALL business content out of the software implementation layer and totally into the application layer that becomes directly "programmed", managed, and maintained by SMEs. This, of course, requires methods of representation at application level that are self sufficient (to capture relevant computations and data). For OLTP systems, documents play the obvious role of "container", but analytical systems may require transformation to "data cubes" etc.

    The underlying software layer will restrict itself to handling all of the automation issues such as processors, networks, databases, operating systems, low level security, interpreters and compilers of application specific languages, rendezvous managers, transaction integrity, caching, other local optimization, etc. (One minor aside, we desperately need to decouple the notion of "data" from "persistence" as many "programs" also need to be persisted and easily distributed, while much intermediate data does not.)

    But the really big payoff is that business knowledge will not longer be imprisoned in the midst of the vast waste land of software implementation code (OO or otherwise). This is to say that Java, C#, C++, Perl, OOP in general, RDBMS, etc. make really lousy application knowledge representations - they can devour and irretrievably swallow up any real knowledge put into them. Our basic software development tools of today thus will drop down a level. To be fair, these tools are useful to bridge the gap between conception and execution within the supporting software infrastructure level, but should remain addressing only the tasks for which they are best suited. Meanwhile, business knowledge will leave the software implementation level entirely. Consequently, BPM and various methods of representing process flow AT APPLICATION LEVEL are an absolutely essential part of achieving this next step to directly capture business knowledge (data and computation) and to directly use it. The stakes may just be higher than generally supposed.

    David W. Wright 05/19/04 10:04:28 PM EDT

    Are you serious? I am a Business Systems Analyst who documents process and functional requirements everyday, and you have forgotten the central element of business in general: people, the people who are responsible for and who carry out tasks in a process to meet the needs of the business'' customers. I appreciate a good mathematically provable method or model as much as anyone, but you are going to have to prove to people like me that it helps improve the business (and makes my job easier) before you get to convince real business people, i.e those who control budgets. I could not convince business people 10 to 15 years ago that a simple functional decomposition independent of the org chart was a useful model; I hope BPMS will have better luck today... pehaps the impressive folks at Intalio will be able to pull it off.

    Marlon Dumas 05/17/04 11:42:21 PM EDT

    It is interesting to see that BPM and SOA are among those (relatively few areas) where industry and academia are trying to meet each other. There use to be a time where academic research on workflow and distributed computing platforms was quite disconnected from industry developments, leading to research that did not make any real impact, and to commercial products with a lot of issues which an academic could spot with a glance of an eye (e.g. the erratic "behaviour" of synchonisation points in some commercial workflow engines).

    In this setting, this article makes a relatively good job at stepping out of the hype and nailing down in a simple way a topic that is causing much debate at the moment, as people from industry are trying to incorporate ideas from academia into the BPMS/SOA picture (and in particular the idea of attaching formal semantics to process modelling notations).

    It should be pointed out that the article has a number of inaccuracies, like for example stating that statecharts are "supplemental" to Petri nets or process algebra, something that I''m sure David Harel (the author of the statecharts formalism) would receive with great surprise. Let''s not forget that statecharts stem from finite state automata and were designed in the context of reactive systems modelling. Also, Figure 1 seems to have an unintended deadlock in the transition corresponding to the underwritting task (although this might just be a glitch caused by the use of multiple colours in the drawing). Finally, the details in the bibliographic references are incomplete and sometimes inaccurate.

    But putting aside minor comments, this article is more than welcome and I hope that it will help stop the flow of marketing-driven vagaries which are currently being published, claiming that BPMS and SOA is all about pi-calculus (or Petri nets by the same token). We should be analysing the requirements of BPMS/SOA modelling techniques and deriving appropriate concepts and notations, whichever the source of these concepts/notations is.

    Marlon Dumas
    Queensland University of Technology, Australia