I saw Mark Proctor post on the W3C Rule Interchange Format (RIF) over the weekend - W3C Rule Interchange Format for Production Rule Systems. As regular readers know, I chair the Production Rule Representation standard group for the OMG. As the Production Rule Representation (PRR) and the RIF are being coordinated and as PRR will likely end up being the modeling version of RIF/PR, it seemed like I should post something about PRR. The following metamodel defines the PRR. It features a definition of production rules for forward chaining inference and procedural processing and a definition of rulesets as collections of rules for a particular class of platform (procedural or inference rule engine).
Future extensions of PRR may address:
- rule metamodels for other classes of rules, such as Event-Condition-Action (ECA), backward chaining, and constraints
- rule representations that are specific to graphical notations, such as decision tables and decision trees
- representations of sequences of rulesets within larger decisions
- transformations between PRR and other MDA models such as SBVR.
Other concrete syntaxes may be applied to PRR Core in future. To this end, the PRR is designed to be extensible. We are submitting the PRR to OMG at the Brussels meeting in 3 weeks time so, if you have any thoughts or comments, do share them as comments so I can bring them up in the meeting. The rest of the post has more details.
<The following represents an extract from the submission to OMG. A PDF of this material, with some extra detail, is available here.
Production Rule definition
A production rule is a statement of programming logic that specifies the execution of one or more actions in the case that its conditions are satisfied. Production rules therefore have an operational semantic (formalizing state changes, e.g., on the basis of a state transition system formalism). The effect of executing production rules may depend on the ordering of the rules, irrespective of whether such ordering is defined by the rule execution mechanism or the ordered representation of the rules. The production rule is typically represented as:
if [condition] then [action-list]
Some implementations extend this definition to include an "else" construct as follows:
if [condition] then [action-list] else [alternative-action-list]
although this form is not considered for PRR; all rules that contain an "else" statement can be reduced to the first form without an "else", and the semantics for interpreting when "else" actions are executed may be complex in some Inferencing schemes. Note that this implies that a conversion from a PSM to a PIM might be complete but not reversible. Rules with "else" statements in a PSM would result in multiple PIM rules which could not then be translated back into the original rules. The new rules would be functionally equivalent, however.
Production Ruleset definition
The container for production rules is the production ruleset. The production ruleset provides
- a means of collecting rules related to some business process or activity as a functional unit,
- a runtime unit of execution in a rule engine together with the interface for rule invocation.
From an architecture and framework perspective, a ruleset is a Behavior in UML terms. The rules in a ruleset operate on a set of objects, called the "data source" in this document. The objects are provided by the ruleset's:
- context at invocation time.
The changed values at the end of execution represent the result or "output" of a ruleset invocation.
Rule Variable definition
The condition and action lists contain expressions (Boolean for condition) that refer to 2 different types of variables (which we term as standard variables and rule variables). At definition time:
- a standard variable has a type and an optional initial expression. In some systems, there may also be a constraint applied to the variable, but the latter is outside the scope of PRR. Standard variables are defined at the ruleset level.
- a rule variable has a type and a domain specified optionally by a filter applied to a data source. With no filter, its domain defaults to all objects conforming to its type that are within scope / in the data source. Rule variables may be defined at the rule level, or at the ruleset level; in the latter case the rule variable definitions are available to all rules.
Semantics of Rule Variables
- standard variables are bound to a single value (that could itself be a collection) within their domain. The value may be assigned by an initial expression, or assigned or reassigned in a rule action.
- rule variables are associated with the set of values within their domain specified by their type and filter. Each combination of values associated with each of the rule variables for a given rule is a tuple called a binding. It binds each rule variable to a value (object or collection) in the data source. These bindings are execution concepts: they are not modeled explicitly but are the result of referencing rule variables in rule definitions.
This means that a production rule is considered for instantiating against ALL the rule variable values. The use of rule variables means that the definition of a production rule is in fact:
for [rule variables] if [condition] then [action-list]
Note that there is an implied product of rule variables when multiple rule variables are defined e.g.:
for [rule variable 1] for [rule variable 2] if [condition] then [action-list]
Semantics of Production Rules
The operational semantics of production rules in general for forward chaining rules (via a rule engine) are as follows:
- Match: the rules are instantiated based on the definition of the rule conditions and the current state of the data source
- Conflict resolution: select rule instances to be executed, per strategy
- Act: change state of data source, by executing the selected rule instances' actions
However, where rule engines are not used and a simpler sequential processing of rules takes place, there is no conflict resolution and a simpler strategy for executing rules.
Operational Semantics for Forward-chaining production rules
A forward chaining production ruleset is defined without consideration of the explicit ordering of the rules; execution ordering is under the control of the inference engine that maintains a stateful representation of rule bindings. The operational semantics of forward-chaining production rules extend the general semantics as follows:
- Match: bind the rule variables based on the state of the data source, and then instantiate rules using the resulting bindings and the rule conditions. A rule instance consists of a binding and the rule whose condition it satisfies. All rule instances are considered for further processing
- Conflict resolution: the rule instance for execution is selected by some means such as a rule priority, if one has been specified
- Act: the action list for the selected rule instance is executed in some order
This sequence is repeated for each rule instance until no further rules can be matched, or an explicit end state is reached through an action. It is important to note that:
- In the case where more than one binding satisfies the condition, there is one separate rule instance per binding.
- An action may modify the data source, which can affect current as well as subsequent bindings and condition matches. For example, an existing rule instance may be removed because the match is no longer valid or an additional rule instance may be added due to a newly valid match.
One popular algorithm for implementing such a forward chaining production rules is the Rete algorithm
Operational Semantics for Sequential production rules
A sequential production rule is a production rule defined without re-evaluation of rule ordering during execution. The operational semantics of sequential production rules extends the general semantics by separating the match into bind and evaluate steps, where the bind step is once-only step, as follows:
- Bind: bind the rule variables based on the state of the data source at invocation time, and instantiate rules using the bindings.
- Evaluate: evaluate the rule conditions based on the current state of the data source. Each instance is treated as a separate rule. If the condition evaluates to false then the rule instance is not considered.
- Act: execute the action list of the current rule instance
This sequence 2-3 is repeated for one rule instance at a time until all the rules are processed, or an explicit end state is reached through an action. It is important to note that:
- The processing order is defined per rule, not per rule instance. It is specific to the engine what is the ordering of the rule instances.
- The instances to be executed are defined on the initial state of the data source. Side effects from the execution of one instance will not affect the eligibility of other instances for execution. Side effects may affect the whether specific conditions of those rules are satisfied.
- Rule execution order is determined by the specified sequence of the rules in the ruleset.