Rules
Rules are the fundamental to computation in Scallop.
Each rule defines the value and data flowing from some relation to another relation.
In the following program, we have defined a few facts for the edge
relation.
On the second line, we have defined that, for each edge (a, b)
, there is also a path (a, b)
.
We note that here, a
and b
are variables instead of constants as we have with defining facts.
During computation, the two facts in edge
will populate the path
relation.
This way, we have defined a rule for the path
, which is executed during computation.
rel edge = {(0, 1), (1, 2)}
rel path(a, b) = edge(a, b) // (0, 1), (1, 2)
In this section, we talk about how we write rules in Scallop and how intricate computation can be done through it.
Syntax
In general, the basic rules in Scallop are of the form
RULE ::= rel ATOM = FORMULA
FORMULA ::= ATOM
| not ATOM
| CONSTRAINT
| AGGREGATION
| FORMULA and FORMULA
| FORMULA or FORMULA
| ( FORMULA )
For each rule, we name the atom on the left to be the head of the rule, and the formula on the right to be the body. We read it from right to left: when the body formula holds, the head also holds. The formula might contain atoms, negated atoms, aggregations, conjunction, disjunction, and a few more constructs. For this section, we focus on simple (positive) atom, constraints, and their conjunctions and disjunctions. We will leave the discussion of negation and aggregation to the next sections.
Atom
Simple atoms are of the form RELATION(ARG_1, ARG_2, ...)
.
Similar to facts, we have the relation name followed by a tuple of numerous arguments.
Now, the arguments can be of richer forms, involving variables, constants, expressions, function calls, and many more.
Considering the most basic example from above:
rel path(a, b) = edge(a, b)
We have two variables a
and b
grounded by the edge
relation.
This means we are treating the variables a
and b
as source of information, which can be propagated to the head.
In this example, the head also contains two variables, both being grounded by the body.
Therefore the whole rule is well formed.
In case the head variables are not grounded by the body, such as the following,
rel path(a, c) = edge(a, b)
we would get an error that looks like the following:
[Error] Argument of the head of a rule is ungrounded
REPL:1 | rel path(a, c) = edge(a, b)
| ^
The error message points us to the variable c
that has not being grounded in the body.
For basic atoms, such as the ones that the user has defined, can be used to directly ground variables which are directly arguments of the atoms.
They can be used to ground other variables or expressions.
In the following example, although the rule itself might not make any sense, the variable a
is used to ground the expression a + 1
.
Therefore, the rule is completely valid.
rel output_relation(a, a + 1) = input_relation(a)
In certain cases, expressions can be used to bound variables as well!
rel output_relation(a, b) = input_relation(a, b + 1)
In the above example, the expression b + 1
can be used to derive b
, and thus making the variable b
grounded.
However, this might not be true for other expressions:
rel output_relation(b, c) = input_relation(b + c) // FAILURE
The input_relation
can ground the expression b + c
directly, however, the two arguments b
and c
cannot be derived from their sum, as there are (theoretically) infinite amount of combinations.
In this case, we will get a compilation failure.
There can be constraints present in atoms as well. For example, consider the following rule:
rel self_edge(a) = edge(a, a)
The atom edge(a, a)
in the body grounds only one variable a
.
But the pattern is used to match any edge that goes from a
and to a
itself.
Therefore, instead of grounding two values representing the "from" and "to" of an edge
, we are additionally posing constraint on the type of edge that we are matching.
Conceptually, we can view the above rule as the following equivalent rule:
rel self_edge(a) = edge(a, b) and a == b
where there is an additional constraint posed on the equality of a
and b
.
We are going to touch on and
and constraints in the upcoming sections.
Disjunction (Or)
The body formula can contain logical connectives such as and
, or
, not
, and implies
, used to connect basic formulas such as Atom.
In the following example, we are defining that if a
is b
's father or mother, then a
is b
's parent:
rel parent(a, b) = father(a, b)
rel parent(a, b) = mother(a, b)
In this program, we have divided the derivation of parent
into two separate rules, one processing the father
relationship and the other processing the mother
relationship.
This natually form a disjunction (or), as the derivation of parent
can come from 2 disjunctive sources.
Note that in Scallop (or Datalog in general), the ordering of the two rules does not matter.
Therefore, given that
rel father = {("Bob", "Alice")}
rel mother = {("Christine", "Alice")}
we can derive that the parent
relation holding two tuples, ("Bob", "Alice")
and ("Christine", "Alice")
.
The above program can be rewritten into a more compact form that looks like the following:
rel parent(a, b) = father(a, b) or mother(a, b)
// or
rel parent(a, b) = father(a, b) \/ mother(a, b)
We have used an explicit or
(\/
) keyword to connect the two atoms, father(a, b)
and mother(a, b)
.
The \/
symbol, which is commonly seen in the formal logics as the symbol vee (\(\vee\)), is also supported.
Notice that written in this way, each branch of the disjunction need to fully bound the variables/expressions in the head.
Conjunction (And)
To demonstrate the use of and
, let's look at the following example computing the relation of grandmother
based on father
and mother
:
rel grandmother(a, c) = mother(a, b) and father(b, c)
// or
rel grandmother(a, c) = mother(a, b) /\ father(b, c)
Notice that the symbol /\
is a replacement for the and
operator, which resembles the wedge (\(\wedge\)) symbol seen in formal logics.
As can be seen from the rule, the body grounds three variables a
, b
, and c
.
The variables a
and b
comes from mother
and the variables b
and c
comes from father
.
Notice that there is one variable, b
, in common.
In this case, we are joining the relation of mother
and father
on the variable b
.
Constraints
Rule body can have boolean constraints. For example, the conjunctive rule above can be re-written as
rel grandmother(a, c) = mother(a, b) and father(bp, c) and b == bp
Here, we are posing an equality (==
) constraint on b
and bp
.
Normally, constraints are such kind of binary expressions involving predicates such as
- equality and inequality (
==
and!=
) - numerical comparisons (
<
,>
,<=
, and>=
)
Other constructs
There are other constructs available for defining rules, which we continue to discuss in detail in other sections:
Traditional Datalog Syntax
If you are familiar with traditional Datalog, you can have it by swapping the =
with :-
, and the and
to ,
For example, the rule for defining grandmother
can be rewritten as
rel grandmother(a, c) :- mother(a, b), father(b, c)