Sketched Answer Set Programming

Sergey Paramonov; Christian Bessiere; Anton Dries; Luc De Raedt

arXiv:1705.07429·cs.AI·August 23, 2018

Sketched Answer Set Programming

Sergey Paramonov, Christian Bessiere, Anton Dries, Luc De Raedt

PDF

1 Repo

TL;DR

Sketched Answer Set Programming (SkASP) is a new method that helps users create ASP models by allowing uncertain parts to be marked and using examples to guide the rewriting process, simplifying ASP modeling.

Contribution

Introduces SkASP, a novel approach that enables easier ASP model development through sketching and example-guided rewriting, enhancing usability and reusability.

Findings

01

Successfully applied to 21 puzzles and combinatorial problems.

02

Demonstrated effectiveness in a database application case.

03

Facilitated easier ASP model creation with uncertain parts.

Abstract

Answer Set Programming (ASP) is a powerful modeling formalism for combinatorial problems. However, writing ASP models is not trivial. We propose a novel method, called Sketched Answer Set Programming (SkASP), aiming at supporting the user in resolving this issue. The user writes an ASP program while marking uncertain parts open with question marks. In addition, the user provides a number of positive and negative examples of the desired program behaviour. The sketched model is rewritten into another ASP program, which is solved by traditional methods. As a result, the user obtains a functional and reusable ASP program modelling her problem. We evaluate our approach on 21 well known puzzles and combinatorial problems inspired by Karp's 21 NP-complete problems and demonstrate a use-case for a database application based on ASP.

Tables1

Table 1. TABLE I : Dataset summary: the number of sketched variables, of rules, of particular types of sketched variables, e.g., “# ?not”, indicates how many atoms with the sketched negation are in the program.

Problem	# Sketched	# ?=	# ?+	# ?not	# ?p	# Rules
Graph Clique	3	1	0	0	2	4
3D Matching	3	3	0	0	0	1
Graph Coloring	7	4	0	0	3	2
Domination Set	3	0	0	1	2	5
Exact Cover	7	2	0	1	4	3
Sudoku	5	4	0	1	0	4
B&W Queens	5	3	2	0	0	3
Hitting Set	3	0	0	1	2	2
FAP	3	0	0	1	2	3
Feedback Arc Set	4	0	0	2	2	3
Latin Square	4	4	0	0	0	2
Edge Domination	3	0	0	1	2	5
FAP	5	3	2	0	0	3
Set Packing	4	2	0	0	2	1
Clique Cover	4	3	0	1	0	3
Feedback Set	5	0	0	5	0	3
Edge Coloring	3	3	0	0	0	3
Set Splitting	5	2	0	1	2	3
N Queens	6	4	2	0	0	3
Vertex Cover	3	0	0	1	2	4
Subg. Isomorph.	5	2	0	1	2	4

Equations18

1 {decision_s_{i} (X) : d_{i} (X)} 1.

1 {decision_s_{i} (X) : d_{i} (X)} 1.

reified_s (d_{i}, X_{1}, \dots, X_{n}) \leftarrow d_{i} (X_{1}, \dots, X_{n}) .

reified_s (d_{i}, X_{1}, \dots, X_{n}) \leftarrow d_{i} (X_{1}, \dots, X_{n}) .

\leftarrow body, positive (E)

\leftarrow body, positive (E)

negsat (E) \leftarrow body, negative (E)

negsat (E) \leftarrow body, negative (E)

\leftarrow negative (E), not negsat (E) .

\leftarrow negative (E), not negsat (E) .

S = # agg {Z_{1}, \dots, Z_{n} : cond (X_{1}, \dots, X_{k} internal, Y_{1}, \dots, Y_{h} external, Z_{1}, \dots, Z_{n}) aggregated},

S = # agg {Z_{1}, \dots, Z_{n} : cond (X_{1}, \dots, X_{k} internal, Y_{1}, \dots, Y_{h} external, Z_{1}, \dots, Z_{n}) aggregated},

external (Y_{1}, \dots, Y_{h})

reified (S, s u m, \overset{ˉ}{Y}) \leftarrow S = # sum {\overset{ˉ}{Z} : cond (\overset{ˉ}{X}, \overset{ˉ}{Y}, \overset{ˉ}{Z})}, external (\overset{ˉ}{Y}) .

reified (S, s u m, \overset{ˉ}{Y}) \leftarrow S = # sum {\overset{ˉ}{Z} : cond (\overset{ˉ}{X}, \overset{ˉ}{Y}, \overset{ˉ}{Z})}, external (\overset{ˉ}{Y}) .

\leftarrow color (X, Y_{1}, C), color (X, Y_{2}, C), Y_{1} \neq = Y_{2} .

\leftarrow color (X, Y_{1}, C), color (X, Y_{2}, C), Y_{1} \neq = Y_{2} .

\leftarrow color (X_{1}, Y, C), color (X_{2}, Y, C), X_{1} \neq = X_{2} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

SergeyParamonov/sketching
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Sketched Answer Set Programming

††thanks: This work has been partially funded by the ERC AdG SYNTH (Synthesising inductive data models)

Sergey Paramonov

*KU Leuven

*Leuven, Belgium

[email protected]

Christian Bessiere

*LIRMM**CNRS

*Montpellier, France

[email protected]

Anton Dries

*KU Leuven

*Leuven, Belgium

[email protected]

Luc De Raedt

*KU Leuven

*Leuven, Belgium

[email protected]

Abstract

Answer Set Programming (ASP) is a powerful modeling formalism for combinatorial problems. However, writing ASP models can be hard. We propose a novel method, called Sketched Answer Set Programming (SkASP), aimed at facilitating this. In SkASP, the user writes partial ASP programs, in which uncertain parts are left open and marked with question marks. In addition, the user provides a number of positive and negative examples of the desired program behaviour. SkASP then synthesises a complete ASP program. This is realized by rewriting the SkASP program into another ASP program, which can then be solved by traditional ASP solvers. We evaluate our approach on 21 well known puzzles and combinatorial problems inspired by Karp’s 21 NP-complete problems and on publicly available ASP encodings.

Index Terms:

inductive logic programming, constraint learning, answer set programming, sketching, constraint programming, relational learning

I Introduction

Many AI problems can be formulated as constraint satisfaction problems that can be solved by state-of-the-art constraint programming (CP) [34] or answer set programming (ASP) techniques [27]. Although these frameworks provide declarative representations that are in principle easy to understand, writing models in such languages is not always easy.

On the other hand, for traditional programming languages, there has been significant attention for techniques that are able to complete [25] or learn a program from examples [17]. The idea of program sketching is to start from a sketched program and some examples to complete the program. A sketched program is essentially a program where some of the tests and constructs are left open because the programmer might not know what exact instruction to use. For instance, when comparing two variables $X$ and $Y$ , the programmer might not know whether to use $X<Y$ or $X\leq Y$ or $X>Y$ and write $X~{}{?}{=}~{}Y$ instead (while also specifying the domain of ${?}{=}$ , that is, which concrete operators are allowed). By providing a few examples of desired program behaviour and a sketch, the target program can then be automatically found. Sketching is thus a form of ”lazy” programming as one does not have to fill out all details in the programs; it can also be considered as program synthesis although there are strong syntactic restrictions on the programs that can be derived; and it can be useful for repairing programs once a bug in a program has been detected. Sketching has been used successfully in a number of applications [24, 35, 19] to synthesise imperative programs. It is these capabilities that this paper brings to the field of ASP.

As a motivating example assume one needs to solve the Peacefully Coexisting Armies of Queens, a version of the $n$ -queens problem with black and white queens, where queens of the same color do not attack each other. One might come up with the following sketched program (where $R_{w}$ ( $C_{b}$ ) stand for the variable representing the row (column) of a white (black) queen):

This program might have been inspired by a solution written in the constraint programming language Essence available from the CSP library [32]. Intuitively, the sketched ASP specifies constraints on the relationship between two queens on the rows (first rule), columns (second rule) and diagonals (third rule), but it expresses also uncertainty about the particular operators that should be used between the variables through the built-in alternatives for ${?}{=}$ (which can be instantiated to one of $=,\neq,<,>,\leq,\geq$ ) and for ${?}+$ (for arithmetic operations). When providing an adequate set of examples to the ASP, the SkASP solver will then produce the correct program.

The key contributions of this paper are the following: 1) we adapt the notion of sketching for use with Answer Set Programming; 2) we develop an approach (using ASP itself) for computing solutions to a sketched Answer Set Program; 3) we contribute some simple complexity results on sketched ASP; and 4) we investigate the effectiveness and limitations of sketched ASP on a dataset of 21 typical ASP programs.

II ASP and Sketching

Answer Set Programming (ASP) is a form of declarative programming based on the stable model semantics [15] of logic programming [27]. We follow the standard syntax and semantics of ASP as described in the Potassco project [13]. A program is a set of rules of the form $a\leftarrow a_{1},\dots a_{k},\textit{not }a_{k+1},\dots,\textit{not }a_{n}$ A positive or negative atom is called a literal, $a$ is a positive propositional literal, called a head, and for $i$ between $1$ and $k$ , $a_{i}$ is a positive propositional atom; and for $i$ between $k+1$ and $n$ , $\textit{not }a_{i}$ is a negative propositional literal. The body is the conjunction of the literals. A rule of the form $a\leftarrow.$ is called a fact and abbreviated as $a.$ and a rule without a head specified is called an integrity constraint ( $a$ is $\bot$ in this case). Conditional literals, written as $a:l_{1},\dots,l_{n}$ , and cardinality constraints, written as $c_{\min}\{l_{1},\dots,l_{n}\}c_{\max}$ , are used ( $l_{1},\dots,l_{n}$ are literals here, and $c_{\min},c_{\max}$ are non-negative integers). A conditional atom holds if its condition is satisfied and a cardinality constraint is satisfied if between $c_{\min}$ and $c_{\max}$ literals hold in it. Furthermore, as ASP is based on logic programming and also allows for variables, denoted in upper-case, the semantics of a rule or expression with a variable is the same as that of its set of ground instances. We restrict the ASP language to the NP-complete subset specified here. For more details on ASP, see [13, 10].

We extend the syntax of ASP with sketched language constructions. Instead of allowing only atoms of the form $p(t_{1},...,t_{n})$ , where $p/n$ is a predicate and the $t_{i}$ are terms (variables or constants), we now allow to use sketched atoms of the form $?q(t_{1},...,t_{n})$ where $?q$ is a sketched predicate variable with an associated domain $d_{q}$ containing actual predicates of arity $n$ . The meaning of the sketched atom $?q(t_{1},...,t_{n})$ is that it can be replaced by any real atom $p(t_{1},...,t_{n})$ provided that $p/n\in d_{q}$ . It reflects the fact that the programmer does not know which $p/n$ from $d_{q}$ should be used. Sketched atoms can be used in the same places as any other atom.

We also provide some syntactic sugar for some special cases and variants, in particular, we use a sketched inequality $X~{}?{=}~{}Y$ , a sketched arithmetic operator $X~{}?{+}~{}Y$ (strictly speaking, this is not a sketched predicate but an operator, but we only make this distinction where needed), and sketched negation ${?}\textit{not }p(X)$ (which is, in fact, a sketched operator of the form ”?not ¡atom¿“; it always has as input a positive atom and its domain is {atom, -atom}, where -atom is a syntactically new atom, which represents the negation of the original atom). The domain of $X~{}?{=}~{}Y$ is the set $\{=,\neq,<,>,\geq,\leq,\top\}$ , where $\top$ is the atom that is always satisfied by its arguments, the domain of $X~{}?{+}~{}Y$ is the set $\{+,-,\times,\div,\textit{dist}\}$ where $\textit{dist}(a,b)$ is defined as $|a-b|$ , and the domain of ${?}\textit{not}$ is $\{\emptyset,not\}$ . An example of sketched inequality can be seen in Line 1(c) of Figure 1(c), examples of sketched predicates and negation in Line 1(a) of Figure 1(a) and sketched arithmetic in Line 3 of Sketch 1.

A sketched variable is a sketched predicate, a sketched negation, a sketched inequality or a sketched arithmetic operator. The set of all sketched variables is referred to as $S$ . Predicate $p$ directly positively (negatively) depends on $q$ iff $q$ occurs positively (negatively) in the body of a rule with $p$ in the head or $p$ is a sketched predicate and $q$ is in its domain; $p$ depends (negatively) on $q$ iff $(p,q)$ is in the transitive closure of the direct dependency relation. A sketch is stratified iff there is no negative cyclic dependency. We restrict programs to the stratified case. An example is a set of ground atoms.

A preference is a function from $\Theta$ (possible substitutions) to $\mathbb{Z}$ . A substitution $\theta$ is preferred over $\theta^{\prime}$ given preferences $f$ iff for all $s_{i}\mapsto d_{i}\in\theta$ and $s_{i}\mapsto d_{i}^{\prime}\in\theta^{\prime}$ it holds that $f(s_{i}\mapsto d_{i})\geq f(s_{i}\mapsto d_{i}^{\prime})$ and at least one inequality is strict. First, when $f(\theta)$ is constant, all substitutions are equal and there are no preferences (all equally preferred). Because specifying preferences might impose an extra burden on the user, we also provide default preferences for the built-in sketched variables (like inequality, etc), cf. the experimental section.

The Language of Sketched Answer Set Programming (SkASP) supports some of the language features of ASP. The language of SkASP has the following characteristics:

•

it allows for a set of rules of the form $a\leftarrow b_{1},\dots,b_{n},\textit{not }c_{1},\dots,\textit{not }c_{m}.$ ;

•

predicates (such as a predicate $p/n$ or comparison $\leq$ ) and operators (such as arithmethic $+,-,\times$ , etc) in these rules can be sketched;

•

aggregates can be used in the body of the rules as well (stratified; see Extension Section IV);

•

the SkASP program has to be stratified;

•

the choice rules are not allowed.

The key idea behind our method is that the SkASP program is rewritten into a normal ASP program (with choice rules, etc.) in order to obtain a solution through the use of an ASP solver. As we will see in Theorem 2: the language of SkASP stays within the complexity bounds of normal ASP, which makes the rewriting possible (SkASP $\mapsto$ ASP).

Let us now formally introduce the problem of SkASP.

Definition 1 (The Problem of Sketched Answer Set Programming).

Given a sketched answer set program $P$ with sketched variables $S$ of domain $D$ and preferences $f$ , and positive and negative sets of examples $\mathbf{E}^{+}$ and $\mathbf{E}^{-}$ , the Sketched Answer Set Problem is to find all substitutions $\theta:S\mapsto D$ preferred by $f$ such that $P\theta\cup\{e\}$ has an answer for all $e$ in $\mathbf{E}^{+}$ and for no $e$ in $\mathbf{E}^{-}$ . The decision version of SkASP asks whether there exists such a substitution $\theta$ .

III Rewriting Schema

One might consider a baseline approach that would enumerate all instances of the ASP sketch, and in this way produce one ASP program for each assignment that could then be tested on the examples. This naive grounding and testing approach is, however, infeasible: the number of possible combinations grows exponentially with the number of sketched variables. E.g., for the sketch of the Radio Frequency Problem [7] there are around $10^{5}$ possible assignments to the sketched variables. Multiplied by the number of examples, around a million ASP programs would have to be generated and tested. This is infeasible in practice.

The key idea behind our approach is to rewrite a SkASP problem $(P,S,D,f,\mathbf{E}^{+},\mathbf{E}^{-})$ into an ASP program such that the original sketching program has a solution iff the ASP program has an answer set. This is achieved by 1) inserting decision variables into the sketched predicates, and 2) introducing example identifiers in the predicates.

The original SkASP problem is then turned into an ASP problem on these decision variables and solutions to the ASP problem allow to reconstruct the SkASP substitution.

The rewriting procedure has four major steps: example expansion, substitution generation, predicate reification and constraint splitting. (Here we follow the notation on meta-ASP already used in the literature [21, 11].)

Example Identifiers To allow the use of multiple examples in the program, every relevant predicate is extended with an extra argument that represents the example identifier. The following steps are used to accommodate this in the program, denoted as metaE( $P,S,\mathbf{E}^{+},\mathbf{E}^{-}$ ).

Let SP be the set of all predicates that depend on a predicate occurring in one of the examples. 2. 2.

Replace each literal $p(t_{1},...,t_{n})$ for a predicate $p\in\textit{SP}$ in the program $P$ by the literal $p(E,t_{1},...,t_{n})$ , where $E$ is a variable not occurring in the program. 3. 3.

Add the guard $\textit{examples}(E)$ (the index of all pos./neg. examples) to the body of each rule in $P$ . 4. 4.

For each atom $p(t_{1},...,t_{n})$ in the $i$ -th example, add the fact $p(i,t_{1},...,t_{n})$ to $P$ . 5. 5.

For each positive example $i$ , add the fact $\textit{positive}(i)$ to $P$ , and for each negative one, the fact $\textit{negative}(i)$ .

E.g., the rule in Line 1(a) of Figure 1(a) becomes Line 1(b) of Figure 1(b), and the example in Line 1(a) is rewritten as in Line 1(b).

Substitution Generation We now introduce the decision variables, referred as metaD( $S,D$ ):

For each sketched variable $s_{i}$ with domain $D_{i}$

[TABLE] 2. 2.

For each value $v$ in $D_{i}$ , add the fact $d_{i}(v)$ .

This constraint ensures that each answer set has exactly one value from the domain assigned to each sketched variable. This results in a one-to-one mapping between decision atoms and sketching substitution $\theta$ . An example can be seen in Lines 1(b) and 1(b) of Figure 1(b).

Predicate Reification We now introduce the reified predicates, referred as metaR( $S,D$ )

Replace each occurrence of a sketched atom $s(t_{1},...,t_{n})$ in a rule of $P$ with the atom $\textit{reified}\_s(D,t_{1},\dots,t_{n})$ , and add $\textit{decision}\_s(D)$ to the body of the rule. 2. 2.

For each sketched variable $s$ and value $d_{i}$ in its domain, add the following rule to $P$ :

[TABLE]

where the first argument is the decision variable for $s$ .

Thus, semantically $\textit{reified}\_s(d_{i},X_{1},\dots,X_{n})$ is equivalent to $d_{i}(X_{1},\dots,X_{n})$ and $\textit{decision}\_s(d_{i})$ indicates that the predicate $d_{i}$ has been selected for the sketched variable $s$ . Notice that we focused here on the general case of a sketched predicate ${?}p(\dots)$ . It is straightforward to adapt this for the sketched inequality, negation and arithmetic. Examples of reification can be seen in Lines 1(b) of Figure 1(b) for the sketched ${?}q$ of the sketch in Figure 1(a) and in Lines 1(b), 1(b) for reified negation.

Integrity Constraint Splitting (referred as metaC( $P$ ))

Replace each integrity constraint $\leftarrow\textit{body}$ by:

[TABLE]

[TABLE] 2. 2.

And add the rule to the program:

[TABLE]

This will ensure that all positives and none of the negatives have a solution. For example, the constraint in Line 1(a) of Figure 1(a) is rewritten into a positive constraint in Lines 1(b),1(b) and into a negative in Lines 1(b), 1(b), 1(b).

Another important result is that the preferences do not affect the decision complexity. Proofs can be found in the supplementary materials.

Theorem 1 (Sound and Complete Sketched Rewriting).

A sketched ASP program $(P,S,D,f,\mathbf{E}^{+},\mathbf{E}^{-})$ has a satisfying substitution $\theta$ iff the meta program

$T=$ $\textit{metaE($ P,S,\mathbf{E}^{+},\mathbf{E}^{-} $)}\cup\textit{metaD($ S,D $)}\cup\textit{metaR($ S,D $)}\cup\textit{metaC($ P $)}$ has an answer set.

Interestingly, the sketched ASP problem is in the same complexity class as the original ASP program.

Theorem 2 (Complexity of Sketched Answer Set Problem).

The decision version of propositional SkASP is in NP.

Proof.

Follows from the encoding of SkASP into a fragment of ASP which is in NP. ∎

Dealing with preferences Preferences are, as we shall show in our experiments, useful to restrict the number of solutions. We have implemented preferences using a post-processing approach (which will also allow to apply the schema to other formalisms such as CP or IDP [8]). We first generate the set of all solutions $O$ (without taking into account the preferences), and then post-process $O$ . Basically, we filter out from $O$ any solution that is not preferred (using tests on pairs $(o,o^{\prime})$ from $O\times O$ ). The preferences introduce a partial order on the solutions. For example, assume ${?}p$ ( ${?}q$ ) can take value $p_{1}$ ( $q_{1}$ ) with preference of $1$ and $p_{2}$ ( $q_{2}$ ) with $2$ . If $(p_{1},q_{2})$ and $(p_{2},q_{1})$ are the only solutions, they are kept because they are incomparable – $(1,2)$ is not dominated by $(2,1)$ (and vice versa). If $(p_{1},q_{1})$ is also solution, $(p_{1},q_{2})$ and $(p_{2},q_{1})$ are removed because they are dominated by $(p_{1},q_{1})$ .

While the number of potential Answer Sets is in general exponential for a sketched ASP, the number of programs actually satisfying the examples is typically rather small (in our experiments, below 10000-20000). If that is not the case, then the problem is under-constrained and it needs more examples. No user would be able to go over a million of proposed programs.

IV System Extension: Aggregates and Use-Case

An aggregate #agg is a function from a set of tuples to an integer. For example, $\#\textit{count}\{Column,Row:\textit{queen}(Column,Row)\}$ counts the number of instances queen(Column,Row) at the tuple level. Aggregates are often useful for modeling. However, adding aggregates to non-disjunctive ASP raises the complexity of an AS existence check, unless aggregate dependencies are stratified [12]. It is possible to add aggregates into our system under the following restrictions: stratified case, aggregates occur in the body in the form ${N=\#\textit{agg}\{\dots\}}$ , sketched with the keyword ?# , where #agg can be max, min, count and sum. This immediately allows us to model problems such as Equal Subset Sum (for details, see the repository), where one needs to split a list of values, specified as a binary predicate val(ID,Value) into two subsets, such as subset1(ID) (and subset2(ID) respectively), such that the sum of both subsets is equal. Essentially, we sketch the constraint of the form:

:- S1 != S2, S1 = ?#{V,X:val(X,V),subset1(X)}...

Formally, each aggregate can be seen as an expression of the form:

[TABLE]

where $S$ is an integer output, and $Y_{1},\dots,Y_{h}$ , shortened as $\bar{Y}$ ( $\bar{X}$ and $\bar{Z}$ are the same kind of shortening) are bound to other atoms in the rule, to which we refer as external( $\bar{Y}$ ) (”external” with respect to the condition in the aggregate; it is simply shortening for a conjunction of atoms, which share variables with the condition in the predicate).

To give an example of $\bar{X},\bar{Y},\bar{Z}$ in a simple context: if we were to compute an average salary per department in a company, we might have written a rule of the form:

avg_sal(A,D) :- A = #avg{S,N: salaries(N,S,D)}, department(D).Then, $\bar{Z}$ consists of the variable S and D is the external variable (with respect to the condition in the aggregate), i.e., $\bar{Y}$ and $\bar{X}$ is composed out of the variable N, since it is neither used in the aggregation, nor in the other atoms outside of the aggregate.

A sketched aggregate $?\#$ , can be reified similarly to the regular sketched atoms, i.e.:

[TABLE]

similarly for other aggregate functions; the same rules, e.g., the example extension, apply to aggregate reification.

With aggregates we can sketch a significantly larger class of problems. Consider the problem from the Functional Pearls Collection: “Finding celebrities problem” [5]111ASP code: hakank.org/answer_set_programming/finding_celebrities.lp4. Problem statement: “Given a list of people at a party and for each person the list of people they know at the party, we want to find the celebrities at the party. A celebrity is a person that everybody at the party knows but that only knows other celebrities. At least one celebrity is present at the party.” The sketch core looks as follows (names are shortened):

n(N) :- N = ?#{ P : p(P) }.:- c(C), p(C), n(N), S = ?#{P : k(P,C), p(P)}, S < N-1.:- c(C), p(C), not c(P), k(C,P).The last rule is an integrity constraint verifying that no celebrity, c, knows a person who is not a celebrity. The first line sketches a rule that should find what aggregation metric on the people (unary predicate person, p) should be used in the problem. The sketched rule in the second line makes use of this metric, denoted as n, and says that an aggregation should be performed on the binary ”knows” predicate, k, (indicating that two persons know each other); so the outcome of the sketched aggregation on the connection between people should be compared to an overall metric on all people individually.

V Experimental Evaluation

For the experimental evaluation we have created a dataset consisting of 21 classical combinatorial problems among which most are NP-complete. For the problem list and precise sketch specifications used in the experiments, we refer to Table I. All problems, their code, and implementation details, can be found in the accompanying Github repository: https://github.com/SergeyParamonov/sketching

Dataset of Sketches. The key challenge in evaluating program synthesis techniques such as SkASP is the absence of benchmark datasets (as available in more typical machine learning tasks). At the same time, although there are many example ASP programs available in blogs, books or come with software, these typically employ advanced features (such as incremental grounding, optimization or external sources) which are not supported by SkASP as yet. Therefore we had to design our own dataset in a systematic way (and put it in the public domain). The dataset is based on a systematic concept (the 21 problems by Karp). When we could find encodings for these problem (such as Sudoku in Figure 1(c) from [18] and Hamiltonian Cycle in Figure 1(a) from [13]) we took these problems, in all other cases we developed a solution according to the standard generate and test development methodology of ASP. Specifically (see $\textbf{Q}_{5}$ ) we looked for different encodings in the public domain of ASP’s favorite – the N-queens problem (these encoding can tackle even its NP-complete version [16]).

After creating all the ASP programs, we turned them into sketches by looking for meaningful opportunities to use sketched variables. We introduced sketched variables to replace operators (equalities and inequalities), to replace arithmetic (such as plus and minus) and to decide whether to use negated literals or not, and to make abstraction of predicates for which another predicate existed with the same signature.

Finally, we had to select the examples in a meaningful way, that is, we selected examples that would be informative (as a user of SkASP would also do). Positive examples were actually selected more or less random, negative examples are meant to violate one or more of the properties of the problem. Furthermore, we also tried to select examples that carry different information (again as a user of SkASP would do). We selected between 4 and 7 examples for each model. Where relevant in the experiments, we sampled the sketched variables (e.g. $\textbf{Q}_{5}$ ) or the examples (e.g. $\textbf{Q}_{3}$ )

Experimental questions are designed to evaluate how usable is SkASP in practice. Users want in general to provide only a few examples ( $\textbf{Q}_{1}$ - $\textbf{Q}_{3}$ ), to obtain a small number of solutions (ideally only one) ( $\textbf{Q}_{1}$ - $\textbf{Q}_{2}$ ), the examples should be small ( $\textbf{Q}_{4}$ ), the solutions should be correct (all), want to know whether and when to use preferences ( $\textbf{Q}_{2}$ ), and how robust the technique is to changes in the encoding ( $\textbf{Q}_{5}$ ) as it is well known in ASP that small changes in the encoding can have large effects. Finally, they are typically interested in how the learned programs change as the sketches become more complex ( $\textbf{Q}_{3}$ ). With this in mind, we have designed and investigated the following experimental questions:

•

$\textbf{Q}_{1}$ : What is the relationship between the number of examples and the number of solutions? How many examples does it take to converge?

•

$\textbf{Q}_{2}$ : Are preferences useful?

•

$\textbf{Q}_{3}$ : What is the effect of the number of sketched variables on the convergence and correctness of the learned programs?

•

$\textbf{Q}_{4}$ : Do models learned on examples with small parameter values generalize to models with larger parameter values?

•

$\textbf{Q}_{5}$ : What is the effect of encoding variations on the number of solutions and their correctness?

Implementation details and limitations. The SkASP engine is written in Python 3.4 and requires pyasp. All examples have been run on a 64-bit Ubuntu 14.04, tested in Clingo 5.2.0. The current implementation does not support certain language constructs such as choice rules or optimization.

We use the default preferences in the experiments for the built-in inequality sketch $X~{}?{=}~{}Y$ : namely $=$ and $\neq$ have equal maximal preference. A user can redefine the preferences. Our experiments indicate that for other sketched types (e.g., arithmetic, etc) no default preferences are needed.

We investigate $\textbf{Q}_{1}$ by measuring the impact of the number of examples on the number of solutions of the 21 SkASP problems. An interesting observation is that even if the user wants to solve, say the Latin Square $20\times 20$ , she does not need to provide examples of size $20\cdot 20=400$ . She can simply provide $3\times 3$ examples and our SkASP problem will learn the generic Latin Square program (see Figure 1(d)).

Figure 2(a) shows how the number of solutions for some of our 21 SkASP problems depends on the number of examples. In some cases, 7 examples are sufficient to converge to a single solution e.g., FAP, B&W Queens.

On some other problems, however, after 7 examples there still remain many solutions (on average 18 for problems that do not converge). Figure 2(b) reports the same information as Figure 2(a) for all 21 problems: the average number of solutions; the average on the 9 that converge within 7 examples, referred to as the easy problems; and the average on the 12 that still have several solutions after 7 examples, referred to as the hard problems. When SkASP does not converge to a unique solution, this leaves the user with choices, often amongst equivalent ASP programs, which is undesirable.

For problems that do not converge after a few examples, we propose to use preferences, as provided by our SkASP framework. We use the default preference described earlier.

We investigate $\textbf{Q}_{2}$ by measuring again the impact of the number of examples on the number of solutions. In Figure 2(c), we observe that all problems have converged in less than 7 examples (under default preferences). The impact of preferences on the speed of convergence is even more visible on the whole set of problems, as reported in Figure 2(b). The number of solutions with preferences is smaller, and often much smaller than without preferences, whatever the number of examples provided. With preferences, all our 21 problems are learned with 7 examples.

To analyze the number of solutions in $\textbf{Q}_{3}$ , we look into the convergence of FAP by varying the number of sketched variables. The original sketched program of FAP contains 5 sketched variables. We vary it from 2 to 5 by turning 3, 2, 1, or 0 sketched variables into their actual value (chosen randomly and repeated over multiple runs). As expected, in Figure 2(d), we observe that the more there are sketched variables in the SkASP, the slower the number of solutions decreases. Furthermore, the number of sketched variables has a greater impact on the convergence without preferences, as we see in Figure 2(e). After 3-5 examples under preferences we have fewer than 10 solutions, while without preferences there are still dozens or hundreds of solutions.

To analyze correctness in $\textbf{Q}_{3}$ , we need first to define it. Informally, we mean that the program classifies arbitrary examples correctly, i.e., positive as positive, etc. A typical metric to measure this is accuracy. However, there are no well defined arbitrary positive and negative examples for the most problems: what is an arbitrary random example for Feedback Arc Cover? Problems like Sudoku and $N$ -queens do have standard examples because they are parameterized with a single parameter, which has a default value. Furthermore, for the standard $8$ -queens we know all solutions analytically, i.e., 92 combinations. Another issue is that the negative and positive classes are unbalanced. The usual way to address this issue is to use precision, i.e., $\frac{\text{True Positive}}{\text{True+False Positives}}$ . (Recall is typically one because the incorrect programs produce way too many solutions that include the correct ones.) In Figure 2(f), we see that in all cases we were able to reach the correct solution (here the locations of sketched variables were fixed as specified in the dataset); while increasing the number of sketched variables generally decreases the precision.

To investigate $\textbf{Q}_{4}$ , we have used the Latin Square from Listing 1(d). We have used examples for Latin Square $3\times 3$ , and verified its correctness on Latin Square $4\times 4$ (which can be checked analytically because all solutions are known). We have discovered, that there is an inverse dependency between number of solutions and accuracy, see Figure 3(a). This happens because there are typically very few useful or “intended” programs while there are lot of incorrect ones.

To investigate $\textbf{Q}_{5}$ , we have focused on the $N$ -queens problem and collected several encodings from multiple sources: Potascco, Hakank.org, an ASP course by Tran Cao Son222www.hakank.org/answer_set_programming/nqueens.lp

www.cs.uni-potsdam.de/~torsten/Lehre/ASP/Folien/asp-handout.pdf

www.cs.nmsu.edu/~tson/tutorials/asp-tutorial.pdf and our encoding. Whereas all the encodings model the same problem they show significant variations in expressing constraints. To reduce the bias in how the sketched variables are introduced and systematically measure the parameters, we pick sketched variables randomly (inequalities and arithmetic) and use the same examples from our dataset (randomly picking the correct amount) for all models.

In Figure 3(b), while there is a certain variation in the number of solutions, they demonstrate similar behavior. For each encoding we have introduced $5$ sketched variables and measured the number of solutions and precision. In Figure 3(c) we see that there is indeed a slight variation in precision, with 3 out of 4 clearly reaching above 90% precision, one reaching 100% and one getting 82%. Thus, despite variations in encoding, they generaly behave similarly on the key metrics. The results have been averaged over $100$ runs.

Overall, we observe that only few examples are needed to converge to a unique or a small group of equivalent solutions. An example where such equivalent solutions are found is the edge coloring problem, where two equivalent (for undirected graphs) constraints are found:

[TABLE]

For this problem these two constraints are equivalent and cannot be differentiated by any valid example.

An interesting observation we made on these experiments is that the hardness (e.g., in terms of runtime) of searching for a solution of a problem is not directly connected to the hardness of learning the constraints of this problem. This can be explained by the incomparability of the search spaces. SkASP searches through the search space of sketched variables, which is usually much smaller than the search space of the set of decision variables of the problem to learn.

VI Related Work

The problem of sketched ASP is related to a number of topics. First, the idea of sketching originates from the area of programming languages, where it relates to so called self-completing programs [25], typically in C [24] and in Java [19], where an imperative program has a question mark instead of a constant and a programmer provides a number of examples to find the right substitution for it. While sketching has been used in imperative programming languages, it has – to the best of the authors’ knowledge – never been applied to ASP and constraint programming. What is also new is that the sketched ASP is solved using a standard ASP solver, i.e., ASP itself.

Second, there is a connection to the field of inductive (logic) programming (ILP) [9, 28, 17]. An example is meta-interpretive learning [29, 30] where a Prolog program is learned based on a set of higher-order rules, which act as a kind of template that can be used to complete the program. However, meta-interpretive learning differs from SkASP in that it induces full programs and pursues as other ILP methods a search- and trace-based approach guided by generality, whereas SkASP using a constraint solver (i.e., ASP itself) directly. Furthermore, the target is different in that ASPs are learned, which include constraints. SkASP relates to meta-interpretation in ASP [11] in rule and decision materialization. The purpose is, however, different: they aim at synthesizing a program of higher complexity ( $\Sigma_{2}^{P}$ ) given programs of lower complexity (NP and Co-NP).

There are also interesting works in the intersection of ILP, program synthesis and ASP [21, 23, 33]. The ILASP system [22] learns an ASP program from examples, and a set of modes, while minimizing a metric, typically the number of atoms. This program, learned completely from scratch, is not necessarily the best program from the user’s point of view and may limit the possibility to localize the uncertainty based on the user’s knowledge of the problem. Indeed, if all sketched predicates are added in the modes with corresponding background knowledge, then the set of hypotheses of sketched ASP is a subset of ILASP. However, if we specify a sketched constraint :- p(X),q(Y),X?=Y with the negative example {p(1),q(2)} as modes for ILASP [22], it would learn a program like :- p(X) (minimal program), but that is clearly not the program intended by the sketch. Furthermore, we compute all preferred programs instead of a single solution.

Third, there is also work on constraint learning, where the systems such as CONACQ [4, 2] and QUACQ [3] learn a set of propositional constraints, and ModelSeeker [1] learns global constraints governing a particular set of examples. The subject has also been investigated in ILP setting [20]. However, the idea in all these approaches is to learn the complete specification of CSPs from scratch. Our setting is probably more realistic from a user perspective as it allows to use the knowledge that the user no doubt possesses about the underlying problem, and also requires much less examples. On the other hand, SkASP also makes, as other sketching approaches, the strong assumption that the intended target program is an instance of the sketched one. This may not always be true, for instance, when rules are missing in the program. This is an interesting issue for further research.

Fourth, our approach is related to debugging of ASP [14, 31]. Unlike SkASP such debuggers can be used to locate bugs, but typically do not provide help in fixing them. On the other hand, once a bug is identified, SkASP could be used to repair it by introducing a sketch and a number of examples333 During the experiments, we stumbled upon a peculiar bug. One ASP encoding that we discovered in a public repository worked mostly by pure luck. The following constraint :-queen(X1,Y1),queen(X2,Y2),X1<X2,abs(Y1-X1)==abs(Y2-X2). works because abs is not actually absolute value but an uninterpreted function, essentially it checks $X==Y$ , and that is indeed the found solution. (This kind of bugs would be extremely hard to find using traditional debuggers, since technically the encoding produced correct solutions.). Also, while working on the aggregate extension use-case, we discovered a subtle bug: the case of a single celebrity was not handled correctly. In both cases, the author has been contacted and models have been updated. The approach of [26] is based on classical ILP techniques of generalization and specification and does not provide the freedom to indicate uncertain parts of the program.

VII Discussion and Conclusions

Our contribution is four-fold: we have introduced the problem of sketched ASP; we have provided a rewriting schema for SkASP; we have created a dataset of sketches and we have evaluated our approach empirically demonstrating its efficiency and effectiveness.

User interaction is an interesting future direction, namely to suggest constraints and examples. For the former, if we are not able to reject a negative example, we can construct a constraint that would reject the negative examples and none of the positive examples. As for the examples, if we have two solutions to a problem, we can generate an example discriminating between them and ask user to clarify it, while this might not always be possible, since symmetric assignments might lead to semantically identical programs. In practice, however, this might be an important addition to simplify sketching for end users. Another direction is to incorporate non-constant preference handling into the model using the extensions of ASP for preference handling, such as asprin [6].

Bibliography35

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Beldiceanu, N., Simonis, H.: A model seeker: Extracting global constraint models from positive examples. In: CP. pp. 141–157 (2012)
2[2] Bessiere, C., Koriche, F., Lazaar, N., O’Sullivan, B.: Constraint acquisition. Artif. Intell. In Press (2017)
3[3] Bessiere, C., Coletta, R., Hebrard, E., Katsirelos, G., Lazaar, N., Narodytska, N., Quimper, C., Walsh, T.: Constraint acquisition via partial queries. In: IJCAI. pp. 475–481 (2013)
4[4] Bessiere, C., Coletta, R., Koriche, F.: A sat-based version space algorithm for acquiring constraint satisfaction problems. In: ECML. pp. 23–34. Springer (2005)
5[5] Bird, R., Curtis, S.: Functional pearls: Finding celebrities: A lesson in functional programming. J. Funct. Program. 16(1), 13–20 (Jan 2006)
6[6] Brewka, G., Delgrande, J., Romero, J., Schaub, T.: Asprin: Customizing answer set preferences without a headache. In: AAAI. pp. 1467–1474 (2015)
7[7] Cabon, B., de Givry, S., Lobjois, L., Schiex, T., Warners, J.: Radio link frequency assignment. Constraints 4(1), 79–89 (Feb 1999)
8[8] de Cat, B., Bogaerts, B., Bruynooghe, M., Denecker, M.: Predicate logic as a modelling language: The IDP system. Co RR abs/1401.6312 (2014)

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Code & Models

Videos

Sketched Answer Set Programming

Abstract

Index Terms:

I Introduction

II ASP and Sketching

Definition 1** (The Problem of Sketched Answer Set Programming).**

III Rewriting Schema

Theorem 1** (Sound and Complete Sketched Rewriting).**

Theorem 2** (Complexity of Sketched Answer Set Problem).**

Proof.

IV System Extension: Aggregates and Use-Case

V Experimental Evaluation

VI Related Work

VII Discussion and Conclusions

Definition 1 (The Problem of Sketched Answer Set Programming).

Theorem 1 (Sound and Complete Sketched Rewriting).

Theorem 2 (Complexity of Sketched Answer Set Problem).