An Operational Semantics for the Cognitive Architecture ACT-R and its Translation to Constraint Handling Rules
Daniel Gall, Thom Fr\"uhwirth

TL;DR
This paper introduces a formal operational semantics for ACT-R, enabling formal reasoning and analysis of cognitive models, and provides a translation to Constraint Handling Rules (CHR) for execution and analysis.
Contribution
It formalizes ACT-R's core using an abstract semantics and translates models into CHR, making them analyzable and executable.
Findings
The semantics abstracts from technical details, focusing on core cognitive processes.
The translation to CHR is proven sound and complete.
This is the first semantics of ACT-R suitable for analysis and execution.
Abstract
Computational psychology has the aim to explain human cognition by computational models of cognitive processes. The cognitive architecture ACT-R is popular to develop such models. Although ACT-R has a well-defined psychological theory and has been used to explain many cognitive processes, there are two problems that make it hard to reason formally about its cognitive models: First, ACT-R lacks a formalization of its underlying production rule system and secondly, there are many different implementations and extensions of ACT-R with technical artifacts complicating formal reasoning even more. This paper describes a formal operational semantics - the very abstract semantics - that abstracts from as many technical details as possible keeping it open to extensions and different implementations of the ACT-R theory. In a second step, this semantics is refined to define some of its abstract…
| Architecture | Model | ||
|---|---|---|---|
| set of constants | set of types | ||
| set of variables | typing function | ||
| set of buffers | set of rules | ||
| set of action symbols | start state | ||
| rule delay | |||
| allowed additional information | |||
| chunk merging operator | |||
| rule selection function | |||
| interpretation functions | |||
| transformation of cognitive state after no rule transition | |||
| progress of time after no rule transition | |||
| Architecture | |
|---|---|
| set of constants | |
| set of variables | |
| set of buffers | |
| set of action symbols | |
| rule delay | |
| allowed additional information | |
| chunk merging operator | |
| result of requests | |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsConstraint Satisfaction and Optimization · AI-based Problem Solving and Planning · Semantic Web and Ontologies
An Operational Semantics for the Cognitive Architecture ACT-R and its Translation to Constraint Handling Rules
Daniel Gall
and
Thom Frühwirth
Ulm UniversityInstitute of Software Engineering and Programming LanguagesUlm89069Germany
(2017; February 2017; ?; ?)
Abstract.
Computational psychology has the aim to explain human cognition by computational models of cognitive processes. The cognitive architecture Adaptive Control of Thought – Rational (ACT-R) is popular to develop such models. Although ACT-R has a well-defined psychological theory and has been used to explain many cognitive processes, there are two problems that make it hard to reason formally about its cognitive models: First, ACT-R lacks a computational formalization of its underlying production rule system and secondly, there are many different implementations and extensions of ACT-R with many technical artifacts complicating formal reasoning even more.
This paper describes a formal operational semantics – the very abstract semantics – that abstracts from as many technical details as possible keeping it open to extensions and different implementations of the ACT-R theory. In a second step, this semantics is refined to define some of its abstract features that are found in many implementations of ACT-R – called the abstract semantics. It concentrates on the procedural core of ACT-R and is suitable for analysis of the general transition system since it still abstracts from details like timing, the sub-symbolic layer of ACT-R or conflict resolution.
Furthermore, a translation of ACT-R models to the declarative programming language Constraint Handling Rules (CHR) is defined. This makes the abstract semantics an executable specification of ACT-R. CHR has been used successfully to embed other rule-based formalisms like graph transformation systems or functional programming. There are many theoretical results and practical tools that support formal reasoning about and analysis of CHR programs. The translation of ACT-R models to CHR is proven sound and complete w.r.t. the abstract operational semantics of ACT-R. This paves the way to analysis of ACT-R models through CHR analysis results and tools. Therefore, to the best of our knowledge, our abstract semantics is the first abstract formulation of ACT-R suitable for both analysis and execution.
Computational cognitive modeling, ACT-R, operational semantics, source to source transformation, Constraint Handling Rules
††journalvolume: 9††journalnumber: 4††article: 39††journalyear: 2017††publicationmonth: 2††articleseq: 9††copyright: rightsretained††doi: 0000001.0000001††ccs: Theory of computation Operational semantics††ccs: Computing methodologies Cognitive science††ccs: Applied computing Psychology††ccs: Software and its engineering Constraint and logic languages††ccs: Software and its engineering Semantics
1. Introduction
Computational cognitive modeling tries to explore human cognition by building detailed computational models of cognitive processes (Sun, 2008). Cognitive architectures support the modeling process by providing a formal, well-investigated theory of cognition that allows for building cognitive models of specific tasks and cognitive features.
Currently, computational cognitive modeling architectures as well as the implementations of cognitive models are typically ad-hoc constructs. They lack a formalization from the computer science point of view. For instance, Adaptive Control of Thought – Rational (ACT-R) (Anderson et al., 2004) is a widely employed cognitive architecture. It is a modular production rule system with a special architecture of the working memory that operates on data stored as so-called chunks, i.e. the unit of knowledge in the human brain. It has a well-defined psychological theory, however, its computational system is not described formally leading to implementations that are full of technical artifacts as claimed in (Albrecht and Westphal, 2014; Stewart and West, 2007), for instance. This impedes formal reasoning about the underlying languages and the programmed models. It makes it hard to compare different implementation variants of the languages. Furthermore, it complicates verifying properties of the models. These issues call for a formal semantics of cognitive modeling languages together with proper analysis techniques.
In this paper, we describe a very abstract formulation of the operational semantics of ACT-R, called the very abstract semantics. This formalization of the fundamental parts of ACT-R is the basis for the specification of concrete implementations of the theory of the architecture, as well as for analysis of cognitive models. This very abstract semantics, extending and adapting the work of (Albrecht and Westphal, 2014), captures all possible kinds of ACT-R implementations. This can be too abstract to formally reason about actual computational models meaningfully.
Therefore, we give a concrete instance of this very abstract semantics that is suitable for analysis and implementation – the abstract semantics. It still abstracts from a variety of technical details like conflict resolution, times and latency that strongly depend on the actual implementation or even configuration of ACT-R, but includes the typical matching process of rules and some common actions that can be extended. This paves the way for computational analysis of actual cognitive models.
Eventually, we construct a sound and complete translation of ACT-R models in abstract semantics to Constraint Handling Rules (CHR). The abstract semantics is therefore directly executable via its CHR representation making it, to the best of our knowledge, the first formal operational semantics of ACT-R that is suitable for analysis and execution.
The paper is a revised and extended version of our prior work in (Gall and Frühwirth, 2015) and (Gall and Frühwirth, 2016). It has the following contributions:
- (1)
A very abstract operational semantics of ACT-R (c.f. section 3), 2. (2)
an instance of this semantics for analysis (abstract semantics, c.f. section 4), 3. (3)
a translation of ACT-R models to the declarative programming language Constraint Handling Rules (CHR) (c.f. section 5) and 4. (4)
a soundness and completeness proof of the translation w.r.t. the operational semantics (c.f. section 6).
The formulations of the semantics (1 and 2) have been improved compared to (Gall and Frühwirth, 2015) and the translation (3) has been revised substantially compared to (Gall and Frühwirth, 2016) making it suitable for the proofs. The soundness and completeness proofs (4) have not been published before and are fundamentally new.
Constraint Handling Rules (CHR) (Frühwirth, 2009) has a formally defined operational semantics as well as a declarative semantics with corresponding soundness and completeness results. There are many theoretical and practical results and tools for analysis of CHR programs (Frühwirth, 2009). Due to its strengths in formal program analysis and its strong relation to first order (Frühwirth, 2009) and linear logic (Frühwirth, 2009; Betz and Frühwirth, 2005), it has been used as a lingua franca that embeds many rule-based approaches (Frühwirth, 2009) like term rewriting systems (Raiser and Frühwirth, 2008), graph transformation systems (Raiser, 2007; Raiser and Frühwirth, 2011) and business rules (Martin and Fages, 2007). Such embeddings have been used successfully to make the analysis results of CHR available to other approaches. The sound and complete embedding of ACT-R in CHR enables the use of these results and tools to formally reason about cognitive models.
The paper is structured as follows: We first give a short introduction to ACT-R in section 2. The formal definition of the very abstract semantics is given in section 3 and the the abstract semantics is defined as an instance in section 4. The translation scheme of abstract ACT-R models to CHR programs is described in section 5 and proven sound and complete w.r.t. the operational semantics in section 6. Related work is discussed in section 7 with a detailed comparison to prior work.
2. Description of ACT-R
In this section, we describe ACT-R informally. For a detailed introduction to the theory, we refer to (Anderson and Lebiere, 1998; Anderson et al., 2004; Taatgen et al., 2006; Anderson, 2007). Adaptive Control of Thought – Rational (ACT-R) is a popular cognitive architecture that is used in many cognitive models to describe and explain human cognition. There have been applications in language learning models (Taatgen and Anderson, 2002) or in improving human computer interaction by the predictions of a cognitive model (Byrne, 2001). The components of the ACT-R architecture even have been mapped to brain regions (Anderson, 2007, chapter 2).
Using a cognitive architecture like ACT-R simplifies the modeling process, since well-investigated psychological results have been assembled to a unified theory about fundamental parts of human cognition. In the best-case, such an architecture constrains modeling to only plausible cognitive models (Taatgen et al., 2006). Computational cognitive models are described clearly and unambiguously since they are executed by a computer producing detailed simulations of human behavior (Sun, 2008). By performing the same experiments on humans and the implemented cognitive models, the resulting data can be compared and models can be validated.
2.1. Overview of the ACT-R Architecture
The ACT-R theory is built around a modular production rule system operating on data elements called chunks. A chunk is a structure consisting of a name and a set of labeled slots that are connected to other chunks. The slots of a chunk are determined by its type. The names of the chunks are only for internal reference – the information represented by a network of chunks comes from the connections. For instance, there could be chunks representing the cognitive concepts of numbers 1, 2, … By chunks with slots number and successor we can connect the individual numbers to an ordered sequence describing the concept of natural numbers. This is illustrated in figure 1.
As shown in figure 2, ACT-R consists of modules. The goal module keeps track of the current (sub-) goal of the cognitive model. The declarative module contains declarative knowledge, i.e. factual knowledge that is represented by a network of chunks. There are also modules for interaction with the environment like the visual and the manual module. The first perceives the visual field whereas the latter controls the hands of the cognitive agent. Each module is connected to a set of buffers that can hold at most one chunk at a time.
The heart of the system is the procedural module that contains the production rules controlling the cognitive process. It only has access to a part of the declarative knowledge: the chunks that are in the buffers. A production rule matches the content of the buffers and – if applicable – executes its actions. There are three types of actions:
**Modifications: **
overwrite information in a subset of the slots of a buffer, i.e. they change the connections of a chunk.
**Requests: **
ask a module to put new information into its buffer. The request is encoded in form of a chunk. The implementation of the module defines how it reacts on a request. For instance, there are modules that only accept chunks of a certain form like the manual module that only accepts chunks that encode a movement command for the hand according to a predefined set of actions.
Nevertheless, all modules share the same interface for requests: The module receives the arguments of the request encoded as a chunk and puts its result in the requested buffer. For instance, a request to the declarative module is stated as a partial chunk and the result is a chunk from the declarative knowledge (the fact base) that matches the chunk from the request.
**Clearings: **
remove the chunk from a buffer.
The system described so far is the so-called symbolic level of ACT-R. It is similar to standard production rule systems operating on symbols (of a certain form) and matching rules that interact with buffers and modules. However, to simulate the human mind, a notion of timing, latency, priorities etc. are needed. In ACT-R, those concepts are subsumed in the sub-symbolic level. It augments the symbolic structure of the system by additional information to simulate the previously mentioned concepts.
Therefore, ACT-R has a progressing simulation time. Certain actions can take some time that depends on the information from the sub-symbolic level. For instance, chunks are mapped to an activation level that determines how long it takes the declarative module to retrieve it. Activation levels also resolve conflicts between chunks that match the same request. The value of the activation level depends on the usage of the chunk in the model (inter alia): Chunks that have been retrieved recently and often have a high activation level. Hence, the activation level changes with the simulation time. This can be used to model learning and forgetting of declarative knowledge. Similarly to the activation level of chunks, production rules have a utility that also depends on the context and the success of a production rule in prior applications. Conflicts between applicable rules are resolved by their utilities which serve as dynamic, learned rule priorities.
2.2. Syntax
We use a simplified syntax of ACT-R that we have introduced in (Gall and Frühwirth, 2015a). It is based on sets of logical terms instead of the concatenation of syntactical elements. This enables an easier access to the syntactical parts. Our syntax can be transformed directly to the original ACT-R syntax and vice-versa.
The syntax of ACT-R is defined over two possibly infinite, disjoint sets of (constant) symbols and variable symbols . An ACT-R model consists of a set of types with type definitions and a set of rules . A production rule has the form where is a finite set of buffer tests and queries. A buffer test is a first-order term of the form where the buffer and is a set of slot-value pairs where and . This means that only the values in the slot-value pairs can consist of both constants and variables. The right-hand side of a rule is a finite set of actions where . I.e. an action is a term of the form where the functor of the action is in , the set of action symbols, the first argument is a constant (denoting a buffer), the second argument is a constant denoting a type, and the last argument is a set of slot-value pairs, i.e. a pair of a constant and a constant or variable. Usually, the action symbols are defined as for modifications, requests and clearings respectively. Only one action per buffer is allowed, i.e. if and , then (Bothell, ).
We define the function that maps an arbitrary set of terms to its set of variables in . For a production rule the following must hold: , i.e. no new variables must be introduced on the right-hand side of a rule. As we will see in the following sections about semantics, this restriction demands that all variables are bound on the left-hand side.
2.3. Informal Operational Semantics
In this section, we describe ACT-R’s operational semantics informally. The production rule system constantly checks for matching rules and applies their actions to the buffers. This means that it tests the conditions on the left hand side with the contents of the buffers (which are chunks) and applies the actions on the right hand side, i.e. modifies individual slots, requests a new chunk from a module or clears a buffer.
The left hand side of a production rule consists of buffer tests – that are terms with a buffer , a type and a set of slot-value pairs . The values of a slot-value pair can be either constants or variables. The test matches a buffer, if the chunk in the tested buffer has the specified type and all slot-value pairs in match the values of the chunk in . Thereby, variables of the rule are bound to the actual values of the chunk. Values of a chunk in the buffers are always ground. This is ensured by the previously mentioned condition in the syntax of a rule that the right hand side of a rule does not introduce new variables (see section 2.2). Hence the chunks in the buffers stay ground.
If there is more than one matching rule, a conflict resolution mechanism that depends on the sub-symbolic layer chooses one rule that is applied. After a rule has been selected, it takes a certain time (usually 50 ms) for the rule to fire. I.e. actions are applied after this delay. During that time the procedural module is blocked and no rule can match.
The right hand side consists of actions , where is an action symbol, is a constant denoting a buffer and is again a set of slot-value pairs. We have already explained the three types of actions (modifications, requests and clearings) roughly. In more detail, a modification overwrites only the slots specified in with the values from . A request clears the requested buffer and asks a module for a new chunk. It can take some time specified by the module (and often depending on sub-symbolic values) until the request is processed and the chunk is available. During that time, other rules still can fire, i.e. requests are executed in parallel. However, a module can only process one request for a buffer at the same time. Buffer clearings simply remove the chunk from a buffer. In the following, we disregard clearings in our definitions since they are easy to add.
We now give an example rule and informally explain its behavior.
Example 2.1 (production rule).
We want to model the counting process of a little child that has just learned how to count from one to ten. We use the natural number chunks described in section 2.1 as declarative knowledge. Furthermore, we have a goal chunk of another type that memorizes the current number in a current slot. We now define a production rule, that increments the number in the counting process (and call this rule inc). We denote variables with capital letters in our examples. The left-hand side of the rule inc consists of two tests:
- •
and
- •
.
This means that the rule tests if in the goal buffer there is a chunk of type that has some number (which is a variable) in the current slot. If this number is also in the number slot of the chunk in the retrieval buffer, the test succeeds and the variable is bound to the value in the successor slot. The actions of the rule are:
- •
and
- •
.
The first action modifies the chunk in the goal. A modification cannot change the type, that is why we just add an anonymous variable denoted by the underscore symbol in the type specification. The current slot of the goal chunk is adjusted to the successor number and the declarative module is asked for a chunk of type with in its number slot. This is called a retrieval request. After a certain amount of time, the declarative module will put a chunk with in its number and in its successor slot into the retrieval buffer and the rule can be applied again.
3. Very Abstract Operational Semantics
Our goal is to define an operational semantics that captures as many ACT-R implementations as possible and leaves room for extensions and modifications. Hence, we first give them a common theoretical foundation that is based on the formalization according to (Albrecht and Westphal, 2014) – the very abstract operational semantics. It describes the fundamental concepts of a production rule system that operates on buffers and chunks like ACT-R. This work extends the definition from (Albrecht and Westphal, 2014). We compare our work with the work from (Albrecht and Westphal, 2014) in section 7.1. Later on, in section 4, we define an instance of this very abstract semantics that is suitable for analysis of cognitive models.
An ACT-R architecture is a concrete instantiation of the very abstract semantics and defines general parts of the system that are left open by the very abstract semantics like the set of possible actions , the effect of such an action or the selection process. In contrast to that, an ACT-R model defines model-specific instantiations of parts like the set of types and the set of rules . Table 1 summarizes what is defined by the architecture and the model.
3.1. Chunk Stores
As described before in section 2, ACT-R operates on a network of typed chunks that we call a chunk store. Therefore, we first define the notion of types:
Definition 3.1 (chunk types).
A typing function maps each type from the set to a finite set of allowed slot names. Every type set must contain a special type with .
A chunk store is defined over a set of types and a typing function. We abstract from chunk names as they do not add any information to the system. In fact, chunks are defined as unique, immutable entities with a type and connections to other chunks:
Definition 3.2 (chunk store).
A chunk store is a multi-set of tuples where is a chunk type and is a function that maps each slot of the chunk (determined by the type ) to another chunk.
Every chunk store must contain a chunk that is defined as . Each chunk store has a bijective identifier function that maps each chunk of the multi-set a unique identifier. The inverse of is defined as follows:
[TABLE]
For a chunk , the following functions are defined:
- •
and
- •
.
The typing function maps a type from the set of type names to a set of allowed slots, hence the function of chunk has the slots of as domain.
Note that two chunks are only considered equivalent, if they have the same chunk identifier, type and value functions (in that case they are the indistinguishable in the multi-set and therefore treated as one and the same chunk). Hence, a chunk store can contain multiple elements with the same values that still are unique entities representing different concepts. We will see this in the following example: We model our well-known example from figure 1 as a chunk store.
Example 3.3 (chunk store of natural numbers).
The chunk store from figure 1 can be modeled as follows. Note that in the examples, we write to denote that the specified chunk on the right hand side has identifier . When it is clear from the context, we only write instead of . We also use the chunk identifiers to define the connections in the slots (the functions).
- •
The set of types is .
- •
The typing function is defined as and .
- •
We have the following chunks in our store :
- –
the unique entities with identifiers that are defined as ,
- –
with \mathord{\mathit{val}}_{b}(s)=\begin{cases}1&\text{if s=\mathord{\mathit{number}}}\\ 2&\text{if s=\mathord{\mathit{successor}}}\end{cases}
- –
with \mathord{\mathit{val}}_{b}(s)=\begin{cases}2&\text{if s=\mathord{\mathit{number}}}\\ 3&\text{if s=\mathord{\mathit{successor}}}\end{cases}
From the definition of a chunk store , we can derive a graph where for each slot-value pair of a chunk there is an edge with label . In the graphical representation, we can label vortices with the chunk identifiers. We can apply this to our example 3.3 to derive the graph illustrated in figure 1 on page 1.
Sometimes we want to build chunk stores that only have a few chunks in them that refer to chunks in other chunk stores in their slots. We call this concept a partial chunk store.
Definition 3.4 (partial chunk store).
A partial chunk store with reference to a chunk store , denoted as , is a multi-set of tuples where is a chunk type and is a function that maps each slot of the chunk (determined by the type ) to another chunk from chunk store . Every chunk in the partial chunk store has a unique identifier that is disjoint from the identifiers in . The function returns the chunk identifier for each chunk in .
The set of all partial chunk stores that refer to a chunk store is denoted as .
Example 3.5 (partial chunk store).
Let be the chunk store from example 3.3. We define as a partial chunk store that refers to . It contains the chunk with the following slots:
[TABLE]
We define an operation that merges two (partial) chunk stores. In an abstract way it can be considered as a special multi-set union that merges two elements of a chunk store, if they have the same chunk identifiers. However, since there are many different implementations of ACT-R, we do not want to limit our formulation to this special type of multi-set union, but define a more general operator . For the general understanding of the paper and the proofs it is sufficient to think of it as multi-set union that maintains uniqueness of chunk identifiers.
Definition 3.6 (chunk merging).
Let be the set of all chunk stores, a chunk store and the set of all partial chunk stores that refer to . Then the chunk merging operator. In the following, denotes that every element of is an element of with preservative chunk identifiers.
We require the following properties for . For all chunk stores and all (partial) chunk stores :
- (1)
, i.e. is associative. 2. (2)
is the neutral element. 3. (3)
For all and with , it must hold that , i.e. both chunks have same types and value functions. 4. (4)
For all (partial) chunk stores If and , then there exist (partial) chunk stores and such that and . This is a special definition of idempotency of for elements with the same chunk identifiers, i.e. chunks in both stores that have the same chunk identifiers, types and value functions are absorbed in the merge product.
There are the following interesting special cases:
- •
leads to with , i.e. the elements of that also appear in are absorbed in the merge product.
- •
leads to with and . 5. (5)
If , then and , i.e. with preservative chunk identifiers. 6. (6)
There is a mapping that maps chunks from the original chunk stores to the merged chunk store. It must hold that if , i.e. the chunks from remain untouched by the merging.
From the axioms it is clear that is a monoid.
We now give two examples for possible implementations of .
The simplest chunk merging operator would be the multi-set union that maintains uniqueness of chunk identifiers, i.e. if two chunks have the same identifiers, types and value functions, only one version will be kept in the merged store. The chunk identifiers are defined as
[TABLE]
For instance, let and with for and for . We assume that for , i.e. those chunks are equivalent and therefore have same types and value functions. Then . Due to the same identifiers, types and value functions of the two appearances of , the merged store only keeps one version of due to the idempotency axiom.
For chunk stores whose chunk identifiers have been renamed apart, this definition of is equivalent to real multi-set union. Therefore, for all chunk stores that have disjoint chunk identifiers.
However, in most implementations chunks that have the same structure are merged to one chunk, i.e. if and , then and would be merged to in . I.e., only would be kept in the merged store. The mapping function would return .
3.2. States
We first define the individual parts of an ACT-R state. The notion of a cognitive state defines which chunks are currently in which buffer and therefore visible to the production system that can only match chunks in buffers.
Definition 3.7 (cognitive state).
A cognitive state is a function that maps each buffer to a chunk and a delay. The set of cognitive states is denoted as , whereas denotes the set of partial cognitive states, i.e. cognitive states that are partial functions and do not necessarily map each buffer to a chunk. We define the following functions to access the individual parts of a cognitive state : If for an arbitrary buffer , then
- •
and
- •
.
The delay decides at which point in time the chunk in the buffer is available to the production system. A delay indicates that the chunk is not yet available to the production system. This implements delays of the processing of requests.
ACT-R adds a sub-symbolic level to the symbolic concepts that have been defined so far and that distinguish it from other production rule systems. To gather information from the sub-symbolic layer, we add the concept of (sub-symbolic) additional information that is needed to calculate sub-symbolic values. This information can be altered by an abstract function as we will see in section 3.3. The information will be expressed as multi-sets or conjunctions of predicates from first-order logic. The additional information is also used to manage data used in ACT-R’s modules.
Additionally, modules other than the procedural module hold their data in the additional information.
We now define ACT-R states as follows:
Definition 3.8 (very abstract state).
A very abstract state is a tuple where is a cognitive state in the sense of definition 3.7, is a multi-set of ground, atomic first order predicates (called additional information), is a time. The state space is denoted with .
Note that a very abstract state cannot contain variables from , but is only compound from terms, sets and functions over constants from : The chunk store contains a type that is denoted by a constant and a valuation function that connects slot names (constants) to other elements from , the cognitive state connects buffer names (constants) with chunks from and a delay in , the additional information is a multi-set of ground, atomic predicates and the time is also a number. The set of allowed predicates for additional information is denoted as . The additional information holds data of ACT-R’s modules as well as sub-symbolic information.
We continue our running example by defining a very abstract state with one of the chunks defined in example 3.3.
Example 3.9 (ACT-R states).
We want to model the counting process of a little child that has learned the sequence of the natural numbers from one to ten as declarative facts and can retrieve those facts from declarative memory. Therefore, we add a chunk of type with a current slot that memorizes the current number in the counting process.
The following state has a chunk of type in the goal buffer that has the current number . The retrieval buffer is currently retrieving the chunk with number and successor . The retrieval is finished in one second as denoted by the delay. Figure 3 illustrates the state. The formal definition is:
- •
where is the set of types from example 3.3.
- •
\tau_{\ref{ex:very_abstract_state}}(t)=\begin{cases}\{\mathord{\mathit{current}}\}&\text{if t=g}\\ \tau_{\ref{ex:chunk_store_nat}}(t)&\text{otherwise.}\par\end{cases}
- •
where .
- •
- •
- •
where is defined as in example 3.3.
3.3. Operational Semantics
We now define the state transition system of the very abstract semantics. As in every production rule system, we first define how matching rules are chosen. Therefore, we introduce a selection function that is defined by the architecture and maps a state to a set of matching rules and the variable bindings implied by the application of the rule.
Definition 3.10 (selection function).
Let be the set of possible substitutions over variables and constants . A selection function is a function that maps a state to a set of pairs where is a production rule and is a substitution of variables from with constants from , such that all variables from the rule are substituted, i.e. .
For actual implementations of ACT-R, the result of is usually restricted to sets with zero or one element, but for abstract definitions there can also be more than one rule. The function usually defines a notion of matching and makes sure that only rules can fire that match visible information in the buffers, i.e. chunks that are not delayed by a time greater than zero.
To define the modification of a state by a transition, we define interpretation functions of actions that determine the possible effects of an action.
Definition 3.11 (interpretation of actions).
An interpretation of an action is a function . The following conditions must hold: if
- (1)
, i.e. the interpretation of an action has at least one effect, 2. (2)
the resulting chunk store is is a partial chunk store that refers to and whose chunk identifiers are disjoint from , 3. (3)
the co-domain of is , i.e. the cognitive state can only refer to chunks in the resulting chunk store , and 4. (4)
if the action has the buffer in its scope, i.e. , then the resulting partial cognitive state has only in its domain, i.e. .
An interpretation maps each state and action of the form – where is an action symbol, a constant denoting a buffer, a type, and is a set of slot-value pairs – to a tuple . Thereby, is a partial chunk store that refers to , is a partial cognitive state, i.e. a partial function that assigns only the buffer from the action to a chunk and a delay. The partial cognitive state will be taken in the operational semantics to overwrite the changed buffer contents, i.e. it contains the new contents of the changed buffers. Analogously, the additional information defines the additions to the sub-symbolic level induced by the action.
Note that the interpretation of an action can return more than one possible effect. This is used in the abstract semantics where due to the lack of sub-symbolic information all possible effects have to be considered. For example, the declarative module can find more than one chunk matching the retrieval request. Usually, by comparing activation levels of chunks, one chunk will be returned. However, in the abstract semantics all matching chunks are possible. In the refined semantics, we restrict the selection to one possible effect as proposed by the ACT-R reference manual (Bothell, ).
Example 3.12 (interpretation of an action).
In this example we define an neutral effect. We will see later that if this effect is applied to a state, the state does not change modulo time.
Let be our action that produces the neutral effect and an ACT-R state. We define
[TABLE]
where for if and some and undefined for all other inputs.
Intuitively, the action returns a chunk store with only one chunk that has the same type and value as the chunk that has been in buffer in the state . The resulting partial cognitive state just links to this new chunk . No additional information () is added.
This is a valid interpretation, since , is a valid partial chunk store that refers to . Therefore is a valid value function. The co-domain of is since the co-domain of is and by definition 3.6, . Additionally, .
To combine interpretations of all actions of a rule, we first define how two interpretations can be combined. Therefore, we introduce the following set operator, that combines two sets of sets:
Definition 3.13 (combination operator for effects).
Let be effects of two actions, i.e. results of an interpretation function of an action in a state with chunk store , i.e. and . Let and where are partial chunk stores with disjoint chunk identifiers that refer to , are partial cognitive states with disjoint domains, i.e. and and is a conjunction of first-order predicates. Then the combination of the effects and w.r.t. a chunk store is defined as
[TABLE]
where with
[TABLE]
The intuition behind this definition is that two effects of an action, i.e. two triples of chunk store, cognitive state and additional information, are merged to one effect that combines them. Hence, we get a merged partial cognitive state that has the combined buffer-chunk mappings of the two original cognitive states. This is possible, since the domains of the partial cognitive states are required to be disjoint.
The partial chunk stores are merged to one partial chunk store referring to the same total chunk store. Note that from the definition of , the chunk store of the first effect is a subset of the merged store. However, the second chunk store might have lost some members. The mapping function assigns every chunk from the second store one from the merged store.
The merged additional information is a conjunction of the additional information from both effects.
The combination is well-defined: exists since and are partial chunk stores that refer to the same chunk store . Additionally, they have disjoint identifiers because they are results of an interpretation function (see definition 3.11) and therefore their merging cannot fail due to different chunks with same identifiers.
The cognitive state is valid, since it has the combined domain of and and just maps the chunks from those original partial cognitive states to their versions in the merged chunk store by , i.e. the co-domain of is just the merge product of the co-domains of and and keeps their connections of buffers to chunks.
The definition of combinations of effects can be lifted to sets of effects by the following definition. Let and be two sets of effects in some state with chunk store , then their combination is defined as all possible pairwise combination of their elements:
[TABLE]
Since the interpretation of an action is possibly non-deterministic, i.e. might have more than one effect triple, the combination of such sets of effect triples is a set that combines each effect from the first set with each effect from the second set. This leads to a set of combined effects from which the transition system will be able to choose one non-deterministically. However, since every effect set is required to have at least one effect, it the same applies for their combination.
We now define the interpretation function that maps a rule to all its possible effects (i.e. chunk store, cognitive state and additional information).
Definition 3.14 (interpretation of rules).
In a state , a rule is interpreted by an interpretation function that is defined as follows: applies the function to all tuples in the result set when combining the individual actions of the rule:
- •
is a function that applies some more effects at the end of the rule application and is defined by the architecture.
- •
For all actions , the resulting chunk stores are disjoint (i.e. chunk identifiers are renamed apart).
- •
The interpretation has the result
[TABLE]
where the combination operator refers to and the function is applied to each member of the combination set. Hence, all possible effects of the rules are combined and each of the resulting partial cognitive states is then modified by the function that is defined by the architecture.
The function can apply additional changes to the state that are not directly defined by its actions. For instance, it can change some sub-symbolic values that depend on the rule application like the utility of the rule itself. Note that by definition of the ACT-R syntax it is ensured that each of the in the combination of the individual actions is still a function, since only one action per buffer is allowed as defined in section 2.2.
It is important to mention, that there are two types of non-determinism in the interpretation of a rule:
- (1)
The first non-determinism comes from the non-deterministic nature of interpretations of an action. Each action can lead to different results (depending on their definition). This is why all interpretation functions have power sets of effects as co-domain. 2. (2)
The second type of non-determinism comes from the definition of the combination operator that merges chunk stores by using the chunk merging operator . Since is not required to be commutative, the result of the merged chunk stores may vary. This leads to possibly differing chunk identifiers. It is possible to abstract from this kind of non-determinism by introducing the concept of (graph) isomorphism on chunk stores.
We now define the operational semantics as the state transition system :
Definition 3.15 (very abstract operational semantics).
The transition relation in the very abstract operational semantics of ACT-R is defined as follows:
**Apply: **
For a rule the following transitions are possible:
[TABLE]
where
- **•: **
,
\gamma^{\prime}(b):=\begin{cases}(\mathord{\mathit{map}}_{\Delta,\Delta^{*}}(c),d)&\text{if }\gamma^{*}(b)=(c,d)\text{ is defined}\\ (\mathord{\mathit{map}}_{\Delta,\Delta^{*}}(c),d\ominus\delta)&\text{otherwise, if \gamma(b)=(c,d),}\end{cases}
- **•: **
x\ominus y:=\begin{cases}x-y&\text{if x>y}\\ 0&\text{otherwise}\end{cases} for two numbers , and
- **•: **
for a delay defined by the concrete instantiation of ACT-R.
When applying the rule, the resulting partial chunk store is merged with the chunk store from the state. Hence, for all actions on the right-hand side of the rule. Note that , i.e. all chunks in also appear in the merged chunk store with preservative chunk identifiers by definition 3.6 of the chunk merging.
The partial cognitive state that comes from the interpretation of the rule replaces all positions in the original cognitive state where it is defined, otherwise the original cognitive state remains untouched. Note that for all buffers and with we can also write instead of since the chunk merging guarantees that with preservative identifiers as mentioned before.
The delays are taken from the partial cognitive state or are reduced by a constant amount that models progression of time.
When it is clear from the context, we just use to denote that the transition applies rule .
**No Rule: **
[TABLE]
where
- **•: **
is a side condition in form of a logical predicate,
- **•: **
a function that describes how the cognitive state should be transformed,
- **•: **
a function that describes the time adjustment in dependency of the current state, and
- **•: **
is the updated state.
An apply transition applies a rule that satisfies the conditions of the selection function by overwriting the cognitive state with the result from the interpretations of the actions of rule . Thereby, one possible combination of all effects of the actions is considered. Note that the transition is also possible for all other combinations. Only the buffers with a new chunk are overwritten, the others keep their contents. The same applies for parameters: They keep their value except for those where defines a new value. Additionally, the rule application can take a certain time that is defined by the architecture. Time is forwarded by , i.e. the time in the state is incremented by and the delays in the cognitive state that determine when a chunk becomes visible to the system are decremented by (with a minimal delay of 0).
The no rule transition defines what happens if there is no rule applicable, but there are still effects of e.g. requests that can be applied. This means that there are buffers with and , i.e. information that is not visible to the production rule system. In that case there are no possible transitions in the original semantics. We generalized this case in our definition of the no rule transition that allows state transitions without rule applications. It ensures that if a side condition defined by the ACT-R instantiation, the cognitive state is updated according to the function and the current time of the system is set to a specified time . Both functions are also defined by the concrete architecture. This makes new information visible to the production system and hence new rules might fire. In typical ACT-R implementations, the side condition is that , i.e. that no rule is applicable, and where has the time component and is the minimum delay in the cognitive state of . This means that time is forwarded to the minimal delay in the cognitive state and makes for instance pending requests visible to the production rule system. It can be interpreted like if the production rule system waits with the next rule application until there is new information present that leads to a rule matching the state. This behavior coincides with the specification from the ACT-R reference implementation (Bothell, ). If no transition is applicable in a state , i.e. there is no matching rule and no invisible information in , then is a final state and the computation stops.
The definition of our very abstract semantics leaves parts to be defined by the actual architecture and the model. Table 1 summarizes what has to be defined by an architecture and a model.
We now show that the neutral effect from example 3.12 is a neutral element of the rule application except for the time component.
Example 3.16 (neutral element of rule applications).
Let be an ACT-R state and an ACT-R rule with only one action . Let .
We define to be the only effect of where and for some and undefined for all other inputs. Then
[TABLE]
where with
Let . Then The follow-up cognitive state is
[TABLE]
This can be reduced to
[TABLE]
Since has the same type and value function, can be considered equivalent to modulo delays for all . Hence, is equivalent to modulo delays and the time component.
4. Abstract Semantics as Instance of the Very Abstract Semantics
The abstract semantics is defined as an instance of the very abstract semantics. It is suitable for the analysis of procedural core of cognitive models because it abstracts from timings and conflict resolution by leaving parts of the transition system to non-deterministic choices and still giving room for extensions and modifications. The idea is to define the minimal core of all implementations of ACT-R’s procedural system disregarding parameter choices, timings, sub-symbolic information and module configuration. The abstract semantics captures all possible state transitions the procedural system can make.
4.1. Definition of the Abstract Semantics
Since it is a central part of the procedural system of ACT-R, we first define the notion of matchings:
Definition 4.1 (matching).
A buffer test for a buffer testing for a type and slot-value pairs matches a state , written , if and only if there is a substitution such that and for all . We define as the function that returns the smallest substitution that satisfies the matching .
This definition can be extended to rules: A rule matches a state , written as , if and only if for all buffer tests match . The function returns the smallest substitution that satisfies the matching .
This means that a buffer test matches a state, if the tested buffer contains a chunk of the tested type and all slot tests hold in the state, i.e. the variables in the test can be substituted by values consistently such that they match the values from the state. The values in the test are denoted by the identifiers of the chunks. Note that a test can only match chunks in the cognitive state that are visible to the system, i.e. whose delay is zero. A test cannot match chunks with a delay greater than zero.
We give the architectural parameters that are left open in the very abstract semantics:
**States: **
We set the time in every state to (or any other constant) because abstract states are not timed. Hence, each abstract state is a tuple where is a cognitive state. We sometimes project an abstract state to for the sake of brevity.
**Selection Function: **
The rule selection in the abstract semantics is simply defined as . Hence we select all matching rules in state and replace the variables from the rules by their actual values from the matching, since in the state transition system the substitution is applied to the rule when calculating the effects.
**Effects: **
For a state the interpretation function for actions in the abstract semantics is defined as follows:
- **•: **
for modifications
where
- **–: **
,
- **–: **
with for a fresh id ,
- **–: **
, and
- **–: **
the new slot values are:
[TABLE]
for .
This means that a modification creates a new chunk that modifies only the slots specified by and takes the remaining values from the chunk that has been in the buffer. Note that the type cannot be modified, since the resulting chunk always has the type derived from the chunk that has previously been in the buffer. Modifications are deterministic, i.e. that there is only one possible effect.
The slot-value function is well-defined, since it appears in a partial chunk store that references , it has as co-domain by definition 3.4 of a partial chunk store.
If the action contains a slot-value pair that modifies to a chunk that was not existent in the original state , this chunk is not magically constructed, since we do not know its type or values. Instead, we map this slot to . This comes from the definition of , which is for , since the chunk referenced by does not exist in .
- **•: **
for requests
if
- **–: **
is a function defined by the architecture for each buffer. It calculates the set of possible answers for a request that is specified by a type and a set of slot value pairs. Possible answers are tuples of a chunk , delay and parameter valuation function .
- **–: **
For all we set \gamma^{*}(b):=\begin{cases}(c^{*},1)&\text{if d^{*}>0}\\ (c^{*},0)&\text{otherwise,}\end{cases}
and .
Note that the additional information in the result of the request is directly added to the result of the interpretation function to update the internal state of the requested module.
- **•: **
The function from definition 3.14 that adds additional changes to the state when a rule is applied is defined as the identity function, i.e. no changes to the state are introduced by the rule application itself but only by its actions.
**Rule Application Delay: **
The delay of a rule application is set to , since the abstract semantics does not care about timings.
**No Rule Transition: **
In the no rule transition, there are three parameters to be defined by the actual ACT-R instantiation: The side condition , the state update function and the time adjustment function . We define them for a state as follows:
- **•: **
if and only if there is a such that with , i.e. there is a buffer with a chunk that is not visible to the system. Those are the cases where there is a pending request. This means that the no rule transition is possible as soon as there is at least one pending request. We call the buffer of one such request .
- **•: **
if and for one . This means that one pending request is chosen to be applied (the one appearing in ). Since this is a rule scheme and can be chosen arbitrarily, the transition is possible for all assignments of . This coincides with the original definition of our abstract semantics where one request is chosen from the set of pending requests.
- **•: **
The function that determines how the time is adjusted after a chunk has been made visible is defined as , i.e. the time is not adjusted.
Table 2 shows the parameters of the abstract semantics that have to be defined by the architecture.
We summarize the transition scheme of the abstract semantics:
**Rule transition: **
[TABLE]
where ,
\gamma^{\prime}(b):=\begin{cases}(\mathord{\mathit{map}}_{\Delta,\Delta^{*}}(c),d)&\text{if }\gamma^{*}(b)=(c,d)\text{ is defined}\\ (\mathord{\mathit{map}}_{\Delta,\Delta^{*}}(c),d)&\text{otherwise, if \gamma(b)=(c,d).}\end{cases}
Again, since with preservative chunk identifiers, we can also write
[TABLE]
**No rule transition: **
[TABLE]
where \gamma^{\prime}(b):=\begin{cases}(c^{*},0)&\text{if b=b^{*}}\\ \gamma(b)&\text{otherwise.}\end{cases}
We now extend our running example by a derivation in the abstract semantics:
Example 4.2 (abstract semantics).
We begin with the state from example 3.9. It is visualized in figure 3. Then, the following derivations are possible.
[TABLE]
where
- •
(and as in ),
- •
and
- •
where
In no rule is applicable, but there is a pending request whose result is not visible for the production system. Hence, we can apply the no rule transition which makes the chunk visible. Then the rule inc from example 2.1 is applicable. If we assume that for all additional information where , i.e. a chunk of type with the number 2 in the slot and 3 in the slot, we reach the state that is illustrated in figure 4. Note that in this state again the no rule transition is possible.
5. Translation of ACT-R Models to Constraint Handling Rules
In this section we show how to translate an ACT-R model to a Constraint Handling Rules (CHR) program. This is one of the main contributions of this paper. The translation is the first that matches the current operational semantics of ACT-R. The proof of the soundness and completeness of the translation w.r.t. the abstract operational semantics of ACT-R is a new contribution of this paper.
Therefore, we first give a quick introduction to CHR and its basic concepts we need for definitions and proofs in section 5.1. Then we introduce a normal form of ACT-R rules that simplifies the translation process and proofs in section 5.2. In section 5.4 we show the translation of ACT-R states to CHR states and in section 5.5 we define the translation scheme for rules. Finally, in section 6 our translation of ACT-R models to CHR programs is proven sound and complete w.r.t. the abstract operational semantics of ACT-R.
5.1. Constraint Handling Rules (CHR)
Before we define the translation scheme of ACT-R models to CHR program, we first recapitulate syntax and semantics of CHR briefly. For an extensive introduction to CHR, its semantics, analysis and applications, we refer to (Frühwirth, 2009). We use the latest definition of the state transition system of CHR that is based on state equivalence (Raiser et al., 2009). This syntax allows for more elegant proofs and has been proven to be equivalent to the canonical so-called very abstract semantics of CHR. The definitions from those canonical sources are now reproduced. We assume the reader to be familiar with the common concepts of first-order predicate logic like predicate symbols, function symbols, predicates, functions, constants, variables, terms and formulas.
The syntax of CHR is defined over a set of variables , a set of function symbols (with arities) and a set of predicate symbols with arities that is disjointly composed of CHR constraint symbols and built-in constraint symbols. The set of constraint symbols contains at least the symbols , and .
For a constraint symbol and terms over and for , is called a CHR constraint, if is a CHR constraint symbol or a built-in constraint if is a built-in constraint respectively. We now define the notion of CHR states.
Definition 5.1 (CHR state).
A CHR state is a tuple where the goal is a multi-set of constraints, the built-in constraint store is a conjunction of built-in constraints and is a set of global variables.
All variables occurring in a state that are not global are called local and the local variables that are only used for built-in constraints are called strictly local variables.
CHR states can be modified by rules that together form a CHR program.
Definition 5.2 (CHR program).
A CHR program is a finite set of rules of the form
[TABLE]
where is an optional rule name, the heads and are multi-sets of CHR constraints, the guard is a conjunction of built-in constraints and the body is a multi-set of CHR constraints and a conjunction of built-in constraints . Note that at most one of and can be empty. If is empty, it is interpreted as the built-in constraint .
If , the rule is called a simplification rule and we write
[TABLE]
Conversely, if , the rule is called a propagation rule and we write
[TABLE]
Informally, a rule is applicable, if the head matches constraints from the store and the guard holds, i.e. is a consequence of the built-in constraints . In that case, the matching constraints from are kept in the store, the constraints matching are removed and the constraints from , and are added.
To define the operational semantics formally, we first have to define state equivalence over CHR states following the work in (Raiser et al., 2009). Therefore, we assume a constraint theory for the interpretation of the built-in constraints in .
Definition 5.3 (state equivalence of CHR states).
Equivalence between CHR states is the smallest equivalence relation over CHR states that satisfies the following conditions:
**Equality as substitution: **
[TABLE]
**Transformation of the constraint store: **
If where are the strictly local variables of , respectively, then:
[TABLE]
**Omission of non-occurring global variables: **
[TABLE]
**Equivalence of failed states: **
[TABLE]
The operational semantics is now defined by the following transition scheme over equivalence classes over CHR states i.e.
Definition 5.4 (operational semantics of CHR).
For a CHR program the state transition system over CHR states and the rule transition relation is defined as the following transition scheme:
[TABLE]
Thereby, we assume that is a variant of a rule in the program such that its local variables are disjoint from the variables occurring in the representative of the pre-transition state.
We may just write instead of if the rule is clear from the context.
5.2. Set Normal Form
To simplify the translation scheme, we assume the ACT-R production rules to be in set normal form. The idea is that each buffer test contains each slot exactly once. In the general ACT-R syntax it is possible to specify more than one slot-value pair for each slot or none at all. If we assume set-normal form, we can reduce the cases to consider in the translation and soundness and completeness proofs.
We now define the set-normal form formally and show that every ACT-R production rule can be transformed to set-normal form with same semantics.
Definition 5.5 (set normal form).
An ACT-R rule is in set normal form, if and only if for all buffer tests and there is exactly one such that .
Theorem 5.6 (set normal form).
For all ACT-R models with rules , there is a set of rules with the following properties: For all rules with that are applicable in at least one state there is a rule with in set-normal form such that for all ACT-R states it holds that if and only if (in the abstract semantics).
We now give the construction of . All rules in that are not applicable in any state do not appear in . For all remaining rules , we first transform to a rule that is in set-normal form. For every buffer test there is a test for a substitution . Thereby, has the following slot-value pairs:
- •
For all where there is exactly one such that , .
- •
For all where there is no such that , there is a slot-value pair for a fresh variable .
- •
For all where there is are such that for we produce a substitution and add a slot-value pair for a . There are the following cases:
- (1)
Then is a fresh variable. Intuitively, we just introduce a new variable and replace all other variables by this new variable. 2. (2)
Then . Intuitively, since is applicable in at least one state, . The slot-value pairs are redundant and therefore we only add one of them. 3. (3)
W.l.o.g., . Then .
We define the substitution as the composition of all substitutions . The rule is obviously in set normal form.
The proof idea is that if all variables or constants that appear in the same slot test in the same buffer have the same values. If there is a buffer test with two slot-value pairs and , then for . Hence, we can replace all occurrences of and by and remove from . This is exactly what our construction of the set-normal form does.
5.3. Notational Aspects
We use the following symbols:
- •
Lists are denoted by enumerating their elements, e.g. for the list with elements and . denotes the empty list. We use for a list with head element and tail list . We also use the following notation for list comprehension: is the list with all elements from a (multi-)set that satisfy .
- •
The list concatenation is denoted with .
- •
For a function with finite domain , we define sorted by an order on and , i.e. the (sorted) enumerative list notation of the function . It can be understood as the list representation of the relational representation of as a set of tuples .
5.4. Translation of States
To translate an ACT-R state to CHR, we have to define translations for the individual components of such a state.
Definition 5.7 (translation of chunk stores).
A (partial) chunk store can be translated to a first order term as follows:
[TABLE]
Thereby, denotes the explicit relational notation of the function as a sorted list of tuples and is the list comprehension as defined in section 5.3.
We denote the translation of a (partial) chunk store with .
Each chunk in a chunk store is translated to a term that is member of a list. Note that we do not have defined the order of chunks in the list that represents the chunk store. We consider all permutations of such chunk store lists as equivalent, since for the proofs the actual choice of order will not make any difference, since we always can choose just the translation with the “correct” order. Where necessary, we comment on that issue in our proofs in section 6. The slot-value pairs in the terms are sorted as defined in .
The cognitive state will be represented by constraints that map a buffer to a chunk identifier and a delay. Since additional information is already represented as logical predicates, we represent them as built-in constraints in the CHR store. We can now define the translation of an abstract state.
Definition 5.8 (translation of abstract states).
An abstract ACT-R state can be translated to the following CHR state:
[TABLE]
We denote the translation of an ACT-R state by .
The chunk store is represented by a constraint that contains the translated chunk store as defined in definition 5.7. Hence, a valid translation of an ACT-R state can only contain exactly one constraint.
For every buffer of the given architecture, a constraint with buffer name, chunk id and delay is added to the state. Since is a total function, every buffer has exactly one constraint. Additionally, the chunk id in the constraint must appear in exactly one constraint because the co-domain of gamma refers to and the chunk ids are unique.
Additional information is used directly as built-in constraints.
5.5. Translation of Rules
In our translation scheme, ACT-R rules are translated to corresponding CHR rules.
5.5.1. Auxiliary Functions for Variable Names
To manage relations between newly introduced variables, we define some auxiliary functions. The functions all produce variable names from the set of variables for a set of arguments that are from the set of constants (or a subset of it) and are applied during the translation, i.e. they do not appear in the generated CHR code.
Definition 5.9 (variable functions).
Let be the set of variables and constants of an ACT-R architecture respectively, the set of buffers of this architecture and for are disjoint subsets of the set of variables. Then the following auxiliary functions are defined:
**Chunk variable function: **
that returns a fresh, unique variable for each buffer . It identifies the chunk of a particular buffer in the translation.
**Result variable functions: **
for are defined as and return a fresh, unique variable for each buffer . They are needed to memorize the results of an action.
**Merge variable function: **
that returns a fresh, unique variable for each buffer . It is needed to memorize the new chunk identifier after merging the chunk in with the existing chunk store.
**Cognitive state variable function: **
(where the list is sorted by the ). The function returns a set of buffer-variable pairs for a set of buffers. From it follows that , i.e. can be considered as a pattern for the cognitive state.
5.5.2. Built-in Constraints for Actions
We define some built-in constraints that are needed for the translation. The idea is that they calculate the results of actions, chunk merging and chunk mapping as defined in the operational semantics of ACT-R. For actual instantiations of the abstract semantics (i.e. with a defined set of actions and chunk merging and mapping mechanisms), it has to be shown that their CHR implementations obey the properties that we define in the following.
Definition 5.10 (action built-ins).
Let be an action and an ACT-R state. Let be the CHR representation of the chunk store in and be the enumerative list representation of the cognitive state .
For all : If is the result of the interpretation of in with , the built-in constraint is defined as follows:
[TABLE]
Intuitively, the built-in constraint represents a function that gets the action , CHR representations of the constraint store and the cognitive state as input and returns by the help of the additional information the CHR representation of . The constraint theory of the constraint is well-defined, since in definition 3.11, has the domain and co-domain , hence is defined. Since , is defined.
5.5.3. Built-in Constraints for Chunk Merging
In the abstract semantics, chunk stores are merged by the operator . In the following, built-in constraints are defined that implement and the corresponding mapping function .
Definition 5.11 (merge built-in).
For a set of chunk stores with for , the built-in is defined as follows:
[TABLE]
Definition 5.12 (map built-in).
For two CHR representations of chunk stores and , the built-in constraint is defined as:
[TABLE]
The main difference to the definition of the function from the abstract semantics is that the built-in constraint operates on chunk identifiers whereas the function operates directly on chunks. Note that in the case that the chunk with identifier appears in neither in nor in , is bound to by definition of (see definition 3.2).
5.5.4. List Operations
We use the built-in constraint to denote that a term is member of a list.
Definition 5.13 (member of a list).
For a term and a list , the constraint holds, iff there is a term that is member of and .
Note that variables in are bound to the values in by this definition.
5.5.5. Translation Scheme for Rules
We can now define the translation scheme for rules.
Definition 5.14 (translation of rules).
An ACT-R rule in set-normal form can be translated to a CHR rule of the following form:
[TABLE]
Note that ACT-R constants and variables from and are implicitly translated to corresponding CHR variables.
We denote the translation of a rule by and the translation of an ACT-R model that is a set of ACT-R rules by . Thereby,
The intuition behind the translation can be described as follows:
The CHR rule tests the state for a constraint representing the chunk store and constraints that come from the buffer tests of the rule. In the constraints a variable for the chunk identifier is introduced. In the guard, the built-in constraint checks, if the chunk store represented as a list contains a term with the same type and slot-value pairs as specified in the buffer tests. The connection to the buffer is realized by the same variable for the chunk identifier (through the variable function ). Since the rule is in set-normal form, the buffer tests are already completed (i.e. all slots are tested) and represented as a sorted list of slot-value pairs as in the state.
In the body of the rule, the built-in constraint calculates the result of each action from the right-hand side of the ACT-R rule. The resulting chunk stores are merged to one store by the built-in constraint . Note that the order of merging is not specified by the translation scheme (as it is not specified by the ACT-R abstract semantics).
The built-in constraint implements the function of the ACT-R semantics and gives access to the possibly modified chunk identifier of all elements in . By definition of , only chunk identifiers in are modified by merging. The resulting chunk identifiers are bound to a variable specified by the variable function .
The resulting chunk store and constraints for all buffers are added. If the buffers have been modified, the constraints points to the resulting chunk of the action, if not it shows to the chunk that has been in the buffer before.
5.6. No Rule Transition
In addition to transitions by rule applications, ACT-R can also have state transitions without rule applications. This is useful for instance, if no rule is applicable (i.e. computation is stuck in a state) but there are pending requests, then simulation time can be forwarded to the point where the next request is finished and its results are visible to the procedural system. This may trigger new rules and continue the computation.
The no rule transition can be modeled in CHR by one individual generic rule:
[TABLE]
This transition is possible for all requests that are pending (i.e. that have a delay ). Hence, the system chooses one request non-deterministically.
6. Soundness and Completeness of the Translation
In this section, we show that our translation is sound and complete w.r.t. the abstract semantics of ACT-R. This means that every transition that is possible in the abstract ACT-R semantics is also possible in CHR and vice versa.
The first step is to show that the results of the built-in constraints in the body of a translated ACT-R rule are equivalent to the interpretation of the right-hand side of the ACT-R rule. Therefore, we use that by definition the built-in constraints and are equivalent to their ACT-R counterparts. Additionally, we have to show that the combination of the results of individual actions leads to the same result as the built-in constraints. We use induction to show that in the following lemma.
Lemma 6.1 (equivalence of effects).
For an ACT-R rule in set-normal form, a state and . Let with . Then the following two propositions are equivalent:
- (1)
[TABLE] 2. (2)
[TABLE]
Proof.
We use induction over the number of actions in .
**base case: : **
Let be an ACT-R rule in set-normal form with one action and . Let and . Let with and with .
We start with
[TABLE]
First of all, we reduce the constraint by use of definition 5.10:
[TABLE]
By definition 5.11, we can reduce the constraints to
[TABLE]
which is by definition of from the assumptions and from the last step equivalent to
[TABLE]
Since and therefore a term with identifier appears in (and not in ), we can now reduce the built-in by definition 5.12 and get together with the definitions of and to
[TABLE]
Since we have that , this is equivalent to
[TABLE]
By definition 3.14, we have that for one action and . Hence, this is equivalent to
[TABLE]
All in all, this proves the proposition for .
**induction step: : **
Let be an ACT-R state and the CHR representation of the chunk store and the enumerative list representation of the cognitive state.
Let with be a rule that has been constructed from a rule . Let with be the interpretation of the smaller rule with actions.
Let with be the interpretation of the rule that has actions..
We begin with
[TABLE]
We need to apply the induction hypothesis. Therefore, we split the conjunctions and lists and get the equivalent formula
[TABLE]
By definition 5.10 of and associativity of and neutral element , we can split the merging as follows: We first merge the actions in to and then merge with the result chunk of action to :
[TABLE]
We can now introduce an intermediate result chunk store that merges the original store with the results from , i.e. , to a chunk store . We introduce some auxiliary variables that map the chunk identifiers of the intermediate chunk store to the resulting chunk store :
[TABLE]
We can now apply the induction hypothesis:
[TABLE]
Thereby, it holds by definition of that .
If we apply definition 5.10 to the remaining constraint of action , we get
[TABLE]
Hence, we have that
[TABLE]
Thus, is the CHR version of the merging of the results from the actions in merged with the results from . By definition 3.14 of the interpretation of rules, we have that
[TABLE]
This is equivalent to
[TABLE]
by definition of and the interpretation of rules (definition 3.14).
We now reduce the last constraint to:
[TABLE]
We can now reconnect the constraints and get
[TABLE]
By definition 3.15, for all defined buffers and we have assumed that . By definition 5.12, this yields
[TABLE]
. exists, since the co-domain of is by definition 3.4 and the are identifiers for the chunks in .
If we apply all this to the conjunction in , we get
[TABLE]
∎
The next lemma proves that rule application transitions are sound.
Lemma 6.2 (soundness of rule applications).
For all ACT-R rules and ACT-R states if then .
Proof.
Let and . Since , we know that for every buffer test there is a substitution such that and for all .
Let the CHR translation of the ACT-R state . By definition 5.8, we have that
[TABLE]
In the next step we split the state to only concentrate on the parts we are interested in for the rule application, i.e. constraints for buffers that occur in a test. We do the same by splitting the representation of the constraint store to the chunks that occur in buffers and all other chunks. This is possible, since does not define an order on the terms in the list in the constraint.
[TABLE]
Note that by now we only have rewritten the CHR state without requiring that the tests we refer to match.
Due to the fact that , we can apply this knowledge to the translated state:
[TABLE]
Since is in set-normal form, we can assume that there is only one test for each buffer in . Hence, we find a constraint in for each such test without violating definition 5.8, that only produces one constraint for each buffer.
Due to set-normal form of and hence totality of w.r.t. the domain of , it is clear that . Hence, we can reduce the state to:
[TABLE]
We introduce a substitution with fresh variables to replace the chunk identifiers in the state, i.e. .
[TABLE]
Let be the conjunction of syntactic equality constraints that can be derived from the substitution , i.e. each substitution appears in as . By definition **Equality as substitution: **, we can move the substitution to the built-in store:
[TABLE]
We introduce a fresh variable and add to the built-in store, which leads to the equivalent state:
[TABLE]
By definition 5.13, we have that . We can replace the corresponding built-ins by definition **Transformation of the constraint store: **:
[TABLE]
Let . By definition 5.14, we have that
[TABLE]
By definition 5.4, is applicable in and with
[TABLE]
Due to the definition 5.14 of , we have that
[TABLE]
Since is in set-normal form, all buffers appearing in also appear in . Hence, all constraints removed by are added in . There is exactly one constraint in . It remains to show that the and constraints are the ones describing the state .
By lemma 6.1, we have that for all with for all
[TABLE]
All in all, we get by definition 5.8 that . Hence, the translation of rule applications is sound w.r.t. the abstract operational semantics of ACT-R. ∎
We have now shown that our translation is sound regarding rule applications, i.e. a rule that can be applied in the abstract semantics of ACT-R can also be applied in the translated CHR program and leads to the same result. The state transition system of ACT-R has a second type of transitions: the no-rule transitions, that can be applied if there is a buffer with a delay , i.e. an invisible chunk. In that case, the no-rule transition allows us to make this chunk visible to the production system by setting the delay to zero. We now show that our translation is sound and complete regarding the no-rule transition.
Lemma 6.3 (soundness and completeness of the no-rule transition).
For an ACT-R model , ACT-R states and and their CHR counterparts and the following propositions are equivalent:
- (1)
* in the model * 2. (2)
**
Proof.
We start with proposition 1. Let and . In the abstract semantics, the no-rule transition is defined as
[TABLE]
where \gamma^{\prime}(b)=\begin{cases}(c^{*},0)&\text{if b=b^{*},}\\ \gamma(b)&\text{otherwise.}\end{cases}
The no-rule transition in CHR is represented by the following CHR rule in :
[TABLE]
Since the no-rule transition is applicable in for some arbitrary but fixed , it holds that
[TABLE]
Therefore, in there must have the following form by definition 5.8:
[TABLE]
Since , this is equivalent to
[TABLE]
and therefore the rule no can be applied to by definition 5.4. This leads to the state with
[TABLE]
By definition 5.8, we have that .
The other direction is analogous. ∎
Lemma 6.4 (completeness of rule applications).
Let be CHR states that have been translated from ACT-R states, i.e. there are ACT-R states such that and . Let be a CHR rule translated from an ACT-R rule . If then .
Proof.
The CHR rule has been translated from an ACT-R rule . Let be an ACT-R rule in set normal form. Then has the following form by definition 5.14:
[TABLE]
where are the built-in constraints of the body and the CHR constraints.
Since is applicable in , by definition 5.4, must have the following form:
[TABLE]
Thereby, is a multi-set of user-defined constraints and a conjunction of built-in constraints. Since by definition of ACT-R states, must be ground and therefore there must be some built-in constraints in that bind those variables to the values from the state:
[TABLE]
is a conjunction of constraints of the form for a variable and a term , that binds the variables from and to the values from the state, i.e. the variables (for all ) and the variables appearing in the from the guard are bound to some values from the state. We denote as the substitution that follows from . Furthermore, must contain the additional information :
[TABLE]
We now want to construct the ACT-R state . Let . Thereby, and for all by definition 5.8. Additionally, . We can now apply the same equivalences as in lemma 6.2 in reverse order and get
[TABLE]
From there it is clear that by definition 4.1. We now apply to according to definition 3.15 and get
[TABLE]
where and \gamma^{\prime}(b):=\begin{cases}(\mathord{\mathit{map}}_{\Delta,\Delta^{*}}(c),d)&\text{if }\gamma^{*}(b)=(c,d)\text{ is defined}\\ (\mathord{\mathit{map}}_{\Delta,\Delta^{*}}(c),d)&\text{otherwise, if \gamma(b)=(c,d).}\end{cases}
By definition 5.4, the application of in leads to the state with
[TABLE]
By lemma 6.1, we get that is equivalent to
[TABLE]
Hence, and correspond directly to each other, i.e. and therefore rule transitions in the translation are complete w.r.t. the abstract operational semantics of ACT-R. ∎
Theorem 6.5 (soundness and completeness of translation).
For an ACT-R model , ACT-R states and and their CHR counterparts and the following propositions are equivalent:
- (1)
* in the model * 2. (2)
**
Proof.
This follows directly from lemmas 6.2, 6.3 and 6.4. ∎
7. Related Work
We want to highlight the contribution of (Albrecht and Westphal, 2014) that we have used as a starting point to improve our work by unifying and extending it by our needs. We discuss this line of work in detail in section 7.1. In section 7.2 we summarize other work related to this paper.
7.1. Formal Semantics According to Albrecht et al.
The formalization according to (Albrecht and Westphal, 2014) has been developed independently from our work in (Gall, 2013; Gall and Frühwirth, 2015a). Our very abstract semantics is based on it. (Albrecht and Westphal, 2014) basically defines a general production rule system that works on sets of buffers and chunks without specifying actual matching, actions and effects for the sake of modularity and reusability. We briefly summarize the differences between the semantics in (Albrecht and Westphal, 2014) and our very abstract semantics. For details, we refer to the original papers. The nomenclature in this paper differs in some points from the original paper (Albrecht and Westphal, 2014) to unify it with our previous work. We omit module queries for the sake of brevity.
The sets of buffers and action symbols are defined as in section 2. For the sake of brevity, we have omitted the so-called buffer queries in our definition of the very abstract semantics. Queries are an additional type of test on the left-hand side of a rule. The very abstract semantics can be easily extended by queries. We have adopted the definition of chunk types, chunks and cognitive states from the formalization of (Albrecht and Westphal, 2014), although the set of chunks in (Albrecht and Westphal, 2014) should be a multi-set as example 3.3 shows. However, we have reduced the definition in our very abstract semantics by omitting the notion of a finite trace, which is a sequence of cognitive states. Those traces are used to compute the effects of an action. This definition seems inaccurate as the information of a finite trace that only logs the contents of the buffers at each step does not suffice to calculate sub-symbolic information. In typical definitions the calculation of production rule utilities needs the times of all rule applications that are not part of the trace. In other implementations and instantiations of ACT-R, there can be more additional information that is needed for sub-symbolic calculations. That is why we have extended the states by a parameter valuation function that abstracts from the information needed and leaves it to the architecture to define which information is stored.
In (Albrecht and Westphal, 2014), effects of actions with action symbol are defined by an interpretation function (we have omitted queries as stated before). Similarly to the very abstract semantics, it assigns to each finite trace the possible effects of an action. Effects are a partial cognitive state that overwrites the contents of the buffers as in the very abstract semantics and a set that defines the chunks that are removed. In typical implementations of ACT-R, the chunks in are moved to the declarative module which explains the need to define such a set. We have generalized this information by the notion parameter valuations that can be manipulated by an interpretation function. This enables us to abstract from the specific concept of moving chunks to declarative memory in our abstract semantics for example. Note that in (Albrecht and Westphal, 2014), the combination of interpretation functions to a rule interpretation is only stated informally. Additionally, we have extended the domain of an action interpretation function to actions, i.e. terms over the actions symbols in , and states instead of only action symbols, since more information is needed to calculate, like the parameters of the actions (i.e. the slot-value pairs) and information from the state.
The production rule selection function maps a set of applicable rules to each finite trace. In the very abstract semantics we have extended the domain from traces to a whole state since again additional information might be needed to resolve rule conflicts. With parameter valuations, we abstract from the information that is actually needed and leave it to the architecture definition. Additionally, our definition of selection function adds the notion of variable bindings that are not considered in (Albrecht and Westphal, 2014).
The operational semantics in (Albrecht and Westphal, 2014) is defined as a labeled, timed transition system with the following transition relation over time-stamped cognitive states from :
[TABLE]
for a production rule , an execution delay , a set of chunks and a finite trace , if and only if , i.e. is applicable in , the actions of according to the interpretation functions yield and .
Note that the set of chunks has been used but never defined in the original paper (Albrecht and Westphal, 2014). We suspect that it represents an equivalent to the chunk store from our abstract semantics, i.e. the used subset of all possible chunks (which is how is defined according to the paper). Although we consider it an integral part of ACT-R, the matching of rules – and particularly binding of variables by the matching – is completely hidden in or even not defined. On the one hand this simplifies exchanging the matching, on the other hand the function should then be defined slightly different to enable proper handling of variable bindings and conflict resolution as we discuss in section 3.
In the original semantics according to (Albrecht and Westphal, 2014) there is no definition of what happens if there is no rule applicable, but there are still effects of e.g. requests that can be applied. We have treated this case by adding the no rule transition to the very abstract semantics.
7.2. Other Work
The reference implementation of ACT-R is described in a technical document (Bothell, ) that defines the operational semantics mostly verbally and determines various technical details that are important for this exact implementation but not the architecture itself. In (Gall and Frühwirth, 2015), we have defined a semantics that describes the core of the reference implementation of ACT-R and show that every transition possible in this refined semantics is also possible in the abstract semantics. This shows that formal reasoning about our abstract semantics is meaningful to actual implementations.
There are approaches of implementing ACT-R in other languages, for example a Python implementation (Stewart and West, 2007) or (at least) two Java implementations (Harrison, 2008; Salvucci, ). All those approaches do not concentrate on formalization and analysis, but only introduce new implementations. In (Stewart and West, 2007) it is stated that exchanging integral parts of the ACT-R reference implementation is difficult due to the need of an extensive knowledge of technical details. They propose an architecture that is more concise and reduced to the fundamental concepts (that they also identify in their paper). However, their work still lacks a formalization of the operational semantics.
In (Albrecht et al., 2014), the authors summarize the work on semantics in the ACT-R context. They also come to the conclusion that there are only new implementations available that sometimes try to formalize parts of the architecture, but no formal definition of ACT-R’s operational semantics. The authors use this result as a motivation for their work in (Albrecht and Westphal, 2014).
We describe an adaptable implementation of ACT-R using Constraint Handling Rules (CHR) in (Gall, 2013; Gall and Frühwirth, 2014, 2015b) that is based on our formalization. Due to the declarativity of CHR, the implementation is very close to the formalization and easy to extend. This has been proved by exchanging the conflict resolution mechanism (that is an integral part of typical implementations) with very low effort (Gall and Frühwirth, 2014). Even the integration of refraction, i.e. inhibiting rules to fire twice on the same (partial) state, has been exemplified and can be combined with other conflict resolution strategies. The translation presented there is close to both the core of the reference implementation and the abstract semantics whose abstract parts are defined such that the match the reference implementation and some of its extensions.
8. Conclusion
In this paper, we have defined a very abstract operational semantics for ACT-R that can serve as a common base to analyze other operational semantics since it leaves enough room for various ACT-R variants. We then have refined this semantics to an abstract semantics.
Similar to the very abstract semantics, the abstract semantics abstracts from details like timings, latencies, forgetting, learning and specific conflict resolution. However, it defines the matching of rules and the processing of actions as they are typically found in ACT-R implementations. Hence, the abstract semantics concentrates on an abstract version of the typical implementations of ACT-R’s procedural system. This makes it possible to reason about the general transitions that are possible in many ACT-R implementations.
We have defined a translation of ACT-R models to Constraint Handling Rules (CHR) that is sound and complete w.r.t. our abstract semantics of ACT-R. To the best of our knowledge, the abstract semantics together with the sound and complete translation to CHR is the first formal formulation of ACT-R that is suitable to implementation.
For the future, we want to investigate how we can use our abstract semantics in practice, since the faithful embedding in CHR opens many possibilities to reason about cognitive models by applying theoretical results from the CHR world to ACT-R models. For instance, confluence is the property that a program always yields the same result for the same input regardless of the order rules are applied. In CHR, there is a decidable confluence criterion for terminating programs (Frühwirth, 2009). Although the human mind is probably not confluent because there are many competing strategies with different outputs for the same task, there are always sequences of rules in cognitive models that should not be interfered by any other rule. A confluence criterion helps identifying the parts of the model that are not confluent. This can improve model quality by allowing for controlled non-confluence where desired guaranteeing the rules in the rest of the program not interfering with each other.
However, in practice confluence usually is too strict. With the notion of invariant-based confluence (Duck et al., 2007) only valid states that can be reached are considered, making confluence analysis applicable for practical use. Confluence modulo equivalence (Christiansen and Kirkeby, 2016) is a recent approach to test if programs always yield states that are considered equivalent by an arbitrary definition of state equivalence for the same input. This could be used to analyze if an ACT-R model always yields a certain class of chunks for the same input. For instance it could be interesting for the modeler to know if a certain buffer always contains a chunk of a certain chunk type or with a certain value in some slot at the end of a computation. By that method, models could guarantee certain properties on their final states improving explanatory power and quality of cognitive models.
To make predictions on the probability that a cognitive model has a certain result, we plan to use the CHR extension CHRiSM (Sneyers et al., 2010; Sneyers et al., 2009) that allows to enrich CHR rules with probabilities. It supports probability computation and even an expectation-maximization learning algorithm that could be used for parameter learning of cognitive models.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1(1)
- 2Albrecht et al . (2014) Rebecca Albrecht, Michael Gießwein, and Bernd Westphal. 2014. Towards formally founded ACT-R simulation and analysis, In Proceedings of the 12th Biannual conference of the German cognitive science society (Gesellschaft für Kognitionswissenschaft). Cognitive Processing 15 (Suppl. 1) (2014), 27–28.
- 3Albrecht and Westphal (2014) Rebecca Albrecht and Bernd Westphal. 2014. F-ACT-R: defining the ACT-R architectural space, In Proceedings of the 12th Biannual conference of the German cognitive science society (Gesellschaft für Kognitionswissenschaft). Cognitive Processing 15 (Suppl. 1) (2014), 79–81.
- 4Anderson (2007) John R. Anderson. 2007. How can the human mind occur in the physical universe? Oxford University Press.
- 5Anderson et al . (2004) John R. Anderson, Daniel Bothell, Michael D. Byrne, Scott Douglass, Christian Lebiere, and Yulin Qin. 2004. An Integrated Theory of the Mind. Psychological Review 111, 4 (2004), 1036–1060.
- 6Anderson and Lebiere (1998) John R. Anderson and Christian Lebiere. 1998. The Atomic Components of Thought . Lawrence Erlbaum Associates, Inc.
- 7Betz and Frühwirth (2005) Hariolf Betz and Thom Frühwirth. 2005. A Linear-Logic Semantics for Constraint Handling Rules . Springer Berlin Heidelberg, Berlin, Heidelberg, 137–151. DOI: http://dx.doi.org/10.1007/11564751_13 · doi ↗
- 8(8) Dan Bothell. ACT-R 6.0 Reference Manual – Working Draft . Department of Psychology, Carnegie Mellon University, Pittsburgh, PA.
