An interdisciplinary overview of developmental indices and behavioral measures of the minimal self
Yasmin Kim Georgie, Guido Schillaci, Verena Vanessa Hafner

TL;DR
This review explores how the minimal self develops in humans, how it is measured behaviorally, and how robotics research is attempting to replicate and expand these concepts in artificial agents.
Contribution
It provides an interdisciplinary overview linking developmental psychology, behavioral measures, and robotics research on the minimal self.
Findings
Behavioral measures of body ownership and agency are key indicators of the minimal self.
Robotics research is increasingly integrating concepts of the minimal self to develop autonomous agents.
Potential pathways exist for expanding robotics research to better understand and simulate human self-development.
Abstract
In this review paper we discuss the development of the minimal self in humans, the behavioural measures indicating the presence of different aspects of the minimal self, namely, body ownership and sense of agency, and also discuss robotics research investigating and developing these concepts in artificial agents. We investigate possible avenues for expanding the research in robotics to further explore the development of an artificial minimal self.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
An interdisciplinary overview of developmental indices and behavioral measures of the minimal self
††thanks: This work was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) - 402790442 (”Prerequisites for the Development of an Artificial Self”) within the SPP ”The Active Self” (SPP 2134). The work of GS has partially received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No. 838861 (Predictive Robots).
Yasmin Kim Georgie
Adaptive Systems Group
*Humboldt-Universität zu Berlin
*Berlin, Germany
Guido Schillaci
*The BioRobotics Institute
Scuola Superiore Sant’Anna*, Pisa, Italy
and Adaptive Systems Group
*Humboldt-Universität zu Berlin
Verena Vanessa Hafner
Adaptive Systems Group
*Humboldt-Universität zu Berlin
*Berlin, Germany
Abstract
In this review paper we discuss the development of the minimal self in humans, the behavioural measures indicating the presence of different aspects of the minimal self, namely, body ownership and sense of agency, and also discuss robotics research investigating and developing these concepts in artificial agents. We investigate possible avenues for expanding the research in robotics to further explore the development of an artificial minimal self.
Index Terms:
Models of self and agency; Sensorimotor development; Machine Learning methods for robot development
I Introduction
For centuries, philosophers and scholars from different disciplines have been debating the nature of subjective experience and self-consciousness. A recent account brings back a phenomenological view on this debate, researching self-consciousness in its minimal form, that is studying subjective experiences in their immediate and first-personal ”mineness”. According to Gallagher, this so-called minimal self refers to the ”consciousness of oneself as an immediate subject of experience, unextended in time” [1]. Aspects of this minimal self involve the sense of agency—the sense of the self as the one causing or generating an action, and the sense of ownership—the sense of the self as the one subjected to an experience [1].
This view is distinct from more elaborated aspects of the self, such as the reflexive self and the narrative self [2]. The minimal self is closer to a minimalist level of subjective experience, where the focus is more on the contribution of the here and now, bodily experience in its construction [3]. Other low-level notions of the self have been proposed in the literature, such as the proto-self and the immunological self [4]. Hereby, we do not enter the debate about which of them constitutes the lowest level of consciousness, as researchers have not yet converged to an agreement. We commit to the notion of the minimal self, as this aspect of consciousness is perhaps the most easily accessible in terms of experimental exploration and quantification, and is in fact receiving greater attention from disciplines such as neuroscience, behavioural and cognitive sciences, and developmental psychology [5, 6].
Developmental psychologists consider the emergence of a sense of the self as a key step in cognitive development. By the second year of life, thus few months after having acquired basic linguistic skills, toddlers are capable of using self-referential language such as I, me, my—suggesting that the acquisition of a self-concept has started earlier. The minimal self is pre-linguistic and non-conceptual, and is suggested to unfold already during early developmental stages [7].
This paper presents an interdisciplinary overview of developmental indices and behavioural measures of the minimal self. The minimal self is argued to include two main aspects—a sense of agency and a sense of body ownership—which are thought to be dependent on an internal body representation maintained by our brain. This manuscript thus starts with discussing the development of body representations as a necessary condition for the emergence of the senses of body ownership and of agency (section II). Behavioural paradigms and measures indicating the presence of different aspects of the minimal self (section III) are analysed. In particular, we survey studies on self-touch, intentional binding and sensory attenuation, and the rubber hand illusion.
Alongside the survey on the developmental aspects of the minimal self and the aforementioned behavioural paradigms and measures, the main goal of this work is to discuss the most prominent related studies in robotics. In fact, there is a growing interest in the developmental robotics community in implementing processes capable of enabling the experience of the self in artificial agents. Self-awareness may clearly improve adaptivity and autonomy in robots and, as a result, reduce human intervention in their programming. Using robots as test-beds for studying the minimal self may also shed light on the cognitive mechanisms underling subjective experience. Nonetheless, the investigation on the artificial self is still young and fragmented. This work contributes with identifying current knowledge gaps and limitations in robotics studies and with suggesting research directions for the implementation of behavioural paradigms and measures for the artificial self.
II The development of body representations
In order to effectively interact with the environment, an embodied agent must form and maintain an internal representation of its own body situated within the environment. The term representation may sound controversial. In fact, scholars usually take either a representationalist or a sensorimotor position in the philosophical study of bodily awareness (see [8], section 2, for a review), which may lead to the definition of body image—the mental representation of the body, constituted by a combination of sensorimotor experience and social and psychological concepts about it, and body schema—the integrated, neural organisation of multimodal stimuli coming from the different parts of the body, which is essential for movements. Here, we commit to the latter interpretation of the term body representation, that is a mapping of the body in its various modalities (tactile, proprioceptive, visual, motor, etc.), which is operated in a nonconscious way [9]. The neural foundation for these representations are the so-called cortical ”homunculi” in the primary sensory (S1) and motor (M1) cortices. These are neurological representations of the different anatomical divisions of the body, mapped onto brain areas charged with sensory and motor processing along S1 and M1, respectively. These specialized areas are organised in a somatotopic map where adjacent body parts are represented closely together. The extent of cortex dedicated to a body region is not proportional to its size in the body, but rather to the density of innervation in that specific part (e.g. the mouth and palms). The establishment of the somatotopic organisation in S1 and M1 is driven by genetic factors that are later elaborated through changes in connectivity driven by embodied interactions both before and after birth [10].
Body representations dynamically integrate information from different sensory modalities: tactile, proprioceptive, vestibular, motor, and visual. Studies suggest that the first sense to emerge in the foetus is the somatosensory sense[11], where foetuses are in a state of constantly being touched by their environment. In addition, they engage in self-touch in the womb: often touching their mouth and feet—body parts that are highly innervated and therefore most sensitive to touch—and later on, other parts of the body. The early inclination for movements and self-touch in parts of the body that are more sensitive, suggests that the foetus shows a preference towards movements that induce more informative sensations [12]
From as early as 19 weeks, foetuses can anticipate hand-to-mouth touch, in opening their mouth prior to contact [13, 14], indicating the early presence of a sort of sensorimotor mapping and inference. From 22 weeks, movements seem to show a form of intentional movement, as they become more direct dependening on the action goal [15]. Evidence from neural development studies suggests that even before birth, the prenatal brain should be able to perceive information arising from the body, while higher level (multimodal) representations are possibly formed during the first year after birth, in accordance with the development of association areas [16].
At around 2 months of age, the dominant control of behaviours transitions from subcortical to higher order cortical systems [17]: PET studies show dominant metabolic activity in subcortical regions and the sensorimotor cortex in infants under 5 weeks after birth, and by 3 months, an increase in metabolic activity in the parietal, temporal, and dorsolateral occipital cortices [18]. Hand-mouth coordination continues to develop after birth, and from birth to 6 months, infants display self-touch in a progressive manner throughout their body, from frequently touching rostral parts such as the head and trunk, to more caudal parts of the body such as the hips, legs, and feet later on [19]. The development of goal-directed reaching considerably speeds up at about 5 months of age. Reaching to the own body is thought to be the product of interactions between multiple subsystems. The body representation is constructed through reaching to the body because, in this, the body that is used to act upon the environment becomes the target itself, and therefore needs to be modeled. In certain cases, the self-touch process seems to bypass vision, as when the target is the face, relying only on somatosensory information.
In summary, genetically predetermined cortical maps—the ”homunculi” in the primary sensory (S1) and motor (M1) cortices—facilitate the formation of body representations through cortical learning of sensorimotor contingencies—i.e. the statistical connections between sensory and motor information, and sensorimotor integration—integrating this information into common percepts. This learning of sensorimotor contingencies both drives and is rendered through interactions between brain-body-environment. Specifically, interlinked with the neural ontogenetic process (brain maturation, brain-body interaction), self-exploration (body babbling), self-touch, and goal-directed reaching are considered the necessary behavioural conditions that facilitate and reflect this process. Importantly, this process is thought to be driven and progressively refined by the reduction of prediction errors between predicted sensory outcomes and motor actions, such that the agent learns not only to predict the outcomes of its actions on the environment, but also to predict the (sensory) outcomes of its (motor) actions on its own body [20, 21].
Rochat [22] describes the idea that infants’ self-exploration, and interactions with the environment, give rise to the sense of body ownership through the ”ability to detect intermodal invariants and regularities in their sensorimotor experience, which specify themselves as separate entities agent in the environment.” Therefore, self-exploration (body babbling), self-touch, and goal-directed reaching are necessary conditions for the development of motor control and the emergence of body ownership and sense of agency.
III Behavioural paradigms and measures of the self
A challenging task in the study of the minimal self is to experimentally quantify the attribution of subjective experience. A number of behavioural paradigms and attempts for objective measures have been proposed. This section reviews the most prominent ones. In particular, we analyse studies on self-touch, intentional binding and sensory attenuation, and the rubber hand illusion. There are different reasons why we consider them important in this study:
- A.
Self-touch is likely to contribute to the formation of initial sensorimotor representations, and may therefore constitute one of the very first cues for subjective experience during early developmental stages.
- B.
The way the brain interprets action effects has been shown to differ depending on whether the sensory perception is self- or externally produced, with respect both to the perceived timing of their occurrence (intentional binding) and to their intensity (sensory attenuation) [23].
- C.
The rubber hand illusion has been extensively used as a paradigm for investigating the mechanisms underlying the sensorimotor minimal self.
An obvious advantage of conducting robotics research as test-beds for cognitive models in humans is that using robots allows unmediated access to the actual process (the algorithm) and information that is registered and processed in the system. In human research, even with advanced neuroimaging and analysis methods, researchers could only ever have an approximation of the actual cognitive processes underlying behaviour. This is especially true when investigating subjective experiences such as the illusion of a dummy hand or object being ”mine”, which is measured by observing behaviour or with subjective self-reports from human participants, whereas the difficulty lies with developing and most importantly validating objective measures of cognitive processes (e.g. neuroimaging, proprioceptive drift). The question of how we can measure a ”subjective” experience in a robot then arises.
III-A Self-Touch
How does self-touch relate to body ownership?
Throughout development, the brain must establish links between sensory and motor maps. Refinement of these links eventually leads to goal directed actions. Establishing the basis of this lies in forming sensorimotor contingencies—the statistical links between sensory and motor information.
These sensorimotor contingencies are thought to be established through body babbling where the infant moves its body in an exploratory manner, whereas the brain initiates actions and organises the resulting sensory outcomes continuously, refining its ability to predict sensory outcomes from motor actions. The process of establishing sensorimotor contingencies is gradual due to the large amount of information that is being processed—sensory inputs, motor outputs, and the statistical correlations between them.
Hoffmann et al. [24] suggest that the most systematic correlations are those that will emerge most easily, and therefore, the links between motor actions and proprioceptive changes are presumably the simplest to be extracted, followed by the links between motor actions and tactile input. Hoffman [16] suggests that the redundant information arising from the configurations of self-touch in the proprioceptive-tactile space facilitates learning the body model in space. He asserts that pre-natal self-touch likely contributes to the formation of initial somatosensory representations. Evidence for an early integration between modalities comes from the instances of hand-to-mouth anticipation already in the womb [13, 14]. However, the formation of more comprehensive multimodal body representations probably occurs after birth, from 2-3 months, to include the visual modality and its connections to tactile-proprioceptive modalities. These are learned through self-exploration including self-touch within the environment, which involves learning temporal contingencies, spatial congruence, and redundancies of information coming from different sensory modalities.
Rochat et al. [25] provide evidence that early on infants are capable of discriminating perceptual events—tactile stimuli—that are either self- or not self-produced. The authors tested the rooting behaviour—a reflex behaviour that is triggered by touching the cheek of the infant—in 24h and 4 weeks old newborns, and reported that infants tended to manifest rooting responses almost three times more often in response to external compared to self-stimulation. This suggests that infants pick up already at birth sensorimotor contingencies (single touch or double touch) that specify self- versus external stimulation.
In a longitudinal study, Hofmann et al. [24] observed how infants between 3 and 21 months react to vibrotactile stimulation applied to different body parts. They report responses that varied between particular movement in the stimulated body part and successfully reaching and removing the buzzer. They found an overall developmental progression from general to specific movement patterns, especially from 3 to 12 months. Specifically, their results suggest that at 3-4 months, the infant responds to the buzzer in a non-specific way by moving its whole body, rather than moving the particular limb that was stimulated. However, between 4-12 months, the limb-specific buzzer-oriented reaching develops.
Robotics research
Yamada et al. [26] worked on an embodied brain model of a human foetus in order to examine the causal link between sensorimotor experiences through embodied interactions and cortical learning of body representations. The embodied brain model was based on anatomical and physiological data and included a cortex, a spinal circuit, and musculoskeletal body with sensory receptors for proprioception, tactile perception, and vision, within a model of a uterine environment. The results of this study showed that embodied interaction between brain-body-environment within the womb help to guide the cortical learning of body representations through regularities in sensorimotor experiences. Also, the embodied interactions inside the womb provided a better arena for forming cortical body representations when compared to extrauterine embodied interactions. These findings support the notion suggested by previous studies on animal newborns and preterm infants that the formation of body representations begins even before birth. In addition, their findings suggest that embodied interactions inside the womb set the stage for visual-somatosensory integration after birth [26].
Expanding on this line of research but in a real robot body, Noda et al. [27] equipped a humanoid robot with soft skin sensors and left it engaging in human-robot touch interactions. The tactile data generated through such interactions were used by the authors to form a self-organising somatosensory map, where the feature space was composed by spatially-adjacent sensor pairs. Computational body representations based on self-organising maps and multimodal integration through Hebbian connections have been proposed by Schillaci et al. [28], although not in the context of self-touch. Nonetheless, the model enabled a humanoid robot with predictive capabilities, fundamental for the study of sensory attenuation processes and sense of agency (see next section).
In [29] Hoffman et al. targeted the development of self-organising body representations in the iCub humanoid robot, learning somatosensory representations of the tactile space. Hoffmann et al. examined how the iCub robot, equipped with artificial skin, could form a topographic body representation by learning from tactile stimulations over the surface of its skin. They used modified SOMs that restrict the size of the maximum receptive field (MRF) of neuron groups at the output layer in order to reproduce the genetically predetermined organisation of somatotopic cortical maps. The formation of a tactile map organisation (predetermined with the MRF-SOM) has been reproduced by using training data obtained from a ”double-touch” procedure, in which two experimenters provided tactile stimulation in two different places on the artificial skin.
Current limitations in robotics research
The authors in [29] focused on mimicking the formation process of the cortical ”homunculus” in a real robot, determined a-priori. The training data for the model was not generated by the robot itself, but rather with two methods: either with simulated data or with a human (or two, for studying multi-touch) experimenter touching the robot. Indeed, investigating self-touch in robotics would require the robot itself generating the behaviour that drives the formation of the body representation, which would incorporate the work on proprioceptive representations in [30]. Another possible expansion could be to implement this in a predictive coding framework, such that the self-touch behaviour would be driven by prediction error minimization.
III-B Intentional binding and sensory attenuation
How do intentional binding and sensory attenuation relate to agency?
The human brain interprets action effects differently depending on whether the sensory perception is self-produced or externally generated, with respect both to the perceived timing of their occurrence (intentional binding) and to their intensity (sensory attenuation) [23].
In 2002, Haggard et al. studied the perceived time of intentional actions and their sensory consequences, while investigating action awareness [31]. Interestingly, they found that voluntary actions and their effects are perceived as closer in time, compared to the perceived time shift between involuntary movements—induced by transcranial magnetic stimulation (TMS)—followed by the same effects. Specifically, subjects perceived voluntary movements as occurring later than when they actually occurred, and their sensory consequences as occurring earlier. This effect, known as intentional binding, has attracted the attention of many scholars interested in shedding light on the nature of the sense of agency (see [32] for a review). Engel and Singer reported several pieces of evidence from animal and human studies suggesting that temporal dynamics in neuronal activity may be critically involved in conscious states, in particular that synchronisation may be involved in the generation and maintenance of sensory awareness [5].
Sensory attenuation refers to the partial cancellation, or reduction, of the perceived intensity of the effects of a self-initiated action. Several studies show similar effects. Blakemore et al., for instance, found that self-produced tactile stimulation was perceived as less intense compared to when the same stimulus was produced externally [20]. In the experiment, participants moved the arm of a robot with their left hand in order to produce tactile stimuli on their right hand via a second robot. The authors found that varying the delay between the movements of the left hand and the resulting movements producing the tactile stimuli on the right hand, and varying the degree of trajectory perturbation all had an effect on the rating of tickliness sensations. Participants perceived the stimuli produced by more delayed and more perturbated movements as more tickling, suggesting that self-produced movements attenuate resulting sensations and that a prerequisite for this attenuation is that stimuli and their causal motor commands correspond in time and space.
Robotics research
In [33], Michel et al. experimented the incremental learning of characteristic time delay inherent in the action-perception loop from a sequence of random arm motions within the visual field in a humanoid robot. Interestingly, the study showed that the learned time delay can be successfully used to identify own body parts in the visual field.
Lang, Schillaci and Hafner [34] studied how a humanoid robot can learn, through a self-exploration behaviour, the sensory outcomes (in the visual domain) of self-generated movements. The sensorimotor experience gathered during this process was used as training data for a deep convolutional neural network that mapped proprioceptive and motor data (e.g. initial arm joint positions and applied motor commands) onto the visual outcomes of these actions. The authors then used such a forward model in two experiments. First, for generating visual predictions of self-generated movements, which were compared to actual visual perceptions and then used to compute a prediction error. The system generated higher prediction errors when an external subject was performing actions in front of the robot, compared to situations in which the robot was observing only itself doing the same arm movements. The authors also showed how predictions can be used to attenuate self-generated movements, and thus create enhanced visual perceptions, where the sight of objects—originally occluded by the robot body—was still maintained. This suggests that similar processes may shed light also on the understanding of the sense of object permanence and of short term memory systems in humans.
In [28], Schillaci et al. presented a biologically inspired model for learning multimodal body representations in artificial agents in the context of learning and predicting robot ego-noise, i.e. the auditory noise produced by the robot’s motors while it moves. The authors performed an ego-noise attenuation experiment, which showed the effects in the ego-noise suppression performance of coherent and incoherent proprioceptive and motor information passed as inputs to the predictive process implemented by a forward model. In line with the aforementioned studies, the experiments showed that ego-noise attenuation was more pronounced when the robot was the owner of the action. Sensory attenuation was less pronounced when the robot (not moving itself) was listening to a simulated moving robot, as the incongruence of the proprioceptive and motor information with the perceived ego-noise generated larger prediction errors.
Other robotics studies can be found in the literature implementing top-down processes for interpreting bottom-up sensory streams. An example is the interesting work of Jun Tani [35], who implemented an incremental learning system based on recurrent neural networks and self-organising networks that evolves by showing steady and unsteady phases. The author explains these fluctuations as a result of the interaction between top-down and bottom-up processes, and makes a parallel between them and phenomenological observations.
Current limitations in robotics research
For many years, sense of agency has been measured using explicit self-reported judgements (e.g. [36]). However, the self-report approach has limitations, as it is sensitive to different biases [37]. Moreover, this is not a feasible approach in current robotics technologies if we look at it from a phenomenological perspective, as the only level of judgement available in robots comes at the point when the experimenter observes and interprets the internal states of the machine.
Tani [35], for instance, compares the dynamical structure of the system created in his robotic study to the structure of the ”self”, and observes that ”the ”self” is made aware when the unsteady phase appears in the course of the time-development of the system”. Although we recognise the quality of the proposed model, we believe that similar statements on subjective mind and self in robots—as reported in the paper—are prone to criticisms about their objective validity.
In the work of Michel et al. on learning time delays in action-perception loops in humanoid robots [33], the learning mechanism was simply looking for regularities in the delays between the motor activations and the detections of moving objects on the screen. The authors performed a basic processing of the visual input, which was at any stage mapped to other sensory modalities (e.g. to proprioceptive modalities). The authors pointed out the need for a more robust method of visual recognition of the robot body. However, we find as more critical the lack of multimodal integration, which as to the proposed study consisted only in the analysis of the timestamps of intermodal events. In fact, as discussed in section II, the development of multimodal body representations is a necessary condition for the emergence of a minimal self. Further works on intentional binding effects in the investigation of sense of body ownership and of agency in robots should thus consider also this aspect.
In general, there is a need for further investigation on the intentional binding action effects, especially in robotics. In fact, there are studies [38] showing that motor prediction seems to not modulate that, casting doubts on the assumption that intentional binding of action effects is linked to an internal forward predictive process. These studies suggest that just the temporal control of a stimulus, by means of a voluntary action, might be sufficient to trigger the binding effect.
Regarding the sensory attenuation measure, most of the robotics studies addressing this topic (e.g. [28, 34]) adopt comparator computation models for sensory prediction and attenuation. Nonetheless, more recent probabilistic computational proposals—such as the predictive coding framework described in the next section—would represent a more biologically plausible approach.
III-C Rubber Hand Illusion (RHI)
How does the RHI relate to body ownership?
The rubber hand illusion (RHI) [39], along with other body ownership illusions, is a widely used paradigm for investigating the mechanisms underlying the (sensorimotor) minimal self. In the rubber hand illusion, an observer sees a dummy hand receiving touch (e.g. brush strokes) while at the same time receiving tactile stimulation on their real hidden hand at the same location. This usually elicits an illusory experience of sensing the stimulation on the dummy hand [39, 40], and as a result, incorporating the fake hand as a part of the observer’s own body [41, 42]. This effect was found to be reflected behaviourally as a fear response when the fake hand is being threatened [43, 44] or in perceiving the location of the real hand as closer towards where the dummy hand is located (a ”proprioceptive drift”) [39, 41], suggesting that the dummy hand is treated by the brain as part of the own body as a result of the multisensory stimulation [42]. These illusory effects are thought to be an indicator for the presence of multimodal, integrated body representations. RHI studies show that both top-down and bottom-up processes influence the embodiment of the dummy hand. For example, top-down expectations about the appearance of the human hand—stronger illusory effects are experienced when the dummy hand closely resembles a human hand—are thought to result from internal body representations, and bottom-up sensory inputs where illusory effects are dependent on spatiotemporal congruence of the stimulation and on the proximity of the dummy hand to the real hand [41].
A predictive coding account of the RHI
Different explanations to the RHI effects have been proposed in the cognitive neuroscience literature. The predictive coding [21] account, that is recently receiving great support, proposes that this results from the fact that, to reduce uncertainty, the brain makes inferences about causes of sensory events in a probabilistic-Bayesian manner: prior beliefs (bias) represented in internal models generate predictions about sensory input (top-down). When predictions contradict actual sensory input, this generates ”prediction errors” that propagate up the hierarchy—to unimodal, multimodal, representational areas (bottom-up). The contradiction results in ”surprise” that needs to be ”explained away” by updating the model, thus reducing the prediction error (see [45] for a review on the minimal self within the predictive coding framework).
During the RHI, the co-occurrence of the visual input that comes from observing touch on the dummy hand together with the tactile input that comes from the stimulation of the real hand, evokes—according to the predictive coding account—a prediction error (or surprise), because this spatiotemporal congruence is not predicted by the initial generative forward model. According to [46], the illusion is induced when the probability of the dummy hand being ”me” exceeds the probability of the real hand being ”one’s own”, given the sensory input. If the prediction error can be explained away by adjusting the body model to incorporate the dummy hand, then the RHI will be induced. The explanation that the dummy hand is ”mine” is equivalent to mean that the visual and tactile perceptual inputs occur at the same location and arise from a common cause.
RHI and sense of agency: ”passive” versus ”active” RHI
In the classic RHI paradigm, the participants are not allowed to move their hand. It has been observed that if they do move their hand, and they do not observe congruent movement in the dummy hand, then the illusion will be immediately abolished. The proposed explanation is that, as the participant moved their hand in order to ”test” the body ownership of the not moving dummy hand, the prediction error cannot resolve in favour of perceiving the dummy hand as one’s own [45]. This classic RHI paradigm is therefore named as ”passive”, and while it induces the illusion of body ownership, the effect it has on the sense of agency can not be directly examined.
In ”active” RHI, in addition to the concurrent multisensory stimulation, the participant also moves their real hand while observing a dummy hand that moves along it [47, 48]. In this version of the RHI paradigm (which could also be induced in a virtual environment), the illusion of embodying a dummy hand (or object) is induced as a result of the congruence between the participant’s motor actions and the sensory outcomes of said actions, namely, the perception of the movement of the dummy hand, rather than as a result of the multimodal sensory integration alone. Also of note, in this case, the visual properties of the dummy hand or object, do not necessarily have to resemble those of a real human hand [47, 49]. In ”active” RHI, one can directly manipulate the sense of agency, or even possibly disassociate agency from body ownership [48]. In line with this, there is evidence that body ownership or embodiment of an object, even one which is anatomically implausible, can still be successfully induced given systematic synchrony between visual input when observing the object and one’s own movement. In [50], Ma and Hommel induced in a virtual setting body ownership of virtual 2-D shapes (a virtual balloon changing in size, and a virtual square changing in size and color) when the changes in the 2-D shapes were systematically congruent with participants’ actions. In addition to the concurrent multisensory stimulation, the induced illusion of body ownership is thought to be cultivated by the congruence between predicted sensory outcomes of motor actions and the actual sensory input, pointing to the role of agency in body ownership. This is reminiscent of the manner in which the sense of body ownership emerges in the ontogenetic developmental process.
In another study in a virtual setting [49], Ma and Hommel manipulated the similarity of the object-to-be-embodied (end-effector) to the real hand, the synchrony between stimulation or movement of the end-effector and the stimulation or movement of participant’s real hand, and the degree of agency, operationalized by the level of control over the end-effector. They found that agency strongly effected synchrony-induced body ownership, but not similarity. However, both similarity and agency induced a bias towards body ownership of the end-effector. This shows that agency contributes to body ownership.
Robotics research
When humans are subjected to the RHI, they show a perceptual drift in the location of the real hand toward the dummy hand, which suggests an update in the body representation. Using a multisensory robotic arm, Hinz et al. [51] replicated these drifting patterns in both human and robot experiments with the classic (”passive”) RHI paradigm. The learning and estimation algorithm [52] used in the study was based on the framework of predictive coding [21]. Specifically, Lanillos and Cheng [52] developed a method for integrating different sources of information (tactile, visual, and proprioceptive) that drives the robot priors to infer its body configuration. This computational perceptual model enables a multisensory robot to learn, make inferences, and update its body configuration from its sensors. They modeled the robot body estimation as a process of minimizing the prediction error between the body configuration ”belief” (prediction) and the observed posterior, and minimizing the variational free energy [21] by using the sensory prediction error. Using the algorithm in [52], Hinz et al. [51] showed that body configuration estimation can be done through minimization of prediction error as one process that involves both predictive coding and causal inference. The results from the human and robot experiments suggest that the perceived locations of both the real and the dummy hand drift to a common location between them. In human data, in fact, illusion scores (self-report) were not correlated with the proprioceptive drift, suggesting that the drift and body-ownership illusions are related, but different processes [53].
Current limitations in robotics research
Many studies can be found in the literature which use robotic or virtual hands in active RHI experiments, that is in scenarios where participants move their real hand while observing a dummy robotic or virtual hand that moves along it. However, the investigation on RHI experienced by artificial systems is very scarce. To the best of our knowledge, the work of Hinz et al. [51] is the only study on replicating the rubber hand illusion on a robot.
Another concern is related to the ”proprioceptive drift” as the classic ”objective measure” for the RHI, as used in the experiment mentioned above [51]. Both the human data from this study, as well as previous work in humans [53, 54], failed to find a correlation between the proprioceptive drift and the self-report of the participants, casting doubt on the validity of the proprioceptive drift as an objective measure for body ownership. Further investigation is therefore suggested. Also, after reproducing the classic, ”passive” RHI in a robot using free energy minimization [51], it could be suggested to examine the algorithm in an ”active” RHI experimental setup, which would allow to distinctly examine sense of agency in a robot apart from body ownership.
IV Conclusions
This manuscript presented an interdisciplinary overview of developmental indices and behavioural measures of the minimal self. The fundamental role of the development of body representation in the emergence of body ownership and sense of agency has been discussed. This work also addressed the task of experimentally quantifying the attribution of subjective experience, and surveyed a number of behavioural paradigms and measures indicating the presence of different aspects of the minimal self, namely self-touch, intentional binding and sensory attenuation, and the rubber hand illusion.
Self-touch is likely to contribute to the formation of initial sensorimotor representations, and may therefore constitute one of the very first cues for subjective experience during early developmental stages. Moreover, the way in which our brain interprets action effects has been shown to differ depending on whether the sensory perception is self-produced or externally triggered, with respect both to the perceived timing of their occurrence (intentional binding) and to their intensity (sensory attenuation). Finally, we addressed the rubber hand illusion as it has been extensively used as a paradigm for investigating the mechanisms underlying the sensorimotor minimal self.
We reviewed the most prominent studies addressing these paradigms and measures from the literature in neuroscience, cognitive and developmental sciences. For each of these topics, we presented related robotics studies. Equipping robots with self-awareness and studying the possibility of subjective experience in artificial systems is, in fact, of high interest for the cognitive and developmental robotics communities. This manuscript contributed to this quest by identifying current knowledge gaps and limitations in robotics. In the next section, we conclude this work by highlighting the most critical gaps and by suggesting further research directions.
Further research directions in robotics
The development of multimodal body representations has been discussed as fundamental in the emergence of self-awareness. Further robotics research should address the implementation of multimodal integration through online developmental processes. Current research on self-touch and on self-organisation of somatosensory maps in robots do not explicitly consider the active role that the robot should have in the generation of sensorimotor experience. We therefore encourage further experimentation considering self-exploration behaviours in such a developmental process.
Recent proposals on predictive processes represent promising research lines that go beyond their higher level of biological plausibility. Prediction error minimization processes could result in intelligent robot exploration behaviours, where the intrinsic motivation of reducing uncertainty would generate artificial curiosity and goal-directed behaviours—both prerequisites for motor and cognitive development.
The intentional binding effects and sensory attenuation processes are recognised by the neuroscience and cognitive science communities as important measures for the definition of self-boundaries. Current studies, however, mostly focus on explicit judgement from human participants. This self-report approach is clearly not feasible in robotics. Nonetheless, robots allow experimenters to inspect their internal states, the flowing sensorimotor data and the predictive processes implemented by their computational models. These data is for obvious reasons not accessible in humans. Robots represent, therefore, promising tools for the investigation of intentional binding effects and sensory attenuation processes. Beside encouraging further investigations in robotics, we also suggest more experimentations considering the effects of the developmental path in the performance of such measures. In particular, how do developmental stages—for instance, the levels of multimodal integration reached after a certain stage of sensorimotor exploration—affect predictive performances, and consequently sensory attenuation and intentional binding effects? Can this be linked to stages in early development of the minimal self in humans?
The RHI represents a well-established paradigm to measure subjective experience in humans, and it makes therefore very much sense to extend this to the investigation of subjective experiences in robots. Further usage of this measure in the study of the artificial self is thus encouraged. In particular, we suggest testing the ”active” RHI paradigm in robots in order to investigate also the sense of agency. Moreover, similar effects as the ones mentioned above could be studied for the RHI paradigm. Further studies could address whether and how the perceptual drift measure is affected by the developmental stage in which the agent finds themselves.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] S. Gallagher, “Philosophical conceptions of the self: implications for cognitive science,” Trends in cognitive sciences , vol. 4, no. 1, pp. 14–21, 2000.
- 2[2] B. Martin, M. Wittmann, N. Franck, M. Cermolacce, F. Berna, and A. Giersch, “Temporal structure of consciousness and minimal self in schizophrenia,” Frontiers in psychology , vol. 5, p. 1175, 2014.
- 3[3] F. Ferri, F. Frassinetti, M. Costantini, and V. Gallese, “Motor simulation and the bodily self,” Plo S one , vol. 6, no. 3, p. e 17927, 2011.
- 4[4] A. Damasio, “Mental self: The person within,” Nature , vol. 423, no. 6937, p. 227, 2003.
- 5[5] A. K. Engel and W. Singer, “Temporal binding and the neural correlates of sensory awareness,” Trends in cognitive sciences , vol. 5, no. 1, pp. 16–25, 2001.
- 6[6] F. Crick and C. Koch, “Towards a neurobiological theory of consciousness,” in Seminars in the Neurosciences , vol. 2. Saunders Scientific Publications, 1990, pp. 263–275.
- 7[7] P. Rochat, “Five levels of self-awareness as they unfold early in life,” Consciousness and cognition , vol. 12, no. 4, pp. 717–731, 2003.
- 8[8] F. de Vignemont, “Bodily awareness,” in The Stanford Encyclopedia of Philosophy , spring 2018 ed., E. N. Zalta, Ed. Metaphysics Research Lab, Stanford University, 2018.
