# Logic Tensor Networks for Semantic Image Interpretation

**Authors:** Ivan Donadello, Luciano Serafini, Artur d'Avila Garcez

arXiv: 1705.08968 · 2017-05-26

## TL;DR

This paper introduces Logic Tensor Networks (LTNs), a framework combining neural networks with fuzzy logic for semantic image interpretation, demonstrating improved accuracy and robustness over purely data-driven methods.

## Contribution

First application of SRL with LTNs to semantic image interpretation tasks, integrating logical constraints to enhance learning and reasoning in image analysis.

## Key findings

- Logical constraints improve classification accuracy.
- Background knowledge increases robustness to label errors.
- LTNs outperform state-of-the-art Fast R-CNN in experiments.

## Abstract

Semantic Image Interpretation (SII) is the task of extracting structured semantic descriptions from images. It is widely agreed that the combined use of visual data and background knowledge is of great importance for SII. Recently, Statistical Relational Learning (SRL) approaches have been developed for reasoning under uncertainty and learning in the presence of data and rich knowledge. Logic Tensor Networks (LTNs) are an SRL framework which integrates neural networks with first-order fuzzy logic to allow (i) efficient learning from noisy data in the presence of logical constraints, and (ii) reasoning with logical formulas describing general properties of the data. In this paper, we develop and apply LTNs to two of the main tasks of SII, namely, the classification of an image's bounding boxes and the detection of the relevant part-of relations between objects. To the best of our knowledge, this is the first successful application of SRL to such SII tasks. The proposed approach is evaluated on a standard image processing benchmark. Experiments show that the use of background knowledge in the form of logical constraints can improve the performance of purely data-driven approaches, including the state-of-the-art Fast Region-based Convolutional Neural Networks (Fast R-CNN). Moreover, we show that the use of logical background knowledge adds robustness to the learning system when errors are present in the labels of the training data.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1705.08968/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/1705.08968/full.md

## References

30 references — full list in the complete paper: https://tomesphere.com/paper/1705.08968/full.md

---
Source: https://tomesphere.com/paper/1705.08968