# Is Japanese CCGBank empirically correct? A case study of passive and   causative constructions

**Authors:** Daisuke Bekki, Hitomi Yanaka

arXiv: 2302.14708 · 2023-03-01

## TL;DR

This paper examines the linguistic validity of the Japanese CCGBank, revealing that it produces empirically incorrect predictions for complex passive and causative constructions when used with a semantic parser.

## Contribution

It provides a case study highlighting the discrepancies between the CCGBank annotations and actual linguistic phenomena in Japanese.

## Key findings

- CCGBank yields incorrect predictions for nested passives and causatives
- Analysis shows potential issues in the automatic annotation process
- Highlights need for improved linguistic validation of treebanks

## Abstract

The Japanese CCGBank serves as training and evaluation data for developing Japanese CCG parsers. However, since it is automatically generated from the Kyoto Corpus, a dependency treebank, its linguistic validity still needs to be sufficiently verified. In this paper, we focus on the analysis of passive/causative constructions in the Japanese CCGBank and show that, together with the compositional semantics of ccg2lambda, a semantic parsing system, it yields empirically wrong predictions for the nested construction of passives and causatives.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2302.14708/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/2302.14708/full.md

## References

19 references — full list in the complete paper: https://tomesphere.com/paper/2302.14708/full.md

---
Source: https://tomesphere.com/paper/2302.14708