Seeing Through Green: Text-Based Classification and the Firm's Returns from Green Patents
Lapo Santarlasci, Armando Rungi, Antonio Zinilli

TL;DR
This paper employs NLP to accurately identify genuine green patents, revealing their economic impact and heterogeneity, thereby improving patent classification for policymaking and firm strategy.
Contribution
It introduces a neural network-based NLP method to distinguish true green patents from classified ones, enhancing the accuracy of green patent identification.
Findings
Only 20% of patents classified as green are truly green.
True green patents are about 1% less cited by subsequent inventions.
Holding true green patents correlates with higher firm sales, market share, and productivity.
Abstract
This paper introduces Natural Language Processing for identifying ``true'' green patents from official supporting documents. We start our training on about 12.4 million patents that had been classified as green from previous literature. Thus, we train a simple neural network to enlarge a baseline dictionary through vector representations of expressions related to environmental technologies. After testing, we find that ``true'' green patents represent about 20\% of the total of patents classified as green from previous literature. We show heterogeneity by technological classes, and then check that `true' green patents are about 1\% less cited by following inventions. In the second part of the paper, we test the relationship between patenting and a dashboard of firm-level financial accounts in the European Union. After controlling for reverse causality, we show that holding at least one…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
