FinTree: Financial Dataset Pretrain Transformer Encoder for Relation Extraction
Hyunjong Ok

TL;DR
FinTree is a novel transformer-based model pretrained on financial data that improves relation extraction accuracy by predicting masked tokens and utilizing a unique input pattern for better entity relation prediction.
Contribution
The paper introduces FinTree, a new pretraining structure for financial relation extraction that replaces the [CLS] token with masked token prediction, enhancing relation prediction accuracy.
Findings
FinTree outperforms existing models on the REFinD dataset.
The masked token prediction structure improves relation extraction accuracy.
The approach effectively incorporates contextual and positional information.
Abstract
We present FinTree, Financial Dataset Pretrain Transformer Encoder for Relation Extraction. Utilizing an encoder language model, we further pretrain FinTree on the financial dataset, adapting the model in financial domain tasks. FinTree stands out with its novel structure that predicts a masked token instead of the conventional [CLS] token, inspired by the Pattern Exploiting Training methodology. This structure allows for more accurate relation predictions between two given entities. The model is trained with a unique input pattern to provide contextual and positional information about the entities of interest, and a post-processing step ensures accurate predictions in line with the entity types. Our experiments demonstrate that FinTree outperforms on the REFinD, a large-scale financial relation extraction dataset. The code and pretrained models are available at…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStock Market Forecasting Methods
MethodsMulti-Head Attention · Attention Is All You Need · Dense Connections · Dropout · Position-Wise Feed-Forward Layer · Adam · Label Smoothing · Byte Pair Encoding · Residual Connection · Linear Layer
