An Analysis of Embedding Layers and Similarity Scores using Siamese Neural Networks
Yash Bingi, Yiqiao Yin

TL;DR
This paper compares embedding algorithms from leading LLMs using medical data, analyzing similarity scores, implementing Siamese Neural Networks for enhancement, and evaluating their performance and carbon footprint.
Contribution
It provides a comparative analysis of embedding algorithms from top industry models and introduces Siamese Neural Networks to improve embedding performance.
Findings
Differences in similarity scores among algorithms
Siamese Neural Networks improved embedding accuracy
Carbon footprint varies across models and training epochs
Abstract
Large Lanugage Models (LLMs) are gaining increasing popularity in a variety of use cases, from language understanding and writing to assistance in application development. One of the most important aspects for optimal funcionality of LLMs is embedding layers. Word embeddings are distributed representations of words in a continuous vector space. In the context of LLMs, words or tokens from the input text are transformed into high-dimensional vectors using unique algorithms specific to the model. Our research examines the embedding algorithms from leading companies in the industry, such as OpenAI, Google's PaLM, and BERT. Using medical data, we have analyzed similarity scores of each embedding layer, observing differences in performance among each algorithm. To enhance each model and provide an additional encoding layer, we also implemented Siamese Neural Networks. After observing changes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Layer Normalization · Residual Connection · Attention Dropout · Dense Connections · Weight Decay · WordPiece
