On Design Choices in Similarity-Preserving Sparse Randomized Embeddings

Denis Kleyko; Dmitri A. Rachkovskij

arXiv:2501.14741·cs.NE·January 28, 2025

On Design Choices in Similarity-Preserving Sparse Randomized Embeddings

Denis Kleyko, Dmitri A. Rachkovskij

PDF

TL;DR

This paper investigates how different design choices in the FlyHash algorithm affect the quality of similarity-preserving embeddings used for tasks like pattern recognition and search.

Contribution

It systematically analyzes the impact of preprocessing, activation functions, and projection matrix formation on FlyHash's performance.

Findings

01

Optimal design choices significantly improve search accuracy.

02

Certain preprocessing and activation functions enhance embedding quality.

03

Performance varies drastically with different parameter configurations.

Abstract

Expand & Sparsify is a principle that is observed in anatomically similar neural circuits found in the mushroom body (insects) and the cerebellum (mammals). Sensory data are projected randomly to much higher-dimensionality (expand part) where only few the most strongly excited neurons are activated (sparsify part). This principle has been leveraged to design a FlyHash algorithm that forms similarity-preserving sparse embeddings, which have been found useful for such tasks as novelty detection, pattern recognition, and similarity search. Despite its simplicity, FlyHash has a number of design choices to be set such as preprocessing of the input data, choice of sparsifying activation function, and formation of the random projection matrix. In this paper, we explore the effect of these choices on the performance of similarity search with FlyHash embeddings. We find that the right…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSparse Evolutionary Training