On Design Choices in Similarity-Preserving Sparse Randomized Embeddings
Denis Kleyko, Dmitri A. Rachkovskij

TL;DR
This paper investigates how different design choices in the FlyHash algorithm affect the quality of similarity-preserving embeddings used for tasks like pattern recognition and search.
Contribution
It systematically analyzes the impact of preprocessing, activation functions, and projection matrix formation on FlyHash's performance.
Findings
Optimal design choices significantly improve search accuracy.
Certain preprocessing and activation functions enhance embedding quality.
Performance varies drastically with different parameter configurations.
Abstract
Expand & Sparsify is a principle that is observed in anatomically similar neural circuits found in the mushroom body (insects) and the cerebellum (mammals). Sensory data are projected randomly to much higher-dimensionality (expand part) where only few the most strongly excited neurons are activated (sparsify part). This principle has been leveraged to design a FlyHash algorithm that forms similarity-preserving sparse embeddings, which have been found useful for such tasks as novelty detection, pattern recognition, and similarity search. Despite its simplicity, FlyHash has a number of design choices to be set such as preprocessing of the input data, choice of sparsifying activation function, and formation of the random projection matrix. In this paper, we explore the effect of these choices on the performance of similarity search with FlyHash embeddings. We find that the right…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSparse Evolutionary Training
