PAPERCLIP: Associating Astronomical Observations and Natural Language   with Multi-Modal Models

Siddharth Mishra-Sharma; Yiding Song; and Jesse Thaler

arXiv:2403.08851·astro-ph.IM·March 15, 2024·1 cites

PAPERCLIP: Associating Astronomical Observations and Natural Language with Multi-Modal Models

Siddharth Mishra-Sharma, Yiding Song, and Jesse Thaler

PDF

Open Access 1 Repo

TL;DR

PAPERCLIP introduces a neural network model that links astronomical images with natural language descriptions, enabling effective image and description retrieval to enhance interaction with astronomical data.

Contribution

The paper presents a novel fine-tuning approach of CLIP models using astronomical proposals and observations, incorporating LLM-generated summaries to improve multimodal association.

Findings

01

Model achieves meaningful image and description retrieval results.

02

Demonstrates potential for generalist foundation models in astronomy.

03

Uses Hubble Space Telescope data for validation.

Abstract

We present PAPERCLIP (Proposal Abstracts Provide an Effective Representation for Contrastive Language-Image Pre-training), a method which associates astronomical observations imaged by telescopes with natural language using a neural network model. The model is fine-tuned from a pre-trained Contrastive Language-Image Pre-training (CLIP) model using successful observing proposal abstracts and corresponding downstream observations, with the abstracts optionally summarized via guided generation using large language models (LLMs). Using observations from the Hubble Space Telescope (HST) as an example, we show that the fine-tuned model embodies a meaningful joint representation between observations and natural language through tests targeting image retrieval (i.e., finding the most relevant observations using natural language queries) and description retrieval (i.e., querying for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

smsharma/paperclip-hubble
jaxOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEnvironmental Monitoring and Data Management