Sensing for Space Safety and Sustainability: A Deep Learning Approach with Vision Transformers
Wenxuan Zhang, Peng Hu

TL;DR
This paper introduces two novel deep learning models, GELAN-ViT and GELAN-RepViT, that incorporate vision transformers for satellite object detection, achieving high accuracy with reduced computational costs to enhance space safety and sustainability.
Contribution
The paper proposes innovative ViT-based models, GELAN-ViT and GELAN-RepViT, improving satellite object detection performance and efficiency over existing methods.
Findings
Models achieve around 95% mAP50 on SOD dataset.
GFLOPs reduced by over 5.0 compared to state-of-the-art.
Models outperform YOLOv9-t in accuracy and efficiency.
Abstract
The rapid increase of space assets represented by small satellites in low Earth orbit can enable ubiquitous digital services for everyone. However, due to the dynamic space environment, numerous space objects, complex atmospheric conditions, and unexpected events can easily introduce adverse conditions affecting space safety, operations, and sustainability of the outer space environment. This challenge calls for responsive, effective satellite object detection (SOD) solutions that allow a small satellite to assess and respond to collision risks, with the consideration of constrained resources on a small satellite platform. This paper discusses the SOD tasks and onboard deep learning (DL) approach to the tasks. Two new DL models are proposed, called GELAN-ViT and GELAN-RepViT, which incorporate vision transformer (ViT) into the Generalized Efficient Layer Aggregation Network (GELAN)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEarthquake Detection and Analysis · Infrared Target Detection Methodologies
MethodsSoftmax · Dense Connections · Linear Layer · Multi-Head Attention · Layer Normalization · Residual Connection · Attention Is All You Need · Vision Transformer
