Distributed Lustre activity tracking
Henri Doreau

TL;DR
This paper presents enhancements to Lustre's changelog system to improve scalability and flexibility, including new external tools like LCAP, enabling better real-time activity monitoring on distributed filesystems.
Contribution
It introduces modifications to Lustre and develops external tools to distribute changelog processing and simplify its use for diverse applications.
Findings
Improved scalability of Lustre changelog system
Development of LCAP proxy for distributed processing
Enhanced real-time activity visibility
Abstract
Numerous administration tools and techniques require near real time vision of the activity occurring on a distributed filesystem. The changelog facility provided by Lustre to address this need suffers limitations in terms of scalability and flexibility. We have been working on reducing those limitations by enhancing Lustre itself and developing external tools such as Lustre ChangeLog Aggregate and Publish (LCAP) proxy. Beyond the ability to distribute changelog processing, this effort aims at opening new prospectives by making the changelog stream simpler to leverage for various purposes.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Software System Performance and Reliability · Service-Oriented Architecture and Web Services
