Benchmarking Harmonized Tariff Schedule Classification Models
Bryce Judy

TL;DR
This paper introduces a standardized benchmarking framework for evaluating HTS classification tools, enabling systematic comparison of their performance in accuracy, speed, and rationality to improve international trade processes.
Contribution
It establishes the first comprehensive benchmark framework for HTS classification models, inspired by language model evaluation methods, to assess and compare industry solutions.
Findings
Zonos excels in speed and accuracy
Tarifflo shows strong rationality and code alignment
Avalara demonstrates balanced performance across metrics
Abstract
The Harmonized Tariff System (HTS) classification industry, essential to e-commerce and international trade, currently lacks standardized benchmarks for evaluating the effectiveness of classification solutions. This study establishes and tests a benchmark framework for imports to the United States, inspired by the benchmarking approaches used in language model evaluation, to systematically compare prominent HTS classification tools. The framework assesses key metrics--such as speed, accuracy, rationality, and HTS code alignment--to provide a comprehensive performance comparison. The study evaluates several industry-leading solutions, including those provided by Zonos, Tarifflo, Avalara, and WCO BACUDA, identifying each tool's strengths and limitations. Results highlight areas for industry-wide improvement and innovation, paving the way for more effective and standardized HTS…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEnergy, Environment, and Transportation Policies · Environmental Impact and Sustainability
