Using SVM to pre-classify government purchases
Thiago Marzag\~ao

TL;DR
This paper demonstrates how a support vector machine classifier trained on extensive government procurement data can significantly improve the accuracy of product classification, aiding transparency and auditability in public spending.
Contribution
It introduces a machine learning approach using SVM to pre-classify government purchases, reducing misclassification and supporting transparency in procurement.
Findings
83.3% accuracy in top three predictions
Trained on 20 million records from 1999 to 2015
Open-sourced web app for practical use
Abstract
The Brazilian government often misclassifies the goods it buys. That makes it hard to audit government expenditures. We cannot know whether the price paid for a ballpoint pen (code #7510) was reasonable if the pen was misclassified as a technical drawing pen (code #6675) or as any other good. This paper shows how we can use machine learning to reduce misclassification. I trained a support vector machine (SVM) classifier that takes a product description as input and returns the most likely category codes as output. I trained the classifier using 20 million goods purchased by the Brazilian government between 1999-04-01 and 2015-04-02. In 83.3% of the cases the correct category code was one of the three most likely category codes identified by the classifier. I used the trained classifier to develop a web app that might help the government reduce misclassification. I open sourced the code…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStock Market Forecasting Methods · Forecasting Techniques and Applications
