Critic Sequential Monte Carlo
Vasileios Lioutas, Jonathan Wilder Lavington, Justice Sefas, Matthew, Niedoba, Yunpeng Liu, Berend Zwartsenberg, Setareh Dabiri, Frank Wood, Adam, Scibior

TL;DR
CriticSMC is a novel planning algorithm combining sequential Monte Carlo with learned heuristics, improving efficiency and effectiveness in environments with sparse constraints, demonstrated in high-dimensional driving simulations.
Contribution
We propose CriticSMC, integrating learned Soft-Q heuristics into SMC for more efficient planning, especially in environments with sparse constraints, and show its applicability as a model-free control method.
Findings
Reduces collision rates in simulated driving tasks
Increases inference and planning efficiency
Maintains diverse and realistic behaviors
Abstract
We introduce CriticSMC, a new algorithm for planning as inference built from a composition of sequential Monte Carlo with learned Soft-Q function heuristic factors. These heuristic factors, obtained from parametric approximations of the marginal likelihood ahead, more effectively guide SMC towards the desired target distribution, which is particularly helpful for planning in environments with hard constraints placed sparsely in time. Compared with previous work, we modify the placement of such heuristic factors, which allows us to cheaply propose and evaluate large numbers of putative action particles, greatly increasing inference and planning efficiency. CriticSMC is compatible with informative priors, whose density function need not be known, and can be used as a model-free control algorithm. Our experiments on collision avoidance in a high-dimensional simulated driving task show that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Markov Chains and Monte Carlo Methods · Machine Learning and Algorithms
