# Generalizable Meta-Heuristic based on Temporal Estimation of Rewards for   Large Scale Blackbox Optimization

**Authors:** Mingde Zhao, Hongwei Ge, Yi Lian, Kai Zhang

arXiv: 1812.06585 · 2019-09-19

## TL;DR

This paper introduces a novel meta-heuristic called Temporal Estimation of Rewards (TER) for large-scale blackbox optimization, which dynamically balances exploration and exploitation to improve generalization across high-dimensional problems.

## Contribution

The paper proposes a new meta-heuristic based on non-stationary multi-armed bandits that generalizes across LSBO tasks and adapts to various resource constraints.

## Key findings

- TER achieves competitive performance on benchmarks with up to 10,000 dimensions.
- The methodology effectively transforms LSBO problems into online decision processes.
- The approach demonstrates significant effectiveness and flexibility across different problem sets.

## Abstract

The generalization abilities of heuristic optimizers may deteriorate with the increment of the search space dimensionality. To achieve generalized performance across Large Scale Blackbox Optimization (LSBO) tasks, it ispossible to ensemble several heuristics and devise a meta-heuristic to control their initiation. This paper first proposes a methodology of transforming LSBO problems into online decision processes to maximize efficiency of resource utilization. Then, using the perspective of multi-armed bandits with non-stationary reward distributions, we propose a meta-heuristic based on Temporal Estimation of Rewards (TER) to address such decision process. TER uses a window for temporal credit assignment and Boltzmann exploration to balance the exploration-exploitation tradeoff. The prior-free TER generalizes across LSBO tasks with flexibility for different types of limited computational resources (e.g. time, money, etc.) and is easy to be adapted to new tasks for its simplicity and easy interface for heuristic articulation. Tests on the benchmarks validate the problem formulation and suggest significant effectiveness: when TER is articulated with three heuristics, competitive performance is reported across different sets of benchmark problems with search dimensions up to 10000.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1812.06585/full.md

## Figures

12 figures with captions in the complete paper: https://tomesphere.com/paper/1812.06585/full.md

## References

27 references — full list in the complete paper: https://tomesphere.com/paper/1812.06585/full.md

---
Source: https://tomesphere.com/paper/1812.06585