# Composite Optimization Algorithms for Sigmoid Networks

**Authors:** Huixiong Chen, Qi Ye

arXiv: 2303.00589 · 2023-07-10

## TL;DR

This paper introduces composite optimization algorithms tailored for sigmoid networks, transforming the training process into a convex composite optimization problem, with proven convergence guarantees and practical effectiveness demonstrated through numerical experiments.

## Contribution

It develops novel composite optimization algorithms based on linearized proximal methods and ADMM for sigmoid networks, ensuring convergence even in non-convex, non-smooth cases.

## Key findings

- Algorithms converge to global optima under certain conditions.
- Numerical results show robust performance on function fitting and digit recognition.
- Provides guidelines for network size based on training data.

## Abstract

In this paper, we use composite optimization algorithms to solve sigmoid networks. We equivalently transfer the sigmoid networks to a convex composite optimization and propose the composite optimization algorithms based on the linearized proximal algorithms and the alternating direction method of multipliers. Under the assumptions of the weak sharp minima and the regularity condition, the algorithm is guaranteed to converge to a globally optimal solution of the objective function even in the case of non-convex and non-smooth problems. Furthermore, the convergence results can be directly related to the amount of training data and provide a general guide for setting the size of sigmoid networks. Numerical experiments on Franke's function fitting and handwritten digit recognition show that the proposed algorithms perform satisfactorily and robustly.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2303.00589/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/2303.00589/full.md

## References

15 references — full list in the complete paper: https://tomesphere.com/paper/2303.00589/full.md

---
Source: https://tomesphere.com/paper/2303.00589