Generalization ability of a perceptron with non-monotonic transfer   function

Jun-ichi Inoue; Hidetoshi Nishimori; Yoshiyuki Kabashima

arXiv:cond-mat/9705190·cond-mat.dis-nn·October 30, 2009

Generalization ability of a perceptron with non-monotonic transfer function

Jun-ichi Inoue, Hidetoshi Nishimori, Yoshiyuki Kabashima

PDF

TL;DR

This paper studies the generalization ability of a perceptron with a non-monotonic transfer function, comparing various learning algorithms and proposing improvements for better performance across parameter ranges.

Contribution

It introduces a modified AdaTron algorithm that performs well for all parameter values and analyzes the effects of learning rate optimization, linking to Bayesian statistics.

Findings

01

Perceptron with non-monotonic transfer function shows specific generalization behaviors.

02

Modified AdaTron algorithm achieves good performance across all parameter ranges.

03

Learning rate optimization enhances learning efficiency, with results related to Bayesian methods.

Abstract

We investigate the generalization ability of a perceptron with non-monotonic transfer function of a reversed-wedge type in on-line mode. This network is identical to a parity machine, a multilayer network. We consider several learning algorithms. By the perceptron algorithm the generalization error is shown to decrease by the $α^{- 1/3}$ -law similarly to the case of a simple perceptron in a restricted range of the parameter $a$ characterizing the non-monotonic transfer function. For other values of $a$ , the perceptron algorithm leads to the state where the weight vector of the student is just opposite to that of the teacher. The Hebbian learning algorithm has a similar property; it works only in a limited range of the parameter. The conventional AdaTron algorithm does not give a vanishing generalization error for any values of $a$ .We thus introduce a modified AdaTron algorithm…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.