Online Heterogeneous Mixture Learning for Big Data

Kazuki Seshimo; Ota Akira; Nishio Daichi; Yamane Satoshi

arXiv:1906.08068·cs.LG·June 20, 2019

Online Heterogeneous Mixture Learning for Big Data

Kazuki Seshimo, Ota Akira, Nishio Daichi, Yamane Satoshi

PDF

Open Access

TL;DR

This paper introduces an online learning approach for big data analysis that handles heterogeneity, demonstrating rapid convergence to batch-level accuracy through experiments.

Contribution

It presents a novel online heterogeneous mixture learning method that achieves comparable accuracy to batch methods with faster convergence.

Findings

01

Online method converges quickly to batch accuracy.

02

Achieves comparable accuracy to traditional batch learning.

03

Effective for big data heterogeneity.

Abstract

We propose the online machine learning for big data analysis with heterogeneity. We performed an experiment to compare the accuracy of each iteration between batch one and online one. It is possible to converge quickly with the same accuracy as the batch one.

Tables1

Table 1. TABLE I: data

the number of data	10,000
the number of component mixture	4
mixing coefficient	0.1,0.2,0.3,0.4
means, convariances	random
the number of dimensions	10

Equations20

γ (z_{nk})^{(t + 1)} = \frac{π _{k}^{(t)} N ( x _{n} ∣ μ _{k}^{(t)} , Σ _{k}^{(t)} )}{\sum _{j = 1}^{K} π _{j}^{(t)} N ( x _{n} ∣ μ _{j}^{(t)} , Σ _{j}^{(t)} )}, (k = 1, \dots, K)

γ (z_{nk})^{(t + 1)} = \frac{π _{k}^{(t)} N ( x _{n} ∣ μ _{k}^{(t)} , Σ _{k}^{(t)} )}{\sum _{j = 1}^{K} π _{j}^{(t)} N ( x _{n} ∣ μ _{j}^{(t)} , Σ _{j}^{(t)} )}, (k = 1, \dots, K)

s_{nk}^{(t + 1)} = γ (z_{nk})^{(t + 1)} - γ (z_{nk})^{(t)}

s_{nk}^{(t + 1)} = γ (z_{nk})^{(t + 1)} - γ (z_{nk})^{(t)}

N_{k}^{(t + 1)} = n = 1 \sum N γ (z_{nk})^{(t)} + s_{nk}^{(t + 1)} = N_{k}^{(t)} + s_{nk}^{(t)}

N_{k}^{(t + 1)} = n = 1 \sum N γ (z_{nk})^{(t)} + s_{nk}^{(t + 1)} = N_{k}^{(t)} + s_{nk}^{(t)}

π_{k}^{(t + 1)} = π_{k}^{(t)} + \frac{s _{nk}^{(t + 1)}}{N}

π_{k}^{(t + 1)} = π_{k}^{(t)} + \frac{s _{nk}^{(t + 1)}}{N}

μ_{k}^{(t + 1)} = μ_{k}^{(t)} + \frac{s _{nk}^{(t + 1)}}{N _{k}^{(t + 1)}} (x_{n} - μ_{k}^{(t)})

μ_{k}^{(t + 1)} = μ_{k}^{(t)} + \frac{s _{nk}^{(t + 1)}}{N _{k}^{(t + 1)}} (x_{n} - μ_{k}^{(t)})

Σ_{k}^{(t + 1)}

Σ_{k}^{(t + 1)}

{Σ_{k}^{(t)} + \frac{s _{nk}^{(t + 1)}}{N _{k}^{(t + 1)}} (x_{n} - μ_{k}^{(t)}) (x_{n} - μ_{k}^{(t)})^{T}}

F I C_{online} (x_{n}, M)

F I C_{online} (x_{n}, M)

= q max {J_{online} (q, \overline{θ}, x_{n})}

J_{online} (q, \overline{θ}, x_{n})

J_{online} (q, \overline{θ}, x_{n})

+ q (z_{n c}) [lo g p (x_{n}, z_{n c} ∣ \overline{θ})

- \frac{1}{2} lo g N - c = 1 \sum C \frac{D _{c}}{2} {lo g z_{n c} - lo g q (z_{n c})}]

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Bayesian Methods and Mixture Models · Data Stream Mining Techniques

Full text

Online Heterogeneous Mixture Learning

for Big Data

††thanks:

1st Kazuki Seshimo

Kanazawa University, Kanazawa, Japan

[email protected]

2nd Akira Ota

Kanazawa University, Kanazawa, Japan

[email protected]

3rd Daichi Nishio

Kanazawa University, Kanazawa, Japan

[email protected]

4th Satoshi Yamane

Kanazawa University, Kanazawa, Japan

[email protected]

Abstract

We propose the online machine learning for big data analysis with heterogeneity. We performed an experiment to compare the accuracy of each iteration between batch one and online one. It is possible to converge quickly with the same accuracy as the batch one.

I Introduction

There is a kind of heterogeneous mixture learning for big data analysis with heterogeneity. This is batch learning using a batch EM algorithm for model generation[1]. Therefore, we use the incremental EM algorithm [2,3,4] which is an online EM algorithm to propose online heterogeneous mixture learning. Online heterogeneous mixture learning is possible to converge faster than the batch type with the same accuracy.

II online heterogeneous mixed learning

We propose online learning of heterogeneous mixed learning using the online method of EM algorithm for mixture of Gaussian. First of all, we introduce the incremental EM algorithm[2,3].

$E_{incremental}Step:$ We fix the parameters, and calculate the responsibility $\gamma$ and the amount of change in the responsibility $s_{k}$ . we update the responsibility for one data $x_{n}$ with observation data $x_{N}$ .

[TABLE]

We calculate the amount of change in the responsibility $s_{nk}$ .

[TABLE]

$M_{incremental}Step:$ We fix the esponsibility $\gamma(z_{nk})$ and the amount of change in the responsibility $s_{nk}$ , and update each parameter.

[TABLE]

The crucial points of heterogeneous mixed learning are a factorized information criterion (FIC) and factorized asymptotic Bayesian inference (FAB)[1]. We have to make these available online. First, we improve FIC, which is metric of the model. Second, we improve FAB in response to change of FIC.

The $FIC_{online}$ which supports online learning is shown below.

[TABLE]

It is not possible to evaluate $FIC_{online}$ directly because the parameters can not be determined analytically. In order to evaluate FIC, FAB maximizes an asymptotically-consistent lower bound of FIC. For updates incrementally, we improve FAB using the variation of the variational probability of the latent variable.

We calculate sequentially by repeating the following two steps $t$ times.

$V_{online}Step:$ We optimize the distribution $q(z_{nc})$ of latent variables ${z}^{N}$ , and calculate the distribution of latent variables and their variation $s_{nc}$ for additional data $x_{n}$ .

$M_{online}Step:$ We optimize components of mixture of Gaussian and parameters $\theta$ .

III results of experiment

We compare the results of conventional batch heterogeneous mixture learning [1] and online heterogeneous mixture learning which is proposed in this paper in the same environment and conditions.

In this experiment, the data used for learning is normal random number generated from the mixture of Gaussian. The mixture of Gaussian needs three parameters which are means, convariances and mixing coefficient. We specified these three parameters and the number of dimensions, and we made the dataset for this experiment. TABLE 1 show details.

We measured how the FIC changed with each iteration to compare the convergence speed of online learning with it of batch learning. The number of iterations until convergence was also included in the evaluation. The experiment is performed 10 times, and the average is taken as the experimental result. We experimented by changing the number of data [500, 10000] (Fig. 1) and changing the number of dimensions [2, 4, 20] (Fig. 2). We experimented with the other parameters fixed.

In terms of the small number of iterations, online learning is better than batch learning for all diagrams. When comparing online and batch algorithms, the batch algorithm usually converges faster. However, in the EM algorithm, the on-line algorithm converges faster than the batch algorithm.

IV CONCLUSION

We proposed the online heterogeneous mixture learning for the purpose of speeding up the convergence of machine learning for heterogeneous data. It can be learned to the same accuracy of the batch one with fewer iterations than the batch one. It is also necessary to consider using the Stepwise EM algorithm [4] in which the work area is scalable.

Bibliography4

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Ryohei Fujimaki, Satoshi Morinaga. ”Factorized Asymptotic Bayesian Inference for Mixture Modeling”. Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, PMLR 22:400-408, 2012.
2[2] GE Hinton RM Neal. ”A view of the em algorithm that justifies incremental”, sparse, and other variants. 1998.
3[3] Percy Liang,Dan Klein ”Online EM for unsupervised models” NAACL ’09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics Pages 611-619
4[4] Masa-Aki Sato and Shin Ishii. on-line em algorithm for the normalized gaussian network. Neural computation, Vol. 12, No. 2, pp. 407–432, 2000.