Bayesian Hamiltonian Selection in X-ray Photoelectron Spectroscopy

Yoh-ichi Mototake; Masaichiro Mizumaki; Ichiro Akai; Masato Okada

arXiv:1812.01205·cond-mat.str-el·March 27, 2019

Bayesian Hamiltonian Selection in X-ray Photoelectron Spectroscopy

Yoh-ichi Mototake, Masaichiro Mizumaki, Ichiro Akai, Masato Okada

PDF

Open Access

TL;DR

This paper introduces an automated Bayesian approach for selecting and parameterizing effective Hamiltonians in core-level XPS spectra analysis, improving accuracy and providing uncertainty quantification.

Contribution

The paper presents a novel Bayesian model selection framework using exchange Monte Carlo sampling for automatic Hamiltonian selection and parameter estimation in XPS spectra analysis.

Findings

01

Successfully applied to Ce and La compounds' 3d XPS spectra

02

Confirmed the method's selections align with physical knowledge

03

Enabled uncertainty evaluation of the estimated parameters

Abstract

Core-level X-ray photoelectron spectroscopy (XPS) is a useful measurement technique for investigating the electronic states of a strongly correlated electron system. Usually, to extract physical information of a target object from a core-level XPS spectrum, we need to set an effective Hamiltonian by physical consideration so as to express complicated electron-to-electron interactions in the transition of core-level XPS, and manually tune the physical parameters of the effective Hamiltonian so as to represent the XPS spectrum. Then, we can extract physical information from the tuned parameters. In this paper, we propose an automated method for analyzing core-level XPS spectra based on the Bayesian model selection framework, which selects the effective Hamiltonian and estimates its parameters automatically. The Bayesian model selection, which often has a large computational cost, was…

Tables9

Table 1. Table 1: Reproduction parameters of XPS spectra of La 2 O 3 and CeO 2 .

Parameter	$Δ$	$V$	$U_{f f}$	$U_{f c}$	$Γ$
La₂O₃	12.5	0.57	10.5	12.7	0.5
CeO₂	1.6	0.76	10.5	12.5	0.7

Table 2. Table 2: Properties of spectrum structure. E j − E g subscript 𝐸 𝑗 subscript 𝐸 𝑔 E_{j}-E_{g} corresponds to the peak position, and | ⟨ F j | a c | G ⟩ | 2 Γ / π E j − E g − ( E j − E g ) + Γ 2 = | ⟨ F j | a c | G ⟩ | 2 Γ π superscript quantum-operator-product subscript 𝐹 𝑗 subscript 𝑎 𝑐 𝐺 2 Γ 𝜋 subscript 𝐸 𝑗 subscript 𝐸 𝑔 subscript 𝐸 𝑗 subscript 𝐸 𝑔 superscript Γ 2 superscript quantum-operator-product subscript 𝐹 𝑗 subscript 𝑎 𝑐 𝐺 2 Γ 𝜋 \left|\langle{}F_{j}|a_{c}|G\rangle{}\right|^{2}\frac{\Gamma/\pi}{E_{j}-E_{g}-(E_{j}-E_{g})+\Gamma^{2}}=\frac{|\langle{}F_{j}|a_{c}|G\rangle{}|^{2}}{\Gamma\pi} corresponds to the peak intensity. The number of peaks is defined as the number of peaks whose intensity is larger than the noise intensity σ n o i s e subscript 𝜎 𝑛 𝑜 𝑖 𝑠 𝑒 \sigma_{noise} .

$Δ$	Number of peaks	$E_{0} - E_{g}$	$E_{1} - E_{g}$	$E_{2} - E_{g}$	$\frac{{\| ⟨ F_{0} \| a_{c} \| G ⟩ \|}^{2}}{Γ π}$	$\frac{{\| ⟨ F_{1} \| a_{c} \| G ⟩ \|}^{2}}{Γ π}$	$\frac{{\| ⟨ F_{2} \| a_{c} \| G ⟩ \|}^{2}}{Γ π}$
12.5	2	-2.8	2.72	12.48	0.3850	0.2512	0.0005
10.08	2-3	-4.53	1.94	8.15	0.2824	0.3509	0.0033
7.66	3	-6.67	1.25	4.3746	0.2330	0.3581	0.0456
5.23	3	-9.03	2.74	-0.92	0.2298	0.2414	0.1654
2.81	3	2.91	-11.5	-3.93	0.2372	0.2623	0.1372
1.6	3	3.29	-5.22	-12.71	0.1937	0.1575	0.2854

Table 3. Table 3: Range of uniform prior densities of parameters.

	$Δ$	$Δ^{'}$	$V$	$U_{f f}$	$U_{f c}$	$Γ$	b
Min	0.0	-20.0	0.0	0.0	0.0	0.01	-5.0
Max	20.0	20.0	4.0	20.0	20.0	1.0	5.0

Table 4. Table 4: Estimated values of Δ Δ \Delta by MAP and MPM methods. The estimation accuracy can be evaluated by comparison between the estimated parameter and true parameter. And the uncertainty of estimation can be evaluated by the variation χ 𝜒 \chi .

$Δ$ \ $σ_{n o i s e}$		0.001		0.0028		0.0046		0.0064		0.0082		0.01
	True	MAP	MPM $\pm χ$	MAP	MPM $\pm χ$	MAP	MPM $\pm χ$	MAP	MPM $\pm χ$	MAP	MPM $\pm χ$	MAP	MPM $\pm χ$
12.5	12.5	14.68	13.65 $\pm$ 34.53	18.56	18.97 $\pm$ 94.66	8.81	10.77 $\pm$ 30.62	16.07	18.16 $\pm$ 77.65	18.71	17.89 $\pm$ 77.3	15.68	16.32 $\pm$ 58.92
10.08	10.08	8.85	8.93 $\pm$ 0.28	8.72	8.41 $\pm$ 8.92	7.94	7.21 $\pm$ 27.91	4.64	6.65 $\pm$ 33.98	5.91	9.42 $\pm$ 24.6	2.33	6.2 $\pm$ 39.8
7.66	7.66	7.62	7.64 $\pm$ 0.0	7.58	7.62 $\pm$ 0.0	7.62	7.58 $\pm$ 0.01	7.65	7.59 $\pm$ 0.03	7.32	7.56 $\pm$ 0.05	7.48	7.58 $\pm$ 0.07
5.23	5.23	5.22	5.21 $\pm$ 0.0	5.15	5.17 $\pm$ 0.0	5.1	5.14 $\pm$ 0.01	5.13	5.1 $\pm$ 0.01	5.04	5.05 $\pm$ 0.02	5.02	5.02 $\pm$ 0.03
2.81	2.81	2.82	2.81 $\pm$ 0.0	2.8	2.81 $\pm$ 0.0	2.79	2.8 $\pm$ 0.0	2.82	2.8 $\pm$ 0.0	2.75	2.81 $\pm$ 0.01	2.71	2.81 $\pm$ 0.01
1.6	1.6	1.6	1.6 $\pm$ 0.0	1.62	1.59 $\pm$ 0.0	1.6	1.6 $\pm$ 0.0	1.56	1.6 $\pm$ 0.0	1.61	1.6 $\pm$ 0.0	1.63	1.59 $\pm$ 0.0

Table 5. Table 5: Estimated values of U f c subscript 𝑈 𝑓 𝑐 U_{fc} by MAP and MPM methods. The estimation accuracy can be evaluated by comparison between the estimated parameter and true parameter. And the uncertainty of estimation can be evaluated by the variation χ 𝜒 \chi .

$Δ$ \ $σ_{n o i s e}$		0.001		0.0028		0.0046		0.0064		0.0082		0.01
	True	MAP	MPM $\pm χ$	MAP	MPM $\pm χ$	MAP	MPM $\pm χ$	MAP	MPM $\pm χ$	MAP	MPM $\pm χ$	MAP	MPM $\pm χ$
12.5	12.5	14.31	5.82 $\pm$ 53.4	18.0	5.78 $\pm$ 55.52	9.68	5.81 $\pm$ 52.92	15.92	5.93 $\pm$ 56.29	17.91	5.92 $\pm$ 54.12	15.33	5.99 $\pm$ 51.92
10.08	12.5	11.61	11.66 $\pm$ 0.14	11.53	11.32 $\pm$ 5.58	10.9	10.68 $\pm$ 17.29	8.79	9.85 $\pm$ 23.41	9.54	9.66 $\pm$ 25.75	7.6	9.73 $\pm$ 25.67
7.66	12.5	12.46	12.48 $\pm$ 0.0	12.43	12.46 $\pm$ 0.0	12.48	12.43 $\pm$ 0.01	12.46	12.44 $\pm$ 0.02	12.19	12.4 $\pm$ 0.03	12.35	12.39 $\pm$ 0.05
5.23	12.5	12.48	12.49 $\pm$ 0.0	12.45	12.47 $\pm$ 0.0	12.42	12.45 $\pm$ 0.0	12.42	12.43 $\pm$ 0.01	12.41	12.4 $\pm$ 0.01	12.39	12.39 $\pm$ 0.01
2.81	12.5	12.5	12.5 $\pm$ 0.0	12.49	12.5 $\pm$ 0.0	12.47	12.49 $\pm$ 0.0	12.49	12.5 $\pm$ 0.0	12.46	12.49 $\pm$ 0.01	12.51	12.48 $\pm$ 0.01
1.6	12.5	12.48	12.49 $\pm$ 0.0	12.51	12.48 $\pm$ 0.0	12.43	12.47 $\pm$ 0.0	12.39	12.48 $\pm$ 0.01	12.37	12.43 $\pm$ 0.01	12.45	12.42 $\pm$ 0.01

Table 6. Table 6: Estimated values of U f f subscript 𝑈 𝑓 𝑓 U_{ff} by MAP and MPM methods. The estimation accuracy can be evaluated by comparison between the estimated parameter and true parameter. And the uncertainty of estimation can be evaluated by the variation χ 𝜒 \chi .

$Δ$ \ $σ_{n o i s e}$		0.001		0.0028		0.0046		0.0064		0.0082		0.01
	True	MAP	MPM $\pm χ$	MAP	MPM $\pm χ$	MAP	MPM $\pm χ$	MAP	MPM $\pm χ$	MAP	MPM $\pm χ$	MAP	MPM $\pm χ$
12.5	10.5	9.75	13.43 $\pm$ 11.0	12.41	14.91 $\pm$ 10.46	11.83	18.66 $\pm$ 31.3	15.57	14.52 $\pm$ 13.12	8.93	14.71 $\pm$ 15.13	11.71	13.37 $\pm$ 14.95
10.08	10.5	11.13	11.05 $\pm$ 0.13	11.32	11.41 $\pm$ 7.87	10.68	15.17 $\pm$ 8.03	12.34	17.38 $\pm$ 11.87	11.1	18.73 $\pm$ 20.31	13.67	18.93 $\pm$ 22.6
7.66	10.5	10.52	10.52 $\pm$ 0.0	10.6	10.54 $\pm$ 0.01	10.62	10.55 $\pm$ 0.01	10.5	10.56 $\pm$ 0.03	10.64	10.58 $\pm$ 0.05	10.58	10.61 $\pm$ 0.08
5.23	10.5	10.51	10.52 $\pm$ 0.0	10.57	10.56 $\pm$ 0.0	10.59	10.58 $\pm$ 0.01	10.56	10.62 $\pm$ 0.01	10.69	10.66 $\pm$ 0.02	10.71	10.68 $\pm$ 0.03
2.81	10.5	10.5	10.5 $\pm$ 0.0	10.51	10.49 $\pm$ 0.01	10.51	10.53 $\pm$ 0.02	10.47	10.49 $\pm$ 0.04	10.58	10.54 $\pm$ 0.06	10.83	10.51 $\pm$ 0.09
1.6	10.5	10.46	10.48 $\pm$ 0.0	10.47	10.46 $\pm$ 0.01	10.3	10.39 $\pm$ 0.02	10.28	10.38 $\pm$ 0.03	10.09	10.39 $\pm$ 0.05	10.25	10.28 $\pm$ 0.08

Table 7. Table 7: Estimated values of V 𝑉 V by MAP and MPM methods. The estimation accuracy can be evaluated by comparison between the estimated parameter and true parameter. And the uncertainty of estimation can be evaluated by the variation χ 𝜒 \chi .

$Δ$ \ $σ_{n o i s e}$		0.001		0.0028		0.0046		0.0064		0.0082		0.01
	True	MAP	MPM $\pm χ$	MAP	MPM $\pm χ$	MAP	MPM $\pm χ$	MAP	MPM $\pm χ$	MAP	MPM $\pm χ$	MAP	MPM $\pm χ$
12.5	0.76	0.77	0.75 $\pm$ 0.02	0.76	0.75 $\pm$ 0.02	0.72	0.75 $\pm$ 0.02	0.75	0.75 $\pm$ 0.01	0.78	0.75 $\pm$ 0.02	0.76	0.75 $\pm$ 0.02
10.08	0.76	0.73	0.73 $\pm$ 0.00	0.72	0.72 $\pm$ 0.01	0.71	0.8 $\pm$ 0.03	0.58	0.8 $\pm$ 0.03	0.64	0.8 $\pm$ 0.03	0.42	0.8 $\pm$ 0.03
7.66	0.76	0.76	0.76 $\pm$ 0.00	0.76	0.76 $\pm$ 0.00	0.76	0.76 $\pm$ 0.00	0.76	0.76 $\pm$ 0.00	0.76	0.76 $\pm$ 0.00	0.75	0.76 $\pm$ 0.00
5.23	0.76	0.76	0.76 $\pm$ 0.00	0.76	0.76 $\pm$ 0.00	0.76	0.76 $\pm$ 0.00	0.76	0.76 $\pm$ 0.00	0.75	0.75 $\pm$ 0.00	0.75	0.75 $\pm$ 0.00
2.81	0.76	0.76	0.76 $\pm$ 0.00	0.76	0.76 $\pm$ 0.00	0.76	0.76 $\pm$ 0.00	0.76	0.76 $\pm$ 0.00	0.76	0.76 $\pm$ 0.00	0.76	0.76 $\pm$ 0.00
1.6	0.76	0.76	0.76 $\pm$ 0.00	0.76	0.76 $\pm$ 0.00	0.76	0.76 $\pm$ 0.00	0.76	0.76 $\pm$ 0.00	0.76	0.76 $\pm$ 0.00	0.76	0.76 $\pm$ 0.00

Table 8. Table 8: Estimated values of Γ 2 subscript Γ 2 \Gamma_{2} by MAP and MPM methods. The estimation accuracy can be evaluated by comparison between the estimated parameter and true parameter. And the uncertainty of estimation can be evaluated by the variation χ 𝜒 \chi .

$Δ$ \ $σ_{n o i s e}$		0.001		0.0028		0.0046		0.0064		0.0082		0.01
	True	MAP	MPM $\pm χ$	MAP	MPM $\pm χ$	MAP	MPM $\pm χ$	MAP	MPM $\pm χ$	MAP	MPM $\pm χ$	MAP	MPM $\pm χ$
12.5	0.5	0.5	0.5 $\pm$ 0.00	0.5	0.5 $\pm$ 0.00	0.51	0.5 $\pm$ 0.00	0.5	0.5 $\pm$ 0.00	0.5	0.51 $\pm$ 0.00	0.5	0.51 $\pm$ 0.00
10.08	0.5	0.5	0.5 $\pm$ 0.00	0.5	0.5 $\pm$ 0.00	0.49	0.5 $\pm$ 0.00	0.48	0.49 $\pm$ 0.00	0.49	0.49 $\pm$ 0.00	0.5	0.49 $\pm$ 0.00
7.66	0.5	0.5	0.5 $\pm$ 0.00	0.5	0.5 $\pm$ 0.00	0.5	0.5 $\pm$ 0.00	0.5	0.5 $\pm$ 0.00	0.5	0.5 $\pm$ 0.00	0.5	0.5 $\pm$ 0.00
5.23	0.5	0.5	0.5 $\pm$ 0.00	0.5	0.5 $\pm$ 0.00	0.5	0.5 $\pm$ 0.00	0.51	0.5 $\pm$ 0.00	0.51	0.5 $\pm$ 0.00	0.5	0.5 $\pm$ 0.00
2.81	0.5	0.5	0.5 $\pm$ 0.00	0.5	0.5 $\pm$ 0.00	0.52	0.51 $\pm$ 0.00	0.51	0.51 $\pm$ 0.00	0.52	0.51 $\pm$ 0.00	0.5	0.52 $\pm$ 0.00
1.6	0.5	0.5	0.5 $\pm$ 0.00	0.5	0.5 $\pm$ 0.00	0.5	0.5 $\pm$ 0.00	0.51	0.5 $\pm$ 0.00	0.51	0.5 $\pm$ 0.00	0.5	0.5 $\pm$ 0.00

Table 9. Table 9: Estimated values of Γ 3 subscript Γ 3 \Gamma_{3} by MAP and MPM methods. The estimation accuracy can be evaluated by comparison between the estimated parameter and true parameter. And the uncertainty of estimation can be evaluated by the variation χ 𝜒 \chi .

$Δ$ \ $σ_{n o i s e}$		0.001		0.0028		0.0046		0.0064		0.0082		0.01
	True	MAP	MPM $\pm χ$	MAP	MPM $\pm χ$	MAP	MPM $\pm χ$	MAP	MPM $\pm χ$	MAP	MPM $\pm χ$	MAP	MPM $\pm χ$
12.5	0.5	0.23	0.13 $\pm$ 0.21	0.06	0.34 $\pm$ 0.11	0.34	0.63 $\pm$ 0.1	0.95	0.2 $\pm$ 0.18	0.15	0.81 $\pm$ 0.17	0.84	0.69 $\pm$ 0.11
10.08	0.5	0.3	0.32 $\pm$ 0.01	0.21	0.25 $\pm$ 0.1	0.66	0.71 $\pm$ 0.12	0.05	0.93 $\pm$ 0.25	0.05	0.93 $\pm$ 0.25	0.99	0.48 $\pm$ 0.08
7.66	0.5	0.5	0.5 $\pm$ 0.00	0.48	0.5 $\pm$ 0.00	0.49	0.49 $\pm$ 0.00	0.51	0.49 $\pm$ 0.00	0.46	0.49 $\pm$ 0.00	0.47	0.49 $\pm$ 0.01
5.23	0.5	0.5	0.5 $\pm$ 0.00	0.5	0.5 $\pm$ 0.00	0.51	0.5 $\pm$ 0.00	0.5	0.5 $\pm$ 0.00	0.5	0.51 $\pm$ 0.00	0.51	0.51 $\pm$ 0.00
2.81	0.5	0.5	0.5 $\pm$ 0.00	0.5	0.5 $\pm$ 0.00	0.5	0.5 $\pm$ 0.00	0.5	0.5 $\pm$ 0.00	0.48	0.5 $\pm$ 0.00	0.5	0.5 $\pm$ 0.00
1.6	0.5	0.5	0.5 $\pm$ 0.00	0.5	0.5 $\pm$ 0.00	0.49	0.5 $\pm$ 0.00	0.49	0.5 $\pm$ 0.00	0.5	0.49 $\pm$ 0.00	0.51	0.49 $\pm$ 0.00

Equations48

H = ϵ_{L} ν \sum a_{Lν}^{†} a_{Lν} + ϵ_{f}^{0} ν \sum a_{f ν}^{†} a_{f ν} + ϵ_{c} a_{c}^{†} a_{c} + \frac{V}{N _{f}} ν \sum (a_{f ν}^{†} a_{f ν} + a_{f ν} a_{f ν}^{†}) + U_{f f} ν ⟩ ν^{'} \sum a_{f ν}^{†} a_{f ν} a_{f ν^{'}}^{†} a_{f ν^{'}} - U_{f c} ν \sum a_{f ν}^{†} a_{f ν} (1 - a_{c}^{†} a_{c}),

H = ϵ_{L} ν \sum a_{Lν}^{†} a_{Lν} + ϵ_{f}^{0} ν \sum a_{f ν}^{†} a_{f ν} + ϵ_{c} a_{c}^{†} a_{c} + \frac{V}{N _{f}} ν \sum (a_{f ν}^{†} a_{f ν} + a_{f ν} a_{f ν}^{†}) + U_{f f} ν ⟩ ν^{'} \sum a_{f ν}^{†} a_{f ν} a_{f ν^{'}}^{†} a_{f ν^{'}} - U_{f c} ν \sum a_{f ν}^{†} a_{f ν} (1 - a_{c}^{†} a_{c}),

F (ω; ϑ) = j = 0 \sum 2 ∣ ⟨ F_{j} ∣ a_{c} ∣ G ⟩ ∣^{2} δ (ω - E_{j} (ϑ) + E_{G} (ϑ)) .

F (ω; ϑ) = j = 0 \sum 2 ∣ ⟨ F_{j} ∣ a_{c} ∣ G ⟩ ∣^{2} δ (ω - E_{j} (ϑ) + E_{G} (ϑ)) .

I (ω; ϑ) = j = 0 \sum 2 ∣ ⟨ F_{j} ∣ a_{c} ∣ G ⟩ ∣^{2} \frac{Γ/ π}{( ω - ( E _{j} ( ϑ ) - E _{G} ( ϑ )) ) ^{2} + Γ ^{2}} + ϵ,

I (ω; ϑ) = j = 0 \sum 2 ∣ ⟨ F_{j} ∣ a_{c} ∣ G ⟩ ∣^{2} \frac{Γ/ π}{( ω - ( E _{j} ( ϑ ) - E _{G} ( ϑ )) ) ^{2} + Γ ^{2}} + ϵ,

H_{2} = ϵ_{L} μ \sum a_{Lμ}^{†} a_{Lμ} + ϵ_{f}^{0} μ \sum a_{f μ}^{†} a_{f μ} + ϵ_{c} a_{c}^{†} a_{c} + \frac{V}{N _{f}} μ \sum (a_{Lμ}^{†} a_{f μ} + a_{Lμ} a_{f μ}^{†}) - U_{f c} μ \sum a_{f μ}^{†} a_{f μ} (1 - a_{c}^{†} a_{c}),

H_{2} = ϵ_{L} μ \sum a_{Lμ}^{†} a_{Lμ} + ϵ_{f}^{0} μ \sum a_{f μ}^{†} a_{f μ} + ϵ_{c} a_{c}^{†} a_{c} + \frac{V}{N _{f}} μ \sum (a_{Lμ}^{†} a_{f μ} + a_{Lμ} a_{f μ}^{†}) - U_{f c} μ \sum a_{f μ}^{†} a_{f μ} (1 - a_{c}^{†} a_{c}),

I_{2} (ω; θ_{2}) = j = 0 \sum 1 ∣ ⟨ F_{j} ∣ a_{c} ∣ G ⟩ ∣^{2} \frac{Γ _{j} / π}{( ω - ( E _{j} ( θ _{2} ) - E _{g} ( θ _{2} ) - b ) ) ^{2} + Γ _{j}^{2}} + ϵ,

I_{2} (ω; θ_{2}) = j = 0 \sum 1 ∣ ⟨ F_{j} ∣ a_{c} ∣ G ⟩ ∣^{2} \frac{Γ _{j} / π}{( ω - ( E _{j} ( θ _{2} ) - E _{g} ( θ _{2} ) - b ) ) ^{2} + Γ _{j}^{2}} + ϵ,

H_{3} = ϵ_{L} ν \sum a_{Lν}^{†} a_{Lν} + ϵ_{f}^{0} ν \sum a_{f ν}^{†} a_{f ν} + ϵ_{c} a_{c}^{†} a_{c} + \frac{V}{N _{f}} ν \sum (a_{f ν}^{†} a_{f ν} + a_{f ν} a_{f ν}^{†}) + U_{f f} ν ⟩ ν^{'} \sum a_{f ν}^{†} a_{f ν} a_{f ν^{'}}^{†} a_{f ν^{'}} - U_{f c} ν \sum a_{f ν}^{†} a_{f ν} (1 - a_{c}^{†} a_{c}) .

H_{3} = ϵ_{L} ν \sum a_{Lν}^{†} a_{Lν} + ϵ_{f}^{0} ν \sum a_{f ν}^{†} a_{f ν} + ϵ_{c} a_{c}^{†} a_{c} + \frac{V}{N _{f}} ν \sum (a_{f ν}^{†} a_{f ν} + a_{f ν} a_{f ν}^{†}) + U_{f f} ν ⟩ ν^{'} \sum a_{f ν}^{†} a_{f ν} a_{f ν^{'}}^{†} a_{f ν^{'}} - U_{f c} ν \sum a_{f ν}^{†} a_{f ν} (1 - a_{c}^{†} a_{c}) .

I_{3} (ω; θ_{3}) = j = 0 \sum 2 ∣ ⟨ F_{j} ∣ a_{c} ∣ G ⟩ ∣^{2} \frac{Γ _{j} / π}{( ω - ( E _{j} ( θ _{3} ) - E _{G} ( θ _{3} ) - b ) ) ^{2} + Γ _{j}^{2}} + ϵ,

I_{3} (ω; θ_{3}) = j = 0 \sum 2 ∣ ⟨ F_{j} ∣ a_{c} ∣ G ⟩ ∣^{2} \frac{Γ _{j} / π}{( ω - ( E _{j} ( θ _{3} ) - E _{G} ( θ _{3} ) - b ) ) ^{2} + Γ _{j}^{2}} + ϵ,

P (H_{k} ∣ D) = \frac{P ( D ∣ H _{k} ) P ( H _{k} )}{P ( D )} \propto P (D ∣ H_{k}) P (H_{k}),

P (H_{k} ∣ D) = \frac{P ( D ∣ H _{k} ) P ( H _{k} )}{P ( D )} \propto P (D ∣ H_{k}) P (H_{k}),

P (H_{k} ∣ D) \propto P (D ∣ H_{k}) = P (I ∣ H_{k}) = \int_{- \infty}^{\infty} P (I ∣ θ_{k}, H_{k}) P (θ_{k} ∣ H_{k}) d θ_{k},

P (H_{k} ∣ D) \propto P (D ∣ H_{k}) = P (I ∣ H_{k}) = \int_{- \infty}^{\infty} P (I ∣ θ_{k}, H_{k}) P (θ_{k} ∣ H_{k}) d θ_{k},

\begin{split}&{\rm P}({\bm{\mathcal{I}}}|\bm{\theta}_{k},H_{k})=\prod_{i=1}^{N}{\rm P}(\mathcal{I}(w_{i})|\bm{\theta}_{k},H_{k})\\ &=\Bigl{(}\frac{1}{2\pi\sigma_{noise}^{2}}\Bigr{)}^{N/2}\prod_{i=1}^{N}\exp\left[-\frac{1}{2\sigma_{noise}^{2}}(\mathcal{I}(w_{i};\bm{\vartheta})-I_{k}(w_{i};\bm{\theta}_{k}))^{2}\right]\\ &=\Bigl{(}\frac{1}{2\pi\sigma_{noise}^{2}}\Bigr{)}^{N/2}\exp\Bigl{\{}-\sum_{i=1}^{N}\left[\frac{1}{2\sigma_{noise}^{2}}(\mathcal{I}(w_{i};\bm{\vartheta})-I_{k}(w_{i};\bm{\theta}_{k}))^{2}\right]\Bigr{\}}.\end{split}

\begin{split}&{\rm P}({\bm{\mathcal{I}}}|\bm{\theta}_{k},H_{k})=\prod_{i=1}^{N}{\rm P}(\mathcal{I}(w_{i})|\bm{\theta}_{k},H_{k})\\ &=\Bigl{(}\frac{1}{2\pi\sigma_{noise}^{2}}\Bigr{)}^{N/2}\prod_{i=1}^{N}\exp\left[-\frac{1}{2\sigma_{noise}^{2}}(\mathcal{I}(w_{i};\bm{\vartheta})-I_{k}(w_{i};\bm{\theta}_{k}))^{2}\right]\\ &=\Bigl{(}\frac{1}{2\pi\sigma_{noise}^{2}}\Bigr{)}^{N/2}\exp\Bigl{\{}-\sum_{i=1}^{N}\left[\frac{1}{2\sigma_{noise}^{2}}(\mathcal{I}(w_{i};\bm{\vartheta})-I_{k}(w_{i};\bm{\theta}_{k}))^{2}\right]\Bigr{\}}.\end{split}

\begin{split}&{\rm P}(\bm{\mathcal{I}}|H_{k})=\Bigl{(}\frac{1}{2\pi\sigma_{noise}^{2}}\Bigr{)}^{N/2}\\ &\times\int_{-\infty}^{\infty}\exp\biggl{\{}-\frac{1}{2\sigma_{noise}^{2}}\sum_{i=1}^{N}\left[\mathcal{I}(w_{i};\bm{\vartheta})-I_{k}(w_{i};\bm{\theta}_{k})\right]^{2}\biggr{\}}{\rm P}(\bm{\theta}_{k}|H_{k})d\bm{\theta}_{k}\\ &=\Bigl{(}\frac{1}{2\pi\sigma_{noise}^{2}}\Bigr{)}^{N/2}\int_{-\infty}^{\infty}\exp\left[-NE(\bm{\theta}_{k})\right]{\rm P}(\bm{\theta}_{k}|H_{k})d\bm{\theta}_{k},\end{split}

\begin{split}&{\rm P}(\bm{\mathcal{I}}|H_{k})=\Bigl{(}\frac{1}{2\pi\sigma_{noise}^{2}}\Bigr{)}^{N/2}\\ &\times\int_{-\infty}^{\infty}\exp\biggl{\{}-\frac{1}{2\sigma_{noise}^{2}}\sum_{i=1}^{N}\left[\mathcal{I}(w_{i};\bm{\vartheta})-I_{k}(w_{i};\bm{\theta}_{k})\right]^{2}\biggr{\}}{\rm P}(\bm{\theta}_{k}|H_{k})d\bm{\theta}_{k}\\ &=\Bigl{(}\frac{1}{2\pi\sigma_{noise}^{2}}\Bigr{)}^{N/2}\int_{-\infty}^{\infty}\exp\left[-NE(\bm{\theta}_{k})\right]{\rm P}(\bm{\theta}_{k}|H_{k})d\bm{\theta}_{k},\end{split}

E (θ_{k}) = \frac{1}{2 N σ _{n o i se}^{2}} i = 1 \sum N [I (w_{i}; ϑ) - I_{k} (w_{i}; θ_{k})]^{2} .

E (θ_{k}) = \frac{1}{2 N σ _{n o i se}^{2}} i = 1 \sum N [I (w_{i}; ϑ) - I_{k} (w_{i}; θ_{k})]^{2} .

F (H_{k}) = - lo g P (I ∣ H_{k}),

F (H_{k}) = - lo g P (I ∣ H_{k}),

F (H_{k}) = - lo g \int_{- \infty}^{\infty} exp [- N E (θ_{k})] P (θ_{k} ∣ H_{k}) d θ_{k} = \int_{0}^{1} \frac{\partial}{\partial β} {- lo g [\int_{- \infty}^{\infty} exp (- β N E (θ_{k})) P (θ_{k} ∣ H_{k}) d θ_{k}]} d β = \int_{0}^{1} \int_{- \infty}^{\infty} N E (θ_{k}) P (θ_{k} ∣ I, β) d θ_{k} d β = \int_{0}^{1} < N E (θ_{k}) >_{P (θ_{k} ∣ I, β)} d β,

F (H_{k}) = - lo g \int_{- \infty}^{\infty} exp [- N E (θ_{k})] P (θ_{k} ∣ H_{k}) d θ_{k} = \int_{0}^{1} \frac{\partial}{\partial β} {- lo g [\int_{- \infty}^{\infty} exp (- β N E (θ_{k})) P (θ_{k} ∣ H_{k}) d θ_{k}]} d β = \int_{0}^{1} \int_{- \infty}^{\infty} N E (θ_{k}) P (θ_{k} ∣ I, β) d θ_{k} d β = \int_{0}^{1} < N E (θ_{k}) >_{P (θ_{k} ∣ I, β)} d β,

P (θ_{k} ∣ I, β) = \frac{exp [ - β N E ( θ _{k} )] P ( θ _{k} ∣ H _{k} )}{\int _{- \infty}^{\infty} exp [ - β N E ( θ _{k} )] P ( θ _{k} ∣ H _{k} ) d θ _{k}} .

P (θ_{k} ∣ I, β) = \frac{exp [ - β N E ( θ _{k} )] P ( θ _{k} ∣ H _{k} )}{\int _{- \infty}^{\infty} exp [ - β N E ( θ _{k} )] P ( θ _{k} ∣ H _{k} ) d θ _{k}} .

F (H_{k}) ≃ l = 0 \sum L < N E (θ_{k}) >_{P (θ_{k} ∣ I, β_{l})} Δ β_{l},

F (H_{k}) ≃ l = 0 \sum L < N E (θ_{k}) >_{P (θ_{k} ∣ I, β_{l})} Δ β_{l},

P (θ_{k}^{1}, θ_{k}^{2} \dots θ_{k}^{L} ∣ I) = l = 1 \prod L P (θ_{k}^{l} ∣ I, β_{l}),

P (θ_{k}^{1}, θ_{k}^{2} \dots θ_{k}^{L} ∣ I) = l = 1 \prod L P (θ_{k}^{l} ∣ I, β_{l}),

r = \frac{P ( θ _{k}^{1} , \dots , θ _{k}^{l + 1} , θ _{k}^{l} , \dots , θ _{k}^{L} ∣ I )}{P ( θ _{k}^{1} , \dots , θ _{k}^{l} , θ _{k}^{l + 1} , \dots , θ _{k}^{L} ∣ I )} = \frac{P ( θ _{k}^{l + 1} ∣ I , β _{l} ) P ( θ _{k}^{l} ∣ I , β _{l + 1} )}{P ( θ _{k}^{l} ∣ I , β _{l} ) P ( θ _{k}^{l + 1} ∣ I , β _{l + 1} )} = exp {N [β_{l + 1} - β_{l}] [E (θ_{k}^{l + 1}) - E (θ_{k}^{l})]} .

r = \frac{P ( θ _{k}^{1} , \dots , θ _{k}^{l + 1} , θ _{k}^{l} , \dots , θ _{k}^{L} ∣ I )}{P ( θ _{k}^{1} , \dots , θ _{k}^{l} , θ _{k}^{l + 1} , \dots , θ _{k}^{L} ∣ I )} = \frac{P ( θ _{k}^{l + 1} ∣ I , β _{l} ) P ( θ _{k}^{l} ∣ I , β _{l + 1} )}{P ( θ _{k}^{l} ∣ I , β _{l} ) P ( θ _{k}^{l + 1} ∣ I , β _{l + 1} )} = exp {N [β_{l + 1} - β_{l}] [E (θ_{k}^{l + 1}) - E (θ_{k}^{l})]} .

\displaystyle\beta_{l}=\left\{\begin{array}[]{ll}0.0&(l=1)\\ \gamma^{l-L}&(l\neq 1)\\ \end{array},\right.

\displaystyle\beta_{l}=\left\{\begin{array}[]{ll}0.0&(l=1)\\ \gamma^{l-L}&(l\neq 1)\\ \end{array},\right.

P (H_{3} ∣ D) = \frac{exp [ - F ( H _{3} )]}{exp [ - F ( H _{2} )] + exp [ - F ( H _{3} )]} .

P (H_{3} ∣ D) = \frac{exp [ - F ( H _{3} )]}{exp [ - F ( H _{2} )] + exp [ - F ( H _{3} )]} .

P (θ_{k}^{m} ∣ I) = \int_{- \infty}^{\infty} P (θ_{k} ∣ I) P (θ_{k}) d θ_{k}^{\neg m} .

P (θ_{k}^{m} ∣ I) = \int_{- \infty}^{\infty} P (θ_{k} ∣ I) P (θ_{k}) d θ_{k}^{\neg m} .

θ_{k}^{M A P} = θ_{k} arg max P (θ_{k} ∣ I) .

θ_{k}^{M A P} = θ_{k} arg max P (θ_{k} ∣ I) .

θ_{k}^{m M P M} = θ_{k}^{m} arg max P (θ_{k}^{m} ∣ I) = θ_{k}^{m} arg max \int_{- \infty}^{\infty} P (θ_{k} ∣ I) P (θ_{k}) d θ_{k}^{\neg m},

θ_{k}^{m M P M} = θ_{k}^{m} arg max P (θ_{k}^{m} ∣ I) = θ_{k}^{m} arg max \int_{- \infty}^{\infty} P (θ_{k} ∣ I) P (θ_{k}) d θ_{k}^{\neg m},

χ^{m} = \frac{1}{T} t = 1 \sum T (θ_{k}^{m} (t) - θ_{k}^{m M P M}) .

χ^{m} = \frac{1}{T} t = 1 \sum T (θ_{k}^{m} (t) - θ_{k}^{m M P M}) .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsElectron and X-Ray Spectroscopy Techniques · Advanced Chemical Physics Studies · Machine Learning in Materials Science

Full text

Bayesian Hamiltonian Selection in X-ray Photoelectron Spectroscopy

Yoh-ichi Mototake1

Masaichiro Mizumaki2

Ichiro Akai3,4

Masato Okada1,5

Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba 277-8561, Japan1

Japan Synchrotron Radiation Research Institute (JASRI), 1-1-1, Kouto, Sayo-cho, Sayo-gun, Hyogo 679-5198, Japan2

Institute of Pulsed Power Science, Kumamoto University, Kumamoto 860-8555, Japan3

Kyushu Synchrotron Light Research Center, Tosu, Saga 841-0005, Japan4

Research and Services Division of Materials Data and Integrated Systems, National Institute for Materials Science, Sengen, Tsukuba, Ibaraki 305-0047, Japan5

Abstract

Core-level X-ray photoelectron spectroscopy (XPS) is a useful measurement technique for investigating the electronic states of a strongly correlated electron system. Usually, to extract physical information of a target object from a core-level XPS spectrum, we need to set an effective Hamiltonian by physical consideration so as to express complicated electron-to-electron interactions in the transition of core-level XPS, and manually tune the physical parameters of the effective Hamiltonian so as to represent the XPS spectrum. Then, we can extract physical information from the tuned parameters. In this paper, we propose an automated method for analyzing core-level XPS spectra based on the Bayesian model selection framework, which selects the effective Hamiltonian and estimates its parameters automatically. The Bayesian model selection, which often has a large computational cost, was carried out by the exchange Monte Carlo sampling method. By applying our proposed method to the 3 $d$ core-level XPS spectra of Ce and La compounds, we confirmed that our proposed method selected an effective Hamiltonian and estimated its parameters appropriately; these results were consistent with conventional knowledge obtained from physical studies. Moreover, using our proposed method, we can also evaluate the uncertainty of its estimation values and clarify why the effective Hamiltonian was selected. Such information is difficult to obtain by the conventional analysis method.

pacs:

Valid PACS appear here

††preprint: APS/123-QED

I Introduction

Core-level X-ray photoelectron spectroscopy (XPS) is a useful means of investigating the electronic state of a strongly correlated electron systemkanamori87 . The XPS spectral distribution is reproduced from its state transition probability. The state transition probability is calculated on the basis of an effective Hamiltonian, which explains complex electron-to-electron interactions in the transition of XPS. The parameters of the effective Hamiltonian are the physical parameters of the measurement target. Therefore, by adjusting the parameters of the effective Hamiltonian so as to reproduce the measured XPS spectrum, the physical parameters of the object material are estimated as the tuned parametersKotani74 ; Kotani85 ; Kotani87 ; Groot08 . Normally, this parameter adjustment is performed manually, and the effective Hamiltonian, which is the premise of the analysis, is set on the basis of the physical consideration of the XPS transition process by researchersKotani74 ; Kotani85 ; Kotani87 ; Groot08 .

There is an information science methodology called spectral deconvolution that regresses a spectrum as a linear sum of unimodal basis functions, such as the Gaussian function. Nagata et al.Nagata12 introduced Bayesian inference to spectral deconvolution by the exchange Monte Carlo method,Hukushima96 which is an efficient sampling method. By Bayesian spectral deconvolution, the quantitative selection of the spectral model, such as the selection of the number of peaks, has been realized. Bayesian spectral deconvolution has been further developed, and the efficient estimation of noise intensitytokuda16 , the analysis of time series spectramurata16 , and a fast Bayesian spectral deconvolution algorithmmototake18 applicable to a high-dimensional spectrum have been realized. Many studies have used Bayesian spectral deconvolution to analyze spectra measured in various scientific fieldskasai16 ; iwamitsu16 ; hong16 . Normally, in spectral deconvolution, the parameters of the unimodal basis, such as the position, intensity, and variance of basis functions, are estimated. Therefore, if the parameters of the physical model do not directly correspond to the parameters of the unimodal basis function, we need an indirect estimation of the parameters. In the analysis of the core-level XPS spectrum, the parameters of the effective Hamiltonian also do not directly correspond to the peak positions or peak intensities. To realize the direct estimation of physical parameters, Bayesian spectral deconvolution methods that build an internal model on the spectrum model have been developedkasai16 ; murata16 . Here, the internal model is defined as a model whose parameters are physical parameters themselves, unlike the parameters of the unimodal basis function such as the mean, intensity, and variance. For instance, Kasai et al. applied the relationship between the protein species and the peak intensity ratio as an internal model, and estimated the protein species directlykasai16 . Also, Murata et al. applied a probability differential equation representing the time evolution of peak positions of the spectrum, and estimated the parameters of latent dynamics directlymurata16 .

In this study, we aim to achieve the automatic selection of an effective Hamiltonian and the estimation of its parameters based on the support of statistical science. By incorporationg the effective Hamiltonian of core-level XPS into the Bayesian spectral deconvolution method as an internal model, we propose a Bayesian spectral deconvolution method that enables the automatic analysis of core-level XPS spectra. Our proposed method is not merely an automated spectral analysis method for estimating the parameters and selecting a model. In Bayesian inference, a physical parameter is treated as a statistical variable. Therefore, the physical parameter is estimated as the probability distribution of the values that the parameter can possibly take, which is called posterior distribution. As a result, we can estimate the parameter and, at the same time, determine its estimation accuracy. This makes it possible to obtain the information necessary for developing measurement plans, such as the number of measurements and the measurement method, to satisfy the required estimation accuracy. In addition, by analyzing the shape of the distribution, we can also discuss the nature of the effective model. For example, we can discuss the uncertainty of the model parameters when fitting the model to the data. This provides information such as the relationship between the expressive ability of the model and the data complexity. It is difficult to obtain this information from simple parameter fitting. Thus, our proposed method can extract useful information for physical discussion from XPS spectra.

In the current methods of analyzing core-level XPS spectra, researchers use effective Hamiltonians such as the cluster modelgroot90 and the impurity Anderson modelgunnarsson83 ; jo88 . The cluster model contains the whole process of core-level XPS spectrum generation, such as the spin-orbit interaction, multiplet effect, and crystal field effect. In the impurity Anderson model, the effect of the conduction band structure is considered on the hybridization interaction between 4 $f$ electrons and conduction electrons. In contrast, as the internal model, our proposed method adopts two of the simplest effective Hamiltonians that can represent the XPS spectra of La2O3 and CeO2. The analysis of the core-level XPS spectra started from these effective HamiltoniansKotani74 ; Kotani85 . Since these effective Hamiltonians do not contain many factors, such as the effects of the spin-orbit interaction, multiplet, and crystal field, they cannot be applied to the general core-level XPS spectra of a rare-earth insulator compound. However, the framework of the proposed method does not limit the inner model to these effective Hamiltonians. Even if the effective Hamiltonian is replaced with an arbitrary one in the framework of the proposed method, this simply increases the number of parameters to be estimated and does not change the principle of the framework. This suggests that our proposed method is applicable to general core-level XPS spectra.

Finally, we applied the proposed method to emulated measurement data of La2O3 and CeO2 and their intermediate electron states. As a result, it was confirmed that the two effective Hamiltonians, which are also regarded as effective Hamiltonians in conventional physical studiesKotani74 ; Kotani85 , were selected for the spectrum data of La2O3 and CeO2. Furthermore, from the analysis of the posterior distribution, it was shown that the effective Hamiltonian of CeO2 has a too large degree of freedom to express the spectrum data of La2O3. In addition, we also applied our proposed method to the spectra, which are the intermediate states of a seamless transition from the CeO2 spectrum (three peaks) to the La2O3 spectrum (two peaks). As a result, it was confirmed that the proposed method selects an effective Hamiltonian on the basis of not only the information about the peak number but also other information contained in the effective Hamiltonian. This cannot be realized by existing spectral deconvolution methods applicable to XPS spectra, which select the model on the basis of the peak numberNagata12 ; tokuda16 .

In this way, it was shown that the proposed method can be used to select an effective Hamiltonian that is consistent with physical knowledge. It was also shown that, unlike the simple parameter fitting, the proposed method gives us the knowledge necessary for the discussion of physics.

II Generative Model of Spectrum

In this study, we analyzed spectra that were simulated on the basis of an effective model that is applicable to emulating all types of simplified 4 $f$ electron derived 3 $d$ core-level XPS spectra of rare-earth insulating compounds. We refer to this effective model as a generative model. The generative model was proposed by Kotani et al.Kotani85 It employs one of the simplest cluster models as the effective Hamiltonian. In the effective Hamiltonian, the effect of the spin-orbit interaction, the multiplet effect, and the crystal field effect are ignored. Moreover, the effect of hybridization between 4 $f$ electrons and conduction electrons is simplified because it assumes that there is only one ligand state. Note here that the generative model contains two effective models, which we applied later as the inner model of the spectral deconvolution method. We refer to these effective models as recognition models, which are explained later.

There are three eigenstates of $4f$ electrons: $|f^{0}\rangle{},|f^{1}\rangle{}$ , and $|f^{2}\rangle{}$ . In the $|f^{0}\rangle{}$ state, there is no electron in the $4f$ orbital, whereas there is one electron in the $|f^{1}\rangle{}$ state and two in the $|f^{2}\rangle{}$ state. In such a state space of 4 $f$ electrons, the effective Hamiltonian of the generative model is given as

[TABLE]

where $\epsilon_{L}$ , $\epsilon^{0}_{f}$ , and $\epsilon_{c}$ are the energies of the conducting electron of $4f$ rare-earth metals ( $5d$ , $6s$ electrons), the $4f$ electron, and the core electron, respectively. The index $\nu$ $(\nu=1\textendash N_{f},N_{f}=14)$ represents the quantum number of the spin and $f$ -orbital. $V$ , $U_{ff}$ , and $-U_{fc}$ are the energies of the hybridization interaction between 4 $f$ electrons and conduction electrons, the Coulomb interaction between $4f$ electrons, and the core-hole Coulomb potential for $4f$ electrons, respectively. In both the initial and final states, we set the energy level of $|f^{0}\rangle{}$ to 0 as the standard energy level. Then, the number of parameters of the effective Hamiltonian $\mathcal{H}$ is reduced to five: $\Delta(=\epsilon^{0}_{f}-\epsilon_{L}),V,U_{ff},U_{fc}$ , and $\Gamma$ . For later use, we define the parameter set of the effective Hamiltonian $\mathcal{H}$ as $\bm{\vartheta}=\{\Delta,V,U_{ff},U_{fc},\Gamma\}$ .

The conceptual diagram of the $4f$ electron state from the initial state to the final state is shown in Fig. 1. The initial state is defined as the state before X-rays are irradiated, and the final state is defined as the state after X-rays are irradiated and a core hole is generated. Furthermore, we define the initial eigenstate of the minimum energy $E_{G}(\bm{\vartheta})$ as $|G\rangle{}$ and the final eigenstates of the three energy levels $E_{j}(\bm{\vartheta})$ $(j=0,1,2)$ as $|F_{j}\rangle{}$ $(j=0,1,2)$ (Fig. 1). Using Fermi’s golden rule, we define the transition probability from the initial ground state to the final state as above

[TABLE]

By convoluting $F(\omega;\bm{\vartheta})$ with the Lorentz function and by adding the Gaussian noise $\epsilon$ , we can obtain the intensity distribution of the spectrum as

[TABLE]

where $\Gamma$ is one-half the width of the Lorentz function. We define the data generated following $\mathcal{I}(\omega;\bm{\vartheta})$ as “emulated measurement data”. Because the generative model is uniquely defined by the effective Hamiltonian $\mathcal{H}$ , we label this model as $\mathcal{H}$ .

Following Groot and KotaniGroot08 , the emulated parameters of the XPS spectrum of La2O3 and CeO2 are set as illustrated in Table 1. The generated data based on Table 1 are shown in Fig. 2. The two peaks in Fig. 2(a) correspond to La2O3 and the three peaks in Fig. 2(b) correspond to CeO2.

In the parameters of the two-peak spectrum, the peak intensity of the highest energy $E_{2}$ becomes almost zero. This is caused by the decreased interaction between the $|f^{2}\rangle{}$ state and the $|f^{0}\rangle{}$ state, originating from the increased energy of the $|f^{2}\rangle{}$ state. This suggests that $\Delta$ is essential for changing the property of the peak number, because $\Delta$ controls the energy of the initial and final $|f^{2}\rangle{}$ states. In fact, the most significant change between the parameters of La2O3 and CeO2 is $\Delta$ (Table 1). Thus, when we discuss the properties of the La2O3 and CeO2 spectra, we can ignore the difference in parameters between La2O3 and CeO2, excluding $\Delta$ .

III Recognition Model of Emulated Spectrum

In this section, we describe the two effective models for recognizing the emulated measurement data generated following the generative model $\mathcal{H}$ . Hereinafter, we refer to these effective models as the recognition models. In this section, we introduce two recognition models referred to as the two-state Hamiltonian model and the three-state Hamiltonian model.

III.1 Two-state Hamiltonian model

Kotani and ToyozawaKotani74 proposed a two-state Hamiltonian model as an effective model of La2O3 XPS spectra. The two-state Hamiltonian model is defined using an effective Hamiltonian on two eigenstates of $4f$ electrons $|f^{0}\rangle{}$ and $|f^{1}\rangle{}$ . The effective Hamiltonian is given as

[TABLE]

where, as in the generative model Hamiltonian $\mathcal{H}$ , $\epsilon_{L}$ , $\epsilon^{0}_{f}$ , and $\epsilon_{c}$ are the energies of the conducting electron of $4f$ rare-earth metals ( $5d$ , $6s$ electrons), the $4f$ electron, and the core electron, respectively. The index $\nu$ $(\nu=1$ – $N_{f},N_{f}=14)$ represents the quantum number of the spin and $f$ -orbital. $V$ and $-U_{fc}$ are the energies of the hybridization interaction and the core-hole Coulomb potential for the $4f$ electrons, respectively.

We define the initial eigenstate of the minimum energy $E_{G}(\bm{\vartheta})$ as $|G\rangle{}$ and the final eigenstates of the two energy levels $E_{j}(\bm{\vartheta})$ $(j=0,1)$ as $|F_{j}\rangle{}$ $(j=0,1)$ . In the two-state Hamiltonian model, the initial eigenstate $|G\rangle{}$ is set to be equal to the $4f$ electron eigenstate $|f^{0}\rangle{}$ . By using Fermi’s golden rule, convoluting the transition probability with the Lorentz function, and adding the Gaussian noise $\epsilon$ , we obtained the spectral distribution

[TABLE]

where, as in the generative model $\mathcal{H}$ , we set the energy level of the initial and final states $|f^{0}\rangle{}$ to 0 as the standard energy level. Thus, the number of parameters of the two-state Hamiltonian model is five: $\Delta^{\prime}(=\epsilon^{0}_{f}-\epsilon_{L}-U_{fc}),V,\Gamma_{1},\Gamma_{2}$ , and $b$ . The energy shift parameter $b$ is added to the model to compensate for the difference in the standard energy level between the models. The energy shift parameter is also added to the model in the conventional analysis of the XPS spectra. Because the two-state Hamiltonian model is uniquely defined by the effective Hamiltonian $H_{2}$ , we label this model as $H_{2}$ .

It was reported that the XPS spectra of La2O3 can be reproduced by the two-state Hamiltonian model $H_{2}$ Kotani74 . As we mentioned, the emulated measurement data of the La2O3 XPS spectrum is generated on the basis of the generative model $\mathcal{H}$ , which is more complex than the two-state Hamiltonian model $H_{2}$ . One purpose of this study is to confirm that this physical knowledge is reproduced by the proposed method.

III.2 Three-state Hamiltonian model

Kotani et al.Kotani85 proposed a three-state Hamiltonian model as an effective model of CeO2 XPS spectra. The effective Hamiltonian of the three-state Hamiltonian model is defined as

[TABLE]

Then, as in the previously described model, the XPS spectral model is derived as

[TABLE]

where $|G\rangle{}$ is the initial eigenstate of the minimum energy $E_{G}(\bm{\vartheta})$ , and $E_{j}(\bm{\vartheta})$ $(j=0,1,2)$ as $|F_{j}\rangle{}$ $(j=0,1,2)$ are the final eigenstates of the three energy levels. We set the energy level of the initial and final states $|f^{0}\rangle{}$ to 0 as the standard energy level. Thus, the number of parameters of the three-state Hamiltonian model is eight: $\Delta(=\epsilon^{0}_{f}-\epsilon_{L}),V,U_{ff},U_{fc},\Gamma_{1},\Gamma_{2},\Gamma_{3}$ , and $b$ . The three-state Hamiltonian model and the generative model are similar except for $b$ and the degree of freedom of $\Gamma$ .

As we mentioned earlier, it was reported that the XPS spectra of CeO2 can be reproduced by the three-state Hamiltonian model $H_{3}$ Kotani85 . Because the three-state Hamiltonian model is uniquely defined by the effective Hamiltonian $H_{3}$ , we label this model as $H_{3}$ .

IV Method

IV.1 Bayesian model selection

We evaluate the recognition models $H_{2}$ and $H_{3}$ in terms of their capability to represent the spectrum data $\bm{D}=\{\bm{w},\bm{\mathcal{I}}\}=\{(w_{1},w_{2},\cdots w_{N}),(\mathcal{I}(w_{1};\bm{\vartheta}),\mathcal{I}(w_{2};\bm{\vartheta}),\cdots\mathcal{I}(w_{N};\bm{\vartheta}))\}$ generated by the generative model $\mathcal{H}$ . The likelihood of the recognition model $H_{k}(k=2,3)$ for the dataset $\bm{D}$ is defined as

[TABLE]

where ${\rm P}(\bm{D})$ is a normalization constant. In this study, we assume that there is no prior knowledge about the likelihood of the model. Thus, we set the prior probability ${\rm P}(H_{k})$ as a uniform distribution; in this study, it is equal to $\frac{1}{2}$ . We also assume that $\bm{w}$ in the dataset $\bm{D}=\{\bm{w},\bm{\mathcal{I}}\}$ is given deterministically, that is, non-probabilistically. Then, the likelihood of the model is transformed as

[TABLE]

where $\bm{\theta}_{k}$ is the parameter set of the recognition model $H_{k}$ described in Sect. III.

The conditional probability ${\rm P}({\bm{\mathcal{I}}}|\bm{\theta}_{k},H_{k})$ of Eq. (9) is a stochastic generative model of the recognition model $H_{k}$ . When the additive noise $\epsilon$ of the XPS spectra is given as an independent and identically distributed Gaussian with average 0 and standard deviation $\sigma_{noise}$ ,

[TABLE]

The probability ${\rm P}(\bm{\theta}_{k}|H_{k})$ in Eq. (9) simulates the prior knowledge about the model parameters $\bm{\theta}_{k}$ as a probability distribution. By substituting Eq. (LABEL:noise_model) into Eq. (9), we obtain the following:

[TABLE]

where

[TABLE]

The probability ${\rm P}(\bm{\mathcal{I}}|H_{k})$ is often referred to as the marginal likelihood and is proportionally related to the likelihood of the recognition model $H_{k}$ . The negative log-likelihood,

[TABLE]

is often referred to as the Bayesian free energy ( $FE$ ). The effective model $H_{k}$ with the smallest $FE$ value represents the best model.

IV.2 Exchange Monte Carlo method

To obtain the value of $FE$ , we need to execute the integration in Eq. (11). However, it is difficult to analytically execute the integration owing to the complicated relationship between $\bm{\mathcal{I}}$ and $\bm{\theta}_{k}$ . We overcame this difficulty by numerical integration using the exchange Monte Carlo methodHukushima96 .

Markov chain Monte Carlo (MCMC) methodsmetropolis53 are efficient for sampling from a probability distribution in a high-dimensional space, such as $\bm{\theta}_{k}$ . To apply an MCMC method to execute an integration, we need to transform the integration to a mean value calculation. When applying an MCMC method to the calculation of $FE$ , we introduce an auxiliary variable $\beta$ and transform Eq. (13) into

[TABLE]

where $<\cdot>$ represents an average and

[TABLE]

When $NE(\bm{\theta}_{k})$ is regarded as energy, this equation suggests that ${\rm P}(\bm{\theta}_{k}|\bm{\mathcal{I}},\beta)$ corresponds to the Boltzmann distribution in statistical physics. In the same way, the auxiliary variable $\beta$ corresponds to the inverse temperature in statistical physics. Equation (LABEL:integ_beta) is approximated to a quadrature by parts,

[TABLE]

where $\beta_{l}$ is given as a sequence of inverse temperatures $0=\beta_{1}<\beta_{1}<\cdots<\beta_{L}=1$ obtained by dividing $\beta=0$ to $\beta=1$ into $L$ pieces, and $<NE(\bm{\theta}_{k})>_{{\rm P}(\bm{\theta}_{k}|\bm{\mathcal{I}},\beta_{l})}$ is obtained by $\beta_{l}$ -independent MCMC sampling. However, the MCMC sampling is often trapped at local minima.

The exchange Monte Carlo method (EMC) is an algorithm of an MCMC method used to avoid local trapping at minima. This method simulates multiple samplings from multiple densities with different inverse temperatures $\{\theta_{l}\}_{l=1}^{L}$ . The EMC takes samples from the joint density

[TABLE]

where the probability density ${\rm P}(\bm{\theta}_{k}^{l}|\bm{\mathcal{I}},\beta_{l})$ is defined in Eq. (15). The EMC algorithm is based on the following updates, in which the joint density ${\rm P}(\bm{\theta}_{k}^{1},\bm{\theta}_{k}^{2}\cdots\bm{\theta}_{k}^{L}|\bm{\mathcal{I}})$ is invariant.

1

Sampling from each density ${\rm P}(\bm{\theta}_{k}^{l}|\bm{\mathcal{I}},\beta_{l})$

Sampling from ${\rm P}(\bm{\theta}_{k}^{l}|\bm{\mathcal{I}},\beta_{l})$ by a conventional MCMC method, such as

the Metropolis–Hastings algorithmhastings70 .

2

Exchange process between two densities corresponding to adjacent inverse temperatures

The exchanges between the configurations $\bm{\theta}_{k}^{l}$ and $\bm{\theta}_{k}^{l+1}$ correspond to adjacent inverse temperatures following the probability $R=\min(1,r)$ , where

[TABLE]

Sampling from a distribution with a smaller $\beta$ corresponds to sampling from a distribution with a larger intensity of noise; thus, the distribution tends not to have a local minimum. On the other hand, sampling from a distribution with a larger $\beta$ corresponds to sampling from a distribution with local minima. Hence, sampling from the joint density ${\rm P}(\bm{\theta}_{k}^{1},\bm{\theta}_{k}^{2}\cdots\bm{\theta}_{k}^{L}|\bm{\mathcal{I}})$ overcomes the local minimum and enables the fast convergence of sampling.

Using the sampling result of the $\beta=1$ state, we obtained a posterior density of the parameter ${\rm P}(\bm{\theta}_{k}|\bm{\mathcal{I}})$ [Eq. (15)]. From the posterior density of $\bm{\theta}_{k}$ , we can estimate the model parameters $\bm{\theta}_{k}$ of $H_{k}$ and the related information, such as estimation accuracy.

V Results

We applied our proposed method to the emulated measurement data and estimated the likelihood of the effective models $H_{2}$ and $H_{3}$ . As described in Sect. III, from the physical knowledge, two recognition models $H_{2}$ and $H_{3}$ are expected to be selected for the emulated measurement data of La2O3 and CeO2 XPS spectra, respectively. The spectrum of CeO2 has a three-peak structure, and La2O3 has a two-peak structure. If the selection of an effective model is based only on the peak number, the effective model could also be selected indirectly by the existing spectral deconvolution method which has no internal modelNagata12 ; tokuda16 . On the other hand, because our proposed method builds an effective model into spectral deconvolution, various information about peak structure, such as the peak position or the order of peak intensity, suppose to be used to select the effective model. To confirm this, we applied our proposed method not only to spectra of CeO2 and La2O3, but also to spectra that are the intermediate states of a seamless transition from the CeO2 spectrum to the La2O3 spectrum. As we mentioned in Sect. II, $\Delta$ is an important parameter for controlling the properties of XPS spectra from La2O3 to CeO2. The generated parameters $V$ and $U_{fc}$ of the La2O3 spectrum are also slightly different from the generated parameters of the CeO2 spectrum (Table 1). However, the emulated measurement data of the La2O3 spectrum generated by replacing the $V$ and $U_{fc}$ values with the parameter values of the CeO2 spectrum have almost the same peak position and peak intensity as those in the La2O3 spectrum [Figs. 3(a)-(1), 3(a)-(2), and 2(a)]. Therefore, we refer to this parameter-replaced emulated spectrum as the La2O3 spectrum. That is, by shifting $\Delta$ from 1.6 to 12.5 and fixing the other parameters as the parameters of CeO2, we can generate spectra that have the intermediate structure of the La2O3 and CeO2 spectra. The increase in the parameter $\Delta$ from CeO2 to La2O3 induces the decrease in the transition probability $\langle{}F_{max}|a_{c}|G\rangle{}$ from the initial ground state $|G\rangle{}$ to the final maximum eigenenergy $E_{max}$ state $<F_{max}|$ . Because the square of the transition probability $|\langle{}F_{max}|a_{c}|G\rangle{}|^{2}$ is the peak intensity, the decrease in the transition probability $\langle{}F_{max}|a_{c}|G\rangle{}$ transforms the three peak CeO2 spectrum to a two-peak spectrum. To be precise, the emulated measurement data of La2O3 has three peaks, but the peak intensity corresponding to the largest eigenvalue $E_{2}$ is very small. Hence, in this study, we define the peak number of the emulated measurement data as the number of peaks whose intensity is larger than the noise intensity $\sigma_{noise}$ . In this way, we generated emulated measurement data by shifting the parameter $\Delta$ and applying our proposed method to it. More specifically, all parameters except $\Delta$ are the same, $\left[V=0.76,U_{ff}=10.5,U_{fc}=12.5,\Gamma=0.5\right]$ , for all applied emulated measurement data. As mentioned above, the peak number is defined using the noise intensity $\sigma_{noise}$ and the peak intensity. Therefore, we evaluated the effect of not only $\Delta$ , but also the noise intensity $\sigma_{noise}$ . Hence, we also generated the emulated measurement data by setting the noise standard deviation to $\sigma_{noise}\in\{0.001,0.0028,0.0046,0.0064,0.0082,0.01\}$ . The peak number, peak position, and peak intensity of the emulated measurement data are described in Table 2 and some examples of emulated measurement data are described in Fig. 3(a). In the emulated measurement data of $\Delta=10.08$ , if the noise intensity $\sigma_{noise}$ is smaller than 0.0052, the peak number is three, whereas if the noise intensity $\sigma_{noise}$ is larger than 0.0052, the peak number is two (Table 2). In this study, each emulated measurement data consists of $N=400$ samples.

To apply Bayesian estimation, we set the prior density ${\rm P}(\bm{\theta}_{k}|H_{k})$ to a uniform distribution in the range described in Table 3. In the execution of EMC sampling, we adopted the Metropolis–Hastings algorithmhastings70 to sample each state of inverse temperature. The states of inverse temperature were determined following the exponential functionNagata08 :

[TABLE]

where $L=40$ and $\gamma=1.4$ . We abandoned the first 10,000 steps and sampled the next 1,000,000 steps.

On the basis of the obtained Bayesian free energies $F(H_{2})$ and $F(H_{3})$ , which correspond to the two-state Hamiltonian model $H_{2}$ and the three-state Hamiltonian model $H_{3}$ , respectively, the log likelihood of $H_{3}$ , ${\rm P}(H_{3}|\bm{D})$ , was calculated as

[TABLE]

If ${\rm P}(H_{3}|\bm{D})>0.5$ , then $H_{3}$ is a more plausible model than $H_{2}$ . Otherwise, if ${\rm P}(H_{3}|\bm{D})<0.5$ , then $H_{2}$ is a more plausible model than $H_{3}$ . From the phase diagram of ${\rm P}(H_{3}|\bm{D})$ (Fig. 4), the three-state Hamiltonian model $H_{3}$ was selected for the emulated measurement data of $\Delta=1.6$ , corresponding to the CeO2 spectra, and the two-state Hamiltonian model $H_{2}$ was selected for the emulated measurement data of $\Delta=12.5$ , corresponding to the La2O3 spectra. Furthermore, the proposed method selected the three-state Hamiltonian model $H_{3}$ for all intermediated emulated measurement data from $\Delta=1.6$ to $\Delta=12.5$ . It included the emulated measurement data of $\Delta=10.08$ and $\sigma_{noise}\geq 0.0046$ , whose peak intensity of the largest energy was smaller than the noise intensity. For further analysis, we evaluated the ratio of the $FE$ s, $F(H_{3})/F(H_{2})$ , which corresponds to the difference in likelihoods, ${\rm P}(H_{3}|\bm{D})-{\rm P}(H_{2}|\bm{D})$ , in logarithm space. $F(H_{3})/F(H_{2})<1$ means that $H_{3}$ is a better model than $H_{2}$ , and $F(H_{3})/F(H_{2})>1$ means that $H_{2}$ is a better model than $H_{3}$ . The value of $F(H_{3})/F(H_{2})$ tends to gradually increase as the data generated by the parameters $\Delta$ and $\sigma_{noise}$ increase [Fig. 4(b)].

To estimate the model parameters and their estimation uncertainty on the basis of the effective model $H_{3}$ , we evaluated the posterior density of parameters ${\rm P}(\bm{\theta}_{3}|\bm{\mathcal{I}})$ . The posterior density must consist of independent samples. However, the sampling time series of the EMC has a time correlation. Therefore, we extracted samples from sufficiently separated intervals with a correlation coefficient of 0.5 or lower, and calculated the posterior distribution. Furthermore, to visualize a posterior distribution with more than three dimensions, we calculated the following marginal posterior density, which marginalizes the posterior distribution of the parameters $\bm{\theta}_{k}^{\lnot m}$ except for the parameter of interest $\theta_{k}^{m}$ :

[TABLE]

Assuming that the $T$ sampling data of one parameter $m$ , $\{\theta_{k}^{m}(t)\}_{t=1}^{T}$ , are extracted from the sampling result of the EMC, the marginalized posterior distribution ${\rm P}(\theta_{k}^{m}|\bm{\mathcal{I}})$ can be estimated from sampling result by kernel density estimation method using Gaussian kernels. We determined the bandwidth of Gaussian kernels using Scott’s rulescott 15 .

From the evaluation of the marginal posterior density of the emulated measurement data of $\Delta$ equal to 7.66 or less, we found a sharp peak in posterior distributions around the true parameter, which we used to generate the emulated measurement data (Figs. 6 – 10 ). From the evaluation of the marginal posterior density with the emulated measurement data of $\Delta=10.08$ , whose $\Delta$ value is the model estimation switching boundary from $H_{3}$ to $H_{2}$ , we found that the width of the posterior distribution of $\Delta$ , $U_{fc}$ , and $U_{ff}$ increased as the noise intensity $\sigma_{noise}$ (Figs. 6, 6, and 8) increased. On the other hand, the marginal posterior densities of $\Delta,U_{ff}$ , and $U_{fc}$ for the emulated measurement data of $\Delta=12.5$ have almost uniform distributions.

In this study, model parameters were estimated from such posterior distributions using the following two methods. The first method was the maximum a posteriori (MAP) method. The MAP method estimates parameters on the basis of the following equation:

[TABLE]

The second method was the maximizer of the posterior marginal (MPM) method. The MPM method estimates parameters on the basis of the following equation:

[TABLE]

where, as with the marginal posterior density, $m$ is the index of a certain parameter $\theta_{k}^{m}$ included in the parameter set $\bm{\theta}_{k}$ . The MPM corresponds to using the maximum marginal posterior density as an estimation value. To evaluate the estimation uncertainty of a parameter, we defined the variation $\chi^{m}$ of sampling data $\{\theta_{k}^{m}(t)\}_{t=1}^{T}$ from the MPM as

[TABLE]

Here, we focused on the parameters of the effective model $H_{3}$ , which are $\Delta$ , $U_{fc}$ , $U_{ff}$ , $V$ , $\Gamma_{2}$ , and $\Gamma_{3}$ (Tables 4, 5, 6, 7, 8, and 9, respectively).

In the emulated measurement data of $\Delta<$ 10.08, all parameters were estimated correctly by both the MAP and MPM methods. In greater detail, the gaps between the estimated parameters and the true parameters increased as the noise variance increased, and the variation $\chi$ also increased as the noise intensity $\sigma_{noise}$ increased. In the emulated measurement data of $\Delta\geq 10.08$ , the gaps between the estimated parameters and the true parameters were much larger than the others, except for the emulated measurement data of $\Delta=10.08$ and $\sigma_{noise}=0.001$ and 0.0028. We evaluated the uncertainty of parameter estimation from the variation $\chi$ of the marginal posterior density. As a result, at $\Delta=10.08$ and $\sigma_{noise}=0.0046$ , a large transition of the variation $\chi$ of $\Delta$ , $U_{fc}$ , and $U_{ff}$ greater than one order was observed (Tables 4, 5, and 6). Also, at $\Delta=12.5$ , a large $\chi$ was observed regardless of the noise intensity (Tables 4, 5, and 6).

VI Discussion

By applying our proposed method to emulated measurement data, we determined that the two-state Hamiltonian model $H_{2}$ should be applied to emulated measurement data corresponding to the La2O3 spectra. On the other hand, the three-state Hamiltonian model $H_{3}$ should be applied to emulated measurement data corresponding to the CeO2 spectra. These results are consistent with those of previous studiesKotani74 ; Kotani85 . For the emulated measurement data of $\Delta=10.08$ and $\sigma_{noise}\geq 0.0046$ , our proposed method selected the three-state Hamiltonian model $H_{3}$ . These spectral distributions, the same as the La2O3 spectra, are two-peak spectra as described in Fig. 3[a-(2)] and Table 2. Here, we consider applying the existing method to the emulated measurement data of $\Delta=10.08$ . The existing Bayesian spectral deconvolution methods that are applicable to the analysis of the core-level 3 $d$ XPS spectrum have no internal modelNagata12 ; tokuda16 . Such existing methods simply select the model whose number of peaks is the same as the appearance of peak numbernumber of peaks appearing in the spectratokuda_Dthesis16 . Therefore, if the existing methods are applied to the emulated measurement data of $\Delta=10.08$ , a model that has a two peak structure should be selected. Such differences in model selection results between our proposed method and existing Bayesian spectral deconvolution methods will depend on whether the internal model, the effective Hamiltonian, was built in the spectral deconvolution model. This suggests that our proposed method should be applied to the analysis of core-shell XPS spectra when its candidates of the effective Hamiltonian are given.

From the posterior distribution, we can estimate the uncertainties of estimated parameters. Actually, from the analysis of the posterior distribution of the $H_{3}$ model, it is confirmed that the estimation uncertainty, which corresponds to the variation $\chi$ , is increased as the noise intensity $\sigma_{noise}$ increased. The marginalized posterior distributions, except for $\Gamma$ and $V$ , of $\Delta=10.08$ and $\sigma_{noise}=0.0046$ were broad and had no sharp peak structures, whereas the marginalized posterior distributions of $\Delta=10.08$ and $\sigma_{noise}=0.0028$ had sharp peak structures. These properties suggest that the estimation uncertainty significantly decreases around $\Delta=10.08$ and $\sigma_{noise}=0.0028$ , where the peak number of the spectrum is changed. This suggests that, to obtain high estimation accuracy, the noise intensity must be reduced to less than $\sigma_{noise}=0.0028$ tokuda_Dthesis16 . Through such an analysis of the posterior distribution, it is possible to make a measurement plan, such as the number of measurements and the measurement method, to satisfy the required estimation accuracytokuda_Dthesis16 . Such an expansion of the variation $\chi$ of the marginalized posterior distribution is presumed to occur via the indefinite estimation parameter as a result of the effective model $H_{3}$ having an excessive expression power. In particular, the fact that the marginalized posterior density began to spread at $\Delta=10.08$ and $\sigma_{noise}=0.0046$ , where the peak number was changed, suggests that the effective model $H_{3}$ has excessive expression capability for the two-peak spectrum. Information about the estimation accuracy of parameters or the expression capability of the effective model is difficult to obtain by the conventional methods of analysis such as the core-level XPS analysis method using manual tuning and spectral deconvolution using a simple fitting method.

In this study, we adopted the simplified cluster model as the effective model of core-level 3 $d$ XPS spectra. This effective model does not take into account the spin-orbit interaction, the multiplet effect, or the crystal field effect. It is generally too simplistic to explain the actual measurement spectra of core-level XPS. However, if the spectrum intensity model is generated by Fermi’s golden rule, our proposed method can easily replace the internal model with another model. For example, except for the difficulty related to increasing the number of parameters, the effective Hamiltonian used in this study can be easily replaced with a model that takes into account the spin-orbit interaction, the multiplet effect, and the crystal field effect. This capability means that we can apply our proposed method to a wider range of actual observed XPS spectra by replacing the effective model with a cluster model that focuses more on interactions or with the impurity Anderson model. In the impurity Anderson model, the effect of the band structure of conducting electrons is concerned with the band structure of conducting electrons. Thus, it is suggested that the framework of the proposed method has wide applicability to actual measurement data of core-level XPS. Also, the analysis of emulated measurement data in the intermediate state can also be realized by the analysis of the actual spectra which have the same kind of spectral structure for each other.

VII Summary

By incorporating the effective Hamiltonian into the stochastic model of spectral deconvolution, we developed a Bayesian spectral deconvolution method for core-level XPS to realize the automatic analysis of core-level XPS spectra. By applying our proposed method to the emulated 3 $d$ core-level XPS spectra of La2O3 and CeO2, it was confirmed that our proposed method selects effective Hamiltonians that are consistent with knowledge obtained from the conventional study of physics. We also applied our proposed method to spectra which are the intermediate states of a seamless transition from the CeO2 spectrum (three peaks) to the La2O3 spectrum (two peaks). As a result, it was confirmed that the proposed method selects an effective Hamiltonian on the basis of not only the information about the peak number but also other information contained in the effective Hamiltonian. This cannot be realized by existing Bayesian spectral deconvolution methods applicable to XPS spectra, which select the model on the basis of the peak numberNagata12 ; tokuda16 . Our proposed method also enables the parameter estimation of the effective model using the posterior distribution of its parameter. Using the MAP or MPM methods, we were able to estimate the true parameters of the generative model $\mathcal{H}$ from the posterior distributions. Furthermore, using the posterior distribution, we were able to evaluate the parameter estimation accuracy or obtain information about the properties of the effective model for the spectrum. This capability of our proposed method can yield information for scientific discussion, e.g., detection limittokuda_Dthesis16 , or the improvement of the effective Hamiltonian using the observed data. In conventional analysis methods, such as those using manual tuning or a simple fitting technique, such information cannot be obtained. It is also suggested through our discussion that the framework of the proposed method has wide applicability to actual measurement data of core-level XPS. We expect that our proposed method will pave the way for the highly quantitative analysis of core-level XPS spectra.

Acknowledgments This work was supported by the Cross-ministerial Strategic Innovation Promotion Program (SIP), “Structural Materials for Innovation” (Funding agency: JST) and JST CREST (JPMJCR1761, JPMJCR1861).

Bibliography21

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1(1) J. Kanamori and A. Kotani, Proc. 10th Taniguchi Int. Symp., 1987, p. 81.
2(2) A. Kotani and Y. Toyozawa, J. Phys. Soc. Jpn. 37 , 912 (1974).
3(3) A. Kotani, H. Mizuta, T. Jo, and J. C. Parlebas, Solid State Commun. 53 , 805 (1985).
4(4) A. Kotani, M. Okada, T. Jo, A. Bianconi, A. Marcelli, and J. C. Parlebas, J. Phys. Soc. Jpn. 56 , 798 (1987).
5(5) F. Groot and A. Kotani, Core Level Spectroscopy of Solids (CRC Press, London, 2008).
6(6) K. Nagata, S. Sugita, and M. Okada, Neural Networks 28 , 82 (2012).
7(7) K. Hukushima and K. Nemoto, J. Phys. Soc. Jpn. 65 , 1604 (1996).
8(8) S. Tokuda, K. Nagata, and M. Okada, J. Phys. Soc. Jpn. 86 , 2 (2016).