Under-Balcony Acoustic Diagnosis Using FOA-Based Directional Metrics: Early–Late Entropy and Vertical-Energy Discrepancy at 125 Hz, 1 kHz, and 4 kHz
Po-Chun Ting, Yu-Cheng Liu

TL;DR
This paper introduces a 3D acoustic sensing method using first-order Ambisonics to diagnose directional imbalances in under-balcony seating areas of concert halls.
Contribution
A novel FOA-based framework using directional metrics to detect early vertical-reflection deficiencies in shadowed seating zones.
Findings
Under-balcony positions show a consistent negative early–late vertical energy discrepancy across 125 Hz, 1 kHz, and 4 kHz.
Directional entropy analysis reveals stronger early directional constraints in shadowed seats compared to other areas.
Traditional scalar metrics fail to capture directional imbalances, while FOA-based metrics provide clearer insights.
Abstract
Traditional concert-hall evaluations primarily rely on ISO 3382-1 scalar parameters (e.g., C50 and C80), which summarize temporal energy behavior but provide limited insight into the directional composition of early reflections, particularly in geometrically shadowed seating zones. This paper presents a first-order Ambisonics (FOA)-based 3D acoustic sensing framework to diagnose under-balcony directional imbalance, with emphasis on early vertical-reflection deficiency. Scene-based FOA impulse responses (WXYZ) were measured at 11 audience positions (P1–P11) in the National Concert Hall (Taipei) and analyzed using intensity-based direction-of-arrival (DoA) proxies, axis-resolved directional energy build-up, and a distributional descriptor based on directional spatial entropy. Results are presented at three representative frequencies (125 Hz, 1 kHz, and 4 kHz) and analyzed within full…
Click any figure to enlarge with its caption.
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15
Figure 16
Figure 17
Figure 18
Figure 19
Figure 20
Figure 21Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHearing Loss and Rehabilitation · Acoustic Wave Phenomena Research · Structural Engineering and Vibration Analysis
1. Introduction
Acoustic performance in concert halls is typically evaluated using scalar parameters derived from omnidirectional room impulse responses (RIRs). The standardized procedures and parameter definitions specified in ISO 3382-1 enable reproducible benchmarking across venues and measurement campaigns [1,2]. While these metrics capture the temporal aspects of energy decay and clarity, they do not explicitly describe the directional distribution of reflections, which is essential to spatial impression and listener perception in music performance spaces [3,4]. Both classical and contemporary studies have shown that early reflection directionality—particularly lateral components—plays a central role in perceived spaciousness and apparent source width [3,4,5].
A persistent seat-dependent issue in auditoria is acoustic degradation in under-balcony regions. Balcony overhangs alter reflection availability and reduce late energy in shadowed zones. The magnitude and characteristics of this degradation depend on local geometry, including the soffit and rear-wall configuration [6]. Consequently, under-balcony problems are often not limited to overall level reduction but involve directional deficits and altered early/late energy balance, which may not be adequately captured by conventional scalar metrics [1,2,3,6].
Recent spatial measurement and analysis methods provide practical tools to address these limitations. Scene-based microphone techniques, particularly first-order Ambisonics (FOA), represent the local sound field using four orthogonal components , supporting direction-of-arrival (DoA) estimation through intensity-based analysis and time–frequency directional descriptors [7,8,9,10]. Related parametric frameworks for spatial impulse response analysis, including Spatial Impulse Response Rendering (SIRR) and Directional Audio Coding (DirAC), operationalize DoA and diffuseness estimation within frequency bands [7,8,9,10]. In addition, the Spatial Decomposition Method (SDM) encodes spatial RIRs into reflection-like components suitable for analysis and rendering [11,12,13]. While these methods enable direction-resolved characterization at discrete receiver positions, meaningful seat-to-seat comparison still benefits from compact distributional metrics that quantify how concentrated or spread the directional energy is.
Information-theoretic measures provide a principled means to quantify directional complexity. Shannon entropy characterizes uncertainty in a discrete distribution and can be applied to the angular energy distribution of DoA samples. In this context, entropy indicates whether energy is broadly distributed (diffuse-like) or concentrated in a few directions (anisotropic dominance) [14,15,16]. Such an entropy-based descriptor complements axis-wise energy summaries and intensity/energy diffuseness indicators by explicitly capturing distributional structure.
Another limitation in large-venue field studies is the sparsity of spatial sampling. Practical measurement campaigns often rely on a limited number of receiver positions, and geometry-free interpolation does not enforce acoustic physics. Physics-Informed Neural Networks (PINNs) embed governing equations (e.g., wave/Helmholtz equations) into the learning objective and have been demonstrated as effective tools for field reconstruction from limited observations [17,18,19]. These developments suggest a viable route toward continuous field mapping that remains consistent with propagation physics.
1.1. Research Motivation and Gap
Under-balcony seats remain a representative case of geometry-driven, seat-dependent degradation in concert halls [6]. Standard ISO 3382-1 parameters support global benchmarking but are not designed to diagnose directional deficits, such as reduced vertical early-reflection energy or directionally concentrated reflection patterns under overhangs [1,2,3,6]. Direction-resolved FOA/parametric methods (e.g., intensity-based DoA estimation, SIRR/DirAC, SDM) provide reflection-structure cues and directional features [7,8,9,10,11,12,13]. However, two practical gaps remain: (i) The absence of a compact metric capable of a summarizing directional-distribution structure across seats and time–frequency windows. (ii) The lack of a physics-consistent framework for interpreting sparse measurements spatially, rather than only point-wise, under limited receiver coverage [17,18,19].
1.2. Objective and Contribution
To bridge these gaps, this study aims to diagnose under-balcony directional deficits using sparse FOA measurements and to provide interpretable indicators and physics-consistent spatial mappings. The main contributions of this work are as follows:
- FOA directional diagnostics: active-intensity-based DoA analysis with time-resolved hedgehog visualization and axis-resolved directional energy build-up over 0–200 ms for cross-seat comparison [7,8].
- Directional spatial entropy: a normalized Shannon-entropy metric computed from binned angular energy distributions to quantify concentration versus spread of directional energy in selected time–frequency windows [14,15,16].
- Physics-regularized reconstruction: a frequency-domain PINN constrained by the Helmholtz equation to reconstruct continuous pressure-field maps from sparse receiver positions and support spatial interpretation of under-balcony shadowing [17,18,19].
2. Methodology
2.1. Receiver Layout and Spatial Reference
Measurements were conducted in the main auditorium of the National Concert Hall (Taipei). Eleven receiver positions (P1–P11) were selected to represent (i) central open seating, (ii) side seating near the wall, and (iii) under-balcony seating, including side-under-balcony locations. Position descriptors (location/row) were used to define grouping for cross-seat comparison (Table 1). Source–receiver spatial relationships were established from full-scale architectural CAD drawings and used consistently throughout the analysis (Figure 1).
2.2. Measurement Procedure and Signal Chain
Measurements were conducted following ISO 3382-1 procedures for performance spaces [1], and under-balcony effects were interpreted in relation to balcony-overhang degradation mechanisms reported in prior literature [6]. An omnidirectional dodecahedron loudspeaker (Brüel & Kjær Type 4292 OmniSource, Brüel & Kjær, Nærum, Denmark ) was used as the sound source and placed on the stage at a height of 1.7 m, approximately 1 m from the stage edge (Figure 1). Excitation employed an exponential swept-sine (ESS) signal to improve signal-to-noise ratio (SNR) and suppress nonlinear distortion components; a 20 s sweep duration was adopted to ensure stable impulse-response extraction in the large-volume hall.
Spatial capture was performed using a first-order Ambisonics (FOA) scene-based microphone providing four-channel B-format signals (W, X, Y, Z), where W represents sound pressure and X/Y/Z correspond to orthogonal particle-velocity directions (left–right, front–back, vertical). Audio I/O was handled by a PreSonus Studio 192 interface expanded with a Behringer ADA8000 preamplifier, and recordings were made at 48 kHz/24-bit. At each receiver position (P1–P11), three repeated 20 s sweeps were acquired and averaged to improve SNR and reduce random variance. A schematic of the measurement system architecture and signal chain is shown in Figure 2a, and an on-site photograph of the measurement setup is provided in Figure 2b.
The overall analysis pipeline is summarized in Figure 3, covering FOA RIR acquisition, pre-processing (time alignment and band-limiting), time-domain directional metrics (intensity-based DoA proxy, hedgehog visualization, 0–200 ms directional energy build-up, and directional spatial entropy), and the optional physics-informed reconstruction module in the frequency domain. Impulse responses were time-aligned to the direct-sound onset (time-of-arrival alignment), and all subsequent feature extraction was performed within a 0–200 ms analysis horizon. As illustrated in Figure 4, the aligned time axis was further partitioned into successive time bins, and an early/late split (0–80 ms/80–200 ms) was used for early–late discrepancy metrics.
2.3. FOA Representation and Intensity-Based Directional Features
Directional analysis was performed using intensity-based features derived from FOA impulse responses, consistent with parametric spatial impulse-response frameworks used in room analysis and reproduction [7,8,9,10,11,12,13]. Let acoustic pressure be approximated by the omnidirectional FOA component, , and let the particle-velocity vector be , with proportionality set by the adopted FOA normalization. Instantaneous active-intensity components were computed as:
The use of active and reactive intensity follows standard formulations [20,21,22]. In this study, the instantaneous active-intensity direction was used as a DoA proxy for reflection-direction analysis (Figure 5).
2.4. Directional Visualization Using Spherical Projection and Planar Views
For directional visualization, the DoA at each time sample was defined from the normalized active-intensity vector as
Here, specifies the instantaneous direction of energy flow, while represents its relative magnitude. Direct plotting of arrows for all time samples was avoided because dense vector rendering tends to obscure the overall directional structure, especially when early and late components are compared within the same spatial framework. Instead, the directional distribution was represented using an energy-weighted directional endpoint density, in which normalized directions were accumulated with weights proportional to .
To improve interpretability and avoid the perspective distortion and partial occlusion associated with purely three-dimensional views, the directional distributions were primarily displayed using spherical azimuth–elevation projections. Early (0–80 ms) and late (80–200 ms) directional distributions were shown separately to allow direct comparison of the dominant arrival regions between the two temporal windows. In addition, an early–late difference map was included to summarize the redistribution of directional density between the early and late fields.
To further examine the vertical structure of the directional field, planar projections on the and planes were computed. These sectional views help assess whether the directional distribution remains concentrated near the horizontal plane or exhibits a constrained vertical spread. The combined use of spherical projection, early–late difference visualization, and vertical planar projections is intended to improve interpretability of the directional patterns while retaining physical correspondence with the FOA intensity field (Figure 6).
2.5. Axis-Resolved Directional Energy Build-Up (0–200 ms)
To quantify the temporal build-up of direction-dependent acoustic energy, the FOA intensity components , , and were integrated over successive non-overlapping time bins within a 0–200 ms analysis horizon. Let the -th time bin be defined as , and let denote a rectangular window that selects samples within (Figure 7a). The axis-resolved directional energy in each time bin was defined as
Here, denotes the magnitude of the (potentially signed) intensity component, such that represents the squared magnitude used for energy accumulation on axis .
For cross-position comparison, the energy sequences were normalized by the global maximum across all bins and axes:
where denotes the set of time-bin indices spanning 0–200 ms. The resulting normalized sequences were used as compact descriptors of the axis-resolved directional energy build-up over 0–200 ms for cross-seat comparison.
To support early–late comparisons, window-accumulated axis energies were computed by summing the bin-wise energies within the corresponding time ranges. Denoting the early window as 0–80 ms and the late window as 80–200 ms, the accumulated energies were defined as
These accumulated energies provide a consistent basis for early–late directional descriptors reported in the Results section (e.g., ratios formed from ).
Recommended bin settings are summarized in Table 2, and the computation procedure is illustrated in Figure 7b.
2.6. Directional Spatial Entropy
Directional spatial entropy was employed to quantify the degree of concentration versus spread in the angular energy distribution. For each receiver position and analysis window, the entropy was evaluated at each target frequency by constructing a direction-of-arrival (DoA) distribution from intensity-based directional samples. Specifically, the FOA components were first projected to a narrowband representation at using windowed complex demodulation. Let , , , and denote the FOA components within the analysis window, and let be the selected temporal window (rectangular or Hann). The real-valued demodulated signals were obtained as
Intensity components were then formed as
and the instantaneous intensity magnitude proxy was defined as
where is a small constant for numerical stability. DoA unit vectors were computed from the normalized intensity direction:
For entropy computation, DoA samples were assigned to a spherical partition comprising angular bins (azimuth–elevation grid). A nonnegative intensity-based weight was used to accumulate contributions in each angular bin. In this study, the weight was set as either (default) or , depending on the configuration. The accumulated bin energies were normalized to form a discrete probability mass function (PMF):
where denotes the set of time samples whose DoA estimates fall into the -th angular bin.
Shannon entropy was then computed from the PMF as
and normalized by the maximum entropy to obtain a bounded directional entropy index:
In this formulation, indicates strong directional concentration, that is, energy dominated by a small number of angular bins, whereas indicates a highly distributed angular energy pattern.
To characterize temporal evolution, was evaluated for both the early (0–80 ms) and late (80–200 ms) windows, denoted as and , respectively. The early–late discrepancy was defined as
Because the normalized directional entropy depends on the adopted angular discretization, its sensitivity to bin resolution was also checked under moderate variations of . As expected, the absolute values of varied slightly with bin resolution. However, the main qualitative trends remained unchanged. In particular, the predominantly negative observed at the under-balcony positions was preserved across the tested binning settings, indicating that the entropy-based comparison is qualitatively robust to moderate changes in angular resolution.
Entropy configuration details, including angular binning and weighting settings, are summarized in Table 3, and the computation workflow is illustrated in Figure 8.
2.7. Spatial Mapping and Physics-Regularized Reconstruction Under Sparse Measurements
2.7.1. Spatial Visualization via Gaussian RBF Interpolation
To visualize spatial trends of point-wise metrics over the audience region , a Gaussian radial basis function (RBF) interpolation was employed to obtain a smooth field representation from discrete receiver positions. Given a scalar metric evaluated at receiver locations , the interpolated field was computed as
where denotes the kernel width. The interpolation was applied within the audience mask to facilitate visualization of spatial variation. This RBF mapping is used solely for visualization and does not impose physical constraints.
2.7.2. Frequency-Domain PINN for Physics-Regularized Sparse Sound-Field Reconstruction
To support physics-regularized interpretation under sparse measurement conditions, an optional frequency-domain physics-informed neural network (PINN) formulation was considered as a reconstruction module for the complex acoustic pressure field. For each target frequency , a complex narrowband pressure proxy at the -th receiver position was obtained from the FOA W-channel using windowed complex projection over a prescribed analysis interval (e.g., 0–200 ms):
where denotes the selected temporal window (rectangular or Hann) and represents the discrete time samples within the analysis window. This complex quantity provides both magnitude and phase information at for the sparse receiver set.
Within the audience domain , the complex pressure field was assumed to approximately satisfy the homogeneous Helmholtz equation [19,20,21]:
where is the speed of sound. A neural network can be used to represent the complex field via real and imaginary outputs:
The training objective combines a data-fidelity term evaluated at the receiver positions (P1–P11 in this study) and a physics residual term evaluated at collocation points :
The weighting parameters and balance measurement agreement and physics regularization.
Because explicit boundary conditions and detailed impedance models are not imposed in the present configuration, the PINN-based reconstruction should be interpreted as a physics-regularized, domain-interior approximation constrained by sparse observations, rather than a full boundary-value room simulation (e.g., Figure 9).
2.8. Implementation
All processing was implemented in MATLAB R2023a (Academic Use, The MathWorks, Natick, MA, USA),including impulse-response extraction, FOA directional metrics, entropy computation, and PINN training/inference.
3. Results
Results are reported at three representative frequencies (125 Hz, 1 kHz, and 4 kHz) using the directional metrics defined in Section 2. The FOA coordinate convention and intensity-based DoA proxy follow Figure 5, the early/late windowing and axis-resolved energy accumulation follow Figure 4 and Figure 7, and the directional entropy follows the binning and normalization procedure in Figure 8 (Table 3). Figure 6 provides a qualitative illustration of how the directional energy distribution differs between the early and late windows and how the corresponding vertical structure can be assessed using the and projections. For each frequency and metric, the corresponding figure contains: (a) point-wise values mapped to the plan view, (b) normalized point-wise values across P1–P11, and (c) Gaussian RBF interpolation within the audience domain for visualization of spatial trends only (Section 2.7.1). Unless otherwise stated, quantitative cross-seat comparisons are primarily drawn from panel (b) of Figure 10, Figure 11, Figure 12, Figure 13, Figure 14, Figure 15, Figure 16, Figure 17, Figure 18, Figure 19, Figure 20 and Figure 21, while panel (c) is used only to illustrate spatial gradients.
3.1. Measured Pressure Proxy ∣pmeas(f)∣ (W-Channel, 0–200 ms)
Figure 10, Figure 11 and Figure 12 show the spatial distribution of the measured pressure-proxy magnitude (W-channel, 0–200 ms) at 125 Hz, 1 kHz, and 4 kHz. The under-balcony receivers correspond to P7–P11 (Table 1). In each figure, panel (b) provides the point-wise comparison across P1–P11, while panel (c) is a Gaussian RBF visualization within .
At 125 Hz (Figure 10), varies across seats and the under-balcony region tends to show lower magnitudes than open seating. At 1 kHz (Figure 11), the distribution becomes more seat-dependent, and the under-balcony region contains both relatively high- and low-magnitude points. At 4 kHz (Figure 12), the contrast between open and under-balcony seats becomes clearer again, while variability within P7–P11 remains evident. These results indicate that under-balcony effects cannot be characterized by a uniform level reduction. Instead, the response depends strongly on receiver location, especially in the mid and high bands.
Spatial distribution of ∣pmeas(f)∣ at 125 Hz. (a) Point-wise normalized ∣pmeas(f)∣ (FOA W-channel, 0–200 ms) mapped to the plan view; the arrows indicate the intensity-based directional vectors at the receiver positions. (b) Normalized values across P1–P11. (c) Gaussian RBF interpolation within Ω for visualization of spatial trends only; the black lines represent schematic RBF contour lines, and the colored dots indicate the corresponding point-wise normalized values projected onto the audience domain, with colors referenced to the color bar.
Spatial distribution of ∣pmeas(f)∣ at 1 kHz. (a) Point-wise normalized ∣pmeas(f)∣ (FOA W-channel, 0–200 ms) mapped to the plan view; the arrows indicate the intensity-based directional vectors at the receiver positions. (b) Normalized values across P1–P11. (c) Gaussian RBF interpolation within Ω for visualization of spatial trends only; the black lines represent schematic RBF contour lines, and the colored dots indicate the corresponding point-wise normalized values projected onto the audience domain, with colors referenced to the color bar.
Spatial distribution of ∣pmeas(f)∣ at 4 kHz. (a) Point-wise normalized ∣pmeas(f)∣ (FOA W-channel, 0–200 ms) mapped to the plan view; the arrows indicate the intensity-based directional vectors at the receiver positions. (b) Normalized values across P1–P11. (c) Gaussian RBF interpolation within Ω for visualization of spatial trends only; the black lines represent schematic RBF contour lines, and the colored dots indicate the corresponding point-wise normalized values projected onto the audience domain, with colors referenced to the color bar.
3.2. Directional Spatial Entropy Hn (0–200 ms): Angular Spread of Energy
Figure 13, Figure 14 and Figure 15 report the normalized directional spatial entropy over 0–200 ms at 125 Hz, 1 kHz, and 4 kHz. Across all three frequencies, is generally high, indicating that the angular energy distribution accumulated within 0–200 ms is broadly spread at most seats. However, consistent seat-to-seat differences remain, with several positions showing lower , i.e., a more concentrated directional distribution.
At 125 Hz (Figure 13), the entropy field already exhibits spatial structure. At 1 kHz (Figure 14), the seat dependence becomes more distinct. At 4 kHz (Figure 15), the patterns highlight stronger directional constraints in geometrically shadowed locations. Compared with , provides complementary information by describing how energy is distributed across arrival directions rather than only indicating magnitude.
Spatial distribution of directional entropy Hn at 125 Hz. (a) Point-wise normalized Hn (0–200 ms) mapped to the plan view; the arrows indicate the intensity-based directional vectors at the receiver positions. (b) Normalized values across P1–P11. (c) Gaussian RBF interpolation within Ω for visualization of spatial trends only; the black lines represent schematic RBF contour lines, and the colored dots indicate the corresponding point-wise normalized values projected onto the audience domain, with colors referenced to the color bar.
Spatial distribution of directional entropy Hn at 1 kHz. (a) Point-wise normalized Hn (0–200 ms) mapped to the plan view; the arrows indicate the intensity-based directional vectors at the receiver positions. (b) Normalized values across P1–P11. (c) Gaussian RBF interpolation within Ω for visualization of spatial trends only; the black lines represent schematic RBF contour lines, and the colored dots indicate the corresponding point-wise normalized values projected onto the audience domain, with colors referenced to the color bar.
Spatial distribution of directional entropy Hn at 4 kHz. (a) Point-wise normalized Hn (0–200 ms) mapped to the plan view; the arrows indicate the intensity-based directional vectors at the receiver positions. (b) Normalized values across P1–P11. (c) Gaussian RBF interpolation within Ω for visualization of spatial trends only; the black lines represent schematic RBF contour lines, and the colored dots indicate the corresponding point-wise normalized values projected onto the audience domain, with colors referenced to the color bar.
3.3. Early–Late Entropy Discrepancy ΔHn: Temporal Evolution of Directional Diffuseness
Figure 16, Figure 17 and Figure 18 show the early–late entropy discrepancy , using early 0–80 ms and late 80–200 ms windows, at 125 Hz, 1 kHz, and 4 kHz. Across the receiver set, is predominantly negative at all three frequencies. This indicates that the early field is more directionally concentrated than the late field, consistent with early arrivals being governed by fewer dominant reflection directions.
A systematic under-balcony tendency is observed: P7–P11 exhibit more negative than open seating at each of the three frequencies. This implies that the early directional structure beneath the balcony is more constrained relative to the late field, consistent with the overhang suppressing or weakening specific early reflection paths.
Early–late directional entropy discrepancy ΔHn at 125 Hz. (a) Point-wise ΔHn=Hnearly−Hnlate (early: 0–80 ms; late: 80–200 ms) mapped to the plan view (normalized signed values); the arrows indicate the intensity-based directional vectors at the receiver positions. (b) Raw signed ΔHn values across P1–P11. (c) Gaussian RBF interpolation within Ω for visualization of spatial trends only; the black lines represent schematic RBF contour lines, and the colored dots indicate the corresponding point-wise values projected onto the audience domain, with colors referenced to the color bar.
Early–late directional entropy discrepancy ΔHn at 1 kHz. (a) Point-wise ΔHn=Hnearly−Hnlate (early: 0–80 ms; late: 80–200 ms) mapped to the plan view (normalized signed values); the arrows indicate the intensity-based directional vectors at the receiver positions. (b) Raw signed ΔHn values across P1–P11. (c) Gaussian RBF interpolation within Ω for visualization of spatial trends only; the black lines represent schematic RBF contour lines, and the colored dots indicate the corresponding point-wise values projected onto the audience domain, with colors referenced to the color bar.
Early–late directional entropy discrepancy ΔHn at 4 kHz. (a) Point-wise ΔHn=Hnearly−Hnlate (early: 0–80 ms; late: 80–200 ms) mapped to the plan view (normalized signed values); the arrows indicate the intensity-based directional vectors at the receiver positions. (b) Raw signed ΔHn values across P1–P11. (c) Gaussian RBF interpolation within Ω for visualization of spatial trends only; the black lines represent schematic RBF contour lines, and the colored dots indicate the corresponding point-wise values projected onto the audience domain, with colors referenced to the color bar.
3.4. Early–Late Vertical Energy Discrepancy ΔRz: Selective Loss of Early Vertical Contribution Under the Balcony
Figure 19, Figure 20 and Figure 21 present the early–late discrepancy of the vertical energy ratio, , at 125 Hz, 1 kHz, and 4 kHz. This metric isolates changes in the vertical contribution between early and late windows and therefore targets direction-specific imbalance rather than overall magnitude variation.
Open seating (P1–P6) shows values clustered near zero across the three frequencies. In contrast, the under-balcony group (P7–P11) shows consistently negative , with the most negative values repeatedly occurring within the under-balcony set. This indicates a selective reduction in early vertical contribution beneath the balcony, consistent with weakened ceiling- and overhead-related early reflections, while later energy (80–200 ms) becomes more redistributed.
Early–late vertical energy discrepancy ΔRz at 125 Hz. (a) Point-wise ΔRz=Rzearly−Rzlate (early: 0–80 ms; late: 80–200 ms) mapped to the plan view (normalized signed values); the arrows indicate the intensity-based directional vectors at the receiver positions. (b) Raw signed ΔRz values across P1–P11. (c) Gaussian RBF interpolation within Ω for visualization of spatial trends only; the black lines represent the schematic contour lines of the interpolated field, and the colored dots indicate the corresponding point-wise values projected onto the audience domain, with colors referenced to the color bar.
Early–late vertical energy discrepancy ΔRz at 1 kHz. (a) Point-wise ΔRz=Rzearly−Rzlate (early: 0–80 ms; late: 80–200 ms) mapped to the plan view (normalized signed values); the arrows indicate the intensity-based directional vectors at the receiver positions. (b) Raw signed ΔRz values across P1–P11. (c) Gaussian RBF interpolation within Ω for visualization of spatial trends only; the black lines represent the schematic contour lines of the interpolated field, and the colored dots indicate the corresponding point-wise values projected onto the audience domain, with colors referenced to the color bar.
Early–late vertical energy discrepancy ΔRz at 4 kHz. (a) Point-wise ΔRz=Rzearly−Rzlate (early: 0–80 ms; late: 80–200 ms) mapped to the plan view (normalized signed values); the arrows indicate the intensity-based directional vectors at the receiver positions. (b) Raw signed ΔRz values across P1–P11. (c) Gaussian RBF interpolation within Ω for visualization of spatial trends only; the black lines represent the schematic contour lines of the interpolated field, and the colored dots indicate the corresponding point-wise values projected onto the audience domain, with colors referenced to the color bar.
4. Discussion
4.1. Direction-Specific Mechanism of Under-Balcony Degradation
The present results indicate that under-balcony behavior cannot be adequately explained by a uniform attenuation mechanism. Pressure-proxy maps (Figure 10, Figure 11 and Figure 12) exhibit substantial seat-to-seat variability within P7–P11, indicating localized reinforcement and shadowing rather than a spatially homogeneous level loss. In contrast, the early–late vertical discrepancy (Figure 19, Figure 20 and Figure 21) provides a stable under-balcony signature across 125 Hz, 1 kHz, and 4 kHz: P7–P11 consistently show negative , whereas open seating (P1–P6) clusters near zero. This pattern supports a direction-selective reduction in early vertical contribution (0–80 ms) relative to the late field (80–200 ms) in the shadowed zone, consistent with an overhang mechanism that weakens ceiling- and overhead-related early reflections.
4.2. Directional Entropy and Early–Late Evolution of Anisotropy
Directional spatial entropy (Figure 13, Figure 14 and Figure 15) remains generally high over 0–200 ms, indicating that the accumulated angular-energy distribution is broadly spread for most seats. However, the early–late discrepancy (Figure 16, Figure 17 and Figure 18) is predominantly negative and becomes more negative under the balcony. This implies that the early field is more directionally concentrated than the late field, and that this early directional constraint is stronger in P7–P11. Such behavior is consistent with early arrivals being governed by a limited set of viable reflection paths, while later energy becomes progressively redistributed by higher-order reflections.
The qualitative directional views in Figure 6 are consistent with this interpretation. The early and late spherical projections show that the dominant directional regions differ between the two temporal windows, indicating that the directional energy distribution is redistributed rather than preserved as a stable angular pattern. This observation supports the interpretation of negative , namely, that the early field is constrained by fewer dominant arrival directions, whereas the late field becomes more directionally spread. In particular, the projection indicates a relatively constrained vertical spread, supporting the view that the under-balcony sound field is affected not only by seat-dependent magnitude variation but also by restricted vertical directional structure. The projection further suggests that much of the directional energy remains concentrated near the horizontal plane rather than being broadly distributed over elevation.
Together with , separates temporal redistribution effects from direction-specific deficits. While reflects how directional diffuseness evolves from the early to the late field, directly captures whether vertical energy contribution is selectively weakened in the early response. Considered jointly, these metrics indicate that the under-balcony condition is characterized by both stronger early anisotropy and limited vertical-energy development, rather than by a simple uniform attenuation of sound level.
4.3. Frequency Dependence Across 125 Hz, 1 kHz, and 4 kHz
Across the three representative frequencies, the observed patterns are consistent with a transition from diffraction-influenced behavior at low frequency to shadowing-dominated behavior at mid and high frequencies. At 125 Hz, spatial differences are present but are partially mitigated by long wavelengths. At 1 kHz and 4 kHz, the under-balcony region exhibits clearer directional constraints and stronger early vertical-energy loss, consistent with reduced access to specular overhead reflections and increased sensitivity to geometric occlusion. The persistence of negative across frequencies further suggests that the under-balcony deficit is not a narrowband artifact but a robust directional feature of the early response.
4.4. Diagnostic Implications Beyond Scalar Metrics
Scalar descriptors can indicate seat-dependent variation but cannot directly reveal which directional components are selectively affected. The present results show that alone is insufficient to distinguish between uniform attenuation and direction-specific imbalance, particularly when within-zone variability is large. By contrast, the proposed set of direction-resolved metrics provides complementary diagnostic information: characterizes angular spread over 0–200 ms, captures early-to-late redistribution, and directly exposes selective early vertical-energy deficiency. This separation improves interpretability and supports mechanism-driven diagnosis of under-balcony acoustic issues.
4.5. Limitations
This study is based on a sparse receiver set (11 positions). Spatial maps in panels (c) of Figure 10, Figure 11, Figure 12, Figure 13, Figure 14, Figure 15, Figure 16, Figure 17, Figure 18, Figure 19, Figure 20 and Figure 21 are generated by Gaussian RBF interpolation for visualization only; therefore, quantitative interpretation should rely primarily on the point-wise comparisons (panel (b)). The DoA proxy is intensity-based and evaluated per frequency using a narrowband representation; results can be affected by the choice of early/late windows and the temporal weighting window. Finally, the present evaluation focuses on three representative frequencies; extending the analysis to additional bands would further clarify frequency-dependent transitions.
In addition, the normalized directional entropy depends to some extent on the adopted angular binning . Although absolute values vary slightly with bin resolution, the principal cross-seat trends reported here—most notably the consistently negative in the under-balcony group—remained stable under moderate changes in binning density.
5. Conclusions
This study presented a FOA-based directional analysis framework for diagnosing spatial acoustic degradation in under-balcony seating areas. By combining intensity-based direction-of-arrival estimation, axis-resolved directional energy build-up, and directional spatial entropy analysis, the proposed approach enables mechanism-oriented interpretation of directional reflection structures that are not captured by conventional scalar metrics.
The experimental results lead to several key findings. First, the pressure-proxy magnitude exhibits strong seat-to-seat variability in the under-balcony region, indicating that the observed degradation cannot be explained by a spatially uniform attenuation mechanism. Second, directional spatial entropy analysis shows that the angular energy distribution remains broadly spread over the full 0–200 ms interval but exhibits localized reductions that suggest directional constraints in specific seating locations. Third, the early–late entropy discrepancy is predominantly negative across all analyzed frequencies, indicating stronger directional concentration in the early sound field. Finally, the early–late vertical energy discrepancy provides the most consistent under-balcony signature, with negative values observed at all shadowed positions, revealing a systematic reduction in early vertical reflection contributions.
The proposed framework therefore provides a physically interpretable method for identifying direction-specific early-reflection deficits in geometrically constrained listening zones. Beyond under-balcony scenarios, the same methodology can be applied to other architectural configurations where reflection paths may be partially blocked, such as deep overhangs, alcoves, or recessed seating areas.
Future work will extend the analysis to denser spatial sampling and broader frequency coverage to better characterize frequency-dependent transitions in directional behavior. In addition, intervention studies will investigate how architectural or electroacoustic modifications influence and , enabling the proposed metrics to serve as design-feedback indicators for targeted acoustic improvements.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1ISO 3382-1:2009 Acoustics—Measurement of Room Acoustic Parameters—Part 1: Performance Spaces International Organization for Standardization Geneva, Switzerland 2009
- 2Christensen C.L. The ISO 3382 Parameters: Can We Simulate Them? Can We Measure Them?Proceedings of the International Symposium on Room Acoustics (ISRA 2013)Toronto, ON, Canada 9–11 June 2013
- 3Barron M. Marshall A.H. Spatial impression due to early lateral reflections in concert halls: The derivation of a physical measure J. Sound Vib.19817721123210.1016/S 0022-460X(81)80020-X · doi ↗
- 4Pätynen J. Tervo S. Robinson P.W. Lokki T. Concert halls with strong lateral reflections enhance musical dynamics Proc. Natl. Acad. Sci. USA 20141114409441410.1073/pnas.131997611124591584 PMC 3970476 · doi ↗ · pubmed ↗
- 5Beranek L.L. Concert Halls and Opera Houses: Music, Acoustics, and Architecture 2nd ed.Springer New York, NY, USA 200410.1007/978-0-387-21636-2 · doi ↗
- 6Barron M. Balcony overhangs in concert auditoria J. Acoust. Soc. Am.1995982580258910.1121/1.413224 · doi ↗
- 7Bradley J.S. Review of objective room acoustics measures and future needs Appl. Acoust.20117271372010.1016/j.apacoust.2011.04.004 · doi ↗
- 8Merimaa J. Pulkki V. Spatial impulse response rendering Proceedings of the International Conference on Digital Audio Effects (DA Fx-04)Naples, Italy 5–8 October 2004
