Complex Block Floating-Point Format with Box Encoding For Wordlength   Reduction in Communication Systems

Yeong Foong Choo; Brian L. Evans; Alan Gatherer

arXiv:1705.05217·cs.IT·October 26, 2017

Complex Block Floating-Point Format with Box Encoding For Wordlength Reduction in Communication Systems

Yeong Foong Choo, Brian L. Evans, Alan Gatherer

PDF

TL;DR

This paper introduces a novel complex block floating-point format with box encoding to reduce wordlength and implementation complexity in communication systems, demonstrated through a QAM transceiver case study.

Contribution

The paper presents a new complex block floating-point format with box encoding, enabling reduced wordlength and complexity while maintaining signal quality.

Findings

01

Reduced quantization error with box encoding

02

Tradeoffs between signal quality and complexity quantified

03

Effective in a QAM transmitter and receiver scenario

Abstract

We propose a new complex block floating-point format to reduce implementation complexity. The new format achieves wordlength reduction by sharing an exponent across the block of samples, and uses box encoding for the shared exponent to reduce quantization error. Arithmetic operations are performed on blocks of samples at time, which can also reduce implementation complexity. For a case study of a baseband quadrature amplitude modulation (QAM) transmitter and receiver, we quantify the tradeoffs in signal quality vs. implementation complexity using the new approach to represent IQ samples. Signal quality is measured using error vector magnitude (EVM) in the receiver, and implementation complexity is measured in terms of arithmetic complexity as well as memory allocation and memory input/output rates. The primary contributions of this paper are (1) a complex block floating-point format…

Figures9

Click any figure to enlarge with its caption.

Tables7

Table 1. TABLE I: Definition & Bit Widths Under IEEE-754 Number Format [ 10 ]

Components	Definition	Bit Widths, $B$
Wordlength, $W$	$N_{w}$	${16, 32, 64}$
Sign, $S$	$N_{s}$	${1}$
Exponent, $E$	$N_{e}$	${5, 8, 11}$
Mantissa, $M$	$N_{m}$	${10, 23, 52}$

Table 2. TABLE II: Definition & Bit Widths Under Common Exponent Encoding [ 8 ]

Components	Definition	Bit Widths, $B$
Common Exponent, $E$	$N_{e}$	${5, 8, 11}$
Real / Imaginary, $𝑺$	$N_{s}^{R, I}$	${1}$
Real / Imaginary Lead, $𝑳$	$N_{l}^{R, I}$	${1}$
Real / Imaginary Mantissa, $𝑴$	$N_{m}^{R, I}$	${10, 23, 52}$

Table 3. TABLE III: Definition & Bit Widths Under Exponent Box Encoding

Components	Definition	Bit Widths , $B$
Common Exponent, $E$	$N_{e}$	${5, 8, 11}$
Real / Imaginary Sign, $𝑺$	$N_{s}^{R, I}$	${1}$
Real / Imaginary Lead, $𝑳$	$N_{l}^{R, I}$	${1}$
Real / Imaginary Box Shift, $𝑿$	$N_{x}^{R, I}$	${1}$
Real / Imaginary Mantissa, $𝑴$	$N_{m}^{R, I}$	${10, 23, 52}$

Table 4. TABLE IV: Wordlength Requirement by N v subscript 𝑁 𝑣 N_{v} Complex-Valued Samples

Encoding	Bit Widths
Complex IEEE754	$2 N_{v} (B_{s} + B_{e} + B_{m})$
Common Exponent	$2 N_{v} (B_{s} + B_{l} + B_{m}) + B_{e}$
Exponent Box	$2 N_{v} (B_{s} + B_{l} + B_{x} + B_{m}) + B_{e}$

Table 5. TABLE V: Mantissas and Exponent Pre/Post Processing Complexity of Complex Block ALU

Block Addition	Mantissas Scaling	Exponents Arithmetic
Complex IEEE754	$4 * N$	$2 * N$
Common Exponent	$4 * N$	$2$
Exponent Box	$8 * N$	$4$
Block Multiplication	Mantissas Scaling	Exponents Arithmetic
Complex IEEE754	$8 * N$	$6 * N$
Common Exponent	$8 * N$	$2$
Exponent Box	$16 * N$	$5$
Convolution	Mantissas Scaling	Exponents Arithmetic
Complex IEEE754	$6 * N_{1} N_{2} + 4 * (N_{1} - 1) (N_{2} - 1)$	$6 * N_{1} N_{2} + 2 * (N_{1} - 1) (N_{2} - 1)$
Common Exponent	$6 * N_{1} N_{2} + 4 * (N_{1} - 1) (N_{2} - 1)$	$3 * (N_{1} + N_{2} - 1) + 1$
Exponent Box	$10 * N_{1} N_{2} + 8 * (N_{1} - 1) (N_{2} - 1)$	$3 * (N_{1} + N_{2} - 1) + 1$

Table 6. TABLE VI: QAM Transmitter, Receiver Specifications

QAM Parameters	Definition	Values / Types
Constellation Order	$M$	1024
Transceiver Parameters	Definition	Values / Types
Up-sample Factor	$L^{T X}, L^{R X}$	4
Symbol Rate (Hz)	$f_{s y m}$	2400
Filter Order	$N^{T X}, N^{R X}$	$32^{t h}$
Pulse Shape	$g^{T X}, g^{R X}$	Root-Raised Cosine
Excess Bandwidth Factor	$α^{T x}, α^{R X}$	0.2

Table 7. TABLE VII: Memory Input / Output and Computational Rates on Exponent Box Shifting Technique

Transmitter Chain	Memory Reads Rate (bits/sec)	Memory Writes Rate (bits/sec)	MACs / sec
Symbol Mapper	$J f_{s y m}$	$2 f_{s y m} (N_{w} + N_{l} + N_{b} - N_{e}) + N_{e}$	$0$
Upsampler	$2 f_{s y m} (N_{w} + N_{l} + N_{b} - N_{e}) + N_{e}$	$2 L^{T x} f_{s y m} (N_{w} + N_{l} + N_{b} - N_{e}) + N_{e}$	$0$
Pulse Shape Filter	$(3 L^{T x} N_{g}^{T x} + 1) (L^{T x} f_{s y m}) (N_{w} + N_{l} + N_{b} - N_{e}) + 2 N_{e}$	$2 L^{T x} f_{s y m} (N_{w} + N_{l} + N_{b} - N_{e}) + N_{e}$	${(L^{T x})}^{2} N_{g}^{T x} f_{s y m}$
Receiver Chain	Memory Reads Rate (bits/sec)	Memory Writes Rate (bits/sec)	MACs / sec
Matched Filter	$(3 L^{R x} N_{g}^{R x} + 1) (L^{R x} f_{s y m}) (N_{w} + N_{l} + N_{b} - N_{e}) + 2 N_{e}$	$2 L^{R x} f_{s y m} (N_{w} + N_{l} + N_{b} - N_{e}) + N_{e}$	${(L^{R x})}^{2} N_{g}^{R x} f_{s y m}$
Downsampler	$2 L^{R x} f_{s y m} (N_{w} + N_{l} - N_{e}) + N_{e} + (N_{w} + N_{l} + N_{b})$	$2 f_{s y m} (N_{w} + N_{l} + N_{b} - N_{e}) + N_{e}$	$0$
Symbol Demapper	$2 f_{s y m} (N_{w} + N_{l} - N_{e}) + N_{e} + \frac{J}{2} (N_{w} + N_{l})$	$J f_{s y m}$	$0$

Equations14

ℜ {Y} ℑ {Y} = ℜ {X_{1}} + ℜ {X_{2}} = ℑ {X_{1}} + ℑ {X_{2}}

ℜ {Y} ℑ {Y} = ℜ {X_{1}} + ℜ {X_{2}} = ℑ {X_{1}} + ℑ {X_{2}}

ℜ {Y} ℑ {Y} = ℜ {X_{1}} ∙ ℜ {X_{2}} - ℑ {X_{1}} ∙ ℑ {X_{2}} = ℜ {X_{1}} ∙ ℑ {X_{2}} + ℑ {X_{1}} ∙ ℜ {X_{2}}

ℜ {Y} ℑ {Y} = ℜ {X_{1}} ∙ ℜ {X_{2}} - ℑ {X_{1}} ∙ ℑ {X_{2}} = ℜ {X_{1}} ∙ ℑ {X_{2}} + ℑ {X_{1}} ∙ ℜ {X_{2}}

ℜ {Y} ℑ {Y} = ℜ {X_{1} * X_{2}} = ℑ {X_{1} * X_{2}}

ℜ {Y} ℑ {Y} = ℜ {X_{1} * X_{2}} = ℑ {X_{1} * X_{2}}

E V M = \frac{∥ X - X ˉ ∥ _{2}}{∥ X ∥ _{2}} * 100

E V M = \frac{∥ X - X ˉ ∥ _{2}}{∥ X ∥ _{2}} * 100

j (1. M {j} * 2^{E {j} - F (j)}) (1. M {j} * 2^{E {j}}) (1. M {j} * 2^{E {j} - E {i} + E {i}}) (1. M {j} * 2^{E {j} - E {i}}) (1. M {j} * 2^{- E {Δ}}) (0. M {j^{'}}) w h er e M {j^{'}} < i < (1. M {i} * 2^{E {i} - F (i)}) < (1. M {i} * 2^{E {i}}) < (1. M {i} * 2^{E {i}}) < (1. M {i}) < (1. M {i}) < (1. M {i}) = \frac{1. M { j }}{2 ^{E {Δ}}}

j (1. M {j} * 2^{E {j} - F (j)}) (1. M {j} * 2^{E {j}}) (1. M {j} * 2^{E {j} - E {i} + E {i}}) (1. M {j} * 2^{E {j} - E {i}}) (1. M {j} * 2^{- E {Δ}}) (0. M {j^{'}}) w h er e M {j^{'}} < i < (1. M {i} * 2^{E {i} - F (i)}) < (1. M {i} * 2^{E {i}}) < (1. M {i} * 2^{E {i}}) < (1. M {i}) < (1. M {i}) < (1. M {i}) = \frac{1. M { j }}{2 ^{E {Δ}}}

\frac{1}{2} (N_{1}) (N_{1} + 1) + (N_{2} - N_{1}) (N_{1}) + \frac{1}{2} (N_{1} - 1) (N_{1}) = \frac{1}{2} (N_{1}^{2} + N_{1}) + (N_{2} N_{1} - N_{1}^{2}) + \frac{1}{2} (N_{1}^{2} - N_{1}) = \frac{1}{2} (2 N_{1}^{2}) + (N_{2} N_{1} - N_{1}^{2}) = N_{1}^{2} + (N_{2} N_{1} - N_{1}^{2}) = N_{2} N_{1}

\frac{1}{2} (N_{1}) (N_{1} + 1) + (N_{2} - N_{1}) (N_{1}) + \frac{1}{2} (N_{1} - 1) (N_{1}) = \frac{1}{2} (N_{1}^{2} + N_{1}) + (N_{2} N_{1} - N_{1}^{2}) + \frac{1}{2} (N_{1}^{2} - N_{1}) = \frac{1}{2} (2 N_{1}^{2}) + (N_{2} N_{1} - N_{1}^{2}) = N_{1}^{2} + (N_{2} N_{1} - N_{1}^{2}) = N_{2} N_{1}

\frac{1}{2} (N_{1} - 1) (N_{1}) + (N_{2} - N_{1}) (N_{1} - 1) + \frac{1}{2} (N_{1} - 2) (N_{1} - 1) = \frac{1}{2} (N_{1}^{2} - N_{1} + N_{1}^{2} - 3 N_{1} + 2) + (N_{2} - N_{1}) (N_{1} - 1) = (N_{1}^{2} - 2 N_{1} + 1) + (N_{2} - N_{1}) (N_{1} - 1) = (N_{1} - 1) (N_{1} - 1) + (N_{2} - N_{1}) (N_{1} - 1) = (N_{1} - 1) (N_{1} - 1 + N_{2} - N_{1}) = (N_{1} - 1) (N_{2} - 1)

\frac{1}{2} (N_{1} - 1) (N_{1}) + (N_{2} - N_{1}) (N_{1} - 1) + \frac{1}{2} (N_{1} - 2) (N_{1} - 1) = \frac{1}{2} (N_{1}^{2} - N_{1} + N_{1}^{2} - 3 N_{1} + 2) + (N_{2} - N_{1}) (N_{1} - 1) = (N_{1}^{2} - 2 N_{1} + 1) + (N_{2} - N_{1}) (N_{1} - 1) = (N_{1} - 1) (N_{1} - 1) + (N_{2} - N_{1}) (N_{1} - 1) = (N_{1} - 1) (N_{1} - 1 + N_{2} - N_{1}) = (N_{1} - 1) (N_{2} - 1)

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Complex Block Floating-Point Format with Box Encoding For Wordlength Reduction in Communication Systems

Yeong Foong Choo1, Brian L. Evans1 and Alan Gatherer2

[email protected], [email protected] [email protected]

1Wireless Networking and Communications Group, The University of Texas at Austin, Austin, TX USA

2Wireless Access Laboratory, Huawei Technologies, Plano, TX USA

Abstract

We propose a new complex block floating-point format to reduce implementation complexity. The new format achieves wordlength reduction by sharing an exponent across the block of samples, and uses box encoding for the shared exponent to reduce quantization error. Arithmetic operations are performed on blocks of samples at time, which can also reduce implementation complexity. For a case study of a baseband quadrature amplitude modulation (QAM) transmitter and receiver, we quantify the tradeoffs in signal quality vs. implementation complexity using the new approach to represent IQ samples. Signal quality is measured using error vector magnitude (EVM) in the receiver, and implementation complexity is measured in terms of arithmetic complexity as well as memory allocation and memory input/output rates. The primary contributions of this paper are (1) a complex block floating-point format with box encoding of the shared exponent to reduce quantization error, (2) arithmetic operations using the new complex block floating-point format, and (3) a QAM transceiver case study to quantify signal quality vs. implementation complexity tradeoffs using the new format and arithmetic operations.

Index Terms:

Complex block floating-point, discrete-time baseband QAM.

I Introduction

Energy-efficient data representation in application specific baseband transceiver hardware are in demand resulting from energy costs involved in baseband signal processing [1]. In macrocell base stations, about ten percent of energy cost contribute towards digital signal processing (DSP) modules while power amplification and cooling processes consume more than 70% of total energy [2]. The energy consumption by DSP modules relative to power amplification and cooling will increase in future designs of small cell systems because low-powered cellular radio access nodes handle a shorter radio range [2]. The design of energy-efficient number representation will reduce overall energy consumption in base stations.

In similar paper, baseband signal compression techniques have been researched for both uplink and downlink. The methods in [3], [4], and [5] suggest resampling baseband signals to Nyquist rate, block scaling, and non-linear quantization. All three papers report transport data rate gain of 3x to 5x with less than 2% EVM loss. In [5], cyclic prefix replacement technique is used to counter the effect of resampling, which would add processing overhead to the system. In [4] and [6], noise shaping technique shows improvement of in-band signal-to-noise ratio (SNR). In [7], transform coding technique is suggested for block compression of baseband signals in the settings of multiple users and multi-antenna base station. Transform coding technique reports potential of 8x transport data rate gain with less than 3% EVM loss. The above methods achieve end-to-end compression in a transport link and incur delay and energy cost for the compression and decompression at the entry and exit points, respectively. The overall energy cost reduction is not well quantified. This motivates the design of energy-efficient data representation and hardware arithmetic units with low implementation complexity.

In [8], Common Exponent Encoding is proposed to represent 32-bit complex floating-point data by only 29-bit wordlength in hardware to achieve 3-bit savings. The method in [8] shows 10% reduction of registers and memory footprints with a tradeoff of 10% increase in arithmetic units. In [9], exponential coefficient scaling is proposed to allocate 6 bits to represent real-valued floating-point data. The method in [9] achieves 37x reduction in quantization errors, 1.2x reduction in logic gates, and 1.4x reduction in energy per cycle compared to 6-bit fixed-point representation. Both papers report less than 2 dB of signal-to-quantization-noise ratio (SQNR).

Contributions: Our method applies the Common Exponent Encoding proposed by [8] and adds a proposed Exponent Box Encoding to retain high magnitude-phase resolution. This paper identifies the computational complexity of complex block addition, multiplication, and convolution and computes reference EVM on the arithmetic output. We apply the new complex block floating-point format to case study of baseband QAM transmitter chain and receiver chain. We also reduce implementation complexity in terms of memory reads/writes rates, and multiply-accumulate operations. We base the signal quality of our method on the measurement of EVM at the receiver. Our method achieves end-to-end complex block floating-point representation.

II Methods

This section describes the data structure used in new representation of complex block floating-point [8] and suggests a new mantissa scaling method in reducing quantization error. In IEEE 754 format, the exponents of complex-valued floating-point data are separately encoded. Common Exponent Encoding technique [8] allows common exponent sharing that has weak encoding of phase resolution.

II-A Common Exponent Encoding Technique

Table I summarizes the wordlength precision of real-valued floating-point data in IEEE-754 encoding [10]. We define $B_{w}$ -bit as the wordlength of scalar floating-point data. A complex-valued floating-point data requires $2B_{w}$ -bit and a complex block floating-point of $N_{v}$ samples requires $2N_{v}B_{w}$ -bit.

The method in [11] assumes only magnitude correlation in the oversampled complex block floating-point data. This assumption allows common exponent be jointly encoded across complex block floating-point of $N_{v}$ samples defined in Table II. The implied leading bit of 1 of each floating-point data is first uncovered. The common exponent is selected from the largest unsigned exponent across the complex block. All mantissa values are successively scaled down by the difference between common exponent and its original exponent. Therefore, each floating-point data with smaller exponents value loses leading bit of 1. The leading bit of complex block floating-point is explicitly coded as $N_{l}$ , using $B_{l}$ -bit. The sign bits are left unchanged. A complex block floating-point of $N_{v}$ samples requires $\{2N_{v}(B_{s}+B_{l}+B_{m})+B_{e}\}$ -bit.

We derive the maximum allowed exponent difference under Common Exponent Encoding in Appendix A. Mantissa values could be reduced to zero as a result of large phase difference. Figure 2 shows the Effective Encoding Region (EER) under Common Exponent Encoding technique ( ). Exponent pairs outside the EER will have corresponding mantissa values reduce to zero.

II-B Exponent Box Encoding Technique

The Common Exponent Encoding technique suffers high quantization and phase error in the complex block floating-point of high dynamic range. Exponent Box Encoding is suggested to reduce quantization error of complex-valued floating-point pairs by allocating $2N_{v}$ -bit per complex block. Figure 2 shows the Effective Encoding Region under Exponent Box Encoding technique ( ) which has four times larger the area of EER of Common Exponent Encoding technique ( ).

The use of 2-bit per complex sample replaces the mantissas rescaling operation with exponents addition/ subtraction. We are able to preserve more leading bits of mantissas values which improve the accuracy of complex block multiplication and complex block convolution results. A complex block floating-point of $N_{v}$ samples requires $\{2N_{v}(B_{s}+B_{l}+B_{x}+B_{m})+B_{e}\}$ -bit.

Arithmetic Logic Unit (ALU) hardware is designed to perform Single-Instruction Multiple-Data (SIMD) operation on complex block floating-point data. The Exponent Box Encoding is performed when converting to Exponent Box Encoding format. The Exponent Box Decoding is performed at the pre-processing of mantissas in Complex Block Addition and pre-processing of exponents in Complex Block Multiply.

Table IV summarizes the wordlength analysis required by complex block floating-point of $B_{v}$ samples. The Exponent Box Encoding and Exponent Box Decoding algorithms are described as follows:

III Arithmetic Unit

We identify the arithmetic units predominantly used on complex block floating-point data. Complex-valued multiplication and addition are two primary ALU required in convolution operation. This section identifies the complexity of pre-processing and post-processing mantissas and exponents in the complex block addition, multiplication, and convolution arithmetic. Table V describes the worst-case complexity analysis of complex block ALU on encoding format described in Section II.

III-A Complex Block Addition

Figure 3 shows simplified block diagram for Complex Block Addition. Let $\boldsymbol{X_{1},X_{2},Y}$ $\in\mathbb{C}^{1\times N}$ be complex-valued row vectors, such that,

[TABLE]

In IEEE-754 encoding format, complex block addition is implemented as two real-valued addition. There are four exponents to the two complex inputs and two exponents to the complex output. Each real-valued addition block requires one mantissa pre-scaling, one mantissa post-scaling, and one exponent arithmetic. Therefore, complex block addition requires two mantissas pre-scaling, two mantissas post-scaling, and two exponents arithmetic per sample.

In Common Exponent and Exponent Box Encoding, there are two shared exponents to the two complex block inputs and one shared exponent to the complex block output. Complexity on shared exponent arithmetic is $O(1)$ . We pre-scale the mantissas corresponding to the smaller exponent and post-scale the mantissas of the complex block output. With Exponent Box Encoding in the worst case, we require two mantissas pre-scaling and one mantissas post-scaling.

III-B Complex Block Multiplication

Figure 4 shows simplified block diagram for Complex Block Multiplication. Let $\boldsymbol{X_{1},X_{2},Y}$ $\in\mathbb{C}^{1\times N}$ be complex-valued row vectors, where $\bullet$ denotes element-wise multiply, such that,

[TABLE]

In IEEE-754 encoding format, complex block multiplication is implemented as four real-valued multiplication and two real-valued addition. Each real-valued multiplication requires one mantissa post-scaling and one exponent arithmetic. Each real-valued addition requires one mantissa pre-scaling, one mantissa post-scaling, and one exponent arithmetic. Complex block multiply requires two mantissas pre-scaling, six mantissas post-scaling, and six exponent arithmetic per sample.

In Common Exponent and Exponent Box Encoding, we need two exponent arithmetic for multiply and normalization of the complex block output. With Exponent Box Encoding in the worst case, we need eight more mantissas post-scaling. Also, the Shift Vectors allow for four possible intermediate exponent values instead of one intermediate exponent value in Common Exponent Encoding.

III-C Complex Convolution

Let $\boldsymbol{X_{1}}$ $\in\mathbb{C}^{1\times N_{1}}$ , $\boldsymbol{X_{2}}$ $\in\mathbb{C}^{1\times N_{2}}$ , and $\boldsymbol{Y}$ $\in\mathbb{C}^{1\times(N_{1}+N_{2}-1)}$ be complex-valued row vectors, where $\ast$ denotes convolution, such that,

[TABLE]

We assume $N_{1}<N_{2}$ for practical reason where the model of channel impulse response has shorter sequence than the discrete-time samples. Each term in the complex block output is complex inner product of two complex block input of varying length between 1 and $min\{N_{1},N_{2}\}$ . Complex convolution is implemented as complex block multiplication and accumulation of intermediate results. We derive the processing complexity of mantissas and exponents in Appendix B.

IV System Model

We apply Exponent Box Encoding to represent IQ components in baseband QAM transmitter in Figure 5 and baseband QAM receiver in Figure 6. The simulated channel model is Additive White Gaussian Noise (AWGN). Table VI contains the parameter definitions and values used in MATLAB simulation and Table VII summarizes the memory input/output rates (bits/sec) and multiply-accumulate rates required by discrete-time complex QAM transmitter and receiver chains.

IV-A Discrete-time Complex Baseband QAM Transmitter

We encode complex block IQ samples in Exponent Box Encoding and retain the floating-point resolution in 32-bit IEEE-754 precision in our model. For simplicity, we select block size to be, $N_{v}=L^{TX}f_{sym}$ . The symbol mapper generates a $L^{TX}f_{sym}$ -size of complex block IQ samples that shares common exponent. Pulse shape filter is implemented as Finite Impulse Response (FIR) filter of $N^{TX}$ -order and requires complex convolution on the upsampled complex block IQ samples.

IV-B Discrete-time Complex Baseband QAM Receiver

Due to the channel effect such as fading in practice, the received signals will have larger span in magnitude-phase response. The Common Exponent Encoding applied on sampled complex block IQ samples is limited to selecting window size of minimum phase difference. The Common Exponent Encoding must update its block size at the update rate of gain by the Automatic Gain Control (AGC). Instead, our Exponent Box Encoding could lift the constraint and selects fixed block size, $N_{v}=L^{RX}f_{sym}$ in this simulation. We simulate matched filter of $N^{RX}$ -order.

V Simulation Results

V-A Error Vector Magnitude on Complex Block (32-bit) ALU

Let $\boldsymbol{X},\boldsymbol{\bar{X}}\in\mathbb{C}^{1\times N}$ be complex-valued row vectors, such that $\boldsymbol{X}$ is the reference results in IEEE-754 Encoding and $\boldsymbol{\bar{X}}$ is the simulated results in Complex Block Encoding.

The signal quality is measured on the complex block arithmetic results. We truncate the arithmetic results to 32-bit precision to make fair comparison. We use the Root-Mean-Squared (RMS) EVM measurement as described in the following, with $\parallel\bullet\parallel_{2}$ as the Euclidean Norm,

[TABLE]

Figure 7 shows the EVM of complex block arithmetic in Section III on Inputs Ratio $\in(0,200)$ dB. In complex block addition, the Exponent Box Encoding does not show significant advantage over Common Exponent Encoding because the mantissas addition emphasizes on magnitude over phase. In complex block multiplication and convolution, the Exponent Box Encoding achieves significant reduction in encoding error over Common Exponent Encoding particularly on Inputs Ratio $\in(70,140)$ dB where the improvement is between $(0,99.999)\%$ .

V-B Error Vector Magnitude on Single-Carrier Transceiver

Figure 8 shows the dynamic range of Root-Raised Cosine (RRC) filter at transmitter and receiver and overall pulse shape response as a function of $\alpha$ . Figure 9 shows the EVM introduced by Complex Block Encoding under system model defined in Section IV. The EVM plot is indistinguishable between IEEE-754 Encoding and Complex Block Encoding. The reasons are the selection of RRC Roll-off factor and energy-normalized constellation map.

VI Conclusion

Our work has identified the processing overhead of the mantissas and shared exponent in complex block floating-point arithmetic. The common exponent encoding would slightly lower the overhead in complex-valued arithmetic. The box encoding of the shared exponent gives the same quantization errors as common exponent encoding in our case study, which is a 32-bit complex baseband transmitter and receiver. Our work has also quantified memory read/write rates and multiply-accumulate rates in our case study. Future work could extend a similar approach to representing and processing IQ samples in multi-carrier and multi-antenna communication systems.

Appendix A Derivation of Maximum Exponent Difference Under Common Exponent Encoding Technique

Let $i,j$ be two bounded positive real numbers, representable in floating point precision. Assume that $i$ has larger magnitude than $j$ , $|j|<|i|$ . Define $E\{k\}$ as exponent and $M\{k\}$ as mantissa to $k$ , and $F(k)=2^{E\{k\}-1}-1$ as exponent offset, where $k=\{i,j\}$ . Let $E\{\Delta\}$ be the difference between two exponents, $(E\{i\}-E\{j\})>0$ .

[TABLE]

The mantissa bits in $M(j^{\prime})$ are truncated in practice, therefore, $E\{\Delta\}$ must be less than $M(j)$ . The quantization error is the largest when the $M(j^{\prime})$ gets zero when $M(j)$ is nonzero.

Appendix B Derivation of Pre / Post Processing Complexity of Complex-valued Convolution

Let $N_{mult}^{mant},N_{add}^{mant},N_{mult}^{exp},N_{add}^{exp}$ be processing complexity of mantissas and exponents determined in Section III.

Among the first and last $N_{1}$ terms of $\boldsymbol{Y}$ , they are computed by complex inner product of $i\in\{1,...,N_{1}\}$ input terms from $\boldsymbol{X_{1}},\boldsymbol{X_{2}}$ and requires $\frac{(N_{1})(N_{1}+1)}{2}(N_{mult})$ and $\frac{(N_{1}-1)(N_{1})}{2}(N_{add})$ . Among the centering $N_{2}-N_{1}$ terms of $\boldsymbol{Y}$ , they are computed by complex inner product of $N_{1}$ input terms from $\boldsymbol{X_{1}},\boldsymbol{X_{2}}$ and requires $(N_{2}-N_{1})((N_{1})(N_{mult})+(N_{1}-1)(N_{add}))$ .

Overall Multiplication Requirement $(N_{mult})$ :

[TABLE]

Overall Addition Requirement $(N_{add})$ :

[TABLE]

Mantissa processing requirement is $(N_{mult}^{mant})(N_{2}N_{1})+(N_{add}^{mant})(N_{1}-1)(N_{2}-1)$ and exponent processing requirement is $(N_{mult}^{exp})(N_{2}N_{1})+(N_{add}^{exp})(N_{1}-1)(N_{2}-1)$ .

Bibliography11

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] G. Fettweis and E. Zimmermann, “ICT energy consumption-trends and challenges,” in Proc. Int. Symposium on Wireless Personal Multimedia Communications , vol. 2, no. 4, 2008, p. 6.
2[2] O. Blume, D. Zeller, and U. Barth, “Approaches to energy efficient wireless access networks,” in Int. Symposium on Communications, Control and Signal Processing , March 2010, pp. 1–5.
3[3] D. Samardzija, J. Pastalan, M. Mac Donald, S. Walker, and R. Valenzuela, “Compressed Transport of Baseband Signals in Radio Access Networks,” IEEE Transactions on Wireless Communications , vol. 11, no. 9, pp. 3216–3225, September 2012.
4[4] K. F. Nieman and B. L. Evans, “Time-domain compression of complex-baseband LTE signals for cloud radio access networks,” in Proc. IEEE Global Conference on Signal and Information Processing , Dec 2013, pp. 1198–1201.
5[5] D. Peng-ren and Z. Can, “Compressed transport of baseband signals in cloud radio access networks,” in Proc. Int. Conf. Communications and Networking in China (CHINACOM) , Aug 2014, pp. 484–489.
6[6] L. S. Wong, G. E. Allen, and B. L. Evans, “Sonar data compression using non-uniform quantization and noise shaping,” in Asilomar Conference on Signals, Systems and Computers , Nov 2014, pp. 1895–1899.
7[7] J. Choi, B. L. Evans, and A. Gatherer, “Space-time fronthaul compression of complex baseband uplink LTE signals,” in Proc. IEEE Int. Conference on Communications , May 2016, pp. 1–6.
8[8] N. Cohen and S. Weiss, “Complex Floating Point A Novel Data Word Representation for DSP Processors,” IEEE Transactions on Circuits and Systems I: Regular Papers , vol. 59, no. 10, pp. 2252–2262, Oct 2012.