The Combinatorics of Weighted Vector Compositions

Steffen Eger

arXiv:1704.04964·math.CO·August 28, 2018

The Combinatorics of Weighted Vector Compositions

Steffen Eger

PDF

TL;DR

This paper explores the combinatorial structure of weighted vector compositions, connecting them to probability, calculus, number theory, and asymptotic analysis, and proposing a new primality conjecture.

Contribution

It introduces the concept of weighted vector compositions, linking them to various mathematical areas and extending classical results with new conjectures.

Findings

01

Relations to sums of random vectors

02

Formulas for derivatives of composite functions

03

Congruence properties similar to binomial coefficients

Abstract

A vector composition of a vector $ℓ$ is a matrix $A$ whose rows sum to $ℓ$ . We define a weighted vector composition as a vector composition in which the column values of $A$ may appear in different colors. We study vector compositions from different viewpoints: (1) We show how they are related to sums of random vectors and (2) how they allow to derive formulas for partial derivatives of composite functions. (3) We study congruence properties of the number of weighted vector compositions, for fixed and arbitrary number of parts, many of which are analogous to those of ordinary binomial coefficients and related quantities. Via the Central Limit Theorem and their multivariate generating functions, (4) we also investigate the asymptotic behavior of several special cases of numbers of weighted vector compositions. Finally, (5) we conjecture an…

Tables1

$𝐬$ -color	cross-and-dash	1-2 compositions
$(\begin{matrix} 2^{1} \\ 2^{1} \end{matrix}) + (\begin{matrix} 1^{1} \\ 1^{1} \end{matrix})$	$(\begin{matrix} \times, -, \times, \times \\ \times, -, \times, \times \end{matrix})$	$(\begin{matrix} 1 & 2 & 1 & 1 \\ 1 & 2 & 1 & 1 \end{matrix})$
$(\begin{matrix} 2^{1} \\ 2^{2} \end{matrix}) + (\begin{matrix} 1^{1} \\ 1^{1} \end{matrix})$	$(\begin{matrix} \times, -, \times, \times \\ -, \times, \times, \times \end{matrix})$	$(\begin{matrix} 1 & 2 & 1 & 1 \\ 2 & 1 & 1 & 1 \end{matrix})$
$(\begin{matrix} 2^{2} \\ 2^{1} \end{matrix}) + (\begin{matrix} 1^{1} \\ 1^{1} \end{matrix})$	$(\begin{matrix} -, \times, \times, \times \\ \times, -, \times, \times \end{matrix})$	$(\begin{matrix} 2 & 1 & 1 & 1 \\ 1 & 2 & 1 & 1 \end{matrix})$
$(\begin{matrix} 2^{2} \\ 2^{2} \end{matrix}) + (\begin{matrix} 1^{1} \\ 1^{1} \end{matrix})$	$(\begin{matrix} -, \times, \times, \times \\ -, \times, \times, \times \end{matrix})$	$(\begin{matrix} 2 & 1 & 1 & 1 \\ 2 & 1 & 1 & 1 \end{matrix})$
$(\begin{matrix} 2^{1} \\ 1^{1} \end{matrix}) + (\begin{matrix} 1^{1} \\ 2^{1} \end{matrix})$	$(\begin{matrix} \times, -, \times, \times \\ \times, \times, \times, - \end{matrix})$	$(\begin{matrix} 1 & 2 & 1 & 1 \\ 1 & 1 & 1 & 2 \end{matrix})$
$(\begin{matrix} 2^{2} \\ 1^{1} \end{matrix}) + (\begin{matrix} 1^{1} \\ 2^{1} \end{matrix})$	$(\begin{matrix} -, \times, \times, \times \\ \times, \times, \times, - \end{matrix})$	$(\begin{matrix} 2 & 1 & 1 & 1 \\ 1 & 1 & 1 & 2 \end{matrix})$
$(\begin{matrix} 2^{1} \\ 1^{1} \end{matrix}) + (\begin{matrix} 1^{1} \\ 2^{2} \end{matrix})$	$(\begin{matrix} \times, -, \times, \times \\ \times, \times, -, \times \end{matrix})$	$(\begin{matrix} 1 & 2 & 1 & 1 \\ 1 & 1 & 2 & 1 \end{matrix})$
$(\begin{matrix} 2^{2} \\ 1^{1} \end{matrix}) + (\begin{matrix} 1^{1} \\ 2^{2} \end{matrix})$	$(\begin{matrix} -, \times, \times, \times \\ \times, \times, -, \times \end{matrix})$	$(\begin{matrix} 2 & 1 & 1 & 1 \\ 1 & 1 & 2 & 1 \end{matrix})$

Equations241

\displaystyle f\bigl{(}(1,1)\bigr{)}=2,\quad f\bigl{(}(1,0)\bigr{)}=1,\quad f\bigl{(}(0,1)\bigr{)}=1

\displaystyle f\bigl{(}(1,1)\bigr{)}=2,\quad f\bigl{(}(1,0)\bigr{)}=1,\quad f\bigl{(}(0,1)\bigr{)}=1

[(01) (11)], [(11) (01)], [(01) (11)^{◊}], [(11)^{◊} (01)],

[(01) (11)], [(11) (01)], [(01) (11)^{◊}], [(11)^{◊} (01)],

[(01) (01) (10)], [(01) (10) (01)], [(10) (01) (01)]

(ℓ k)_{f} = m_{1} + \dots + m_{k} = ℓ \sum f (m_{1}) \dots f (m_{k}) .

(ℓ k)_{f} = m_{1} + \dots + m_{k} = ℓ \sum f (m_{1}) \dots f (m_{k}) .

S (ℓ) = {s \in N^{N} ∣ s \neq = 0, 0 \leq s_{j} \leq ℓ_{j}, j = 1, \dots, N}

S (ℓ) = {s \in N^{N} ∣ s \neq = 0, 0 \leq s_{j} \leq ℓ_{j}, j = 1, \dots, N}

\displaystyle\mathcal{P}(\boldsymbol{\ell};k)=\{\bigl{(}r_{1},r_{2},\ldots\bigr{)}\,|\,r_{i}\geq 0,\sum_{i\geq 1}r_{i}=k,\sum_{i\geq 1}r_{i}\mathbf{s}_{i}=\boldsymbol{\ell}\}

\displaystyle\mathcal{P}(\boldsymbol{\ell};k)=\{\bigl{(}r_{1},r_{2},\ldots\bigr{)}\,|\,r_{i}\geq 0,\sum_{i\geq 1}r_{i}=k,\sum_{i\geq 1}r_{i}\mathbf{s}_{i}=\boldsymbol{\ell}\}

\displaystyle\mathcal{P}(\boldsymbol{\ell})=\{\bigl{(}r_{1},r_{2},\ldots\bigr{)}\,|\,r_{i}\geq 0,\sum_{i\geq 1}r_{i}\mathbf{s}_{i}=\boldsymbol{\ell}\}

\displaystyle\mathcal{P}(\boldsymbol{\ell})=\{\bigl{(}r_{1},r_{2},\ldots\bigr{)}\,|\,r_{i}\geq 0,\sum_{i\geq 1}r_{i}\mathbf{s}_{i}=\boldsymbol{\ell}\}

P [X_{1} + \dots + X_{k} = ℓ]

P [X_{1} + \dots + X_{k} = ℓ]

= m_{1} + \dots + m_{k} = ℓ \sum f (m_{1}) \dots f (m_{k}) = (ℓ k)_{f} .

P [X_{1} + \dots + X_{k} = ℓ] = (\frac{1}{∣ S ∣})^{k} (ℓ k)_{g_{S}}

P [X_{1} + \dots + X_{k} = ℓ] = (\frac{1}{∣ S ∣})^{k} (ℓ k)_{g_{S}}

(2 π)^{- N /2} ∣ Σ_{k} ∣^{- 1/2} exp (- \frac{1}{2} (ℓ - μ_{k})^{⊺} Σ_{k}^{- 1} (ℓ - μ_{k}))

(2 π)^{- N /2} ∣ Σ_{k} ∣^{- 1/2} exp (- \frac{1}{2} (ℓ - μ_{k})^{⊺} Σ_{k}^{- 1} (ℓ - μ_{k}))

∣ Σ ∣ = j = 1 \prod N \frac{( ν _{j} + 1 ) ^{2} - 1}{12} .

∣ Σ ∣ = j = 1 \prod N \frac{( ν _{j} + 1 ) ^{2} - 1}{12} .

\displaystyle\binom{k}{k\boldsymbol{\mu}}_{g_{S}}\sim\frac{\Bigl{(}\prod_{j}\nu_{j}+1\Bigr{)}^{k}}{(2\pi)^{N/2}k\sqrt{|\mathbf{\Sigma|}}}.

\displaystyle\binom{k}{k\boldsymbol{\mu}}_{g_{S}}\sim\frac{\Bigl{(}\prod_{j}\nu_{j}+1\Bigr{)}^{k}}{(2\pi)^{N/2}k\sqrt{|\mathbf{\Sigma|}}}.

P [x = 0] = \frac{1}{3}, P [x = 1] = \frac{2}{3}, P [x y = 0] = \frac{2}{3}, P [x y = 1] = \frac{1}{3} .

P [x = 0] = \frac{1}{3}, P [x = 1] = \frac{2}{3}, P [x y = 0] = \frac{2}{3}, P [x y = 1] = \frac{1}{3} .

Σ = (\frac{2}{9} - \frac{1}{9} - \frac{1}{9} \frac{2}{9}), μ = (\frac{2}{3} \frac{2}{3}) .

Σ = (\frac{2}{9} - \frac{1}{9} - \frac{1}{9} \frac{2}{9}), μ = (\frac{2}{3} \frac{2}{3}) .

(k μ k)_{g_{S}} \sim \frac{3 ^{k}}{2 π k \frac{1}{27}} = \frac{3 ^{k + 1}}{2 π k \frac{1}{3}}

(k μ k)_{g_{S}} \sim \frac{3 ^{k}}{2 π k \frac{1}{27}} = \frac{3 ^{k + 1}}{2 π k \frac{1}{3}}

(( 10 , 10 ) 15)_{g_{S}} = 756, 756,

(( 10 , 10 ) 15)_{g_{S}} = 756, 756,

(( 12 , 12 ) 18)_{g_{S}} = 17, 153, 136,

(( 12 , 12 ) 18)_{g_{S}} = 17, 153, 136,

\displaystyle F(\mathbf{x};k)=\Bigl{(}\sum_{\mathbf{s}\in\mathbb{N}^{N}}f(\mathbf{s})\mathbf{x}^{\mathbf{s}}\Bigr{)}^{k},

\displaystyle F(\mathbf{x};k)=\Bigl{(}\sum_{\mathbf{s}\in\mathbb{N}^{N}}f(\mathbf{s})\mathbf{x}^{\mathbf{s}}\Bigr{)}^{k},

m_{1} + \dots + m_{k} = ℓ \sum f (m_{1}) \dots f (m_{k}),

m_{1} + \dots + m_{k} = ℓ \sum f (m_{1}) \dots f (m_{k}),

(ℓ k)_{f}

(ℓ k)_{f}

(ℓ k)_{f}

ℓ (ℓ k)_{f}

(ℓ k)_{f}

E [T_{i} ∣ T_{k} = ℓ] = \frac{ℓ}{k} i,

E [T_{i} ∣ T_{k} = ℓ] = \frac{ℓ}{k} i,

E [T_{i} ∣ T_{k} = ℓ] = s \in N^{N} \sum s \frac{P [ T _{i} = s , T _{k} = ℓ ]}{P [ T _{k} = ℓ ]} = s \in N^{N} \sum s \frac{P [ T _{i} = s ] \cdot P [ T _{k - i} = ℓ - s ]}{P [ T _{k} = ℓ ]} .

E [T_{i} ∣ T_{k} = ℓ] = s \in N^{N} \sum s \frac{P [ T _{i} = s , T _{k} = ℓ ]}{P [ T _{k} = ℓ ]} = s \in N^{N} \sum s \frac{P [ T _{i} = s ] \cdot P [ T _{k - i} = ℓ - s ]}{P [ T _{k} = ℓ ]} .

(ℓ k)_{f} = s \in N^{N} \sum f (s) (ℓ - s k - 1)_{f},

(ℓ k)_{f} = s \in N^{N} \sum f (s) (ℓ - s k - 1)_{f},

(0 k)_{f}

(0 k)_{f}

(x 1)_{f}

(x 0)_{f}

[z^{s}] (G \circ F) (z)

[z^{s}] (G \circ F) (z)

= n \geq 0 \sum g_{n} (s n)_{f}

(G \circ F) (z) = s \in N^{N} \sum z^{s} (n \geq 0 \sum g_{n} (s n)_{f}) = s \in N^{N} \sum z^{s} n \geq 0 \sum g_{n} π \in C (s; n) \sum f_{m_{1}} \dots f_{m_{n}} = s \in N^{N} \sum z^{s} n \geq 0 \sum π \in C (s; n) \sum g_{n} f_{m_{1}} \dots f_{m_{n}} = s \in N^{N} \sum z^{s} π \in C (s) \sum g_{∣ π ∣} f_{π}

(G \circ F) (z) = s \in N^{N} \sum z^{s} (n \geq 0 \sum g_{n} (s n)_{f}) = s \in N^{N} \sum z^{s} n \geq 0 \sum g_{n} π \in C (s; n) \sum f_{m_{1}} \dots f_{m_{n}} = s \in N^{N} \sum z^{s} n \geq 0 \sum π \in C (s; n) \sum g_{n} f_{m_{1}} \dots f_{m_{n}} = s \in N^{N} \sum z^{s} π \in C (s) \sum g_{∣ π ∣} f_{π}

\frac{1}{ℓ !} \frac{\partial ^{∣∣ ℓ ∣∣} H ( 0 )}{\partial z ^{ℓ}} = [z^{ℓ}] H (z),

\frac{1}{ℓ !} \frac{\partial ^{∣∣ ℓ ∣∣} H ( 0 )}{\partial z ^{ℓ}} = [z^{ℓ}] H (z),

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

The Combinatorics of Weighted Vector Compositions

Steffen Eger

Department of Computer Science

Technical University Darmstadt

[email protected]

Abstract

A vector composition of a vector $\boldsymbol{\ell}$ is a matrix $\mathbf{A}$ whose rows sum to $\boldsymbol{\ell}$ . We define a weighted vector composition as a vector composition in which the column values of $\mathbf{A}$ may appear in different colors. We study vector compositions from different viewpoints: (1) We show how they are related to sums of random vectors and (2) how they allow to derive formulas for partial derivatives of composite functions. (3) We study congruence properties of the number of weighted vector compositions, for fixed and arbitrary number of parts, many of which are analogous to those of ordinary binomial coefficients and related quantities. Via the Central Limit Theorem and their multivariate generating functions, (4) we also investigate the asymptotic behavior of several special cases of numbers of weighted vector compositions. Finally, (5) we conjecture an extension of a primality criterion due to Mann and Shanks [28] in the context of weighted vector compositions.

1 Introduction

An integer composition (ordered partition) of a nonnegative integer $n$ is a tuple $(\pi_{1},\ldots,\pi_{k})$ of nonnegative integers whose sum is $n$ . The $\pi_{i}$ ’s are called the parts of the composition. For fixed number $k$ of parts, the number of $f$ -weighted integer compositions—also called $f$ -colored integer compositions in the literature—in which each part size $s$ may occur in $f(s)$ different colors, is given by the extended binomial coefficient $\binom{k}{n}_{f}$ [12].

We generalize here the notion of weighted integer compositions to weighted vector compositions. For a vector $\boldsymbol{\ell}\in\mathbb{N}^{N}$ , for $N\geq 1$ , a vector composition [4] of $\boldsymbol{\ell}$ with $k$ parts is a matrix $\mathbf{A}=[\mathbf{m}_{1},\ldots,\mathbf{m}_{k}]\in\mathbb{N}^{N\times k}$ such that $\mathbf{m}_{1}+\cdots+\mathbf{m}_{k}=\boldsymbol{\ell}$ . We call a vector composition $f$ -weighted, for a function $f:\mathbb{N}^{N}\rightarrow\mathbb{N}$ , when each part of ‘size’ $\mathbf{m}$ may occur in one of $f(\mathbf{m})$ different colors in the composition. For example, for $N=2$ and $f$ :

[TABLE]

and $f(\mathbf{x})=0$ for all other $\mathbf{x}\in\mathbb{N}^{2}$ , there are seven distinct $f$ -weighted vector compositions of $\boldsymbol{\ell}=(1,2)$ , namely:

[TABLE]

where $\Diamond$ distinguishes between the two values of $(1,1)$ . For fixed number $k\geq 0$ of parts, we denote the number of distinct $f$ -weighted vector compositions of $\boldsymbol{\ell}\in\mathbb{N}^{N}$ by $\binom{k}{\boldsymbol{\ell}}_{f}$ . Moreover, the number $c_{f}(\boldsymbol{\ell})$ of $f$ -weighted vector compositions with arbitrarily many parts is then given by $c_{f}(\boldsymbol{\ell})=\sum_{k\geq 0}\binom{k}{\boldsymbol{\ell}}_{f}$ .

The number of $f$ -weighted vector compositions with $k$ parts may be represented as

[TABLE]

When the function $f$ takes values in $\mathbb{R}$ (or even in a commutative ring), then the RHS of Eq. (4) gives the total weight of all vector compositions of $\boldsymbol{\ell}$ with $k$ parts, where we define the weight of a composition $[\mathbf{m}_{1},\ldots,\mathbf{m}_{k}]$ as $f(\mathbf{m}_{1})\cdots f(\mathbf{m}_{k})$ .

We study $f$ -weighted vector compositions from several viewpoints. Section 2 relates weighted vector compositions to sums of random vectors. Section 3 introduces basic identities for $\binom{k}{\boldsymbol{\ell}}_{f}$ which will be used in follow-up results. Section 4 derives a formula for partial derivatives of composite functions using these identities. Our formula generalizes the famous formula of Faà di Bruno (see [26]) for the higher order derivatives of a composite function. Section 5 gives divisibility properties of $\binom{k}{\boldsymbol{\ell}}_{f}$ and in Section 6, we derive congruences and identities for sums of $\binom{k}{\boldsymbol{\ell}}_{f}$ , including $c_{f}(\boldsymbol{\ell})$ . Our results in these two sections generalize corresponding results from [16, 41] for weighted integer compositions, and others for ordinary binomial coefficients. We also generalize here the notion of so-called $s$ -color compositions in which a part of size $s$ may occur in $s$ different colors in a composition [2]. We discuss asymptotics of weighted vector compositions in Section 7 and the primality criterion of Mann and Shanks [28] in the context of weighted vector compositions in Section 8.

In the rest of this work, we use the following notation and definitions. We write vectors and matrices in bold font ( $\mathbf{x},\boldsymbol{\ell},\ldots$ ) to distinguish them from ‘scalars’ ( $k,n,\ldots$ ). We write vectors as row vectors ( $(x,y,z),\ldots$ ). We write the components of a vector $\mathbf{x}$ as $x_{1},x_{2},\ldots$ and similarly for matrices. We use the standard notation, $\binom{k}{n}$ , for ordinary binomial coefficients, which are a special case of our setup. They are retrieved when $N=1$ and $f(x)$ is the indicator function on $\{0,1\}$ , that is, $f(x)=1$ for $x\in\{0,1\}$ and $f(x)=0$ for all other $x$ .

We let $\mathbb{N}=\{0,1,2,\ldots\}$ be the set of nonnegative integers. Let $k\geq 0,N\geq 1$ and let $\boldsymbol{\ell}\in\mathbb{N}^{N}$ . Let $\mathbf{0}=\mathbf{0}_{N}=(0,\ldots,0)\in\mathbb{N}^{N}$ and let $\mathbf{1}=\mathbf{1}_{N}=(1,\ldots,1)\in\mathbb{N}^{N}$ . Let

[TABLE]

be the set of all non-zero part sizes in $\mathbb{N}^{N}$ ‘bounded from above’ by $\boldsymbol{\ell}$ , and let $S_{\mathbf{0}}(\boldsymbol{\ell})=S(\boldsymbol{\ell})\cup\{\mathbf{0}\}$ . Let the elements in $S(\boldsymbol{\ell})$ or $S_{\mathbf{0}}(\boldsymbol{\ell})$ be enumerated as $\mathbf{s}_{1},\mathbf{s}_{2},\ldots$ . We denote by $\mathcal{P}^{(\mathbf{S_{\mathbf{0}}(\boldsymbol{\ell})})}(\boldsymbol{\ell};k)=\mathcal{P}(\boldsymbol{\ell};k)$ the set

[TABLE]

of vector partitions (unordered compositions) of $\boldsymbol{\ell}$ with $k$ parts, including part size $\mathbf{0}$ . Here, $r_{1},r_{2},\ldots$ are the multiplicities of the part sizes $\mathbf{s}_{1},\mathbf{s}_{2},\ldots$ . We similarly define $\mathcal{P}^{(\mathbf{S(\boldsymbol{\ell})})}(\boldsymbol{\ell};k)$ as the set of vector partitions of $\boldsymbol{\ell}$ with $k$ parts, excluding $\mathbf{0}$ . We write $\mathcal{P}^{(\mathbf{S_{\mathbf{0}}(\boldsymbol{\ell})})}(\boldsymbol{\ell})=\mathcal{P}(\boldsymbol{\ell})$ for the set of vector partitions, part size $\mathbf{0}$ included, of $\boldsymbol{\ell}$ with arbitrary number of parts:

[TABLE]

and analogously for $\mathcal{P}^{(\mathbf{S(\boldsymbol{\ell})})}(\boldsymbol{\ell})$ .111When it is clear from context whether $\mathbf{0}$ is included or not, we may also write $\mathcal{P}(\boldsymbol{\ell})$ for both $\mathcal{P}^{(\mathbf{S(\boldsymbol{\ell})})}(\boldsymbol{\ell})$ and $\mathcal{P}^{(\mathbf{S_{\mathbf{0}}(\boldsymbol{\ell})})}(\boldsymbol{\ell})$ , and similarly for related quantities. For a scalar $a$ and a vector $\mathbf{b}$ , we write $a|\mathbf{b}$ , when $a|b_{i}$ for all components $b_{i}$ of $\mathbf{b}$ .

Background: Weighted vector compositions generalize the concept of vector compositions introduced in Andrews [4]. In fact, vector compositions are $f$ -weighted vector compositions for which $f=f_{S_{0}}$ is the indicator function on $S_{0}=\mathbb{N}^{N}-\{\mathbf{0}_{N}\}$ . For the same $f_{S_{0}}$ , Munarini et al. [31] introduce matrix compositions. These are matrices whose entries sum to a positive integer $n$ and whose columns are non-zero. We find that the number $c^{(N)}(n)$ of matrix compositions of $n$ for matrices with $N$ rows satisfies $c^{(N)}(n)=\sum_{\ell_{1}+\cdots+\ell_{N}=n}c_{f_{S_{0}}}(\ell_{1},\ldots,\ell_{N})$ .

Vector compositions are also closely related to lattice path combinatorics [44]. Lattice paths are paths from the origin $\mathbf{0}_{N}$ to some point $\boldsymbol{\ell}=(\ell_{1},\ldots,\ell_{N})\in\mathbb{N}^{N}$ where each step lies in some set $S$ . In our case, each coordinate of each step $\mathbf{s}\in S$ is nonnegative. Vector compositions also generalize the concept of alignments considered in computational biology and computational linguistics [20]. For example, the number of (standard) alignments of $N$ sequences of lengths $\boldsymbol{\ell}=(\ell_{1},\ldots,\ell_{N})$ is given by $c_{f_{S_{1}}}(\boldsymbol{\ell})$ , where $f_{S_{1}}$ is the indicator function on $S_{1}=\{(s_{1},\ldots,s_{N})\,|\,s_{i}\in\{0,1\}\}-\{\mathbf{0}_{N}\}$ . When $f_{S}$ is the indicator function on more ‘complex’ $S\subseteq\mathbb{N}^{N}$ , $c_{f_{S}}$ counts “many-to-many” alignments [15].

Weighted integer compositions, that is, the case when $N=1$ , go back to [29] and [46, 47]. Recently, they have attracted attention in the form of so-called $s$ -color compositions, for which $f$ is specified as identity function, that is, $f(s)=s$ [2, 21, 32, 39, 41]. More general $f$ have been considered in [1, 7, 12, 17, 16, 25, 40], to name just a few. Results on standard integer compositions, i.e., where $f$ is the indicator function on $\mathbb{N}-\{0\}$ or a subset thereof, are found in [23].

2 Relation to multivariate random variables

Let $X_{1},X_{2},\ldots$ be i.i.d. discrete random vectors with common distribution function $f(\mathbf{x})=P[X=\mathbf{x}]$ , for $\mathbf{x}\in\mathbb{N}^{N}$ . Then the distribution of the sum $X_{1}+X_{2}+\cdots+X_{k}$ is given by

[TABLE]

Let $f$ be the discrete uniform measure on some $S\subseteq\mathbb{N}^{N}$ . Then

[TABLE]

where $g_{S}$ is the indicator function on $S$ . Thus $\binom{k}{\boldsymbol{\ell}}_{g_{S}}=|S|^{k}P[X_{1}+\cdots+X_{k}=\boldsymbol{\ell}]$ . Moreover, $P[X_{1}+\cdots+X_{k}=\boldsymbol{\ell}]$ may be approximated by the multivariate normal distribution according to the multivariate Central Limit Theorem (CLT). That is, for large $k$ , $P[X_{1}+\cdots+X_{k}=\boldsymbol{\ell}]$ can be approximated by the density

[TABLE]

where $\boldsymbol{\mu}_{k}=k\boldsymbol{\mu}$ and $\mathbf{\Sigma}_{k}=k\mathbf{\Sigma}$ are the mean vector and covariance matrix of $X_{1}+\cdots+X_{k}$ , respectively. Here, $\boldsymbol{\mu}$ is the mean vector of each $X_{i}$ and $\mathbf{\Sigma}$ is the covariance matrix among the components of $X_{i}$ , where $|\mathbf{\Sigma}|$ denotes its determinant. The approximation holds for large $k$ .

Example 2.1.

Let $S=\prod_{j=1}^{N}\{0,1,\ldots,\nu_{j}\}$ , for integers $\nu_{j}>0$ . Let $X_{i}$ be uniformly distributed on $S$ , for all $i=1,\ldots,k$ . We have $\boldsymbol{\mu}=\operatorname{E}[X_{i}]=(\nu_{1}/2,\ldots,\nu_{N}/2)$ . Since the components of $X_{i}$ are independent of each other and since the variance of each component $j$ of $X_{i}$ is given by $\frac{(\nu_{j}+1)^{2}-1}{12}$ (variance of uniform distributed random variable on $\{0,\ldots,\nu_{j}\}$ ), we find that

[TABLE]

This leads to the approximation

[TABLE]

When $N=1$ and $\nu_{1}=1$ we obtain the well-known approximation $\frac{2^{k+1}}{\sqrt{2\pi k}}$ for the central binomial coefficient $\binom{k}{k/2}$ .

Example 2.2.

Let $S=\{(0,1),(1,0),(1,1)\}$ . Let $X_{i}=(x,y)$ be uniformly distributed on $S$ . We have

[TABLE]

Therefore $\operatorname{Cov}(x,y)=E[xy]-E[x]E[y]=1/3-(2/3)^{2}=3/9-4/9=-\frac{1}{9}$ . Moreover $\operatorname{Var}(x)=\frac{2}{9}$ and thus

[TABLE]

Hence:

[TABLE]

For example, we have

[TABLE]

while the approximation formula yields $791,096.70\ldots$ , which amounts to a relative error of less than 5%. Analogously,

[TABLE]

while the approximation formula yields $17,799,675.85\ldots$ , which amounts to a relative error of less than 4%.

The idea of deriving asymptotics of coefficients via the CLT, that underlies our above approximations, has been developed in different works such as [14, 49]; see [30] for a survey. While such results can also be obtained via singularity or saddle point analysis methods using the generating function for $\binom{k}{\boldsymbol{\ell}}_{f}$ in our case [18, 36], using the CLT with suitably defined random variables is an alternative that may guarantee additional desirable properties such as uniform convergence [33].

3 Basic identities

In the sequel, we write $\mathbf{x}^{\mathbf{s}}$ for $x_{1}^{s_{1}}\cdots x_{N}^{s_{N}}$ where $\mathbf{x}=(x_{1},\ldots,x_{N})$ and $\mathbf{s}=(s_{1},\ldots,s_{N})$ .

For $k\geq 0$ and $\ell_{1},\ldots,\ell_{N}\geq 0$ and $f:\mathbb{N}^{N}\rightarrow\mathbb{R}$ , consider the coefficient of $\mathbf{x}^{\boldsymbol{\ell}}=x_{1}^{\ell_{1}}\cdots x_{N}^{\ell_{N}}$ of the power series $F$ in the variables $x_{1},\ldots,x_{N}$ , where:

[TABLE]

and denote it by $[\mathbf{x}^{\boldsymbol{\ell}}]F(\mathbf{x};k)$ . Our first theorem states that $[\mathbf{x}^{\boldsymbol{\ell}}]F(\mathbf{x};k)$ denotes the combinatorial object we are investigating in this work, the number of $f$ -weighted vector compositions of $\boldsymbol{\ell}$ (with a fixed number, $k$ , of parts). Therefore $F(\mathbf{x};k)$ is the generating function for $\binom{k}{\boldsymbol{\ell}}_{f}$ .

Theorem 3.1.

We have that $[\mathbf{x}^{\boldsymbol{\ell}}]F(\mathbf{x};k)=\binom{k}{\boldsymbol{\ell}}_{f}$ .

Proof.

Collecting terms in (2), we see that $[\mathbf{x}^{\boldsymbol{\ell}}]F(\mathbf{x};k)$ is given as

[TABLE]

where the sum is over all nonnegative vector solutions to $\mathbf{m}_{1}+\cdots+\mathbf{m}_{k}=\boldsymbol{\ell}$ . Using (1) proves the theorem. ∎

Next, we list four identities for $\binom{k}{\boldsymbol{\ell}}_{f}$ which we will make use of in the proofs of (divisibility) properties of the number of vector compositions later on.

Theorem 3.2.

Let $k\geq 0$ and $\boldsymbol{\ell}\in\mathbb{N}^{N}$ . Then, the following hold:

[TABLE]

In (4), $\binom{k}{r_{1},r_{2},\ldots}=\frac{k!}{r_{1}!r_{2}!\cdots}$ denote the multinomial coefficients. In (5), which we will call Vandermonde convolution, the sum is over all solutions $\mathbf{q}_{1},\ldots,\mathbf{q}_{r}$ , $\mathbf{q}_{i}\in\mathbb{N}^{N}$ , of $\mathbf{q}_{1}+\cdots+\mathbf{q}_{r}=\boldsymbol{\ell}$ , and the relationship holds for any fixed composition $(k_{1},\ldots,k_{r})$ of $k$ , for $r\geq 1$ . In (6), $i$ is an integer satisfying $0<i\leq k$ . In (7), $\mathbf{m}\in\mathbb{N}^{N}$ and by $f_{|{f(\mathbf{m})=0}}$ we denote the function $g:\mathbb{N}^{N}\rightarrow\mathbb{N}$ for which $g(\mathbf{s})=f(\mathbf{s})$ , for all $\mathbf{s}\neq\mathbf{m}$ , and $g(\mathbf{m})=0$ .

Proof.

(4) follows from rewriting the sum in (3) as a summation over vector partitions rather than over vector compositions and then adjusting the factors in the sum. (5) follows because each vector composition of $\boldsymbol{\ell}$ with $k$ parts can be subdivided into a fixed number $r$ of ‘subcompositions’ with $k_{1},\ldots,k_{r}$ parts. These represent weighted vector compositions of vectors $\mathbf{q}_{i}$ with $k_{i}$ parts and the subcompositions are independent of each other, given that the $\mathbf{q}_{i}$ ’s sum to $\boldsymbol{\ell}$ . In view of our previous discussions, we prove (6) for sums of random vectors. For $0<i\leq k$ , let $T_{i}$ denote the partial sum $X_{1}+\cdots+X_{i}$ of i.i.d. random vectors $X_{1},\ldots,X_{i},\ldots,X_{k}$ . Consider the conditional expectation $\operatorname{E}[T_{i}\,|\,T_{k}=n]$ , for which the relation

[TABLE]

holds, by independent and identical distribution of $X_{1},\ldots,X_{k}$ . Moreover, by definition of conditional expectation, we have that

[TABLE]

Combining the two identities for $\operatorname{E}[T_{i}\,|\,T_{k}=n]$ and rearranging yields (6). To prove (7), let $\mathbf{m}\in\mathbb{N}^{N}$ . The part value $\mathbf{m}$ may occur $i=0,\ldots,k$ times in a vector composition of $\boldsymbol{\ell}$ with $k$ parts. When it occurs exactly $i$ times we are left with a composition of $\boldsymbol{\ell}-\mathbf{m}i$ into $k-i$ parts in which $\mathbf{m}$ does not occur anymore. The factor $\binom{k}{i}$ distributes the $i$ parts with value $\mathbf{m}$ among $k$ parts and the $i$ parts may be colored independently into $f(\mathbf{m})$ colors. ∎

Remark 3.3.

Note the following important special case of (5) which results when we let $r=2$ and $k_{1}=1$ and $k_{2}=k-1$ ,

[TABLE]

which establishes that the quantities $\binom{k}{\boldsymbol{\ell}}_{f}$ may be perceived of as generating a “Pascal triangle”-like array in which entries in row $k$ are weighted sums of the entries in row $k-1$ . However, note that the entries $\boldsymbol{\ell}$ in rows $k$ themselves lie in an $N$ -dimensional space.

We also note the following special cases of $\binom{k}{\boldsymbol{\ell}}_{f}$ .

Lemma 3.4.

For all $k\in\mathbb{N},\mathbf{x}\in\mathbb{N}^{N}$ , we have that:

[TABLE]

∎

4 Combinatorics of partial derivatives

The formula of Faà di Bruno (1825-1888) describes the higher-order derivatives of a composite function $G\circ F$ as a combinatorial sum of the derivatives of the individual functions $G$ and $F$ . Hardy [22] generalizes this formula to partial derivatives, arguing that treating variables in the derivatives as distinct is more natural. We provide an alternative derivation of the partial derivative formula which is based on interpreting $G\circ F$ as the generating function for weighted vector compositions. As a consequence, the formulas for partial derivatives of composite functions follow effortlessly from different identities for weighted vector compositions.

For two power series $G:\mathbb{R}\rightarrow\mathbb{R}$ and $F:\mathbb{R}^{N}\rightarrow\mathbb{R}$ with $G(z)=\sum_{n\geq 0}g_{n}z^{n}$ and $F(\mathbf{z})=\sum_{\mathbf{s}\in\mathbb{N}^{N}}f_{\mathbf{s}}\mathbf{z}^{\mathbf{s}}$ , we first ask for the power series representation of $G\circ{F}$ . We find that

[TABLE]

by Theorem 3.1. Hence, using (1), we obtain

[TABLE]

where we let $\mathcal{C}(\mathbf{s};n)$ stand for $\{\pi=(\mathbf{m}_{1},\ldots,\mathbf{m}_{n})\,|\,\mathbf{m}_{i}\in\mathbb{N}^{N},\sum_{i=1}^{n}\mathbf{m}_{i}=\mathbf{s}\}$ (vector compositions of $\mathbf{s}$ with fixed number $n$ of parts) and $\mathcal{C}(\mathbf{s})$ analogously represents the class of vector compositions of $\mathbf{s}$ with arbitrary number of parts. Moreover, we use the abbreviation $f_{\pi}=f_{\mathbf{m}_{1}}\cdots f_{\mathbf{m}_{n}}$ and $|\pi|$ denotes the number of parts in $\pi$ . Note that the above representation generalizes the analogous representation derived in Vignat and Wakhare [48] to the multivariate case.

Since

[TABLE]

for any power series $H(\mathbf{z})$ , we immediately have several Faà di Bruno like representations of partial derivatives. Here, we write $\partial\mathbf{z}^{\boldsymbol{\ell}}$ for $\partial z_{1}^{\ell_{1}}\cdots\partial z_{N}^{\ell_{N}}$ , $||\boldsymbol{\ell}||$ for $\ell_{1}+\cdots+\ell_{N}$ and $\boldsymbol{\ell}!$ for $\ell_{1}!\cdots\ell_{N}!$ .

Theorem 4.1.

Let $G\circ F:\mathbb{R}^{N}\rightarrow\mathbb{R}$ , with $F:\mathbb{R}^{N}\rightarrow\mathbb{R}$ and $G:\mathbb{R}\rightarrow\mathbb{R}$ . Let $\boldsymbol{\ell}=(\ell_{1},\ldots,\ell_{N})\in\mathbb{N}^{N}$ and assume that $G$ and $F$ have a sufficient number of derivatives. Then

[TABLE]

∎

Note that in the theorem, terms $\frac{\partial^{||\mathbf{m}_{i}||}F(\mathbf{x})}{\partial\mathbf{z}^{\mathbf{m}_{i}}}$ with $\mathbf{m}_{i}=\mathbf{0}$ drop, so we can perceive of the sum as being over non-zero parts $\mathbf{m}_{i}$ .

Alternative representations of the partial derivative can be derived by considering different identities for $\binom{n}{\mathbf{s}}_{f}$ . For example, using (4), we obtain Theorem 4.2 below. Still other representations follow analogously from considering further identities of $\binom{n}{\mathbf{s}}_{f}$ , e.g., (7), plugged into the representation of $(G\circ F)(\mathbf{z})$ in (8) above.

Theorem 4.2.

Let $G\circ F:\mathbb{R}^{N}\rightarrow\mathbb{R}$ , with $F:\mathbb{R}^{N}\rightarrow\mathbb{R}$ and $G:\mathbb{R}\rightarrow\mathbb{R}$ . Let $\boldsymbol{\ell}=(\ell_{1},\ldots,\ell_{N})\in\mathbb{N}^{N}$ and assume that $G$ and $F$ have a sufficient number of derivatives. Then

[TABLE]

where $r=r_{1}+r_{2}+\cdots$ .∎

Example 4.3.

Let $\boldsymbol{\ell}=(1,2)$ . Then $S(\boldsymbol{\ell})=\{(0,1),(1,0),(1,1),(1,2),(0,2)\}$ . Moreover,

[TABLE]

Therefore, according to Theorem 4.2

[TABLE]

We now show that Theorem 4.1 (or equivalently Theorem 4.2) generalizes the main formula derived in [22]. Recall that a set partition of $[n]=\{1,\ldots,n\}$ is a set of disjoint, non-empty subsets of $[n]$ whose union is $[n]$ .

Lemma 4.4.

There is a bijection between the set of all vector partitions (unordered compositions) of the vector $\underbrace{(1,\ldots,1)}_{{n\text{ times}}}$ into $k$ non-zero parts and the set of all set partitions of $[n]$ into $k$ parts.

The proof of the lemma is straightforward. We can assign each set partition $a=\{a_{1},\ldots,a_{k}\}$ (where $a_{i}\subseteq[n]$ , $a_{i}\neq\emptyset$ , $\bigcup_{i}a_{i}=[n],a_{i}\cap a_{j}=\emptyset$ ) the vector partition $\mathbf{b}_{1}+\cdots+\mathbf{b}_{k}$ where $\mathbf{b}_{i}$ is a vector whose entries are $1$ for all indices in $a_{i}$ and zero otherwise (and vice versa). Due to the properties of $a$ , $\mathbf{b}_{1}+\cdots+\mathbf{b}_{k}$ yields $(1,\ldots,1)$ .

Further, since the parts of each vector partition of $\mathbf{1}=(1,\ldots,1)$ into $k$ non-zero parts are all distinct, we also have that $|\mathcal{C}(\mathbf{1};k)|=k!|\mathcal{P}(\mathbf{1};k)|$ .

To derive the main result in [22], we now let $\boldsymbol{\ell}$ in Theorem 4.1 be $\mathbf{1}=(1,\ldots,1)$ (each of $N$ variables occurs exactly once). Then $||\boldsymbol{\ell}||=N$ and $\boldsymbol{\ell}!=1$ and $\mathbf{m}_{i}!=1$ . Thus,222In the equation, we perceive of $\mathcal{P}(\boldsymbol{\ell})$ as directly containing unordered vectors, rather than multiplicities as in our original definition.

[TABLE]

Interpreting the last quantity as a sum over set partitions, using Lemma 4.4, with the $\mathbf{m}_{i}$ as subsets of $[N]$ yields the formula (5) in [22].

Correspondingly, our representation in Theorem 4.2 is the direct analogue of the representation in [22] based on ‘multiset partitions’ (Corollary to Propositions 1 and 2 in [22] combined with Proposition 4 therein).

There has been some debate on the combinatorial nature of higher-order derivatives. While they may (thus) be perceived of as set partitions [22, 26], Yang [50] finds that they are “essentially integer partitions”. Noting the relationships and equivalences between these concepts and based on our derivations, we may also claim that partial derivatives of composite functions are essentially vector compositions!

5 Congruences for $\binom{k}{\boldsymbol{\ell}}_{f}$

Theorem 5.1 (Parity of $\binom{k}{\boldsymbol{\ell}}_{f}$ ).

Let $k\geq 0$ and let $\boldsymbol{\ell}\in\mathbb{N}^{N}$ . Then

[TABLE]

Proof.

We distinguish three cases.

•

Case $1$ : Let $k$ be even and let one entry of $\boldsymbol{\ell}$ be odd. Consider (6) in Theorem 3.1 with $i=1$ . If $k$ is even, the right-hand side vector is even in each entry. Thus, if $\boldsymbol{\ell}$ is odd in one entry, $\binom{k}{\boldsymbol{\ell}}_{f}$ must be even.

•

Case $2$ : Let $k$ be even and $\boldsymbol{\ell}$ be even in each entry. Consider the Vandermonde convolution in the case of $r=2$ and $k_{1}=k_{2}=k/2$ . Then,

[TABLE]

All pairs $(\mathbf{a},\mathbf{b})$ for which $\mathbf{a}\neq\mathbf{b}$ occur exactly twice, so their sum contributes nothing modulo $2$ . The only term that does not occur twice is $\mathbf{a}=\mathbf{b}$ , for which $\mathbf{a}=\boldsymbol{\ell}/2$ . Hence,

[TABLE]

•

Case $3$ : Let $k$ be odd. Then $k-1$ is even. Thus, the Vandermonde convolution with $k_{1}=1$ , $r=2$ implies

[TABLE]

where we use Case $1$ and Case $2$ in the last congruence.

∎

Example 5.2.

Let $f((0,1,0))=3$ and let $f(\mathbf{s})=1$ for all

$\mathbf{s}\in\{(1,0,0),(0,0,1),(1,1,0),(1,0,1),(0,1,1),(1,1,1)\}$ . Let $f(\mathbf{s})=0$ for all other $\mathbf{s}$ . Then, by Theorem 5.1,

[TABLE]

In fact, $\binom{21}{(20,19,18)}_{f}=7,301,700$ . In contrast,

[TABLE]

Indeed, $\binom{19}{(3,16,2)}_{f}=8,356,358,620,683$ .

Theorem 5.3.

Let $p$ be prime, $\boldsymbol{\ell}\in\mathbb{N}^{N}$ . Then

[TABLE]

We sketch three proofs of Theorem 5.3, a combinatorial proof and two proof sketches based on identities in Theorem 3.2. The first proof uses the following lemma (see [3]).

Lemma 5.4.

Let $S$ be a finite set, let $p$ be prime, and suppose $g:S\rightarrow S$ has the property that $g^{p}(x)=x$ for any $x$ in $S$ , where $g^{p}$ is the $p$ -fold composition of $g$ . Then $|{S}|\equiv|{F}|\pmod{p}$ , where $F$ is the set of fixed points of $g$ . ∎

Proof of Theorem 5.3, 1.

Let $g$ , a map from the set of $f$ -weighted vector compositions of $\boldsymbol{\ell}$ with $p$ parts to itself, be the operation that shifts all parts one to the right, modulo $p$ . In other words, $g$ maps (denoting colors by superscripts) $[\mathbf{m}_{1}^{\alpha_{1}},\mathbf{m}_{2}^{\alpha_{2}},\ldots,\mathbf{m}_{p-1}^{\alpha_{p-1}},\mathbf{m}_{p}^{\alpha_{p}}]$ to

[TABLE]

Of course, applying $g$ $p$ times yields the original vector composition, that is, $g^{p}(x)=x$ for all $x$ . We may thus apply Lemma 5.4. If $\boldsymbol{\ell}$ allows a representation $\boldsymbol{\ell}=p\mathbf{m}$ for some suitable $\mathbf{m}$ , $g$ has exactly $f(\mathbf{m})$ fixed points, namely, all compositions $\underbrace{[\mathbf{m}^{1},\ldots,\mathbf{m}^{1}]}_{p\text{ times}}$ to $\underbrace{[\mathbf{m}^{f(\mathbf{m})},\ldots,\mathbf{m}^{f(\mathbf{m})}]}_{p\text{ times}}$ . Otherwise, if $\boldsymbol{\ell}$ has no such representation, $g$ has no fixed points. This proves the theorem. ∎

Proof of Theorem 5.3, 2.

We apply (7) in Theorem 3.2. Since for the ordinary binomial coefficients, the relation $\binom{p}{n}\equiv 0\pmod{p}$ holds for all $1\leq n\leq p-1$ and $\binom{p}{0}=\binom{p}{p}=1$ , we have

[TABLE]

for any $\mathbf{m}$ and where the last congruence is due to Fermat’s little theorem. Therefore, if $\boldsymbol{\ell}=\mathbf{m}p$ for some $\mathbf{m}$ , then $\binom{p}{\boldsymbol{\ell}}_{f}\equiv\binom{p}{\boldsymbol{\ell}}_{f_{|f(\mathbf{m})=0}}+f(\mathbf{m})\pmod{p}$ and otherwise $\binom{p}{\boldsymbol{\ell}}_{f}\equiv\binom{p}{\boldsymbol{\ell}}_{f_{|f(\mathbf{m})=0}}\pmod{p}$ for any $\mathbf{m}$ . Now, the theorem follows inductively. ∎

Proof of Theorem 5.3, 3.

We use (4) in Theorem 3.2 in conjunction with the following property of multinomial coefficients (see, e.g., [38]):

[TABLE]

Since the multiplicities $r_{1},r_{2},\ldots$ for $\binom{p}{\boldsymbol{\ell}}_{f}$ in (4) satisfy $r_{1}+r_{2}+\cdots=p$ , we have $d=\gcd{(r_{1},r_{2},\ldots)}\in\{1,p\}$ , since otherwise $p$ was composite. Moreover, $d=p$ if and only if exactly one of the $r_{i}$ equals $p$ and all the other are zero. Hence, whenever $\boldsymbol{\ell}\neq p\mathbf{m}$ , for any $\mathbf{m}$ , then $d=1$ for all $(r_{1},r_{2},\ldots)$ in the summation, for otherwise, the condition $r_{1}\mathbf{s}_{1}+r_{2}\mathbf{s}_{2}+\cdots=\boldsymbol{\ell}$ would imply that $p\mathbf{s}_{i}=\boldsymbol{\ell}$ , a contradiction. Therefore, $\binom{p}{\boldsymbol{\ell}}_{f}\equiv 0\pmod{p}$ since all terms in the summation in (4) are congruent to zero modulo $p$ by (9). Consider now the case $\boldsymbol{\ell}=p\mathbf{m}$ for some $\mathbf{m}$ . Then, $\mathbf{m}\in S(\boldsymbol{\ell})$ , that is, $\mathbf{m}=\mathbf{s}_{i}$ for some $i$ . Again, the only terms in the summation that contribute modulo $p$ are those for which $d=p$ . Thus, there is exactly one term that contributes, namely, $(r_{1},r_{2},\ldots,r_{i},\ldots)=(0,0,\ldots,p,\ldots)$ . Therefore, $\binom{p}{\boldsymbol{\ell}}_{f}\equiv\binom{p}{0,\ldots,0,p,0,\ldots}f(\mathbf{\mathbf{m}})^{p}\equiv f(\mathbf{m})\pmod{p}$ . ∎

We call the next congruence Babbage’s congruence, since Charles Babbage was apparently the first to assert the respective congruence in the case of ordinary binomial coefficients [5].

Theorem 5.5 (Babbage’s congruence).

Let $p$ be prime, let $n$ be a nonnegative integer, and let $\mathbf{m}\in\mathbb{N}^{N}$ . Then

[TABLE]

whereby $g$ is defined as $g(\mathbf{x})=\binom{p}{\mathbf{x}p}_{f}$ , for all $\mathbf{x}$ .

Proof.

By the Vandermonde convolution, we have

[TABLE]

Now, by Theorem 5.3, $p$ divides $\binom{p}{\mathbf{x}}_{f}$ whenever $\mathbf{x}$ is not of the form $\mathbf{x}=\mathbf{r}p$ . Hence, modulo $p^{2}$ , the only terms that contribute to the sum are those for which at least $n-1$ $\mathbf{k}_{i}$ ’s are of the form $\mathbf{k}_{i}=\mathbf{r}_{i}p$ . Since the $\mathbf{k}_{i}$ ’s must sum to $\mathbf{m}p$ , this implies that all $\mathbf{k}_{i}$ ’s are of the form $\mathbf{k}_{i}=\mathbf{r}_{i}p$ , for $i=1,\ldots,n$ . Hence, modulo $p^{2}$ , (10) becomes

[TABLE]

The last sum is precisely $\binom{n}{\mathbf{m}}_{g}$ . ∎

Example 5.6.

Let $f$ be the indicator function on the set

$\{(1,0),(0,1),(1,1),(2,1),(1,2)\}$ . Let $p=3$ , $n=2$ , and $\mathbf{m}=(1,2)$ . Enumeration shows that

[TABLE]

Moreover, $\binom{2}{(1,2)}_{g}$ can be determined by looking at the compositions of $(1,2)$ in two parts, which are $(1,2)=(0,1)+(1,1)=(1,1)+(0,1)$ . We have $g((1,1))=\binom{3}{(3,3)}_{f}=13$ and $g((0,1))=\binom{3}{(0,3)}_{f}=1$ . Hence, $\binom{2}{(1,2)}_{g}=26\equiv 8\equiv\binom{6}{(3,6)}_{f}\pmod{3^{2}}$ , as predicted.

Since $g(\mathbf{m})\equiv f(\mathbf{m})\pmod{p}$ , by Theorem 5.3, we have the following theorem.

Theorem 5.7.

Let $p$ be prime, let $n$ be a nonnegative integer, and let $\mathbf{m}\in\mathbb{N}^{N}$ . Then

[TABLE]

∎

We use Theorem 5.7 to prove a stronger version of Theorem 5.3, namely:

Theorem 5.8.

Let $p$ be prime and let $m\geq 1$ , $\boldsymbol{\ell}\in\mathbb{N}^{N}$ . Then

[TABLE]

Proof.

Let $\boldsymbol{\ell}=p^{m}\mathbf{m}$ . Using Theorem 5.7 twice, we find for $m=2$

[TABLE]

Using this, we find that:

[TABLE]

and so on for any $m$ .

Consider now the case $\boldsymbol{\ell}\neq p^{m}\mathbf{m}$ for any $\mathbf{m}$ . We use (7) from Theorem 3.2 together with the fact that $\binom{p^{m}}{n}\equiv 0\pmod{p}$ when $0<n<p^{m}$ and $\equiv 1\pmod{p}$ whenever $n=1,p^{m}$ . From this it follows that

[TABLE]

for any $\mathbf{m}$ . We can successively set all arguments of $f$ to zero and note that hence $\binom{p^{m}}{\boldsymbol{\ell}}_{f}\equiv 0\pmod{p}$ . ∎

Now, we consider the case when $\boldsymbol{\ell}$ in $\binom{np}{\boldsymbol{\ell}}_{f}$ is not of the form $\mathbf{m}p$ for any $\mathbf{m}$ .

Theorem 5.9.

Let $p$ be prime and let $n$ be a nonnegative integer. Let $\boldsymbol{\ell}$ not be of the form $\boldsymbol{\ell}=p\mathbf{m}$ , for any $\mathbf{m}$ . Then

[TABLE]

where $g$ is as defined in Theorem 5.5.

Proof.

By the Vandermonde convolution, (5), we find that

[TABLE]

As in the proof of Theorem 5.9, at least $n-1$ factors $\binom{p}{\mathbf{k}_{j}}_{f}$ must be such that $\mathbf{k}_{j}=\mathbf{r}_{j}p$ . Not all $n$ factors can be of the form $\mathbf{r}_{j}p$ , since otherwise $\mathbf{k}_{1}+\cdots+\mathbf{k}_{n}=p(\mathbf{r}_{1}+\cdots+\mathbf{r}_{n})=\boldsymbol{\ell}$ , a contradiction. Hence, exactly $n-1$ factors must be of the form $\mathbf{r}_{j}p$ , and therefore,

[TABLE]

Now, the equation $p(\mathbf{r}_{2}+\cdots+\mathbf{r}_{n})=\boldsymbol{\ell}-\mathbf{k}$ has solutions if and only if $p\mid\boldsymbol{\ell}-\mathbf{k}$ , that is, when there exists $\mathbf{x}$ such that $\boldsymbol{\ell}-\mathbf{k}=\mathbf{x}p$ . ∎

Example 5.10.

Let $n=4$ , $p=3$ and $\boldsymbol{\ell}=(2,3)$ . In this situation, the only suitable $\mathbf{k}$ in the previous theorem is $\mathbf{k}=(2,3)$ to which corresponds $\mathbf{x}=(0,0)$ . The theorem thus implies that

[TABLE]

Let $f(\mathbf{s})=s_{1}+s_{2}+1$ for all $\mathbf{s}\in\{(0,0),(0,1),(1,0),(1,1)\}$ and $f(\mathbf{s})=0$ otherwise. Then $\binom{3}{(0,0)}_{g}=1$ since $g((0,0))=\binom{3}{(0,0)}_{f}=1$ . Moreover, $\binom{3}{(2,3)}_{f}=54$ . Therefore

[TABLE]

Indeed,

[TABLE]

Theorem 5.11.

Let $k\geq 0$ , $\boldsymbol{\ell}\in\mathbb{N}^{N}$ . Let $d_{i}=\gcd(k,\ell_{i})$ and let $t_{i}=\frac{k}{d_{i}}$ . Then

[TABLE]

for all $i=1,\ldots,N$ . Equivalently,

[TABLE]

Here, $M$ is the number $M=p_{1}^{m_{1}}\cdots p_{R}^{m_{R}}$ , where the $t_{i}$ have prime factorization $t_{i}=p_{1}^{{(a_{i})}_{1}}\cdots p_{R}^{{(a_{i})}_{R}}$ and where $m_{j}=\max_{i}{(a_{i})}_{j}$ , for all $j=1,\ldots,R$ .

Proof.

From (6), with $i=1$ , write

[TABLE]

Now, for any $1\leq i\leq N$ , consider this equation at component $i$ , dividing by $d_{i}=\gcd(k,\ell_{i})$ :

[TABLE]

Since $\gcd(k/d_{i},\ell_{i}/d_{i})=1$ , this means that $\frac{k}{d_{i}}\mid\binom{k}{\boldsymbol{\ell}}_{f}$ for all $i=1,\ldots,N$ . ∎

Example 5.12.

Let $f$ be the indicator function on the set

$\{(1,0),(0,1),(1,1),(1,2),(2,1),(0,0)\}$ . Enumeration shows that

[TABLE]

We have $t_{1}=12/3=4$ and $t_{2}=12/4=3$ . Hence $4\cdot 3$ divides $\binom{12}{(9,8)}_{f}$ , and indeed, $44,742,060=12\cdot 3,728,505$ .

Theorem 5.13.

Let $p$ be prime, $n\geq 1$ arbitrary. Then,

[TABLE]

where $h(\mathbf{s})=\begin{cases}f(\mathbf{s})^{p},&\text{if }\mathbf{s}\in U;\\ 0,&\text{else};\end{cases}$ for $U=\{\mathbf{x}\neq\mathbf{0}\in\mathbb{N}^{N}\,|\,x_{i}\in\{0,1\}\}$ .

In the theorem, note that $\frac{(pn)!}{(pr_{1})!(pr_{2})!\cdots(p(n-k))!}=\frac{(pn)!}{(p!)^{k}(p(n-k))!}$ . Also note that the limit of the summation for $k$ is (more adequately described as) $\min\{n,N\}$ .

Proof.

From (4), $\binom{pn}{p\mathbf{1}}_{f}$ can be written as

[TABLE]

For a term in the sum, either $d=\gcd(r_{1},r_{2},\ldots)=1$ or $d=p$ , since otherwise, if $1<d<p$ , then, $d\cdot\sum_{\mathbf{s}_{i}\in S(p\mathbf{1})}\frac{r_{i}}{d}\mathbf{s}_{i}=p\mathbf{1}$ , whence $p$ is composite, a contradiction. Those terms on the RHS of (11) for which $d=1$ contribute nothing to the sum modulo $pn$ , by (9), so they can be ignored. But, from the equation $\sum_{\mathbf{s}_{i}\in S(p\mathbf{1})}r_{i}\mathbf{s}_{i}=p\mathbf{1}$ , the case $d=p$ happens precisely when:

•

there are $k$ unit vectors $\mathbf{s}_{1},\ldots,\mathbf{s}_{k}\in U$ , for $1\leq k\leq n$ , each of whose associated multiplicity is $p$ , as well as the zero vector $\mathbf{0}$ , whose multiplicity is $p(n-k)$ , such that $\mathbf{s}_{1}+\cdots+\mathbf{s}_{k}+\mathbf{0}=\mathbf{1}$ .

∎

Example 5.14.

When $N=1$ , then $U=\{1\}$ . Hence,

$\binom{pn}{p}_{f}\equiv\binom{pn}{p}f(0)^{p(n-1)}f(1)^{p}\pmod{pn}$ because only the term $k=1$ leads to a valid solution, since $1$ cannot be the sum of two or more elements from $U$ . When $N=2$ , then $U=\{(0,1),(1,0),(1,1)\}$ and the relevant terms are $k=1,2$ . The formula becomes

[TABLE]

Recall that the ordinary binomial coefficients satisfy Lucas’ theorem, namely,

[TABLE]

whenever $k=\sum k_{i}p^{i}$ and $n=\sum n_{i}p^{i}$ with $0\leq n_{i},k_{i}<p$ . Bollinger and Burchard [8] generalize this to extended binomial coefficients. We further generalize to weighted vector compositions.

Theorem 5.15 (Lucas’ theorem).

Let $p$ be prime and let $k=\sum_{j=0}^{r}k_{j}p^{j}$ , where $0\leq k_{j}<p$ for $j=0,\ldots,r$ . Let $\boldsymbol{\ell}\in\mathbb{N}^{N}$ . Then

[TABLE]

whereby the sum is over all $(\mathbf{m}_{0},\ldots,\mathbf{m}_{r})$ that satisfy $\mathbf{m}_{0}+\mathbf{m}_{1}p+\cdots+\mathbf{m}_{r}p^{r}=\boldsymbol{\ell}$ .

Proof.

We have

[TABLE]

where the fourth relation (congruence) follows from Theorem 5.8, and the theorem follows by comparing the coefficients of $\mathbf{x}^{\boldsymbol{\ell}}$ . ∎

Example 5.16.

For a similar situation as in Example 5.6, let $p=3$ and $k=5=2+1\cdot p$ . Thus, $(k_{0},k_{1})=(2,1)$ . For $\boldsymbol{\ell}=(3,6)$ , the relevant $(\mathbf{m}_{0},\mathbf{m}_{1})$ such that $\boldsymbol{\ell}=\mathbf{m}_{0}+p\mathbf{m}_{1}$ are:

[TABLE]

No other $\mathbf{m}_{1}$ must be looked at, because $k_{1}=1$ and $\binom{1}{\mathbf{x}}_{f}=f(\mathbf{x})$ and the specified $f$ is zero outside $\{(0,1),(1,0),(1,1),(1,2),(2,1)\}$ . Hence:

[TABLE]

by Theorem 5.15, which is true, since $\binom{5}{(3,6)}_{f}=80$ .

Our final result in this section allows a fast computation of the coefficients $\binom{k}{\boldsymbol{\ell}}_{f}$ modulo a prime $p$ . See Granville [19] for the corresponding result for the special case of ordinary binomial coefficients.

Theorem 5.17.

Let $p$ be prime, $k\geq 0$ , $\boldsymbol{\ell}\in\mathbb{N}^{N}$ . Then,

[TABLE]

whereby $k=k_{0}+k_{1}p$ with $0\leq k_{0}<p$ , and $\boldsymbol{\ell}=\boldsymbol{\ell}_{0}+\mathbf{x}p$ , where each component $\ell$ of $\boldsymbol{\ell}_{0}$ satisfies $0\leq\ell<p$ .

Proof.

We have

[TABLE]

by Theorem 5.3 and therefore, with $k=k_{0}+k_{1}p$ , for $0\leq k_{0}<p$ ,

[TABLE]

Now, since $\binom{k}{\boldsymbol{\ell}}_{f}$ is the coefficient of $\mathbf{x}^{\boldsymbol{\ell}}$ of $\left(\sum_{\mathbf{s}}f(\mathbf{s})\mathbf{x}^{\mathbf{s}}\right)^{k_{0}+k_{1}p}$ , we have

[TABLE]

and the theorem follows after re-indexing the summation on the RHS. ∎

Example 5.18.

In the situation of Example 5.16, consider $p=3$ , $\boldsymbol{\ell}=(3,6)=(0,0)+(1,2)\cdot p$ and $k=5=2+1\cdot p$ . By Theorem 5.17, we have hence to consider sums of products of the form

[TABLE]

Since $f$ is zero outside of $\{(0,1),(1,0),(1,1),(1,2),(2,1)\}$ , $\mathbf{m}$ ranges over

$\{(0,0),(0,1),(1,1),(0,2)\}$ and the summation is the same as in Example 5.16.

For a more challenging example, let $p=7$ , $\boldsymbol{\ell}=(5,9)=(5,2)+(0,1)p$ and $k=8=1+1\cdot p$ . Here, we have to consider sums of products of the form

[TABLE]

Due to the specification of $f$ , the only possible such term ( $\mathbf{m}=(0,0)$ ) leads to the sum value of [math]. Indeed, $\binom{8}{(5,9)}_{f}=4368=7\cdot 2^{4}\cdot 39$ .

6 Congruences and identities for sums of $\binom{k}{\boldsymbol{\ell}}_{f}$

In this section, we consider divisibility properties and identities for sums of $\binom{k}{\boldsymbol{\ell}}_{f}$ . First, we focus on the number $c_{f}(\boldsymbol{\ell})=\sum_{k\geq 0}\binom{k}{\boldsymbol{\ell}}_{f}$ of vector compositions with arbitrary number of parts. In Theorems 6.8 and 6.10, we then investigate particular divisibility properties for the total number of all $f$ -weighted vector compositions of $\boldsymbol{\ell}$ where $\boldsymbol{\ell}$ ranges over particular sets $L$ and where the number of parts is fixed, that is, we evaluate divisibility of $\sum_{\boldsymbol{\ell}\in L}\binom{k}{\boldsymbol{\ell}}_{f}$ . We also generalize the notion of $s$ -color compositions [2] in this section and derive a corresponding identity.

At first, we establish that $c_{f}(\boldsymbol{\ell})$ satisfies a weighted linear recurrence where the weights are given by $f$ .

Theorem 6.1.

For $\boldsymbol{\ell}\in\mathbb{N}^{N}$ , ${\boldsymbol{\ell}}\neq\mathbf{0}$ , we have that

[TABLE]

where we define $c_{f}(\mathbf{0})=1$ and $c_{f}(\boldsymbol{\ell})=0$ if any component $\ell$ of $\boldsymbol{\ell}$ is smaller than zero.

Proof.

An $f$ -weighted vector composition $[\mathbf{m}_{1},\ldots,\mathbf{m}_{k-1},\mathbf{m}_{k}]$ of $\boldsymbol{\ell}$ ends, in its last part, with exactly one of the values $\mathbf{m}=\mathbf{m}_{k}\in\mathbb{N}^{N}$ , and $\mathbf{m}$ may be colored in $f(\mathbf{m})$ different colors. Moreover, $[\mathbf{m}_{1},\ldots,\mathbf{m}_{k-1}]$ is a vector composition of $\boldsymbol{\ell}-\mathbf{m}$ . ∎

Before investigating divisibility of $c_{f}(\boldsymbol{\ell})$ , we detail special cases of $c_{f}(\boldsymbol{\ell})$ that arise for particular $f$ .

Example 6.2.

When $f_{D}$ is the indicator function on the set

$D=\{(0,1),(1,0),(1,1)\}$ , then $c_{f_{D}}(\boldsymbol{\ell})=c_{f_{D}}(m,n)$ is the well-known Delannoy sequence [6], which counts the number of lattice paths from $(0,0)$ to $(m,n)$ with steps in $D$ (i.e., east, north, north-east). The underlying lattice paths are of interest in sequence alignment problems in computational biology and computational linguistics. They also appear in so-called edit distance problems [27] in which the minimal number of insertions and deletions is sought that transforms one sequence into another. Closed-form expressions for the Delannoy numbers are

[TABLE]

The weighted Delannoy numbers [37], for which $f_{\text{WD}}((1,0))=a$ ,

$f_{\text{WD}}((0,1))=b$ and $f_{\text{WD}}((1,1))=c$ , for integers $a,b,c\geq 1$ , have closed-form expression

[TABLE]

When $f_{W}$ is the indicator function on the set $W=\{(1,1),(1,2),(2,1),(2,2)\}$ , then $c_{f_{W}}(\boldsymbol{\ell})=c_{f_{W}}(m,n)$ are known as Whitney numbers [9]. The diagonals are listed as integer sequence A051286. A closed-form expression can be derived as

[TABLE]

The diagonals of $c_{f_{M}}$ , where $M=\{(1,1),(1,2),(2,1)\}$ , are listed as integer sequence A098479. The diagonals of $c_{f_{R}}$ , where $R=\{(x,y)\,|\,x\geq 1,y\geq 0\}$ , are listed as integer sequence A047781. The diagonals of $c_{f_{A}}$ , where $A=\{(x,y,z)\,|\,0\leq x,y,z\leq 1\}-\{\mathbf{0}_{3}\}$ , are listed as integer sequence A126086. They appear in alignment problems of multiple (in this case, three) sequences. The case of $c_{f_{S}}$ , for $S=\mathbb{N}^{N}-\{\mathbf{0}\}$ , counts the original vector compositions considered in [4]. A closed-form expression is given by

[TABLE]

It has been noted that $c_{f_{S}}(\ell,\ldots,\ell)=2^{\ell-1}c_{f_{U}}(\ell,\ldots,\ell)$ , where

$U=\{(s_{1},\ldots,s_{N})\,|\,s_{i}\in\{0,1\}\}-\{\mathbf{0}\}$ [11]. The latter numbers generalize the Delannoy numbers and admit the closed-form expression [42]

[TABLE]

Next, we generalize the concept of $s$ -color compositions for ordinary colored compositions, for which the weighting function is $f(s)=s$ for each part size $s$ , to weighted vector compositions. Of course, there are many possible extensions of the concept of $s$ -color integer compositions to vector compositions. The most natural is probably the following:

Definition 6.3.

We call an $f$ -weighted vector composition of $\boldsymbol{\ell}$ an $\mathbf{s}$ -color composition when

[TABLE]

for all $\mathbf{s}\in\mathbb{N}^{N}$ .

This definition inherently captures an independent labeling of the vector components $s_{1},\ldots,s_{N}$ into $s_{1}$ colors (for component $1$ ), $\ldots$ , $s_{N}$ colors (for component $N$ ). It is well-known that ordinary $s$ -color compositions [2] are closely related to “1-2-color compositions”, that is, integer compositions that only have part sizes in $\{1,2\}$ ; see, e.g., Shapcott [41]. The next theorem generalizes this relationship.

Theorem 6.4.

Let $f_{\text{prod}}(\mathbf{s})=s_{1}\cdots s_{N}$ for all $\mathbf{s}\in\mathbb{N}^{N}$ and let $g$ be the indicator function on $S=\{(s_{1},\ldots,s_{N})\,|\,s_{i}\in\{1,2\}\}$ . Then

[TABLE]

for all $\ell>0$ .

Proof sketch.

Let $(s_{1},\ldots,s_{N})^{1},\ldots,(s_{1},\ldots,s_{N})^{s_{1}\cdots s_{N}}$ be the $s_{1}\cdots s_{N}$ colorations of part size $(s_{1},\ldots,s_{N})$ . We bijectively re-write them to individual components $(s_{1}^{1},\ldots,s_{N}^{1}),\ldots,(s_{1}^{s_{1}},\ldots,s_{N}^{s_{N}})$ . Now, when we have a sum $\mathbf{s}^{r}+\mathbf{t}^{q}=\ell\mathbf{1}$ (and similarly for more than two terms) this reads in components

[TABLE]

where $r_{1},\ldots,r_{N}$ and $q_{1},\ldots,q_{N}$ denote the bijective re-writings. Consider this equation in each row, $s_{i}^{r_{i}}+t_{i}^{q_{i}}=\ell$ . Encode the integer composition $(s_{i}^{r_{i}},t_{i}^{q_{i}})$ of $\ell$ into the “cross-and-dash representation” of Shapcott [41] in which crosses separate parts and a part value of size $\pi$ with color $1\leq c\leq\pi$ is denoted by $\pi-1$ dashes and one cross in position $c$ . Then, as in Shapcott [41], Proposition 2, let crosses stand for 1s and dashes for 2s. This proves the bijection between $f_{\text{prod}}$ -weighted compositions and $g$ -weighted compositions.

The table below illustrates the $\mathbf{s}$ -color compositions of $(3,3)$ (into two parts) and the uniquely corresponding $g$ -weighted compositions (into four parts).

The table omits the further eight cases corresponding to $(1,1)+(2,2)$ and $(1,2)+(2,1)$ . ∎

Example 6.5.

The number of $f_{\text{prod}}$ -weighted vector compositions of $(\ell,\ell)$ are given by the integer sequence

[TABLE]

for $\ell=1,2,3,\ldots$ . The number of $g$ -weighted vector compositions of $(\ell,\ell)$ are given by integer sequence A051286

[TABLE]

When $f$ is arbitrary but zero almost everywhere, that is, $f(\mathbf{x})\neq 0$ for only finitely many $\mathbf{x}$ , then $c_{f}(\boldsymbol{\ell})$ satisfies a linear recurrence by Theorem 6.1. When $N=1$ , that is, vectors $\boldsymbol{\ell}$ are one-dimensional, then $c_{f}$ satisfies an $m$ -th order linear recurrence of the form

[TABLE]

in this situation.

For such sequences, Somer [43] specifies varying congruence relationships, one of which translates to the following result in our context.

Theorem 6.6 ([16], Theorem 27).

Let $p$ be a prime and let $b$ a nonnegative integer. Let $f:\mathbb{N}\rightarrow\mathbb{N}$ be zero almost everywhere, i.e., $f(x)=0$ for all $x>m$ for some positive $m$ . Then

[TABLE]

∎

However, when $N>1$ , these results are not applicable. One possibility would be to project vectors in $\mathbb{N}^{N}$ onto $\mathbb{N}$ via a bijection $\tau:\mathbb{N}\rightarrow\mathbb{N}^{N}$ and then define new quantities $\tilde{c}_{f}$

[TABLE]

for which the findings of [43] and others might be applicable. The problem with such a specification is that the bijection does not lead, in general, to fixed order linear recurrences because $\tau$ can map different $n,n^{\prime}$ to ‘arbitrary’ points in $\mathbb{N}^{N}$ , so that e.g. $\tilde{c}_{f}(100)$ may be a function of $\tilde{c}_{f}(90)$ and $\tilde{c}_{f}(80)$ , but $\tilde{c}_{f}(1000)$ may be a function of $\tilde{c}_{f}(543)$ and $\tilde{c}_{f}(389)$ .

Another result, for $N=2$ , is the following. Consider the weighted Delannoy numbers for which $f_{\text{WD}}((1,0))=a$ , $f_{\text{WD}}((0,1))=b$ and $f_{\text{WD}}((1,1))=c$ as above. Razpet [37] shows that these numbers satisfy a ‘Lucas property’.

Theorem 6.7 ([37], Theorem 2).

Let $p$ be prime, $n\geq 1$ , and let integers $a_{k},b_{k}$ satisfy

[TABLE]

Then

[TABLE]

∎

Finally, we consider the number of $f$ -weighted vector compositions, with fixed number of parts, of all vectors $\boldsymbol{\ell}$ in some particular sets $L$ . Introduce the following notation:

[TABLE]

where $\mathcal{D}(N)$ is the set of $N\times N$ diagonal matrices with nonnegative integer entries. Note that ${k\brack\mathbf{r}}_{\mathbf{m},f}$ generalizes the binomial sum notation (cf. [45]). By the Vandermonde convolution, ${k\brack\mathbf{r}}_{\mathbf{m},f}$ satisfies

[TABLE]

Our first theorem in this context goes back to J. W. L. Glaisher, and its proof is inspired by the corresponding proof for binomial sums due to Sun (cf. [45], and references therein).

Theorem 6.8.

Let $\mathbf{m}=(m_{1},\ldots,m_{N})\in\mathbb{N}^{N}$ . For any prime $p\equiv 1\pmod{m_{i}}$ , for all $i=1,\ldots,N$ , and any $k\geq 1$ , $\mathbf{r}\in\mathbb{N}^{N}$ ,

[TABLE]

Proof.

For $k=1$ ,

[TABLE]

by Theorem 5.3. Now, in components, the equation $\mathbf{Am}+\mathbf{r}=p\mathbf{q}$ means that $a_{ii}m_{i}+r_{i}=pq_{i}$ . Since $p\equiv 1\pmod{m_{i}}$ , we have $q_{i}\equiv r_{i}\pmod{m_{i}}$ , i.e., $q_{i}=c_{i}m_{i}+r_{i}$ . In vector notation this means $\mathbf{q}=\mathbf{Cm}+\mathbf{r}$ for the diagonal matrix $\mathbf{C}$ with entries $C_{ii}=c_{i}$ . Therefore,

[TABLE]

using Theorem 5.7. The RHS is ${1\brack\mathbf{r}}_{\mathbf{m},f}$ . For $k>1$ , the result follows by induction using (12). ∎

Example 6.9.

Let $f$ be the indicator function on the set $\{(0,1),(1,1),(1,1)\}$ . Let $p=5$ , $k=2$ , $\mathbf{m}=(4,1)$ and $\mathbf{r}=(1,0)$ . To evaluate ${k\brack\mathbf{r}}_{\mathbf{m},f}$ , we consider all matrices $\mathbf{A}\in\mathcal{D}(2)$ and all corresponding sums $\mathbf{Am}+\mathbf{r}$ . Since it is impossible to write $m_{1}=4$ (or larger) as the sum of $k=2$ numbers in $\{0,1\}$ , $a_{11}$ must be zero. The only suitable matrices are then

[TABLE]

The corresponding values $\mathbf{Am}+\mathbf{r}$ are

[TABLE]

We easily find that $\binom{k}{\boldsymbol{\ell}_{0}}_{f}=\binom{k}{\boldsymbol{\ell}_{1}}_{f}=2$ and therefore ${k\brack\mathbf{r}}_{\mathbf{m},f}=4$ . Similarly, for ${k+p-1\brack\mathbf{r}}_{\mathbf{m},f}$ , we have to evaluate matrices

[TABLE]

and correspondingly

[TABLE]

for $n=0,\ldots,6$ . Summing up yields ${k+p-1\brack\mathbf{r}}_{\mathbf{m},f}=204$ , which is indeed $\equiv 4\pmod{5}$ .

Theorem 6.10.

Let $f(\mathbf{s})={0}$ for almost all $\mathbf{s}\in\mathbb{N}^{N}$ . Consider ${k\brack\mathbf{0}}_{\mathbf{1},f}$ , the row sum in row $k\geq 0$ , or, equivalently, the total number of $f$ -weighted vector compositions with $k$ parts. Let $M=\sum_{\mathbf{s}\in\mathbb{N}^{N}}f(\mathbf{s})$ . Then

[TABLE]

for all $k>0$ . This implies the congruences

[TABLE]

for any prime $p$ by Fermat’s little theorem, where $k=a_{0}+\cdots+a_{r}p^{r}$ , with $0\leq a_{i}<p$ for all $i=0,\ldots,r$ .

Proof.

Consider the equation $(\sum_{\mathbf{s}\in\mathbb{N}^{N}}f(\mathbf{s})\mathbf{x}^{\mathbf{s}})^{k}=\sum_{\boldsymbol{\ell}\in\mathbb{N}^{N}}\binom{k}{\boldsymbol{\ell}}_{f}\mathbf{x}^{\boldsymbol{\ell}}$ . Plug in $\mathbf{x}=\mathbf{1}\in\mathbb{N}^{N}$ . ∎

Remark 6.11.

Note that the previous theorem generalizes the fact that the number of odd entries in row $k$ in Pascal’s triangle is a multiple of $2$ .

Example 6.12.

When $f$ is the indicator function on $\{(0,1),(1,1),(1,1)\}$ then $M=3$ and so the row sum in row $k>0$ is $3^{k}$ and thus always odd. To illustrate, for $k=1$ , we have $\binom{k}{(0,1)}_{f}=\binom{k}{(1,1)}_{f}=\binom{k}{(1,1)}_{f}=1$ , so their sum is $3$ . For $k=2$ , we have to consider all $\boldsymbol{\ell}=(x,y)$ with $x,y\leq 2$ . We find for all $\boldsymbol{\ell}$ such that $\binom{2}{\boldsymbol{\ell}}_{f}$ is non-zero:

[TABLE]

Hence, their sum is indeed $9$ .

7 Asymptotics of $c_{f}(\boldsymbol{\ell})$

We can find asymptotics of $c_{f}(\boldsymbol{\ell})$ by looking at its multivariate generating function

[TABLE]

While methods for determining the asymptotic growth of the coefficients of a generating function in one variable are well-established [18], methods for generating functions of several variables are less ubiquitous. However, [35] and [34] discuss such cases. Particularly simple results obtain when $J(\mathbf{x}):=1-\sum_{\mathbf{s}\in\mathbb{N}^{N}}f(\mathbf{s})\mathbf{x}^{\mathbf{s}}$ is symmetric in $\mathbf{x}$ .

For instance, [35] discuss the case when $f$ is the indicator function on $\{(1,0),(0,1),(1,1)\}$ , so that $J(x,y)=1-x-y-xy$ . They determine the set of “critical points”, that is, the points $(x_{0},y_{0})$ that satisfy $J(x_{0},y_{0})=0$ and $x_{0}\frac{\partial J(x_{0},y_{0})}{\partial x}=y_{0}\frac{\partial J(x_{0},y_{0})}{\partial x}$ in the positive orthant. They find that $(x_{0},y_{0})=(L-1,L-1)$ , where $L=\sqrt{2}$ is the only solution, from which follows the asymptotic

[TABLE]

using their Theorems 3.2 and 3.3. More general cases such as when $f$ is the indicator function on $\{\mathbf{x}\neq\mathbf{0}\in\mathbb{N}^{N}\,|\,x_{i}\in\{0,1\}\}$ or on $\{\mathbf{x}\in\mathbb{N}^{N}\,|\,x_{i}\in\{1,2\}\}$ can be solved analogously, but require more work to find the critical points and the implied asymptotics.

Theorem 7.1 ([20], Theorem 2).

Let $f$ be the indicator function on $S=\{(s_{1},\ldots,s_{N})\,|\,s_{i}\in\{0,1\}\}-\{\mathbf{0}\}$ . Then

[TABLE]

∎

Theorem 7.2 ([17], Theorem 4).

Let $f$ be the indicator function on $S=\{(s_{1},\ldots,s_{N})\,|\,s_{i}\in\{1,2\}\}$ . Moreover, let $\phi=\frac{\sqrt{5}-1}{2}$ and let $A=-\phi^{N-1}(1+\phi)^{N-1}(1+2\phi)$ . Define $h=N\Bigl{(}\frac{\phi}{1+3\phi+2\phi^{2}}\Bigr{)}^{N-1}$ and $b_{0}=\frac{1}{-\phi A\sqrt{(2\pi)^{N-1}h}}$ . Then

[TABLE]

∎

Example 7.3.

For the $f$ in the last theorem, the number $c_{f}(9,9,9)$ equals $17,899$ while the approximation formula has $18,955.30\ldots$ , which amounts to a relative error of less than $6\%$ .

Note that $c_{f}$ in the last theorem is closely related to $c_{f_{\text{prod}}}$ by Theorem 6.4, which immediatley yields another asymptotic formula.

8 Prime criteria

Mann and Shanks’ [28] prime criterion states that an integer $q$ is prime if and only if $m$ divides the adjusted binomial coefficients $\binom{m}{q-2m}$ for all $m$ with $0\leq 2m\leq q$ . This criterion can be extended to $f$ -weighted integer compositions ( $N=1$ ) when $f$ takes on the value $1$ for all elements inside the ‘unit sphere’, that is, 0 and 1 [13, 17]. For $N\geq 1$ , it is tempting to conjecture as follows.

Conjecture 8.1.

Let $f(\mathbf{x})=1$ for all $\mathbf{x}\in U_{\mathbf{0}}=\{\mathbf{s}\in\mathbb{N}^{N}\,|\,s_{i}\in\{0,1\}\}$ . Then, an integer $q>1$ is prime if and only if $m$ divides $\binom{m}{q\mathbf{1}-2m\mathbf{1}}_{f}$ for all integers $m$ with $0\leq 2m\leq q$ .

If $q$ is prime, then indeed $\binom{m}{q\mathbf{1}-2m\mathbf{1}}_{f}\equiv 0\pmod{m}$ for all integers $m$ with $0\leq 2m\leq q$ . This is a simple consequence of Theorem 5.11. Conversely, when $q$ is not prime, then $q$ is odd or even. When $q$ is even, $m=q/2$ does not divide $\binom{m}{q\mathbf{1}-2m\mathbf{1}}_{f}=\binom{m}{\mathbf{0}}_{f}=f(\mathbf{0})^{q/2}=1$ . However, when $q$ is odd, the situation is more difficult. Mann and Shanks choose $m=(q-p)/2=pn$ , for a prime divisor $p$ of $q$ and a suitable $n$ . This choice is appropriate for $N=1$ and the stated requirements on $f$ . However, already for $N=2$ , we find a counter-example to this choice (when $f$ is the indicator function on $\{(0,0),(0,1),(1,0),(1,1)\}$ ). Namely, when $q=55$ , then $p=5$ is a prime divisor of $q$ and we have $m=pn$ where $n=5$ . Then $\binom{pn}{p\mathbf{1}}_{f}\equiv\binom{pn}{p}+\frac{(pn)!}{(p!)^{2}(p(n-2))!}\pmod{pn}$ by Theorem 5.13 and Example 5.14. Numerical evaluation shows that this sum is $\equiv 5+20\equiv 0\pmod{pn}$ . However, while this choice is not suitable, there are others for $q=55$ (namely $m=20,22$ ). We leave Conjecture 8.1 as an open problem.

9 Conclusion

Many extensions of our results are conceivable. We have shown that the basis for weighted vector compositions are sums of independent and identically distributed random vectors. Other types of compositions can be investigated in which part sizes are correlated [10]. The basis for this class of compositions would be sums of dependent random vectors. Many approximations both for dependent and independent sums of random variables are known, e.g., [24]. How do these translate to approximation results for weighted compositions? Finally, we have generalized weighted integer compositions to weighted vector compositions. One could further generalize to weighted matrix compositions or general weighted tensor compositions, that is, compositions of arbitrary multidimensional arrays.

Acknowledgements

The author wishes to thank the anonymous referee for helpful suggestions.

Bibliography50

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] M. Abrate, S. Barbero, U. Cerruti, and N. Murru. Colored compositions, invert operator and elegant compositions with the “black tie”. Discrete Math. , 335:1 – 7, 2014.
2[2] A. K. Agarwal. n 𝑛 n -colour compositions. Indian J. Pure Appl. Math , 31:1421–1427, 2000.
3[3] P. G. Anderson, A. T. Benjamin, and J. A. Rouse. Combinatorial proofs of Fermat’s, Lucas’s, and Wilson’s theorems. Amer. Math. Monthly , 112(3):266–268, 2005.
4[4] G. E. Andrews. The theory of partitions, ii: Simon newcom’s problem. Utilitas Math. , 7:33–54, 1975.
5[5] C. Babbage. Demonstration of a theorem relating to prime numbers. The Edinburgh Philosophical Journal , 1:46–49, 1819.
6[6] C. Banderier and S. Schwer. Why Delannoy numbers? J. Statist. Plann. Inference , 135(1):40 – 54, 2005.
7[7] D. Birmajer, J. Gil, and M. D. Weiner. Compositions colored by simplicial polytopic numbers. J. Comb. , 9(2):221–232, 2018.
8[8] R. Bollinger and C. Burchard. Lucas’s theorem and some related results for extended Pascal triangles. Amer. Math. Monthly , 97(3):198–204, Mar. 1990.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

The Combinatorics of Weighted Vector Compositions

Abstract

1 Introduction

2 Relation to multivariate random variables

Example 2.1**.**

Example 2.2**.**

3 Basic identities

Theorem 3.1**.**

Proof.

Theorem 3.2**.**

Proof.

Remark 3.3**.**

Lemma 3.4**.**

4 Combinatorics of partial derivatives

Theorem 4.1**.**

Theorem 4.2**.**

Example 4.3**.**

Lemma 4.4**.**

5 Congruences for (kℓ)f\binom{k}{\boldsymbol{\ell}}_{f}(ℓk​)f​

Theorem 5.1** (Parity of (kℓ)f\binom{k}{\boldsymbol{\ell}}_{f}(ℓk​)f​).**

Proof.

Example 5.2**.**

Theorem 5.3**.**

Lemma 5.4**.**

Proof of Theorem 5.3, 1.

Proof of Theorem 5.3, 2.

Proof of Theorem 5.3, 3.

Theorem 5.5** (Babbage’s congruence).**

Proof.

Example 5.6**.**

Theorem 5.7**.**

Theorem 5.8**.**

Proof.

Theorem 5.9**.**

Proof.

Example 5.10**.**

Theorem 5.11**.**

Proof.

Example 5.12**.**

Theorem 5.13**.**

Proof.

Example 5.14**.**

Theorem 5.15** (Lucas’ theorem).**

Proof.

Example 5.16**.**

Theorem 5.17**.**

Proof.

Example 5.18**.**

6 Congruences and identities for sums of (kℓ)f\binom{k}{\boldsymbol{\ell}}_{f}(ℓk​)f​

Theorem 6.1**.**

Proof.

Example 6.2**.**

Definition 6.3**.**

Theorem 6.4**.**

Proof sketch.

Example 6.5**.**

Theorem 6.6** ([16], Theorem 27).**

Theorem 6.7** ([37], Theorem 2).**

Theorem 6.8**.**

Proof.

Example 6.9**.**

Theorem 6.10**.**

Proof.

Remark 6.11**.**

Example 6.12**.**

7 Asymptotics of cf(ℓ)c_{f}(\boldsymbol{\ell})cf​(ℓ)

Theorem 7.1** ([20], Theorem 2).**

Theorem 7.2** ([17], Theorem 4).**

Example 7.3**.**

8 Prime criteria

Conjecture 8.1**.**

9 Conclusion

Acknowledgements

Example 2.1.

Example 2.2.

Theorem 3.1.

Theorem 3.2.

Remark 3.3.

Lemma 3.4.

Theorem 4.1.

Theorem 4.2.

Example 4.3.

Lemma 4.4.

5 Congruences for $\binom{k}{\boldsymbol{\ell}}_{f}$

Theorem 5.1 (Parity of $\binom{k}{\boldsymbol{\ell}}_{f}$ ).

Example 5.2.

Theorem 5.3.

Lemma 5.4.

Theorem 5.5 (Babbage’s congruence).

Example 5.6.

Theorem 5.7.

Theorem 5.8.

Theorem 5.9.

Example 5.10.

Theorem 5.11.

Example 5.12.

Theorem 5.13.

Example 5.14.

Theorem 5.15 (Lucas’ theorem).

Example 5.16.

Theorem 5.17.

Example 5.18.

6 Congruences and identities for sums of $\binom{k}{\boldsymbol{\ell}}_{f}$

Theorem 6.1.

Example 6.2.

Definition 6.3.

Theorem 6.4.

Example 6.5.

Theorem 6.6 ([16], Theorem 27).

Theorem 6.7 ([37], Theorem 2).

Theorem 6.8.

Example 6.9.

Theorem 6.10.

Remark 6.11.

Example 6.12.

7 Asymptotics of $c_{f}(\boldsymbol{\ell})$

Theorem 7.1 ([20], Theorem 2).

Theorem 7.2 ([17], Theorem 4).

Example 7.3.

Conjecture 8.1.