Let D be a countable set of unordered colors, also called
“data”, and let (X,+,0) be a commutative monoid. An X-valued
data vector is a mapping D→X with finite support,
i.e. with finitely many colors mapping to nonzero values. For example,
consider D={r,g,b,…} and X=N under addition. Here, a data
vector is a finite multiset over D, e.g. 0 denotes
the data vector assigning 0 to every d∈D, and
{{r,r,b}} is
represented by the data vector v with v(r)=2, v(b)=1, and v(d)=0 for all d∈D∖{r,b}.
An X-valued abstract data vector, over a finite set of variables
Vars, is a mapping a:Vars→X. Informally, a represents the infinite (but orbit-finite)
family of data vectors where the variables are renamed by distinct
colors. For example, a={x↦2,y↦1}
represents
Given an injection ι:Vars→D, we
write aι for the data vector v defined by
v(d)=a(ι−1(d)) if d is in the image of ι,
and v(d)=0 otherwise. For example, a{x↦r,y↦b}={{r,r,b}}.
Given a set A of abstract data vectors, the set of data vectors
generated by A, denoted ⟨A⟩, is defined inductively
by v=0 or v=w+aι for some
w∈⟨A⟩, a∈A and injection ι:Vars→D.
For example, consider D={r,g,b,…}, Vars={x,y} and X=Z under addition. Let A={a,b} where
a={x↦2,y↦1} and b={x↦1,y↦−2}. We have v={r↦3,b↦2}∈⟨A⟩ since
We are interested in the expressibility problem which asks, given a
data vector v and a finite set A of abstract data vectors,
whether v∈⟨A⟩. This problem arises, e.g., in
the study of unordered data Petri nets. We will see how one can decide
the expressibility problem, by revisiting the work of Hofman, Leroux
and Totzke1.
Edge chromatic number
To tackle the expressibility problem, we briefly turn to graph
theory. An edge coloring of an undirected graph G=(V,E) is a
function f:E→N where f(e)=f(e′) implies that e and
e′ do not share any vertex. The size of f is the size of its
image. The edge chromatic number of G is the size of a minimal
edge coloring. For example, the edge chromatic number of this graph is
4:
König’s line coloring
theorem
states that the edge chromatic number of a bipartite graph is its
maximum degree. For example, the above bipartite graph has maximum
degree 2. We provide a proof2 of the theorem for the slightly more
general case of multigraphs, i.e. where parallel edges are allowed.
The edge chromatic number of a bipartite multigraph equals its maximum degree.
Proof.
Let G=(X∪Y,E) be a bipartite multigraph. We proceed by
induction on ∣E∣. If ∣E∣=0, then we are trivially done. Assume
that ∣E∣≥1. Let Δ denote the maximum degree of G. Let
e∈E be an arbitrary edge, and let x∈X and y∈Y be its
endpoints. Let G′=(X∪Y,E′) be the graph obtained by removing
e from G. As the maximum degree of G′ is at most Δ, by
induction hypothesis, there is an edge coloring f′:E′→[1..Δ] of G′.
Since degG′(x),degG′(y)<Δ, there
exist i,j∈[1..Δ] such that vertex x is not incident with
an i-edge in f′, and likewise for y and j. If i=j, then we
are done by coloring e with i. So, assume i=j. Without loss
of generality, we assume that x is incident with a j-edge in f′,
as otherwise we could have set i=j.
We construct a path. Let us start in x0=x and take a
j-edge e1∈E′ going to some vertex v1. If
e1 is incident with an i-edge e2∈E′, then we
take it and move to some vertex v2. We continue this process
alternating with j-edges and
i-edges. Since f′ is an edge coloring, a vertex v
cannot repeat along this path, as otherwise v would be incident with
two edges of the same color. Thus, we may obtain a maximal simple path
P of the form
v0e1v1e2v2e3⋯vn, where vi∈X iff i is even.
By assumption on x, we have n≥1. Let g′:E′→[1..Δ] be obtained from f′ by swapping i and
j on the edges of P. Note that g′ is an
edge coloring.
Click for a proof.
For the sake of contradiction, suppose that some vertex u is
incident to two edges of the same color k in g′. We must have u=vℓ for some ℓ∈[0..n] and k∈{i,j}. We cannot have ℓ=0, as v0=x is not
incident to an i-edge in f′. We cannot have 1≤ℓ<n, as we swapped the occurrences of the two colors. We cannot
have ℓ=n as otherwise we could have extended P to a longer
path.
Recall that y is not incident to a j-edge in
f′. Thus, if y appeared along P, we would have vn=y with
n even, which is impossible as G is bipartite. Thus, x is the
first vertex of P, and y does not appear along P. Consequently,
in the edge coloring g′, neither x nor y is incident to a
j-edge.
Let g:E→[1..Δ] be g′ extended with g(e)=j. We are done since g is an edge coloring of G.
Note that the proof of Theorem 1 is constructive. For example, the
edge coloring of the previous example was obtained as follows:
Perfect matchings on multisets
We now introduce a useful notion of matchings that can be
characterized thanks to Theorem 1. Let X be a finite
collection of finite multisets X1,…,Xk:D→N. A transversal of X is a tuple (d1,…,dk)∈Dk where di∈Xi and di=dj for all i=j. A perfect matching of X is a finite multiset
T of transversals such that Xi={{di:(d1,…,dk)∈T}} for each i∈[1..k].
For example, consider X1={{r,r,g}}, X2={{r,g,b}} and X3={{g,b,b}}. This is a perfect matching:
T={{(r,g,b),(r,b,g),(g,r,b)}}.
Perfect matchings do not always exist, e.g. for X1={{r,r}} and X2={{r,b}}.
There is a perfect matching iff ∣X1∣=⋯=∣Xk∣=n and
⋀d∈D∑1≤i≤kXi(d)≤n.
Proof.
⇒) Let n be the number of transversals of the perfect
matching. Since each element of Xi must belong to exactly one
transversal, we must have ∣X1∣=⋯=∣Xk∣=n. Let d∈D. Each transversal contains at most one occurrence of
d. As there are n transversals, this means that X1(d)+…+Xk(d)≤n.
⇐) Let U=[1..k], and let V⊆D be
the finite subset of colors occurring at least once in some Xi. Let
G=(U∪V,E) be the bipartite multigraph where there are
Xi(d) edges of the form {i,d}. By assumption, we have
deg(u)=n for each u∈U, and deg(v)≤n
for each v∈V. Thus, the maximum degree of G is n, and hence
its edge chromatic number is n by Theorem 1.
Let f:E→[1..n] be an edge coloring of G. For every i∈[1..n] and j∈[1..k], let ei,j be the unique edge
adjacent with vertex j∈U such that f(ei,j)=i. Let di,j∈V be the other vertex adjacent with ei,j. Since f is
an edge coloring, the tuple ti=(di,1,…,di,k) is a
transversal. Thus, T={{t1,…,tn}} is a perfect
matching.
Characterizing expressibility with perfect matchings
Recall that we consider X-valued data vectors where (X,+,0) is a
commutative monoid. For every c∈N and x∈X, let c⋅x denote x+…+x with c terms. We obtain the following
characterization:
Let a be an abstract data vectors over Vars={x1,…,xk}. It is the case that v∈⟨a⟩ iff there exist finite multisets X1,…,Xk:D→N such that
⇒) By assumption there exist injections ι1,…,ιn:Vars→D satisfying v=∑j∈[1..n]aιj. Let Xi:D→N be defined by Xi(d)=∣{{j∈[1..n]:ιj(xi)=d}}∣. For every d∈D, we have
v(d)=i∈[1..k]∑Xi(d)⋅a(xi).
For each i∈[1..n], let ti=(ιi(x1),…,ιi(xk)). By injectivity of ιi, the tuple ti is a
transversal. Thus, by definition, {{t1,…,tn}}
is a perfect matching of X1,…,Xk.
⇐) Let T be a perfect matching for X1,…,Xk. For each t∈T, let ιt(xi)=t[i]. Since t is a
transversal, ιt is an injection. For every d∈D,
we have
Recall that v(d)=∑i∈[1..k]Xi(d)⋅a(xi). Hence, we have v=∑t∈Taιt, and so v∈⟨a⟩.
Deciding expressibility
We now combine our characterizations and bound the number of colors
necessary to witness expressibility.
Let A={a1,…,am} be a finite set of abstract
data vectors over Vars={x1,…,xk}. It is the
case that v∈⟨A⟩ iff there exist n1,…,nm∈N, finite multisets X1,1,…,Xm,k:D→N and data vectors v1,…,vm such
that the following holds for each i∈[1..m]:
We have v∈⟨A⟩ iff v=v1+…+vm for some v1∈⟨a1⟩,…,vm∈⟨am⟩. Thus, by Proposition 3
and Lemma 2, we have v∈⟨A⟩ iff there exist
n1,…,nm∈N and finite multisets X1,1,…,Xm,k:D→N satisfying Item 1 of the claim.
It remains to prove Item 2, i.e. that the number of colors can be
bounded. Let us pick a solution minimizing the size of S=supp(X1,1+…+Xm,k). Let us show that ∣S∣≤m(2k−1)+∣supp(v)∣+1.
We say that d∈D is i-dominant if Xi,1(d)+…+Xi,k(d)>ni/2. Let Di denote the set of
i-dominant colors. We have ∣Di∣<2k, as otherwise we would
obtain this contradiction:
We say that a color is non-dominant if it is not i-dominant for
any i∈[1..m]. For the sake of contradiction, suppose that ∣S∣≥m(2k−1)+∣supp(v)∣+2. By the pigeonhole
principle, there exist distinct non-dominant colors e,e′∈S not
appearing in supp(v).
We merge color e′ into color e by defining
Xi,j′(e′)Xi,j′(e)Xi,j′(f)=0,=Xi,j(e)+Xi,j(e′),=Xi,j(f)for all f∈/{e,e′}.
Observe that X1,1′,…,Xm,k′ satisfy Item 1 of the claim,
namely:
⋀d∈Dv(d)=∑i∈[1..m],j∈[1..k]Xi,j′(d)⋅a(xj) holds since e,e′∈/supp(v);
∣Xi,j′∣=∣Xi,j∣=ni;
⋀i∈[1..m],d∈D∑j∈[1..k]Xi,j(d)≤ni holds since e and e′ are non-dominant.
This contradicts the minimality of S since S′=supp(X1,1′+…+Xm,k′)=S∖{e′}.
The expressibility problem is in NP for Zℓ-valued data vectors
under addition.
Proof.
Theorem 4 yields a system of linear inequalities for which at most polynomially many colors are needed. Thus, expressibility reduces to integer linear programming feasability.