- Research
- Open access
- Published:
Projecting onto rectangular matrices with prescribed row and column sums
Fixed Point Theory and Algorithms for Sciences and Engineering volume 2021, Article number: 23 (2021)
Abstract
In 1990, Romero presented a beautiful formula for the projection onto the set of rectangular matrices with prescribed row and column sums. Variants of Romero’s formula were rediscovered by Khoury and by Glunt, Hayden, and Reams for bistochastic (square) matrices in 1998. These results have found various generalizations and applications.
In this paper, we provide a formula for the more general problem of finding the projection onto the set of rectangular matrices with prescribed scaled row and column sums. Our approach is based on computing the Moore–Penrose inverse of a certain linear operator associated with the problem. In fact, our analysis holds even for Hilbert–Schmidt operators, and we do not have to assume consistency. We also perform numerical experiments featuring the new projection operator.
1 Motivation
A matrix in \(\mathbb{R}^{n\times n}\) is called bistochastic if all entries of it are nonnegative and all its row and column sums equal 1. More generally, a matrix is generalized bistochastic if the requirement on nonnegativity is dropped. The bistochastic matrices form a convex polytope B, commonly called the Birkhoff polytope, in \(\mathbb{R}^{n\times n}\), with its extreme points being the permutation matrices (a seminal result due to Birkhoff and von Neumann). A lovely formula provided in 1998 by Khoury [8]—and also by Glunt et al. [5]—gives the projection of any matrix onto G, the affine subspace of generalized bistochastic matrices (see Example 3.8). More generally, nonnegative rectangular matrices with prescribed row and column sums are called transportation polytopes. If the nonnegativity assumption is dropped, then Romero provided already in 1990 an explicit formula (see Remark 3.5) which even predates the square case! On the other hand, the projection onto the set of nonnegative matrices N is simple—just replace every negative entry with 0. No explicit formula is known to project a matrix onto the set of bistochastic matrices; however, because \(B=G\cap N\), one may apply algorithms such as Dykstra’s algorithm to iteratively approximate the projection onto B by using the projection operators \(P_{G}\) and \(P_{N}\) (see, e.g., Takouda’s [12] for details). In the case of transportation polytopes, algorithms which even converge in finitely many steps were provided by Calvillo and Romero [4].
The goal of this paper is to provide explicit projection operators in more general settings. Specifically, we present a projection formula for finding the projection onto the set of rectangular matrices with prescribed scaled row and column sums. Such problems arise, e.g., in discrete tomography [13] and the study of transportation polytopes [4]. Our approach uses the Moore–Penrose inverse of a certain linear operator \(\mathcal{A}\). It turns out that our analysis works even for Hilbert–Schmidt operators because the range of \(\mathcal{A}\) can be determined and seen to be closed. Our main references are [3, 7] (for Hilbert–Schmidt operators) and [6] (for the Moore–Penrose inverse). We also note that consistency is not required.
The paper is organized as follows. After recording a useful result involving the Moore–Penrose inverse at the end of this section, we prove our main results in Sect. 2. These results are then specialized to rectangular matrices in Sect. 3. We then turn to numerical experiments in Sect. 4, where we compare the performance of three popular algorithms: Douglas–Rachford, the method of alternating projections, and Dykstra.
We conclude this introductory section with a result which we believe to be part of the folklore (although we were not able to pinpoint a crisp reference). It is formulated using the Moore–Penrose inverse of an operator—for the definition of the Moore–Penrose inverse and its basic properties, see [6] (and also [3, pages 57–59] for a crash course). The formula presented works even in the case when the problem is inconsistent and automatically provides a least squares solution.
Proposition 1.1
Let \(A\colon X\to Y\) be a continuous linear operator with closed range between two real Hilbert spaces. Let \(b\in Y\), set \(\bar{b}:= P_{\operatorname{ran}A}b\), and set \(C:= A^{-1}\bar{b}\). Then
where \(A^{\dagger }\) denotes the Moore–Penrose inverse of A.
Proof
Clearly, \(\bar{b}\in \operatorname{ran}A\); hence, \(C\neq \varnothing \). Let \(x\in X\). It is well known (see, e.g., [3, Example 29.17(ii)]) that
On the other hand,
using the fact that \(AA^{\dagger }= P_{\operatorname{ran}A}\) (see, e.g., [3, Proposition 3.30(ii)]) and \(A^{\dagger }A A^{\dagger }= A^{\dagger }\) (see, e.g., [6, Section II.2]). Altogether, \(P_{C}x=x-A^{\dagger }(Ax-b)\) as claimed. □
2 Hilbert–Schmidt operators
From now on, we assume that
which in turn give rise to the real Hilbert space
Hilbert–Schmidt operators encompass rectangular matrices—even with infinitely many entries as long as these are square summable—as well as certain integral operators. (We refer the reader to [7, Sect. 2.6] for basic results on Hilbert–Schmidt operators and also recommend [10, Section VI.6].) Moreover, \(\mathcal{H}\) (is generated by and) contains rank-one operators of the form
where \((v,u)\in Y\times X\), and with adjoint
and
Moreover,
For the rest of the paper, we fix
and set
Proposition 2.1
\(\mathcal{A}\) is a continuous linear operator and \(\|\mathcal{A}\| = \sqrt{\|e\|^{2}+\|f\|^{2}}\).
Proof
Clearly, \(\mathcal{A}\) is a linear operator. Moreover, \((\forall T\in \mathcal{H})\) \(\|\mathcal{A}(T)\|^{2} = \|Te\|^{2} + \|T^{*}f\|^{2} \leq \|T\|_{ \mathsf{op}}^{2}\|e\|^{2}+ \|T^{*}\|_{\mathsf{op}}^{2}\|f\|^{2} \leq \|T\|^{2}(\|e\|^{2}+\|f\|^{2})\) because the Hilbert–Schmidt norm dominates the operator norm. It follows that \(\mathcal{A}\) is continuous and \(\|\mathcal{A}\|\leq \sqrt{\|e\|^{2}+\|f\|^{2}}\). On the other hand, if \(T = f\otimes e\), then \(\|T\| = \|e\|\|f\|\), \(\mathcal{A}(T) = (\|e\|^{2}f,\|f\|^{2}e)\) and hence \(\|\mathcal{A}(T)\|=\|T\|\sqrt{\|e\|^{2}+\|f\|^{2}}\). Thus \(\|\mathcal{A}\| \geq \sqrt{\|e\|^{2}+\|f\|^{2}}\). Combining these observations, we obtain altogether that \(\|\mathcal{A}\| = \sqrt{\|e\|^{2}+\|f\|^{2}}\). □
We now prove that \(\operatorname{ran}\mathcal{A}\) is always closed.
Proposition 2.2
(Range of \(\mathcal{A}\) is closed)
The following hold:
-
(i)
If \(e=0\) and \(f=0\), then \(\operatorname{ran}\mathcal{A}= \{0\}\times \{0\}\).
-
(ii)
If \(e=0\) and \(f\neq 0\), then \(\operatorname{ran}\mathcal{A}= \{0\}\times X\).
-
(iii)
If \(e\neq 0\) and \(f=0\), then \(\operatorname{ran}\mathcal{A}= Y\times \{0\}\).
-
(iv)
If \(e\neq 0\) and \(f\neq 0\), then \(\operatorname{ran}\mathcal{A}= \{(f,-e)\}^{\perp }\).
Consequently, \(\operatorname{ran}\mathcal{A}\) is always a closed linear subspace of \(Y\times X\).
Proof
(i): Clear.
(ii): Obviously, \(\operatorname{ran}\mathcal{A}\subseteq \{0\}\times X\). Conversely, let \(x\in X\) and set
Then \(Te=T0 = 0\) and
and thus \((0,x)=(Te,T^{*}f)=\mathcal{A}(T)\in \operatorname{ran}\mathcal{A}\).
(iii): Obviously, \(\operatorname{ran}\mathcal{A}\subseteq Y\times \{0\}\). Conversely, let \(y\in Y\) and set
Then \(T^{*}f=T^{*}0 = 0\) and
and thus \((y,0)=(Te,T^{*}f)=\mathcal{A}(T)\in \operatorname{ran}\mathcal{A}\).
(iv): If \((y,x)\in \operatorname{ran}\mathcal{A}\), say \((y,x)=\mathcal{A}(T)=(Te,T^{*}f)\) for some \(T\in \mathcal{H}\), then
i.e., \((y,x)\perp (f,-e)\). It follows that \(\operatorname{ran}\mathcal{A}\subseteq \{(f,-e)\}^{\perp }\).
Conversely, let \((y,x)\in \{(f,-e)\}^{\perp }\), i.e., \(\langle {e},{x} \rangle = \langle {f},{y} \rangle \).
Case 1: \(\langle {e},{x} \rangle = \langle {f},{y} \rangle \neq 0\).
Set
Note that
and
therefore, \((y,x)=(Te,T^{*}f)=\mathcal{A}(T)\in \operatorname{ran}\mathcal{A}\).
Case 2: \(\langle {e},{x} \rangle = \langle {f},{y} \rangle =0\).
Pick ξ and η in \(\mathbb{R}\) such that
and set
Then
and
Thus \((y,x)=(Te,T^{*}f)=\mathcal{A}(T)\in \operatorname{ran}\mathcal{A}\). □
We now turn to the adjoint of \(\mathcal{A}\).
Proposition 2.3
(Adjoint of \(\mathcal{A}\))
We have
Proof
Let \(T\in \mathcal{H}\) and \((y,x)\in Y\times X\). Let B be any orthonormal basis of X. Then
which proves the result. □
We have all the results together to start tackling the Moore–Penrose inverse of \(\mathcal{A}\).
Theorem 2.4
(Moore–Penrose inverse of \(\mathcal{A}\) part 1)
Suppose that \(e\neq 0\) and \(f\neq 0\). Let \((y,x)\in Y\times X\). Then
Proof
Set
Then
and similarly
Substituting (28a)–(28d) and (29) in (27) yields
Thus
Therefore, using (24), (30), (7), and (24) again, we obtain
To sum up, we found \(\mathcal{A}^{*}(v,u)\in \operatorname{ran}\mathcal{A}^{*} = (\ker \mathcal{A})^{\perp }\) such that \(\mathcal{A}^{*}\mathcal{A}\mathcal{A}^{*}(v,u) = \mathcal{A}^{*}(y,x)\). By [3, Proposition 3.30(i)], (30), and (24), we deduce that
which now results in (26) by using (28a)–(28d) and (29). □
Theorem 2.5
(Moore–Penrose inverse of \(\mathcal{A}\) part 2)
Let \((y,x)\in Y\times X\). Then the following hold:
-
(i)
If \(e=0\) and \(f\neq 0\), then \(\mathcal{A}^{\dagger }(y,x) = \frac{1}{\|f\|^{2}} f \otimes x\).
-
(ii)
If \(e\neq 0\) and \(f= 0\), then \(\mathcal{A}^{\dagger }(y,x) = \frac{1}{\|e\|^{2}} y \otimes e\).
-
(iii)
If \(e=0\) and \(f=0\), then \(\mathcal{A}^{\dagger }(y,x) = 0\in \mathcal{H}\).
Proof
Let \(T\in \mathcal{H}\).
(i): In this case, \(\mathcal{A}(T) = (0,T^{*}f)\) and \(\mathcal{A}^{*}(y,x) = f\otimes x\). Let us verify the Penrose conditions [6, p.48]. First, using (7),
and
which shows that \(\mathcal{A}\mathcal{A}^{\dagger }\) is indeed self-adjoint.
Secondly,
and if \(S\in \mathcal{H}\) and B is any orthonormal basis of X, then
which yields the symmetry of \(\mathcal{A}^{\dagger }\mathcal{A}\).
Thirdly, using (36) and the assumption that \(e=0\), we have
And finally, using (34a)–(34b), we have
(ii): This can be proved similar to (i).
(iii): In this case, \(\mathcal{A}\) is the zero operator and hence the Desoer–Whalen conditions (see [6, page 51]) make it obvious that \(\mathcal{A}^{\dagger }\) is the zero operator as well. □
Let us define the auxiliary function
which allows us to combine the previous two results into one.
Corollary 2.6
Let \((y,x)\in Y\times X\). Then
We now turn to formulas for \(P_{\operatorname{ran}\mathcal{A}}\) and \(P_{\operatorname{ran}\mathcal{A^{*}}}\).
Corollary 2.7
(Projections onto \(\operatorname{ran}\mathcal{A}\) and \(\operatorname{ran}\mathcal{A}^{*}\))
Let \((y,x)\in {Y}\times {X}\) and let \(T\in \mathcal{H}\). If \(e\neq 0\) and \(f\neq 0\), then
and
Furthermore,
and
Proof
Using [3, Proposition 3.30(ii)] and (26), we obtain for \(e\neq 0\) and \(f\neq 0\)
which verifies (43).
Next, using [3, Proposition 3.30(v) and (vi)] and (26), we have
which establishes (44).
If \(e=0\) and \(f\neq 0\), then
and
The case when \(e\neq 0\) and \(f=0\) is treated similarly.
Finally, if \(e=0\) and \(f=0\), then \(\mathcal{A}^{\dagger }=0\) and the result follows. □
Theorem 2.8
(Main projection theorem)
Let \((s,r)\in Y\times X\) and set \((\bar{s},\bar{r})=P_{\operatorname{ran}\mathcal{A}}(s,r)\). Then
Let \(T\in \mathcal{H}\). If \(e\neq 0\) and \(f\neq 0\), then
Moreover,
Proof
Clearly, \(C\neq \varnothing \). Now Proposition 1.1 and (11) yield
Now we consider all possible cases. If \(e\neq 0\) and \(f\neq 0\), then, using (26),
as claimed.
Next, if \(e=0\) and \(f\neq 0\), then using Theorem 2.5(i) yields
Similarly, if \(e\neq 0\) and \(f= 0\), then using Theorem 2.5(ii) yields
And finally, if \(e=0\) and \(f=0\), then \(\mathcal{A}^{\dagger }=0\) and hence \(P_{C}(T) = T\). □
Remark 2.9
Consider Theorem 2.8 and its notation. If \((s,r)\in \operatorname{ran}\mathcal{A}\), then \((\bar{s},\bar{r})=(s,r)\) and hence \(C = \mathcal{A}^{-1}(s,r)\) which covers also the consistent case. Note that the auxiliary function defined in (40) allows us to combine all four cases into
The last two results in this section are inspired by [5, Theorem 2.1] and [8, Theorem on page 566], respectively. See also Corollary 3.6 and Example 3.8.
Corollary 2.10
Suppose that \(Y=X\), let \(e\in X\smallsetminus \{0\}\), let \(f\in X\smallsetminus \{0\}\), set
and let \(\gamma \in \mathbb{R}\). Then
and
Proof
The projection identities in (56) follow from (9). Note that \(\gamma \operatorname{Id}\in C\), and hence \(C\neq \varnothing \). We may and do assume without loss of generality that \(\|e\|=1=\|f\|\).
Now let \(T\in \mathcal{H}\). Applying (52a)–(52b) with \(r:=\gamma f\) and \(s:=\gamma e\), we deduce that
as claimed. □
We conclude this section with a beautiful projection formula that arises when the last result is specialized even further.
Corollary 2.11
Suppose that \(Y=X\), let \(e\in X\smallsetminus \{0\}\), and set
Then
and
Proof
Let \(T\in \mathcal{H}\). Applying Corollary 2.10 with \(f=e\) and \(\gamma =1\), we obtain
because \(\operatorname{Id}-E=P_{\{e\}^{\perp }}\) is idempotent. □
3 Rectangular matrices
In this section, we specialize the results of Sect. 2 to
which gives rise to
the space of real \(m\times n\) matrices. Given u and x in \(\mathbb{R}^{n}\), and v and y in \(\mathbb{R}^{m}\), we have \(v\otimes u = vu^{\intercal }\), \((v\otimes u)x=vu^{\intercal }x = (u^{\intercal }x)v\), and \((v\otimes u)^{*}y = (v^{\intercal }y) u\). Corresponding to (11), we have
The counterpart of (24) reads
Translated to the matrix setting, Theorem 2.4 and Theorem 2.5 turn into the following.
Theorem 3.1
Let \(x\in \mathbb{R}^{n}\) and \(y\in \mathbb{R}^{m}\). If \(e\neq 0\) and \(f\neq 0\), then
Furthermore,
In turn, Corollary 2.7 now states the following.
Corollary 3.2
Let \(x\in \mathbb{R}^{n}\), let \(y\in \mathbb{R}^{m}\), and let \(T\in \mathbb{R}^{m\times n}\). If \(e\neq 0\) and \(f\neq 0\), then
and
Furthermore,
and
Next, Theorem 2.8 turns into the following result.
Theorem 3.3
Let \(r\in \mathbb{R}^{n}\), let \(s\in \mathbb{R}^{m}\), and set \([\bar{s},\bar{r}]^{\intercal }= P_{\operatorname{ran}\mathcal{A}}[s,r]^{\intercal }\). Then
Now let \(T\in \mathbb{R}^{m\times n}\). If \(e\neq 0\) and \(f\neq 0\), then
Moreover,
Let us specialize Theorem 3.3 further to the following interesting case.
Corollary 3.4
(Projection onto matrices with prescribed row/column sums)
Suppose that \(e=[1,1,\ldots,1]^{\intercal }\in \mathbb{R}^{n}\) and that \(f=[1,1,\ldots,1]^{\intercal }\in \mathbb{R}^{m}\). Let \(r\in \mathbb{R}^{n}\), let \(s\in \mathbb{R}^{m}\), and set \([\bar{s},\bar{r}]^{\intercal }= P_{\operatorname{ran}\mathcal{A}}[s,r]^{\intercal }\). Then
and for every \(T\in \mathbb{R}^{m\times n}\),
Remark 3.5
(Romero; 1990)
Consider Corollary 3.4 and its notation. Assume that \([s,r]^{\intercal }\in \operatorname{ran}\mathcal{A}\), which is equivalent to requiring that \(\langle {e},{r} \rangle = \langle {f},{s} \rangle \) (which is sometimes jokingly called the “Fundamental Theorem of Accounting”). Then one verifies that the entries of the matrix in (78a)–(78b) are given also expressed by
for every \(i\in \{1,\ldots,m\}\) and \(j\in \{1,\ldots,n\}\). Formula (79) was proved by Romero (see [11, Corollary 2.1]) who proved this result using Lagrange multipliers and who has even a K-dimensional extension (where (79) corresponds to \(K=2\)). We also refer the reader to [4] for using (79) to compute the projection onto the transportation polytope.
Next, Corollary 2.10 turns into the following result.
Corollary 3.6
(Glunt–Hayden–Reams; 1998 [5, Theorem 2.1])
Suppose that e and f lie in \(\mathbb{R}^{n}\smallsetminus \{0\}\), set
and let \(\gamma \in \mathbb{R}\). Then
and
We conclude this section with a particularization of Corollary 2.11 which immediately follows when \(X=Y=\mathbb{R}^{n}\) and thus \(\mathcal{H}= \mathbb{R}^{n\times n}\):
Corollary 3.7
Suppose that \(e\in \mathbb{R}^{n}\smallsetminus \{0\}\), and set
Then
and
Example 3.8
(Projection formula for generalized bistochastic matrices; 1998 (See [8, Theorem on page 566] and [5, Corollary 2.1].))
Set
Then
Proof
Apply Corollary 3.7 with \(e=u\) for which \(\|e\|^{2}=n\). □
Remark 3.9
For some applications of Example 3.8, we refer the reader to [12] and also to the recent preprint [2].
Remark 3.10
A reviewer pointed out that projection algorithms can also be employed to solve linear programming problems provided a strict complementary condition holds (see Nurminski’s work [9]). This does suggest a possibly interesting future project: explore whether the projections in this paper are useful in solving some linear programming problems on rectangular matrices with prescribed row and column sums.
4 Numerical experiments
We consider the problem of finding a rectangular matrix with prescribed row and column sums as well as some additional constraints on the entries of the matrix. To be specific and inspired by [1], we seek a real matrix of size \(m\times n = 4\times 5\) such that its row and column sums are equal to \(\bar{s}:= \begin{bmatrix} 32,43,33,23\end{bmatrix} ^{\intercal }\) and \(\bar{r}:= \begin{bmatrix} 24,18,37,27,25\end{bmatrix} ^{\intercal }\), respectively. One solution featuring actually nonnegative integers to this problem is given by
Adopting the notation of Corollary 3.4, we see that the set
is an affine subspace of \(\mathbb{R}^{4\times 5}\) and that an explicit formula for \(P_{B}\) is available through Corollary 3.4. Next, we define the closed convex “hyper box”
For instance, the \((1,3)\)-entry of any nonnegative integer solution must lie between 0 and \(32=\min \{32,37\}\); thus \(A_{1,3} = [0,32]\). The projection of a real number ξ onto the interval \([0,\min ({\bar{s}}_{i},{\bar{r}}_{j})]\) is given by \(\max \{0,\min \{{\bar{s}}_{i},{\bar{r}}_{j},\xi \}\}\). Because A is the Cartesian product of such intervals, the projection operator \(P_{A}\) is nothing but the corresponding product of interval projection operators.
Our problem is thus to
We shall tackle (90) with three well-known algorithms: Douglas–Rachford (DR), method of alternating projections (MAP), and Dykstra (Dyk). Here is a quick review of how these methods operate for a given starting matrix \(T_{0}\in \mathbb{R}^{4\times 5}\) and a current matrix \(T_{k}\in \mathbb{R}^{4\times 5}\).
DR updates via
MAP updates via
and finally Dyk initializes also \(R_{0}=0\in \mathbb{R}^{4\times 5}\) and updates via
For all three algorithms, it is known that
in fact, Dyk satisfies even \(P_{A}(T_{k})\to P_{A\cap B}(T_{0})\) (see, e.g., [3, Corollary 28.3, Corollary 5.26, and Theorem 30.7]). Consequently, for each of the three algorithms, we will focus on the sequence
which obviously lies in A and which thus prompts the simple feasibility criterion given by
4.1 The convex case
Each algorithm is run for 250 iterations and for \(100,000\) instances of \(T_{0}\) that are produced with entries generated uniformly in \([-100,100]\). The plot of the median value for \(\delta _{k}\) of the iterates is shown in Fig. 1. The shaded region for each line represents the range of values attained at that iteration. We assume an algorithm to have achieved feasibility when \(\delta _{k}=0\). While MAP and DR always achieve feasibility, as can be seen from the range of their values in Fig. 1, DR achieves it the fastest in most cases. To support this, we order these algorithms in Table 1 according to their performance. The first column reports what percent of the instances achieved feasibility in the given order and if any of the algorithms did not converge. So the row labeled “DR<MAP” represents cases where DR achieved feasibility the fastest, MAP was second, and Dyk did not converge. The second column reports what percent of the first feasible matrices obtained were closest to the starting point \(T_{0}\) in the given order. This is done by measuring \(\lVert {T_{0}-T}\rVert\), where \(\lVert {\cdot}\rVert\) is the operator norm, and \(T_{k}\) is the first feasible matrices obtained using a given algorithm (Dyk, DR, or MAP). We consider the algorithms tied, if the distance between the starting point and the estimate for both differs by a value less than or equal to 10−15. As is evident, a majority of the cases have DR in the lead for feasibility. However, the distance of these matrices is not as close as the ones given by MAP and Dyk when feasible. This is consistent with the fact that DR explores regions further away from the starting point to look for matrices, and Dyk is built to achieve the least distance. It is also worth noting that at least one of these algorithms converges in every instance. (Convergence for all three algorithms is guaranteed in theory.)
Last but not least, because our problem deals with unscaled row and column sums, we point out that the sought-after projection may also be computed by using the algorithm proposed by Calvillo and Romero [4] which even converges in finitely many steps!
4.2 The nonconvex case
We exactly repeat the experiment of Sect. 4.1 with the only difference being that the (new) set A in this section is the intersection of the (old) set A from the previous section (see (89)) and \(\mathbb{Z}^{4\times 5}\). This enforces nonnegative integer solutions. The projection operator \(P_{A}\) is obtained by simply rounding after application of \(P_{A}\) from Sect. 4.1.
In this nonconvex case, MAP fails to converge in most cases, whereas DR and Dyk converge to solutions as shown in Fig. 2. This is corroborated by Table 2 where the rows where MAP converges correspond to only a quarter of the total cases. Again, DR achieves feasibility the fastest in more than half the cases, but Dykstra’s algorithm gives the solution closest to \(T_{0}\) among these, as shown in the second column of Table 2. In this nonconvex case convergence of any of the algorithms is not guaranteed; in fact, there are several instances when no solution is found. However, in the 105 runs considered, we did end up discovering several distinct solutions (see Table 3). It turned out that all solutions found were distinct even across all three algorithms resulting in 113,622 different nonnegative integer solutions in total.
Availability of data and materials
The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.
References
Aneikei (https://math.stackexchange.com/users/378887/aneikei), Find a matrix with given row and column sums, 2016, https://math.stackexchange.com/questions/1969542/find-a-matrix-with-given-row-and-column-sums
Aragón Artacho, F.J., Campoy, R., Tam, M.K.: Strengthened splitting methods for computing resolvents (2020) https://arxiv.org/abs/2011.01796
Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces, 2nd edn. Springer, Berlin (2017)
Calvillo, G., Romero, D.: On the closest point to the origin in transportation polytopes. Discrete Appl. Math. 210, 88–102 (2016). https://doi.org/10.1016/j.dam.2015.01.027
Glunt, W., Hayden, T.L., Reams, R.: The nearest “doubly stochastic” matrix to a real matrix with the same first moment. Numer. Linear Algebra Appl. 5), 475–482 (1998
Groetsch, C.W.: Generalized Inverses of Linear Operators. Dekker, New York (1977)
Kadison, R.V., Ringrose, J.R.: Fundamentals of the Theory of Operator Algebras I: Elementary Theory. Am. Math. Soc., Providence (1997)
Khoury, R.N.: Closest matrices in the space of generalized doubly stochastic matrices. J. Math. Anal. Appl. 222, 561–568 (1998). https://doi.org/10.1006/jmaa.1998.5970
Nurminski, E.A.: Single-projection procedure for linear optimization. J. Glob. Optim. 66, 95–110 (2016). https://doi.org/10.1007/s10898-015-0337-9
Reed, M., Simon, B.: Methods of Modern Mathematical Physics I: Functional Analysis, revised and enlarged edn. Academic Press, San Diego (1980)
Romero, D.: Easy transportation-like problems on K-dimensional arrays. J. Optim. Theory Appl. 66, 137–147 (1990). https://doi.org/10.1007/BF00940537
Takouda, P.L.: Un problème d’approximation matricielle: quelle est la matrice bistochastiqu la plus proche d’une matrice donnée? RAIRO Oper. Res. 39, 35–54 (2005). https://doi.org/10.1051/ro:2005003
Wikipedia, Discrete tomography, https://en.wikipedia.org/wiki/Discrete_tomography, retrieved September 13, 2021
Acknowledgements
We thank the editor Aviv Gibali, three anonymous reviewers, and Matt Tam for constructive comments and several pointers to literature we were previously unaware of.
Funding
The research of HHB and XW was partially supported by Discovery Grants from the Natural Sciences and Engineering Research Council of Canada.
Author information
Authors and Affiliations
Contributions
All authors contributed equally in writing this article. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no competing interests.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Bauschke, H.H., Singh, S. & Wang, X. Projecting onto rectangular matrices with prescribed row and column sums. Fixed Point Theory Algorithms Sci Eng 2021, 23 (2021). https://doi.org/10.1186/s13663-021-00708-1
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s13663-021-00708-1