 Research
 Open access
 Published:
Convergence of proximal splitting algorithms in \(\operatorname{CAT}(\kappa)\) spaces and beyond
Fixed Point Theory and Algorithms for Sciences and Engineering volume 2021, Article number: 13 (2021)
Abstract
In the setting of \(\operatorname{CAT}(\kappa)\) spaces, common fixed point iterations built from prox mappings (e.g. proxprox, Krasnoselsky–Mann relaxations, nonlinear projectedgradients) converge locally linearly under the assumption of linear metric subregularity. Linear metric subregularity is in any case necessary for linearly convergent fixed point sequences, so the result is tight. To show this, we develop a theory of fixed point mappings that violate the usual assumptions of nonexpansiveness and firm nonexpansiveness in puniformly convex spaces.
1 Fundamentals of nonlinear spaces
Following [1] we focus on puniformly convex spaces with parameter c [12]: for \(p\in (1,\infty )\), a metric space \((G, d)\) is puniformly convex with constant \(c>0\) whenever it is a geodesic space, and
Examples of puniformly convex spaces are \(L^{p}\) spaces, \(\operatorname{CAT}(0)\) spaces (\(p=c=2\)), Hadamard spaces (complete \(\operatorname{CAT}(0)\) spaces), Hilbert spaces (linear Hadamard spaces). Of particular interest are \(\operatorname{CAT}(\kappa)\) spaces since these serve as the model space for applications on manifolds with curvature bounded above.
Lemma 1
([13], Proposition 3.1)
A \(\operatorname{CAT}(\kappa )\) space is locally 2uniformly convex with parameter \(c\nearrow 2\) as the diameter of the local neighborhood vanishes. In particular, for any \(\operatorname{CAT}(\kappa )\) space \((G, d)\) and any point \(\overline{x}\in G\), for all \(\delta \in (0, \pi /(4\sqrt{\kappa }))\), the subspace \((\mathbb{B}_{\delta }(\overline{x}), d \vert _{\mathbb{B}_{\delta }(\overline{x})})\) is a 2uniformly convex space with constant \(c_{\delta }= 4\delta \sqrt{\kappa }\tan (\pi /2  2\delta \sqrt{ \kappa } )\).
Note the asymptotic behavior of the constants: as \(\delta \searrow 0\), the constant \(c\nearrow 2\).
Definition 2
Let \((G,d)\) be a geodesic space and γ and η be two geodesics through p. Then γ is said to be perpendicular to η at point p denoted by \(\gamma \perp _{p} \eta \) if
A space is said to be symmetric perpendicular if for all geodesics γ and η with common point p we have
Remark 3
Any \(\operatorname{CAT}(\kappa )\) space \((G,d)\) with \(\operatorname{diam}(G) < \frac{\pi }{2\sqrt{\kappa }} \) is symmetric perpendicular [9, Theorem 2.11].
In a complete puniformly convex space the pproximal mapping of a proper and lower semicontinuous function f is defined by
The main dividend of this work is the following.
Theorem 4
(Convergence of proximal algorithms in \(\operatorname{CAT}(\kappa)\) spaces)
Let \((G, d)\) be a complete \(\operatorname{CAT}(\kappa )\) space with \(\kappa >0\), and for \(j=1,2,\dots,N\), let \(f_{j}\) and \(g\colon G \rightarrow \mathbb{R}\cap \{+\infty \}\) be proper, convex, and lower semicontinuous with \(\mathop {\operatorname {argmin}}g\cap (\bigcap_{j=1}^{N} \mathop {\operatorname {argmin}}f_{j} )\neq \emptyset \). Let \(T: D \to D \) where \(D\subset G\) denotes one of the following:

(i)
\(T:=\operatorname {prox}_{f_{N},\lambda _{N}}\circ \operatorname {prox}_{f_{N1},\lambda _{N1}} \circ \cdots \circ \operatorname {prox}_{f_{1},\lambda _{1}}\);

(ii)
\(T:=\beta \operatorname {prox}_{g,\lambda }\oplus (1\beta )\operatorname {Id}\);

(iii)
\(T:=\operatorname {prox}_{f_{1},\lambda _{1}}\circ (\beta \operatorname {prox}_{g,\lambda _{2}} \oplus (1\beta )\operatorname {Id})\);

(iv)
\(T:=P_{C}\circ (\beta \operatorname {prox}_{g,\lambda _{1}}\oplus (1\beta ) \operatorname {Id})\),
where in the last case \(P_{C}\) is the metric projector onto the closed convex set \(C\subset G\) (in other words, this is the specialization of part (iii) to the case where \(f_{1}\) is the indicator function of a closed convex subset of G). If T satisfies \(\operatorname {Fix}T\neq \emptyset \) and
with constant μ, then the fixed point sequence initialized from any starting point close enough to FixT is at least linearly convergent to a point in FixT.
A more precise statement of this theorem, with proof, is Theorem 25. The intervening sections prove the fundamental building blocks.
2 Almost αfirmly nonexpansive mappings
The regularity of a mapping \(T: G \to G \) is characterized by the behavior of the images of pairs of points under T. A key tool is what has been called the transport discrepancy in [4]:
Definition 5
Let \((G, d)\) be a puniformly convex metric space with constant c.

(i)
The mapping \(T:G\to G\) is pointwise almost nonexpansive at \(y\in D\subset G\) on D with violation \(\epsilon \geq 0\) whenever
$$\begin{aligned} \exists \epsilon \geq 0:\quad d(Tx,Ty)^{p}\leq (1+\epsilon )d(x,y)^{p}\quad \forall x\in D. \end{aligned}$$(4)The smallest ϵ for which (4) holds is called the violation. If (4) holds with \(\epsilon =0\), then T is pointwise nonexpansive at \(y\in D\subset G\) on D. If (4) holds at all \(y\in D\), then T is said to be (almost) nonexpansive on D. If \(D=G\), then the mapping T is simply said to be (almost) nonexpansive. If \(D\supset \operatorname {Fix}T\neq \emptyset \) and (4) holds at all \(y\in \operatorname {Fix}T\) with the same violation, then T is said to be almost quasi nonexpansive.

(ii)
The mapping T is said to be pointwise asymptotically nonexpansive at y whenever
$$\begin{aligned} \forall \epsilon >0, \exists D_{\epsilon }(y)\subset G:\quad d(Tx,Ty)^{p} \leq (1+\epsilon )d(x,y)^{p}\quad \forall x\in D_{\epsilon }(y), \end{aligned}$$(5)where \(D_{\epsilon }(y)\) is a neighborhood of y in D.

(iii)
\(T: G \to G \) is said to be quasi strictly nonexpansive whenever
$$\begin{aligned} d(Tx,\overline{x})< d(x,\bar{x})\quad \forall x \in G\setminus \operatorname {Fix}T, \forall \overline{x}\in \operatorname {Fix}T. \end{aligned}$$(6) 
(iv)
The operator \(T:G\to G\) is pointwise almost αfirmly nonexpansive at \(y\in D\subset G\) on D with violation at most \(\epsilon >0\) whenever
$$\begin{aligned} \exists \alpha \in (0,1), \epsilon \geq 0:\quad d(Tx,Ty)^{p} \leq (1+ \epsilon )d(x,y)^{p}\frac{1\alpha }{\alpha }\psi ^{(p,c)}_{T}(x,y). \end{aligned}$$(7)If (7) holds with \(\epsilon =0\), then T is pointwise αfirmly nonexpansive at \(y\in D\subset G\) on D. If (7) holds at all \(y\in D\) with the same constant α, then T is said to be (almost) αfirmly nonexpansive on D. If \(D=G\), then the mapping T is simply said to be (almost) αfirmly nonexpansive. If \(D\supset \operatorname {Fix}T\neq \emptyset \) and (7) holds at all \(y\in \operatorname {Fix}T\) with the same constant α, then T is said to be almost quasi αfirmly nonexpansive.

(v)
The mapping T is said to be pointwise asymptotically αfirmly nonexpansive at y with constant \(\alpha <1\) whenever
$$\begin{aligned} &\forall \epsilon >0, \exists D_{\epsilon }(y)\subset G: \\ &\quad d(Tx,Ty)^{p} \leq (1+\epsilon )d(x,y)^{p}\frac{1\alpha }{\alpha } \psi ^{(p,c)}_{T}(x,y)\quad \forall x\in D_{\epsilon }(y), \end{aligned}$$(8)where \(D_{\epsilon }(y)\) is a neighborhood of y in D.
Proposition 6
(Characterizations)
Let \((G, d)\) be a puniformly convex space with constant \(c>0\), and let \(T:D\to G\) for \(D\subset G\).

(i)
$$\begin{aligned} \psi _{T}^{(p,c)}(x, y)=\frac{c}{2}d(Tx, x)^{p} \quad\textit{whenever } y\in \operatorname {Fix}T. \end{aligned}$$(9)
For fixed \(y\in \operatorname {Fix}T\), the function \(\psi _{T}^{(p,c)}(x,y)\geq 0\) for all \(x \in D\) and \(\psi _{T}^{(p,c)}(x,y)= 0\) only when \(x\in \operatorname {Fix}T\).

(ii)
Let \(y\in \operatorname {Fix}T\). T is pointwise almost αfirmly nonexpansive at y on D with violation at most \(\epsilon >0\) if and only if
$$\begin{aligned} \exists \alpha \in [0,1): \quad d(Tx,y)^{p}\leq (1+\epsilon )d(x,y)^{p}  \frac{1\alpha }{\alpha }\frac{c}{2}d(Tx,x)^{p}\quad \forall x\in D. \end{aligned}$$(10)In particular, T is almost quasi αfirmly nonexpansive on D whenever T possesses fixed points and (10) holds at all \(y\in \operatorname {Fix}T\) with the same constant \(\alpha \in [0,1)\) and violation at most ϵ.

(iii)
If T is pointwise almost αfirmly nonexpansive at \(y\in \operatorname {Fix}T\) on D with constant \(\underline{\alpha }\in [0,1)\) and violation at most ϵ, then it is pointwise almost αfirmly nonexpansive at y with the same upper bound on the violaton violation on D for all \(\alpha \in [\underline{\alpha },1]\). In particular, if T is pointwise almost αfirmly nonexpansive at \(y\in \operatorname {Fix}T\) on D, then it is pointwise almost nonexpansive at y on D.
Proof
This is a slight extension of [4, Proposition 4], which was for pointwise αfirmly nonexpansive mappings. The proof for pointwise almost αfirmly nonexpansive mappings is the same. □
2.1 Composition of operators
Before continuing with pointwise almost αfirmly nonexpansive mappings, we make a brief but important observation about fixed points of compositions of quasi strictly nonexpansive mappings (6).
Lemma 7
Let \(T_{1}\) and \(T_{2}\) be quasi strictly nonexpansive on \((G,d)\) with \(\operatorname {Fix}T_{1} \cap \operatorname {Fix}T_{2} \neq \emptyset \). Then \(\operatorname {Fix}T_{1} \circ T_{2} = \operatorname {Fix}T_{1} \cap \operatorname {Fix}T_{2}\).
Proof
The inclusion \(\operatorname {Fix}T_{1} \cap \operatorname {Fix}T_{2} \subset \operatorname {Fix}(T_{1} \circ T_{2} )\) is clear. Assume that there exists \(y \in \operatorname {Fix}( T_{1} \circ T_{2}) \setminus (\operatorname {Fix}T_{1} \cap \operatorname {Fix}T_{2})\) and choose \(x \in \operatorname {Fix}T_{1} \cap \operatorname {Fix}T_{2}\). Then
with either \(d(T_{2}y,x) < d(y,x)\) or \(d(T_{1} \circ T_{2} y,x) < d(T_{2} y,x)\) as \(y \notin \operatorname {Fix}T_{1} \cap \operatorname {Fix}T_{2}\). This is a contradiction, so \(\operatorname {Fix}T_{1} \circ T_{2} = \operatorname {Fix}T_{1} \cap \operatorname {Fix}T_{2}\). □
Remark 8
The sufficiency of strict quasi nonexpansivity for the analogous identity for convex combinations of mappings in a Hadamard space was recognized in [3, Remark 7.11].
Lemma 9
Let \((G, d)\) be a puniformly convex space with constant \(c>0\), and let \(D\subset G\). Let \({T_{0}}:D\to G\) be pointwise almost αfirmly nonexpansive at y on D with constant \(\alpha _{0}\) and violation \(\epsilon _{0}\), and let \({T_{1}}:{T_{0}}(D)\to G\) be pointwise almost αfirmly nonexpansive at \({T_{0}} y\) on \({T_{0}}(D)\) with constant \(\alpha _{1}\) and violation \(\epsilon _{1}\). Then the composition \(\overline{T}:={T_{1}}\circ {T_{0}}\) is pointwise almost αfirmly nonexpansive at y with constant \(\overline{\alpha }\in (0,1)\) and violation at most \({\overline{\epsilon }}=\epsilon _{0}+\epsilon _{1} + \epsilon _{0} \epsilon _{1}\) on D whenever
Proof
The proof is a slight extension of the same result for compositions of αfirmly nonexpansive mappings in [4, Lemma 10]. Since \({T_{1}}\) is pointwise αfirmly nonexpansive at \({T_{0}} y\) with violation \(\epsilon _{1}\) and constant \(\alpha _{1}\) on \({T_{0}}(D)\), we have
where \(\psi ^{(p,c)}_{{T_{1}}}\) is defined by (3). Since \({T_{0}}\) is αfirmly nonexpansive at y with constant \(\alpha _{0}\) with violation \(\epsilon _{0}\) on D, we have
for all \(x \in D\). Whenever (11) holds, we conclude that
where \({\overline{\epsilon }}= \epsilon _{0}+\epsilon _{1}+\epsilon _{0} \epsilon _{1}\). □
Proposition 10
(Compositions of pointwise almost αfirmly nonexpansive mappings)
Let \((G, d)\) be a puniformly convex space with constant \(c>0\), and let \(D\subset G\). Let \(T_{0}:D\to G\) be pointwise almost αfirmly nonexpansive at y on D with constant \(\alpha _{0}\) and violation \(\epsilon _{0}\), and let \(T_{1}:T_{0}(D)\to G\) be pointwise almost αfirmly nonexpansive at y on \(T_{0}(D)\) with constant \(\alpha _{1}\) and violation \(\epsilon _{1}\). Let \(y \in \operatorname {Fix}T_{0} \cap \operatorname {Fix}{T_{1}}\). Then the composite operator \(\overline{T}={T_{1}} \circ T_{0}\) is pointwise almost αfirmly nonexpansive at y on D with violation at most \({\overline{\epsilon }}=\epsilon _{0}+\epsilon _{1} + \epsilon _{0} \epsilon _{1}\) and constant
Proof
This is a minor extension of [4, Theorem 11]. By Lemma 9, it suffices to show (11) at all points \(y\in \operatorname {Fix}{T_{1}}\cap \operatorname {Fix}{T_{0}}\). First, note that \(\operatorname {Fix}\overline{T}\supset \operatorname {Fix}{T_{1}}\cap \operatorname {Fix}{T_{0}}\), so by (9) we have \(\psi ^{(p.c)}_{T_{0}}(x,y)=\frac{c}{2}d(x,{T_{0}}x)^{p}\), \(\psi ^{(p.c)}_{T_{1}}({T_{0}}x,{T_{0}}y)= \frac{c}{2}d({T_{0}}x, \overline{T}x)^{p}\), and \(\psi ^{(p.c)}_{\overline{T}}(x,y)=\frac{c}{2}d(x,\overline{T}x)^{p}\) whenever \(y\in \operatorname {Fix}{T_{1}}\cap \operatorname {Fix}{T_{0}}\). Inequality (11) in this case simplifies to
where \(\kappa _{0}:=\frac{1\alpha _{0}}{\alpha _{0}}\), \(\kappa _{1}:=\frac{1\alpha _{1}}{\alpha _{1}}\), and \(\overline{\kappa }:= \frac{1\overline{\alpha }}{\overline{\alpha }}\) with \(\overline{\alpha }\in (0,1)\). By (1), we have
Letting \(t=\frac{\kappa _{1}}{\kappa _{0}+\kappa _{1}}\) yields \((1t)=\frac{\kappa _{0}}{\kappa _{0}+\kappa _{1}}\), so that (14) becomes
It follows that (13) holds for any \(\overline{\kappa }\in (0, \frac{c\kappa _{0}\kappa _{1}}{2(\kappa _{0}+\kappa _{1})} ]\). We conclude that the composition T̅ is quasi αfirmly nonexpansive with constant
A short calculation shows that this is the same as (12), which completes the proof. □
2.2 Averages of operators
Let \(\mathcal{B}(G)\) be the Borel algebra on \((G,d)\), \(\mathcal{P}\) be the family of probability measures on \((G,\mathcal{B}(G))\), and \(\mathcal{P}^{p}(G)\) be the family of probability measures on G such that the pth moment exists i.e.
For \(\mu \in \mathcal{P}^{p}(G)\), the minimizer of
is called pbarycenter of ν and denoted by \(b_{p}(\nu )\) if it exists. The pbarycenter of ν always exists if \((G,d)\) is a proper geodesic space and \(\mu \in \mathcal{P}^{p}(G)\) [9, Proposition 3.3].
Let \(T_{i} \colon G \rightarrow G, i\in I\) be a collection of mappings where I is an index space. Assume that \((I,\mathcal{I})\) is a measurable space and \((x,i) \mapsto T_{i}x\) is measurable. Let η be a probability measure on I and define \(b_{p}(T_{i},\eta )\colon G \rightarrow G\) by
We use the notation \(T_{i}x_{*}\eta \) for the push forward of η with respect to the mapping \(i \mapsto T_{i} x\) for fixed x i.e.
for \(A \in \mathcal{B}(G)\). Then, by definition, \(b_{p}(T_{i},\eta )(x) = b_{p}(T_{i} x _{*}\eta )\).
Theorem 11
Let G be a proper, symmetric perpendicular, puniformly convex space with constant \(c>0\), and let \(T_{i}\), \(i \in I\) be a family of almost quasi αfirmly nonexpansive operators with violation \(\epsilon _{i}\) and constant \(\alpha _{i}\) respectively on G. Let η be a probability measure on I such that \(T_{i}x_{*}\eta \in \mathcal{P}^{p}(G) \) for all \(x\in G\). Then \(\mathscr{T}=b_{p}(T_{i},\eta )\) is a pointwise almost αfirm operator at any \(y \in \bigcap_{i\in I} \operatorname {Fix}T_{i}\) on G with constant \(\overline{\alpha }=\sup_{i\in I} \alpha _{i}\) and violation at most \({\overline{\epsilon }}= \sup_{i\in I} \epsilon _{i}\).
Proof
Let \(x \in G\) be arbitrary, \(y \in \bigcap_{i\in I} \operatorname {Fix}T_{i} \subset \operatorname {Fix}\mathscr{T} \), \(\nu = T_{i} x _{*}\eta \) as defined in (16) and \((d(\cdot,y)^{p}) _{*} \nu \) the push forward of ν. Then
by Jensen’s inequality [9, Theorem 4.1] since \(d(\cdot,y)^{p}\) is a convex function and the fact that \(\mathbb{R}\) is a puniformly convex space with constant c. Now
And again Jensen’s inequality completes the proof
□
For a finite index set I, without loss of generality \(I=\{1,\ldots,n\}\) and a probability measure η on I, we can define \(\omega _{i}:= \eta (i)\) for all i. Then
In case of a Hilbert space G and \(p=c=2\) this reduces further to \(b_{p}(T_{i},\eta )(x) = \sum_{i=1}^{n} \omega _{i} T_{i} x\).
If the support of the measure ν consists of two discrete points \(x_{1}\) and \(x_{2}\) i.e. for \(\omega \in [0,1]\), \(\nu = \omega \delta _{x_{1}}+(1\omega )\delta _{x_{2}}\), then \(b_{p}(\nu )\) can be calculated explicitly. It is obvious that \(b_{p}(\nu )\) has to lie on the geodesic connecting \(x_{1}\) and \(x_{2}\). Hence \(b_{p}(\nu ) = \overline{t}x_{1} \oplus (1\overline{t}) x_{2}\) for some \(\overline{t}\in [0,1]\). Minimizing the function \(t\mapsto \omega d( t x_{1} \oplus (1t) x_{2}, x_{1})^{p}+ (1 \omega ) d( t x_{1} \oplus (1t) x_{2}, x_{1})^{p}\) leads to \(\overline{t}= \frac{1}{\sqrt[p1]{\frac{1\omega }{\omega }}+1}\). If \(I=\{1,2\}\), \(T_{1}=T\) and \(T_{2}=\operatorname{Id}\), \(\eta =\omega \delta _{1}(\cdot )+ (1\omega ) \delta _{2}(\cdot )\) for \(\omega \in [0,1]\), then \(T_{\beta }=b_{p}(T_{i},\eta )\) is the Krasnoselsky–Mann relaxation of T
The next result shows that the convex combination of an almost nonexpansive mapping with the identity mapping can be made arbitrarily close to αfirmly nonexpansive (no violation) by choosing the averaging constant small enough—this can be interpreted as choosing an appropriately small step size.
Proposition 12
(Krasnoselsky–Mann relaxations)
Let \((G,d)\) be a puniformly convex space and \(T\colon G \rightarrow G\) be pointwise almost nonexpansive at all \(y \in \operatorname {Fix}T\) with violation ϵ. Then \(T_{\beta }:= \beta T \oplus (1\beta ) \operatorname{Id}\) is pointwise almost αfirmly nonexpansive at all \(y \in \operatorname {Fix}T\) with constant
and violation at most \(\epsilon _{\beta }:=\epsilon \beta \).
Proof
Clearly \(\operatorname {Fix}T = \operatorname {Fix}T_{\beta }\) and \(d(x,T_{\beta }x)^{p}=\beta ^{p} d(x,Tx)^{p}\). Let \(y \in Fix T_{\beta }\), then
Setting
and solving for \(\alpha _{\beta }\) yield the result. □
2.3 Metric subregularity
Recall that \(\mu:[0,\infty ) \to [0,\infty )\) is a gauge function if μ is continuous, strictly increasing with \(\mu (0)=0\), and \(\lim_{t\to \infty }\mu (t)=\infty \).
Definition 13
(Metric regularity on a set)
Let \((G_{1}, d_{1})\) and \((G_{2}, d_{2})\) be metric spaces, and let \(\Phi: G_{1}\rightrightarrows G_{2} \), \(U\subset G_{1}\), \(V\subset G_{2}\). The mapping Φ is called metrically regular with gauge μ on \(U\times V\) relative to \(\Lambda \subset G_{1}\) if
When the set V consists of a single point, \(V=\{{\overline{y}}\}\), then Φ is said to be metrically subregular for y̅ on U with gauge μ relative to \(\Lambda \subset G_{1}\).
When μ is a linear function (that is, \(\mu (t)=\kappa t, \forall t\in [0,\infty )\)), this special case is distinguished as linear metric (sub)regularity with constant κ. When \(\Lambda =G_{1}\), the quantifier “relative to” is dropped. When μ is linear, the infimum of all constants κ for which (17) holds is called the modulus of metric regularity.
The next statement is obvious from the definition.
Proposition 14
Let \((G_{1}, d_{1})\) and \((G_{2}, d_{2})\) be metric spaces, and let \(\Phi: G_{1}\rightrightarrows G_{2} \), \(U\subset G_{1}\), \(V\subset G_{2}\). If Φ is metrically subregular with gauge μ at y on U relative to \(\Lambda \subset G_{1}\), then Φ is metrically subregular with the same gauge μ at y on all subsets \(U'\subset U\) relative to \(\Lambda \subset G_{1}\).
3 Quantitative convergence
To obtain convergence of fixed point iterations under the assumption of metric subregularity, the gauge of metric subregularity μ is constructed implicitly from another nonnegative function \(\theta: [0,\infty ) \to [0,\infty ) \) satisfying
For a puniformly convex space the operative gauge satisfies
for \(\tau >0\) fixed and θ satisfying (18).
In the case of linear metric subregularity on a \(\operatorname{CAT}(\kappa)\) space this becomes
If (17) is satisfied for some \(\mu '>0\), then the condition \(\mu \geq \sqrt{\frac{\tau }{(1+\epsilon )}}\) is satisfied for all \(\mu \geq \mu '\) large enough. The conditions in (18) in this case simplify to \(\theta (t)=\gamma t\), where
The next definition characterizes the quantitative convergence of sequences in terms of gauge functions.
Definition 15
(Gauge monotonicity [10])
Let \((G,d)\) be a metric space, let \((x_{k})_{k\in \mathbb{N}}\) be a sequence on G, let \(D\subset G\) be nonempty, and let the continuous mapping \(\mu: \mathbb{R}_{+} \to \mathbb{R}_{+} \) satisfy \(\mu (0)=0\) and

(i)
\((x_{k})_{k\in \mathbb{N}}\) is said to be gauge monotone with respect to D with rate μ whenever
$$\begin{aligned} d(x_{k+1}, D)\leq \mu \bigl(d(x_{k}, D) \bigr)\quad \forall k\in \mathbb{N}. \end{aligned}$$(21) 
(ii)
\((x_{k})_{k\in \mathbb{N}}\) is said to be linearly monotone with respect to D with rate c if (21) is satisfied for \(\mu (t)=c\cdot t\) for all \(t\in \mathbb{R}_{+}\) and some constant \(c\in [0,1]\).
A sequence \((x_{k})_{k\in \mathbb{N}}\) is said to converge gauge monotonically to some element \(x^{*}\in G\) with rate \(s_{k}(t):=\sum_{j=k}^{\infty }\mu ^{(j)}(t)\) whenever it is gauge monotone with gauge μ satisfying \(\sum_{j=1}^{\infty }\mu ^{(j)}(t)<\infty \ \forall t\geq 0\), and there exists a constant \(a>0\) such that \(d(x_{k},x^{*})\leq a s_{k}(t)\) for all \(k\in \mathbb{N}\).
All Fejér monotone sequences are linearly monotone (with constant \(c=1\)) but the converse does not hold (see Proposition 1 and Example 1 of [10]). Gaugemonotonic convergence for a linear gauge in the definition above is just Rlinear convergence.
Metric subregularity and pointwise (almost) nonexpansiveness are fundamentally connected through the surrogate mapping \(\mathcal{T}_{S}: G \to \mathbb{R}_{+} \cup \{+\infty \}\) defined by
where \(\psi ^{(p,c)}_{T}\) is defined by (3) and \(S\subset G\). If \(S=\emptyset \) then, by definition, \(\mathcal{T}_{S}(x):=+\infty \) for all x. Hence, \(\mathcal{T}_{S}\) is proper when S is nonempty. For our purposes, \(S\subseteq \operatorname {Fix}T\), in which case by Proposition 6(i) we have \(\psi ^{(p,c)}_{T}(x,y)\geq 0\) for all \(x\in D\) and all \(y\in S\) and \(\psi ^{(p,c)}_{T}(x,y)= 0\) only when both \(x,y\in \operatorname {Fix}T\). Hence \(\mathcal{T}_{S}\) is nonnegative, takes the value 0 only on FixT, and has the simple representation
Theorem 16
(Necessary and sufficient conditions for convergence rates)
Let \((G,d)\) be a complete puniformly convex space with constant c; let \(D\subset G\) with \((D, d)\), let \(T: D \to D \) with \((T(D), d)\) compact on bounded subsets, and let \(S:=\operatorname {Fix}T\cap D\) be nonempty. Assume further that T is pointwise almost αfirmly nonexpansive at all points \(y\in S\) with the same constant α̅ and violation at most ϵ on D.

(a)
(necessity) Suppose that all sequences \((x^{k})_{k\in \mathbb{N}}\) defined by \(x^{k+1}=Tx^{k}\) and initialized in D are gauge monotone relative to S with rate θ satisfying (18), and \((\operatorname {Id} \theta )^{1}(\cdot )\) is continuous on \(\mathbb{R}_{+}\), strictly increasing, and \((\operatorname {Id} \theta )^{1}(0)=0\). Then all sequences initialized on D converge gauge monotonically to some \(\overline{x}\in S\) with rate \(O(s_{k}(t_{0}))\) where \(s_{k}(t):=\sum_{j=k}^{\infty }\theta ^{(j)}(t)\) and \(t_{0}:=d(x^{0},\operatorname {Fix}T\cap D)\). Moreover, \(\mathcal{T}_{S}\) defined by (22) is metrically subregular for 0 relative to D on D with gauge \(\mu (\cdot )=(\operatorname {Id}\theta )^{1}(\cdot )\).

(b)
(sufficiency) Let T satisfy
$$\begin{aligned} d(x,\operatorname {Fix}T\cap D)\leq \mu \bigl(d(x,Tx)\bigr), \quad\forall x\in D, \end{aligned}$$(24)with gauge μ given implicitly by (19) with θ satisfying (18) for \(\tau =(1\overline{\alpha })/\overline{\alpha }\) and \(\epsilon \geq 0\) an upper bound on the violation of pointwise α firmness of T on D. Then, for any \(x^{0}\in D\), the sequence \((x^{k})_{k\in \mathbb{N}}\) defined by \(x^{k+1}= T x^{k}\) satisfies
$$\begin{aligned} d \bigl(x^{k+1},\operatorname {Fix}T\cap D \bigr) \leq \theta \bigl(d \bigl(x^{k}, \operatorname {Fix}T\cap D \bigr) \bigr) \quad\forall k \in \mathbb{N}. \end{aligned}$$(25)Moreover, the sequence \((x^{k})_{k\in \mathbb{N}}\) converges gauge monotonically to some \(x^{*}\in \operatorname {Fix}T\cap D\) with rate \(O(s_{k}(t_{0}))\) where \(s_{k}(t):=\sum_{j=k}^{\infty }\theta ^{(j)}(t)\) and \(t_{0}:=d(x^{0},\operatorname {Fix}T\cap D)\).
Before proving this theorem, we collect some intermediate results.
Lemma 17
(Gauge monotonicity and almost quasi αfirmness implies convergence to fixed points)
Let \((G, d)\) be a complete puniformly convex space with constant c. Let \(T: G \to G \) with \(T(D)\subseteq D\subseteq G\) boundedly compact. Suppose that \(\operatorname {Fix}T\cap D\) is nonempty and that T is pointwise almost αfirmly nonexpansive at all \(y\in \operatorname {Fix}T\cap D\) with the same constant α̅ and violation at most ϵ on D. If the sequence \((x^{k})_{k\in \mathbb{N}}\) defined by \(x^{k+1}= Tx^{k}\) and initialized in D is gauge monotone relative to \(\operatorname {Fix}T\cap D\) with rate θ satisfying (18), then \((x^{k})_{k\in \mathbb{N}}\) converges gauge monotonically to some \(x^{*}\in \operatorname {Fix}T\cap D\) with rate \(O(s_{k}(t_{0}))\) where \(s_{k}(t):=\sum_{j=k}^{\infty }\theta ^{(j)}(t)\) and \(t_{0}:=d(x^{0},\operatorname {Fix}T\cap D)\).
Proof
The assumption that T is pointwise almost αfirmly nonexpansive at all \(y\in \operatorname {Fix}T\cap D\) with constant α̅ and violation at most ϵ on D yields
Let \(x^{0}\in D\) and define the sequence \(x^{k+1}= Tx^{k}\) for all \(k\in \mathbb{N}\). Since T is pointwise almost αfirmly nonexpansive at all points in \(\operatorname {Fix}T\cap D\) on D, \(\operatorname {Fix}T\cap D\) is closed and \(P_{\operatorname {Fix}T\cap D}x^{k}\) is nonempty (though possibly setvalued) for all k. Denote any selection by \(\bar{x}^{k}\in P_{\operatorname {Fix}T\cap D}x^{k}\) for each \(k\in \mathbb{N}\). Then
which implies that
On the other hand \(d(x^{k}, \bar{x}^{k})= d(x^{k}, \operatorname {Fix}T\cap D) \leq \theta (d(x^{k1}, \operatorname {Fix}T\cap D) )\) since \((x^{k})_{k\in \mathbb{N}}\) is gauge monotone relative to \(\operatorname {Fix}T\cap D\) with rate θ. Therefore an iterative application of gauge monotonicity yields
Let \(t_{0}=d(x^{0}, \operatorname {Fix}T\cap D)\). For any given natural numbers \(k,l\) with \(k< l\), an iterative application of the triangle inequality yields the upper estimate
where \(s_{k}(t_{0}):=\sum_{j=k}^{\infty }\theta ^{(j)}(t_{0})<\infty \) for θ satisfying (18). Since \((\theta ^{(k)}(t_{0}))_{k\in \mathbb{N}}\) is a summable sequence of nonnegative numbers, the sequence of partial sums \(s_{k}(t_{0})\) converges to zero monotonically as \(k\to \infty \), and hence \((x^{k})_{k\in \mathbb{N}}\) is a Cauchy sequence and \(x^{k}\to x^{*}\) for some \(x^{*}\in G\). Letting \(l\to +\infty \) yields
Therefore \((x^{k})_{k\in \mathbb{N}}\) converges gauge monotonically to \(x^{*}\) with rate \(O(s_{k}(t_{0}))\).
It remains to show that \(x^{*}\in \operatorname {Fix}T\cap D\). Note that for each \(k\in \mathbb{N}\)
which yields \(\lim_{k} d(x^{k}, \bar{x}^{k})=0\). But by the triangle inequality
so \(\lim_{k}d(\bar{x}^{k}, x^{*})=0\). By construction \((\bar{x}^{k})_{k\in \mathbb{N}}\subseteq \operatorname {Fix}T\cap D\) and \(\operatorname {Fix}T\cap D\) is closed, hence \(x^{*}\in \operatorname {Fix}T\cap D\). □
Proposition 18
([4], Theorem 32)
Let \((G, d)\) be a puniformly convex metric space with constant c. Let \(T:D\to D\) with \(D\subseteq G\). Suppose that \(S:=\operatorname {Fix}T\cap D\) is nonempty. Suppose that all sequences \((x^{k})_{k\in \mathbb{N}}\) defined by \(x^{k+1}=Tx^{k}\) and initialized in D are gauge monotone relative to S with rate θ satisfying (18). Suppose, in addition, that \((\operatorname {Id} \theta )^{1}(\cdot )\) is continuous on \(\mathbb{R}_{+}\), strictly increasing, and \((\operatorname {Id} \theta )^{1}(0)=0\). Then \(\mathcal{T}_{S}\) defined by (22) is metrically subregular for 0 relative to D on D with gauge \(\mu (\cdot )=(\operatorname {Id}\theta )^{1}(\cdot )\).
Proof of Theorem 16
Part (a). This follows immediately from Lemma 17 and Proposition 18.
Part (b). Our pattern of proof follows the same logic as the analogous result for setvalued mappings in a Euclidean space setting [11, Theorem 2.2]. Since \(S = \operatorname {Fix}T\cap D\), Proposition 6(i) establishes that \(\psi (x,y)=\frac{c}{2}d(x, Tx)^{p}\) for all \(y\in \operatorname {Fix}T\), so \(\mathcal{T}_{S}(x)=\frac{c}{2}d(x, Tx)\). Also by Proposition 6(i) \(\mathcal{T}_{S}\) takes the value 0 only on FixT, that is, \(\mathcal{T}_{S}^{1}(0)=\operatorname {Fix}{T}\). So by assumption that \(\mathcal{T}_{S}\) satisfies (24) with gauge μ given by (19) for \(\tau =(1\overline{\alpha })/\overline{\alpha }\), together with the definition of metric subregularity (Definition 13), this yields
In other words,
On the other hand, by the assumption that T is pointwise almost αfirmly nonexpansive at all points \(y\in S\) with the same constant α̅ and violation at most ϵ on D we have
Incorporating (26) into (27) and rearranging the inequality yields
Since this holds at any \(x\in D\), it certainly holds at the iterates \(x^{k}\) with initial point \(x^{0}\in D\) since T is a selfmapping on D. Therefore for all \(k\in \mathbb{N}\)
Equation (28) simplifies as follows. Since the space \((T(D), d)\) is boundedly compact and FixT is closed by continuity, for every \(k\in \mathbb{N}\), the distance \(d(x^{k}, \operatorname {Fix}T\cap D)\) is attained at some \(y^{k}\in \operatorname {Fix}T\cap D\). This yields
Taking the pth root and recalling (19) yields (25).
This establishes also that the sequence \((x^{k})_{k\in \mathbb{N}}\) is gauge monotone relative to \(\operatorname {Fix}T\cap D\) with rate θ satisfying Eq. (18). By Lemma 17 it follows that the sequence \((x^{k})_{k\in \mathbb{N}}\) converges gauge monotonically to \(x^{*}\in \operatorname {Fix}T\cap D\) with the rate \(O(s_{k}(d(x^{0},\operatorname {Fix}T\cap D)))\) where \(s_{k}(t):=\sum_{j=k}^{\infty }\theta ^{(j)}(t)\). □
Corollary 19
(Linear convergence)

(a)
(necessity) In the setting of Theorem 16(a), if all sequences \((x^{k})_{k\in \mathbb{N}}\) defined by \(x^{k+1}=Tx^{k}\) and initialized in D are linearly monotone relative to S with rate \(\gamma <1\), then all sequences initialized on D converge Rlinearly to some \(\overline{x}\in S\) with rate \(O(\gamma ^{k})\). Moreover, \(\mathcal{T}_{S}\) defined by (22) is linearly metrically subregular for 0 relative to D on D with gauge \(\mu (t)=(1\gamma )^{1}t\).

(b)
(sufficiency) In the setting of Theorem 16(b) suppose that T satisfies
$$\begin{aligned} d(x,\operatorname {Fix}T\cap D)\leq \mu d(x,Tx),\quad \forall x\in D, \end{aligned}$$(30)with the scalar μ satisfying
$$\begin{aligned} \sqrt[p]{\frac{1\overline{\alpha }}{\overline{\alpha }(1+\epsilon )}}< \mu < \sqrt[p]{\frac{1\overline{\alpha }}{\overline{\alpha }\epsilon }}. \end{aligned}$$Then, for any \(x^{0}\in D\), the sequence \((x^{k})_{k\in \mathbb{N}}\) defined by \(x^{k+1}= T x^{k}\) is Rlinearly convergent to a point in \(\operatorname {Fix}T\cap D\) with rate \(\gamma = \sqrt[p]{1+\epsilon \frac{1\overline{\alpha }}{\overline{\alpha }\mu ^{p}}}\).
In the statements above, the upper bound on the violation of αfirm nonexpansiveness ϵ has to be compensated for by an equally strong gauge of metric subregularity with this value of ϵ explicitly accounted for in the gauge. The next result shows that these can be decoupled if T is pointwise asymptotically αfirmly nonexpansive at fixed points. In particular, if T is pointwise almost αfirmly nonexpansive at \(y\in \operatorname {Fix}T\) with arbitrarily small violation ϵ, then whenever T is (gauge) metrically subregular at y, there is a neighborhood of y on which convergence of the fixed point iteration can be quantified by the said gauge. In this situation it suffices to qualitatively determine metric subregularity—the exact value of the constants in relation to the violation of αfirmness is not needed in order to determine local convergence on the order of the gauge.
Proposition 20
Let \((G,d)\) be a complete puniformly convex space with constant c; let \(D\subset G\), let \(T: D \to D \) with \(T(D)\) boundedly compact, and let \(S:=\operatorname {Fix}T\cap D\) be nonempty. Assume that T is a selfmapping on sufficiently small balls around points in S restricted to D, and that T is pointwise asymptotically αfirmly nonexpansive at all points \(y\in S\) with constant \(\overline{\alpha }\in (0,1)\). Suppose further that T satisfies
with gauge μ given by (19) and \(\tau =(1\overline{\alpha })/\overline{\alpha }\). Then, for any \(x^{0}\) close enough to S, the sequence \((x^{k})_{k\in \mathbb{N}}\) defined by \(x^{k+1}= T x^{k}\) converges gauge monotonically to some \(x^{*}\in \operatorname {Fix}T\cap D\) with rate \(O(s_{k}(t_{0}))\) where \(s_{k}(t):=\sum_{j=k}^{\infty }\theta ^{(j)}(t)\) and \(t_{0}:=d(x^{0},\operatorname {Fix}T)\) for θ given implicitly by (19) satisfying (18).
Proof
Since T is a selfmapping on \(\mathbb{B}_{\delta }(S)\cap D\) for δ small enough, and T is pointwise asymptotically αfirmly nonexpansive with constant \(\overline{\alpha }\in (0,1)\), the result follows immediately from Proposition 14 and Theorem 16 when the domain D is restricted to \(\mathbb{B}_{\delta }(S)\) for δ sufficiently small. □
4 Proximal mappings
We return now to the prox mapping (2). It was shown in [8, Proposition 2.7] that the argmin in (2) exists and is unique if f is proper, lsc, and convex. In general the prox mapping of a convex function is not αfirmly nonexpansive. However, it was shown in [4, Corollary 23] that it is almost αfirmly nonexpansive. This and other properties of prox mappings are collected in the following result.
Theorem 21
Let \((G,d)\) be a puniformly convex metric space with constant \(c\in (0,2]\), and let \(f\colon G \rightarrow \mathbb{R}\) be proper, convex, and lower semicontinuous.

(i)
If \((G,d)\) is symmetric perpendicular, then \(\operatorname {prox}^{p}_{f,\lambda }\) is quasi strictly nonexpansive, that is,
$$\begin{aligned} d\bigl(\operatorname {prox}^{p}_{f,\lambda }(x),\overline{x}\bigr)< d(x,\bar{x})\quad \forall x \in G, \forall \overline{x}\in \mathop {\operatorname {argmin}}f = \operatorname {Fix}\operatorname {prox}^{p}_{f, \lambda }. \end{aligned}$$ 
(ii)
The prox mapping \(\operatorname {prox}_{f,\lambda }^{p}\) is pointwise almost αfirmly nonexpansive at all \(y\in \mathop {\operatorname {argmin}}f\) on G with constant
$$\begin{aligned} \alpha _{c} = \frac{c(c1)}{c(c1)+2} \quad\textit{and violation bounded above by}\quad \epsilon _{c}= \frac{2c}{c1}. \end{aligned}$$(32) 
(iii)
If \((G,d)\) is a \(\operatorname{CAT}(\kappa )\) space, then \(\operatorname {prox}_{f,\lambda }^{p}\) is pointwise asymptotically αfirmly nonexpansive at all \(y\in \mathop {\operatorname {argmin}}f\) with constant \(\overline{\alpha }= 1/2\).
Proof
(i). Let \(x \in G\) be arbitrary and \(y:= \operatorname {prox}^{p}_{f,\lambda }(x)\). We prove by contradiction that the projection of x onto the geodesic \([\overline{x},y]\) connecting x̅ and y is y i.e. \(P_{[\overline{x},y]}(x)= y\). Therefore assume that \(P_{[\overline{x},y]}(x)\neq y\) i.e. \(P_{[\overline{x},y]}(x)= (1t)y \oplus t \overline{x}\) for some \(t \in (0,1]\). Then \(f((1t)y \oplus t \overline{x}) \leq (1t) f(y) + t f(\overline{x}) \leq f(y)\) and \(d((1t)y \oplus t \overline{x},x) < d(y,x)\). Now
contradicts \(y= \operatorname {prox}^{p}_{f,\lambda }(x)\). Hence our assumption must be discarded and \(P_{[\overline{x},y]}(x) = y\). In particular \([\overline{x},y] \perp _{y} [x,y]\), and hence by the symmetric perpendicular property \([x,y] \perp _{y} [\overline{x},y]\). Now \([x,y] \perp _{y} [\overline{x},y]\) in turn yields the claim \(d(y,\overline{x})\leq d(x,\overline{x})\).
If in addition \((G,d)\) is a puniformly convex space
This is only possible if either \(x=y\) or \(d(\overline{x},y) < d(\overline{x},x)\). In both cases \(\operatorname {prox}^{p}_{f,\lambda }\) is quasi strictly nonexpansive.
(ii). This is [4, Corollary 23].
(iii). Let \(\epsilon >0\), \(y\in \mathop {\operatorname {argmin}}f\), and \(c=\frac{2+\epsilon }{1+\epsilon }\). Then \(c \in (1,2)\) and there is a unique solution \(t \in (0, \pi /2)\) to \(\frac{c}{2} = t \tan (\pi /2t) \). Set \(\delta =t/(2 \sqrt{\kappa })\). Then \(\delta \in (0,\pi /(4 \sqrt{\kappa }))\) and \((\mathbb{B}_{\delta }(y), d \vert _{\mathbb{B}_{\delta }(y)})\) is a puniformly convex space with constant c by Lemma 1. Part (i) ensures that \(\operatorname {prox}^{p}_{f,\lambda }\) is a selfmapping on \(\mathbb{B}_{\delta }(y)\), and hence
and we will use the same notation \(\operatorname {prox}^{p}_{f,\lambda }\) for the operator restricted to the subspace \((\mathbb{B}_{\delta }(y), d \vert _{\mathbb{B}_{\delta }(y)})\). The operator \(\operatorname {prox}^{p}_{f,\lambda }\) is pointwise almost αfirmly nonexpansive with constant \(\alpha _{c}=\frac{c(c1)}{c(c1)+2}\) and violation bounded above by \(\epsilon _{c} = \frac{2c}{c1} =\epsilon \) on \((\mathbb{B}_{\delta }(y), d \vert _{\mathbb{B}_{\delta }(y)})\) by part (ii). Note that \(\alpha _{c}\nearrow 1/2\) as \(c\nearrow 2\) and by Propositon 6(iii) \(\operatorname {prox}^{p}_{f,\lambda }\) is pointwise almost αfirmly nonexpansive with constant \(\overline{\alpha }=1/2\) for all \(c\in (0,2)\). Since \(\epsilon _{c}\searrow 0\) as \(c\nearrow 2\), this implies that \(\operatorname {prox}^{p}_{f,\lambda }\) is pointwise asymptotically αfirmly nonexpansive at \(y\in \mathop {\operatorname {argmin}}f\) with constant \(\overline{\alpha }= 1/2\) and the neighborhood \(\mathbb{B}_{\delta }(y)\) of y. □
Remark 22
In part (i) of Theorem 21, quasi nonexpansiveness comes from the symmetric perpendicular property. Strict quasi nonexpansiveness comes from the property of puniformly convex spaces.
The next result gathers properties of mappings built from prox mappings.
Proposition 23
Let \((G,d)\) be a puniformly convex space with constant \(c\in (1,2]\), and let \(f,g\colon G \rightarrow \mathbb{R}\cup \{+\infty \}\) be convex, proper, and lower semicontinuous. Assume that \(\mathop {\operatorname {argmin}}f \cap \mathop {\operatorname {argmin}}g \neq \emptyset \).

(i)
The operator \(T:= \operatorname {prox}_{f,\lambda _{2}}^{p} \circ \operatorname {prox}_{g,\lambda _{1}}^{p}\) (\(\lambda _{1}, \lambda _{2}>0\)) is pointwise almost αfirmly nonexpansive at all points \(y\in \Omega:=\mathop {\operatorname {argmin}}f\cap \mathop {\operatorname {argmin}}g\) on G with constant \(\alpha _{\circ }= \frac{2 (c1)}{2c1}\) and violation bounded above by \(\epsilon _{\circ }=\frac{1}{(c1)^{2}}1\). If \((G, d)\) is a \(\operatorname{CAT}( \kappa)\) space, then T is pointwise asymptotically αfirmly nonexpansive at all \(y\in \Omega \) with constant \(\overline{\alpha }_{\circ }= 2/3\).

(ii)
Let \(T_{0}: G \to G \) be pointwise asymptotically αfirmly nonexpansive with constant \(\overline{\alpha }_{0}\) at all \(y\in \Omega:=\mathop {\operatorname {argmin}}f\cap \operatorname {Fix}T_{0}\). The operator \(T:= \operatorname {prox}_{f,\lambda }^{p} \circ T_{0}\) (\(\lambda >0\)) is pointwise almost αfirmly nonexpansive at all points \(y\in \Omega \) on G with constant and violation bounded above by
$$\begin{aligned} \alpha = \frac{\overline{\alpha }_{0}+\alpha _{c}2\overline{\alpha }_{0}\alpha _{c}}{\frac{c}{2} (1\overline{\alpha }_{0}\alpha _{c}+\overline{\alpha }_{0}\alpha _{c} )+ \overline{\alpha }_{0}+\alpha _{c}2\overline{\alpha }_{0}\alpha _{c}}\quad \textit{and}\quad \epsilon = \epsilon _{0}+\epsilon _{c}+\epsilon _{0} \epsilon _{c}, \end{aligned}$$(33)where \(\epsilon _{0}\) is the violation of αfirm nonexpansiveness of \(T_{0}\) on some neighborhood small enough. If \((G, d)\) is a \(\operatorname{CAT}( \kappa)\) space, then T is pointwise asymptotically αfirmly nonexpansive at all \(y\in \Omega \) with constant
$$\begin{aligned} \overline{\alpha }:=\frac{1}{2\overline{\alpha }_{0}}. \end{aligned}$$(34) 
(iii)
The Krasnoselsky–Mann relaxation \(T:=\beta \operatorname {prox}_{f,\lambda }^{p}\oplus (1\beta )\operatorname {Id}\) is pointwise almost αfirmly nonexpansive at all points \(y\in \Omega:=\mathop {\operatorname {argmin}}f\) on G with constant
$$\begin{aligned} \alpha _{\beta }= \frac{\alpha _{c}\beta ^{p1}}{\alpha _{c}\beta ^{p1}  \alpha _{c}\beta +1} \end{aligned}$$and violation bounded above by \(\epsilon _{\beta }=\beta \epsilon _{c}\), where \(\alpha _{c}\) and \(\epsilon _{c}\) are given by (32). If \((G, d)\) is a \(\operatorname{CAT}(\kappa)\) space, then T is pointwise asymptotically αfirmly nonexpansive at all \(y\in \Omega \) with constant
$$\begin{aligned} \overline{\alpha }_{\beta }:=\frac{\beta ^{p1}}{\beta ^{p1}\beta +2}. \end{aligned}$$ 
(iv)
The composition \(T:=\operatorname {prox}_{f,\lambda _{2}}^{p}\circ (\beta \operatorname {prox}_{g,\lambda _{1}}^{p} \oplus (1\beta )\operatorname {Id})\) is pointwise almost αfirmly nonexpansive at all points \(y\in \Omega:=\mathop {\operatorname {argmin}}f\cap \mathop {\operatorname {argmin}}g\) on G with constant
$$\begin{aligned} \begin{aligned}&\widehat{\alpha }= \frac{\alpha _{\beta }+\alpha _{c}2\alpha _{\beta }\alpha _{c}}{\frac{c}{2} (1\alpha _{\beta }\alpha _{c}+\alpha _{\beta }\alpha _{c} )+ \alpha _{\beta }+\alpha _{c}2\alpha _{\beta }\alpha _{c}}\\ &\quad \textit{and violation bounded above by}\\ & \widehat{\epsilon }= (1+\beta )\epsilon _{c} + \beta \epsilon _{c}^{2}, \end{aligned} \end{aligned}$$(35)where \(\alpha _{\beta }\), \(\alpha _{c}\), and \(\epsilon _{c}\) are the constants in part (iii) and Theorem 21(ii) respectively. If \((G, d)\) is a \(\operatorname{CAT}(\kappa)\) space, then T is pointwise asymptotically αfirmly nonexpansive at all \(y\in \Omega \) with constant
$$\begin{aligned} \widehat{\alpha }= \frac{1}{2  \overline{\alpha }_{\beta }}, \end{aligned}$$where \(\overline{\alpha }_{\beta }\) is the constant in part (iii),

(v)
If \((G, d)\) is symmetric perpendicular, the projected gradient operator
$$\begin{aligned} T:=P_{C}\circ \bigl(\beta \operatorname {prox}_{g,\lambda }^{p}\oplus (1\beta ) \operatorname {Id}\bigr) \end{aligned}$$is pointwise almost αfirmly nonexpansive at all points \(y\in \Omega:=C\cap \mathop {\operatorname {argmin}}g\) on G with
$$\begin{aligned} \alpha _{PG}= \frac{1}{\frac{c}{2} (1\alpha _{\beta } )+ 1}\quad \textit{and violation bounded above by}\quad \epsilon _{PG} = \epsilon _{c}\beta. \end{aligned}$$(36)If \((G, d)\) is a \(\operatorname{CAT}(\kappa)\) space, then T is pointwise asymptotically αfirmly nonexpansive at all \(y\in \Omega \) with constant
$$\begin{aligned} \alpha _{PG}= \frac{1}{2  \overline{\alpha }_{\beta }}, \end{aligned}$$where \(\overline{\alpha }_{\beta }\) is the constant in part (iii),
Proof
(i). By Theorem 21(ii), the operators \(\operatorname {prox}_{f,\lambda }^{p}\) and \(\operatorname {prox}_{g,\lambda }^{p}\) are almost quasi αfirmly nonexpansive with constants \(\alpha _{c}=\frac{c (c1)}{c (c1)+2}\) and violation bounded above by \(\epsilon _{c}=\frac{2c}{c1}\). The operator T is almost αfirmly nonexpansive at all points \(y\in \Omega \) on G with
and violation satisfying \(1+\epsilon _{\circ }=(1+\epsilon _{c})^{2}\) by Proposition 10. A short calculation yields
Taking the limit as \(c\to 2\) from below yields the constant \(\overline{\alpha }_{\circ }= 2/3\) with limiting violation \({\overline{\epsilon }}_{\circ }= 0\). The same argument as Proposition 21(iii) then shows that, when \((G, d)\) is a \(\operatorname{CAT}(\kappa )\) space, the composition of two prox mappings is pointwise asymptotically αfirmly nonexpansive at points in Ω with constant \(2/3\).
(ii). Theorem 21(ii) and Proposition 10 yield pointwise almost αfirm nonexpansiveness of T at \(y\in \Omega \) with constant and violation characterized by (33), where \(\alpha _{c}\) and \(\epsilon _{c}\) are given by (32), \(\overline{\alpha }_{0}\) is the asymptotic constant of αfirm nonexpansiveness of \(T_{0}\), and \(\epsilon _{0}\) is the upper bound of the violation on some neighborhood. (By Proposition 6(iii) if \(T_{0}\) is pointwise almost αfirmly nonexpansive with constant \(\alpha _{0}<\overline{\alpha }_{0}\), then \(T_{0}\) is also pointwise almost αfirmly nonexpansive with constant \(\overline{\alpha }_{0}\).) By Theorem 21(iii) and the assumption that \(T_{0}\) is pointwise asymptotically αfirmly nonexpansive at \(\operatorname {Fix}T_{0}\) with constant \(\overline{\alpha }_{0}\), the same argument as Proposition 21(iii) establishes that, when \((G, d)\) is a \(\operatorname{CAT}(\kappa )\) space, T is pointwise asymptotically αfirmly nonexpansive at points in Ω with constant α̅ given by (34).
(iii). The first statement is an immediate application of Proposition 12 and Theorem 21(ii). When \((G, d)\) is a \(\operatorname{CAT}(\kappa )\) space, the same argument as Proposition 21(iii) establishes that Krasnoselsky–Mann relaxations of prox mappings are pointwise asymptotically αfirmly nonexpansive with constant \(\overline{\alpha }_{\beta }\) as claimed.
(iv). This is an application of part (ii) to part (iii).
(v). This is a specialization of part (iv) when f is the indicator function of a convex set C and follows from the fact that, on symmetric perpendicular puniformly convex spaces, the projector is pointwise αfirmly nonexpansive at all points in C with constant \(\alpha =1/2\) (no violation) as shown in [4, Proposition 25]. □
Remark 24
Part (i) of Proposition 23 coincides with \(\alpha = \frac{2}{3}\) and \(\epsilon =0\) in the classic setting with \(c=2\). In particular, the composition \(P_{A} \circ P_{B}\) of two projections \(P_{A}\) and \(P_{B}\) onto convex sets A and B with \(A\cap B \neq \emptyset \) is αfirmly nonexpansive at all \(y\in A\cap B\) on G with \(\alpha =\frac{2}{3}\) and violation \(\epsilon =0\). However, this result does not apply if the problem is infeasible i.e. \(A \cap B= \emptyset \).
These properties allow us to prove the following fundamental result.
Theorem 25
(Convergence of proximal algorithms in \(\operatorname{CAT}(\kappa)\) spaces)
Let \((G, d)\) be a complete \(\operatorname{CAT}(\kappa )\) space with \(\kappa >0\) and \(f,g\colon G \rightarrow \mathbb{R}\cap \{+\infty \}\) be proper, convex, and lower semicontinuous with \(\mathop {\operatorname {argmin}}f\cap \mathop {\operatorname {argmin}}g \neq \emptyset \). Let T denote any of the mappings in Proposition 23. If T satisfies
with constant μ, then the fixed point sequence initialized from any starting point close enough to \(\operatorname {Fix}T\cap D\) is at least linearly convergent to a point in \(\operatorname {Fix}T\cap D\) with rate \(\gamma = \sqrt{1+\epsilon  \frac{1\alpha }{\alpha \mu ^{2}}}<1\), where α and ϵ are the respective constant and violation of pointwise αfirm nonexpansiveness of the fixed point mapping T as given in Proposition 23. The asymptotic rate of convergence is \(\overline{\gamma }= \sqrt{1 \frac{1\overline{\alpha }}{\overline{\alpha }\mu ^{2}}}<1\), where α̅ is the respective constant of pointwise asymptotic αfirm nonexpansiveness of the fixed point mapping T.
Proof
As established in Theorem 21 and Proposition 23, all of the mappings covered in those results are pointwise asymptotically αfirmly nonexpansive at points in Ω with constants \(\overline{\alpha }<1\), where Ω is one of the following subsets corresponding to the respective mappings (i)–(v) in 23: (i) \(\Omega = \mathop {\operatorname {argmin}}f\cap \mathop {\operatorname {argmin}}g\); (ii) \(\Omega \subset \operatorname {Fix}T_{0}\cap \mathop {\operatorname {argmin}}f\); (iii) \(\Omega = \mathop {\operatorname {argmin}}f\); (iv) \(\Omega = \mathop {\operatorname {argmin}}f\cap \mathop {\operatorname {argmin}}g\); (v) \(\Omega = C\cap \mathop {\operatorname {argmin}}g\).
As noted in Remark 3, any \(\operatorname{CAT}(\kappa)\) space is symmetric perpendicular locally, so by Theorem 21(i) and Lemma 7, in every case \(\operatorname {Fix}T = \Omega \). Now, if T satisfies (37) at all points in FixT on G with constant μ, it follows immediately from Proposition 20 that the fixed point iteration converges linearly to a point in \(\operatorname {Fix}T\cap D\) with the given rate for all starting points close enough to \(\operatorname {Fix}T\cap D\), as claimed. □
5 Proximal splitting methods
The concrete examples provided here have been well studied for puniformly convex spaces with \(p=c=2\) i.e. \(\operatorname{CAT}(0)\) spaces. The tools established in the previous sections open the door to applying these methods in \(\operatorname{CAT}( \kappa )\) spaces, which is new. Since \(\operatorname{CAT}(\kappa )\) spaces are puniformly convex with \(p=2\), to avoid confusion, we revert to the usual notation for proximal operators in the setting, namely \(\operatorname {prox}_{f,\lambda }\), omitting the exponent.
Let \((G,d)\) be a \(\operatorname{CAT}(\kappa)\) space, \(f_{i}:G\to G\) be proper lsc convex functions for \(i=1,2,\dots N\). Consider the problem
Applying backwardbackward splitting to this problem yields Algorithm 1. Local linear convergence follows from Theorem 25 and the extension of Proposition 23(i) via part (ii) of the same proposition and induction, under the assumption that \(\mathcal{T}_{\Omega }\) defined by (22)—which simplifies to (23)—is linearly metrically subregular for 0 on G with constant μ and \(\Omega:=\bigcap_{j}\mathop {\operatorname {argmin}}f_{j}\neq \emptyset \). By Proposition 6(i), linear metric subregularity simplifies to
Recall, in a puniformly convex space with \(p=2\), the Moreau–Yosida envelope of f is defined by
The analogue to the direction of steepest descent for the Moreau–Yosida envelope in puniformly convex settings is
Specializing problem (38) to the case \(N=2\) and \(f_{2}=\iota _{C}\), the indicator function of some closed convex set \(C\subset G\) yields Algorithm 2, the analog to projected gradients in \(\operatorname{CAT}(\kappa)\) space, which is the projected resolvent/projected prox iteration. Local linear convergence follows immediately from Theorem 25 under the assumption that T satisfies (30) and \(\Omega:=\mathop {\operatorname {argmin}}f\cap C\neq \emptyset \).
Compositions of projectors in \(\operatorname{CAT}(\kappa )\) spaces has been studied in [4] and [1]. We consider Algorithm 1 when the functions \(f_{i}:=\iota _{C_{i}}\), the indicator functions of closed convex sets \(C_{i}\subset G\), where \((G, d)\) is a complete, symmetric perpendicular puniformly convex space with constant c. The pproximal mapping of the indicator function is the metric projector, and so by [4, Proposition 25] these are pointwise αfirmly nonexpansive at all points in \(\bigcap_{i} C_{i}\) (assuming, of course, that this is nonempty) and by [4, Lemma 10] the cyclic projections mapping
is pointwise αfirmly nonexpansive at all points in \(\bigcap_{i} C_{i}=\operatorname {Fix}T_{CP}\), when the intersection is nonempty, with constant \(\overline{\alpha }_{N} = \frac{N1}{N}\) on G. Δ or weak convergence (no rate) to a point in \(\bigcap_{i} C_{i}\) follows from [4, Theorem 27], with strong convergence whenever one of the sets is compact. If in addition \(d(x,\bigcap_{i} C_{i})\leq \mu d(T_{CP}x, x)\) for all \(x\in G\), where \(\mu >0\) is the rate of linear metric subregularity, then, by Theorem 25, the sequence \((x_{k})\) initialized anywhere in G converges linearly to some \(x^{*}\in \bigcap_{i} C_{i}\).
6 Open problems
There are two obvious next steps for this work. First and foremost is to determine the requirements for quantitative convergence of proximal splitting methods for the case when the individual prox mappings do not have common fixed points—the socalled inconsistent case—since it is too limiting to require that the fixed points of the constituent elements of splitting methods coincide. The second item to explore is qualitative settings in which metric subregularity comes “for free”. In linear settings, polyhedrality and isolated fixed points suffice to guarantee metric subregularity [7, Propositions 3I.1 and 3I.2], and this was successfully used to prove local linear convergence of the ADMM/Douglas–Rachford algorithms in a convex setting [2, Theorem 2.7]. In more general settings, the Kurdyka–Łojasiewicz (KL) property—which implies metric subregularity [6, Corollary 4 and Remark 5]—is satisfied by semialgebraic functions [5]. Analogues to these properties for puniformly convex spaces would be very useful.
Availability of data and materials
Not applicable.
References
ArizaRuiz, D., LópezAcedo, G., Nicolae, A.: The asymptotic behavior of the composition of firmly nonexpansive mappings. J. Optim. Theory Appl. 167, 409–429 (2015)
Aspelmeier, T., Charitha, C., Luke, D.R.: Local linear convergence of the ADMM/Douglas–Rachford algorithms without strong convexity and application to statistical imaging. SIAM J. Imaging Sci. 9(2), 842–868 (2016)
Bërdëllima, A.: Investigations in Hadamard Spaces. PhD thesis, GeorgAugust Universität Göttingen, Göttingen (2020)
Bërdëllima, A., Lauster, F., Luke, D.R.: αfirmly nonexpansive operators on metric spaces. arXiv:2104.11302 (2021)
Bolte, J., Daniilidis, A., Lewis, A.: The Lojasiewicz inequality for nonsmooth subanalytic functions with applications to subgradient dynamical systems. SIAM J. Optim. 17, 1205–1223 (2006)
Bolte, J., Daniilidis, A., Ley, O., Mazet, L.: Characterizations of Lojasiewicz inequalities: subgradient flows, talweg, convexity. Trans. Am. Math. Soc. 362(6), 3319–3363 (2010)
Dontchev, A.L., Rockafellar, R.T.: Implicit Functions and Solution Mappings, 2nd edn. Springer, Dordrecht (2014)
Izuchukwu, C., Ugwunnadi, G.C., Mewomo, O.T., Khan, A.R., Abbas, M.: Proximaltype algorithms for split minimization problem in Puniformly convex metric spaces. Numer. Algorithms 82(3), 909–935 (2019)
Kuwae, K.: Jensen’s inequality on convex spaces. Calc. Var. Partial Differ. Equ. 49(3–4), 1359–1378 (2014)
Luke, D.R., Teboulle, M., Thao, N.H.: Necessary conditions for linear convergence of iterated expansive, setvalued mappings. Math. Program. 180, 1–31 (2018)
Luke, D.R., Thao, N.H., Tam, M.K.: Quantitative convergence analysis of iterated expansive, setvalued mappings. Math. Oper. Res. 43(4), 1143–1176 (2018)
Naor, A., Silberman, L.: Poincaré inequalities, embeddings, and wild groups. Compos. Math. 147(5), 1546–1572 (2011)
Ohta, S.: Convexities of metric spaces. Geom. Dedic. 125, 225–250 (2007)
Acknowledgements
Not applicable.
Funding
FL was supported in part by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation)—ProjectID LU 1702/11. DRL was supported in part by Deutsche Forschungsgemeinschaft (DFG, German Research Foundation)—ProjectID LU 1702/11 and in part by Deutsche Forschungsgemeinschaft (DFG, German Research Foundation)—ProjectID 432680300—SFB 1456. Open Access funding enabled and organized by Projekt DEAL.
Author information
Authors and Affiliations
Contributions
The authors contributed equally to all results. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no competing interests.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Lauster, F., Luke, D.R. Convergence of proximal splitting algorithms in \(\operatorname{CAT}(\kappa)\) spaces and beyond. Fixed Point Theory Algorithms Sci Eng 2021, 13 (2021). https://doi.org/10.1186/s13663021006980
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s13663021006980