 Research
 Open access
 Published:
Asymptotic behavior of Newtonlike inertial dynamics involving the sum of potential and nonpotential terms
Fixed Point Theory and Algorithms for Sciences and Engineering volume 2021, Article number: 17 (2021)
Abstract
In a Hilbert space \(\mathcal{H}\), we study a dynamic inertial Newton method which aims to solve additively structured monotone equations involving the sum of potential and nonpotential terms. Precisely, we are looking for the zeros of an operator \(A= \nabla f +B \), where ∇f is the gradient of a continuously differentiable convex function f and B is a nonpotential monotone and cocoercive operator. Besides a viscous friction term, the dynamic involves geometric damping terms which are controlled respectively by the Hessian of the potential f and by a Newtontype correction term attached to B. Based on a fixed point argument, we show the wellposedness of the Cauchy problem. Then we show the weak convergence as \(t\to +\infty \) of the generated trajectories towards the zeros of \(\nabla f +B\). The convergence analysis is based on the appropriate setting of the viscous and geometric damping parameters. The introduction of these geometric dampings makes it possible to control and attenuate the known oscillations for the viscous damping of inertial methods. Rewriting the secondorder evolution equation as a firstorder dynamical system enables us to extend the convergence analysis to nonsmooth convex potentials. These results open the door to the design of new firstorder accelerated algorithms in optimization taking into account the specific properties of potential and nonpotential terms. The proofs and techniques are original and differ from the classical ones due to the presence of the nonpotential term.
1 Introduction and preliminary results
Let \(\mathcal{H}\) be a real Hilbert space endowed with the scalar product \(\langle \cdot ,\cdot \rangle \) and the associated norm \(\\cdot \\). Many situations coming from physics, biology, human sciences involve equations containing both potential and nonpotential terms. In human sciences, this comes from the presence of both cooperative and noncooperative aspects. In physics, this comes from the joint presence of terms of diffusion and convection. To describe such situations we will focus on solving additively structured monotone equations of the type
In the above equation, ∇f is the gradient of a convex continuously differentiable function \(f: \mathcal{H}\to \mathbb{R}\) (that’s the potential part), and \(B: \mathcal{H}\to \mathcal{H}\) is a nonpotential operator ^{Footnote 1} which is supposed to be monotone and cocoercive. To this end, we will consider continuous inertial dynamics whose solution trajectories converge as \(t \to +\infty \) to solutions of (1.1). Our study is part of the active research stream that studies the close relationship between continuous dissipative dynamical systems and optimization algorithms which are obtained by their temporal discretization. To avoid lengthening the paper, we limit our study to the analysis of the continuous dynamic. The analysis of the algorithmic part and its link with firstorder numerical optimization will be carried out in a second companion paper. From this perspective, damped inertial dynamics offer a natural way to accelerate these systems. As the main feature of our study, we will introduce the dynamic geometric dampings which are respectively driven by the Hessian for the potential part and by the corresponding Newton term for the nonpotential part. In addition to improving the convergence rate, this will considerably reduce the oscillatory behavior of the trajectories. We will pay particular attention to the minimal assumptions which guarantee convergence of the trajectories, and which highlight the asymmetric role played by the two operators involved in the dynamic. We will see that many results can be extended to the case where \(f: \mathcal{H}\to \mathbb{R}\cup \{+\infty \}\) is a convex lower semicontinuous proper function, which makes it possible to broaden the field of applications.
1.1 Dynamical inertial Newton method for additively structured monotone problems
For \(t\geq t_{0}\), let us introduce the following secondorder differential equation which will form the basis of our analysis:
We use (DINAM) as an abbreviation for dynamical inertial Newton method for additively structured monotone problems. We call \(t_{0}\in \mathbb{R}\) the origin of time. Since we are considering autonomous systems, we can take any arbitrary real number for \(t_{0}\). For simplicity, we set \(t_{0}=0\). When considering the corresponding Cauchy problem, we add the initial conditions: \(x(0)=x_{0}\in \mathcal{H}\) and \(\dot{x}(0)=x_{1}\in \mathcal{H}\). The term \(B'(x(t))\dot{x}(t)\) is interpreted as \(\frac{d}{dt} (B(x(t)) )\) taken in the distribution sense. Likewise the term \(\nabla ^{2} f(x(t)) \dot{x}(t)\) is interpreted as \(\frac{d}{dt} ( \nabla f(x(t)) )\) taken also in the distribution sense. Because of the assumptions made below, these terms are indeed measurable functions which are bounded on the bounded time intervals. So, we will consider strong solutions of the above equation (DINAM).
Throughout the paper we make the following standing assumptions:
We emphasize the fact that we do not assume the gradient of f to be globally Lipschitz continuous. Developing our analysis without using any bound on the gradient of f is a key to further extend the theory to the nonsmooth case. As a specific property, the inertial system (DINAM) combines two different types of driving forces associated respectively with the potential operator ∇f and the nonpotential operator B. It also involves three different types of friction:

(a)
The term \(\gamma \dot{x}(t)\) models viscous damping with a positive coefficient \(\gamma >0\).

(b)
The term \(\beta _{f} \nabla ^{2} f(x(t)) \dot{x}(t)\) is the socalled Hessian driven damping, which allows to attenuate the oscillations that naturally occur with the inertial gradient dynamics.

(c)
The term \(\beta _{b} B'(x(t))\dot{x}(t) \) is the nonpotential version of the Hessian driven damping. It can be interpreted as a Newtontype correction term.
Note that each driving force term enters (DINAM) with its temporal derivative. In fact, we have
This is a crucial observation which makes (DINAM) equivalent to a firstorder system in time and space, and makes the corresponding Cauchy problem well posed. This will be proved later (see Sect. 2.1 for more details). The cocoercivity assumption on the operator B plays an important role in the analysis of (DINAM), not only to ensure the existence of solutions, but also to analyze their asymptotic behavior as time \(t\to +\infty \).
Recall that the operator \(B: \mathcal{H}\to \mathcal{H}\) is said to be λcocoercive for some \(\lambda > 0\) if
Note that B is λcocoercive is equivalent to \(B^{1}\) is λstrongly monotone, i.e., cocoercivity is a dual notion of strong monotonicity. It is easy to check that B is λcocoercive implies that B is \(1/\lambda \)Lipschitz continuous. The reverse implication holds true in the case where the operator is the gradient of a convex and differentiable function. Indeed, according to Baillon–Haddad’s theorem [17], ∇f is LLipschitz continuous implies that ∇f is a \(1/L\)cocoercive operator (we refer to [18, Corollary 18.16] for more details).
1.2 Historical aspects of the inertial systems with Hessiandriven damping
The following inertial system with Hessiandriven damping
was first considered by Alvarez, Attouch, Peypouquet, and Redont in [6]. Then, according to the continuous interpretation by Su, Boyd, and Candès [28] of the accelerated gradient method of Nesterov, Attouch, Peypouquet, and Redont [14] replaced the fixed viscous damping parameter γ with an asymptotic vanishing damping parameter \(\frac{\alpha }{t}\), with \(\alpha >0\). At first glance, the presence of the Hessian may seem to entail numerical difficulties. However, this is not the case as the Hessian intervenes in the above ODE in the form \(\nabla ^{2} f (x(t)) \dot{x} (t)\), which is nothing but the derivative with respect to time of \(\nabla f (x(t))\). So, the temporal discretization of these dynamics provides firstorder algorithms of the form
As a specific feature, and by comparison with the classical accelerated gradient methods, these algorithms contain a correction term which is equal to the difference of the gradients at two consecutive steps. While preserving the convergence properties of the accelerated gradient method, they provide fast convergence to zero of the gradients and reduce the oscillatory aspects. Several recent studies have been devoted to this subject, see Attouch, Chbani, Fadili, and Riahi [7], Boţ, Csetnek, and László [20], Kim [24], Lin and Jordan [25], Shi, Du, Jordan, and Su [27], and Alesca, Lazlo, and Pinta [4] for an implicit version of the Hessian driven damping. Application to deep learning has been recently developed by Castera, Bolte, Févotte, and Pauwels [23]. In [3], Adly and Attouch studied the finite convergence of proximalgradient inertial algorithms combining dry friction with Hessiandriven damping.
1.3 Inertial dynamics involving cocoercive operators
Let us come to the transposition of these techniques to the case of maximally monotone operators. Álvarez and Attouch [5] and Attouch and Maingé [10] studied the equation
when \(A:\mathcal{H}\to \mathcal{H}\) is a cocoercive (and hence maximally monotone) operator (see also [19]). The cocoercivity assumption plays an important role in the study of (1.2), not only to ensure the existence of solutions, but also to analyze their longterm behavior. Assuming that the cocoercivity parameter λ and the damping coefficient γ satisfy the inequality \(\lambda \gamma ^{2} >1\), Attouch and Maingé [10] showed that each trajectory of (1.2) converges weakly to a zero of A, i.e., \(x(t)\rightharpoonup x_{\infty }\in A^{1}(0)\) as \(t\to +\infty \). Moreover, the condition \(\lambda \gamma ^{2} >1\) is sharp.
For general maximally monotone operators, this property has been further exploited by Attouch and Peypouquet [13] and by Attouch and Laszlo [8, 9]. The key property is that, for \(\lambda >0\), the Yosida approximation \(A_{\lambda }\) of A is λcocoercive and \(A_{\lambda }^{1} (0) = A^{1}(0)\). So the idea is to replace the operator A with its Yosida approximation and adjust the Yosida regularization parameter. Another related work has been done by Attouch and Maingé [10] who first considered the asymptotic behavior of the secondorder dissipative evolution equation with \(f:\mathcal{H}\to \mathbb{R}\) convex and \(B:\mathcal{H}\to \mathcal{H}\) cocoercive
combining potential with nonpotential effects. Our study will therefore consist initially in introducing the Hessian term and the Newtontype correcting term into this dynamic.
1.4 Link with Newtonlike methods for solving monotone inclusions
Let us specify the link between our study and Newton’s method for solving (1.1). To overcome the illposed character of the continuous Newton method for a general maximally monotone operator A, the following firstorder evolution system was studied by Attouch and Svaiter [16]:
This system can be considered as a continuous version of the Levenberg–Marquardt method, which acts as a regularization of the Newton method. Remarkably, under a fairly general assumption on the regularization parameter \(\gamma (t)\), this system is well posed and generates trajectories that converge weakly to equilibria (zeroes of A). Parallel results have been obtained for the associated proximal algorithms obtained by implicit temporal discretization, see [2, 12, 15]. Formally, this system is written as
Thus (DINAM) can be considered as an inertial version of this dynamical system for structured monotone operator \(A= \nabla f + B\). Our study is also linked to the recent works by Attouch and Laszlo [8, 9] who considered the general case of monotone equations. By contrast with [8, 9], according to the cocoercivity of B, we do not use the Yosida regularization and exhibit minimal assumptions involving only the nonpotential component.
1.5 Contents
The paper is organized as follows. Section 1 introduces (DINAM) with some historical perspective. In Sect. 2, based on the firstorder equivalent formulation of (DINAM), we show that the Cauchy problem is wellposed (in the sense of existence and uniqueness of solutions). In Sect. 3, we analyze the asymptotic convergence properties of the trajectories generated by (DINAM). Using appropriate Lyapunov functions, we show that any trajectory of (DINAM) converges weakly as \(t\to +\infty \), and that its limit belongs to \(S=(\nabla f +B)^{1}(0)\). The interplay between the damping parameters \(\beta _{f}\), \(\beta _{b}\), γ and the cocoercivity parameter λ will play an important role in our Lyapunov analysis. In Sect. 4, we perform numerical experiments showing that the wellknown oscillations in the case of the heavy ball with friction are damped with the introduction of the geometric (Hessianlike) damping terms. An application to the LASSO problem with a nonpotential operator as well as a coupled system in dynamical games are considered. Section 5 deals with the extension of the study to the nonsmooth and convex case. Section 6 contains some concluding remarks and perspectives.
2 Wellposedness of the Cauchy–Lipschitz problem
We first show the existence and the uniqueness of the solution trajectory for the Cauchy problem associated with (DINAM) for any given initial condition data \((x_{0},x_{1})\in \mathcal{H}\times \mathcal{H}\).
2.1 Firstorder in time and space equivalent formulation
The following firstorder equivalent formulation of (DINAM) was first considered by Alvarez, Attouch, Bolte, and Redont [6] and Attouch, Peypouquet, and Redont [14] in the framework of convex minimization. Specifically, in our context, we have the following equivalence, which follows from a simple differential and algebraic calculation.
Proposition 2.1
Suppose that \(\beta _{f} >0\). Then the following problems are equivalent: \((\mathrm{i}) \Longleftrightarrow (\mathrm{ii})\)
Proof
\((\mathrm{i})\Longrightarrow (\mathrm{ii})\). For \(t\ge 0\), set
which gives the first equation of (ii). By differentiating \(y(\cdot )\) and using (i), we get
By combining (2.1) and (2.2), we obtain
This gives the second equation of (ii).
\((\mathrm{ii})\Longrightarrow (\mathrm{i})\). By differentiating the first equation of (ii), we obtain
Let us eliminate y from this equation to obtain an equation involving only x. For this, we successively use the second equation in (ii), then the first equation in (ii) to obtain
Therefore,
From (2.4) and (2.5), we obtain (i). □
2.2 Wellposedness of the evolution equation (DINAM)
In the following theorem, we show the wellposedness of the Cauchy problem for the evolution equation (DINAM).
Theorem 2.1
Suppose that \(\beta _{f}>0 \) and \(\beta _{b}\geq 0\). Then, for any \((x_{0}, x_{1}) \in \mathcal{H}\times \mathcal{H}\), there exists a unique strong global solution \(x:[0, +\infty [ \, \to \mathcal{H}\) of the continuous dynamic (DINAM) which satisfies the Cauchy data \(x(0) =x_{0}\), \(\dot{x}(0) =x_{1}\).
Proof
System (ii) in Proposition 2.1 can be written equivalently as
where \(Z(t) = (x(t), y(t)) \in \mathcal{H}\times \mathcal{H}\) and
Therefore, \(F = \nabla \Phi + G\), where \(\Phi : \mathcal{H}\times \mathcal{H}\to \mathbb{R}\) is the convex differentiable function
and \(G: \mathcal{H}\times \mathcal{H}\to \mathcal{H}\times \mathcal{H}\)
is a Lipschitz continuous map. Indeed, the Lipschitz continuity of G is a direct consequence of the Lipschitz continuity of B. The existence of a classical solution to
follows from Brézis [21, Proposition 3.12]. In fact, the proof of this result relies on a fixed point argument. It consists in finding a fixed point of the mapping \(u \in \mathcal{C} ([0,T], \mathcal{H}) \mapsto K(u) \in \mathcal{C} ([0,T], \mathcal{H})\), where \(K(u)=w\) is the solution of
It is proved that the sequence of iterates \((w_{n})\) generated by the corresponding Picard iteration
converges uniformly on \([0,T]\) to a fixed point of K. When returning to (DINAM), that is, equation (i) of Proposition 2.1, we recover a strong solution. Precisely, ẋ is Lipschitz continuous on the bounded time intervals, and ẍ taken in the distribution sense is locally essentially bounded. □
Remark 2.1
Note that when ∇f is supposed to be globally Lipschitz continuous, the above proof can be notably simplified by just applying the classical Cauchy–Lipschitz theorem.
3 Asymptotic convergence properties of (DINAM)
In this section, we study the asymptotic behavior of the solution trajectories of (DINAM). For each solution trajectory \(t\mapsto x(t)\) of (DINAM), we show that the weak limit w\(\lim_{t\to +\infty }x(t)=x_{ \infty }\) exists and satisfies \(x_{\infty }\in S\), where
Before stating our main result, notice that \(B(p)\) is uniquely defined for \(p\in S\).
Lemma 3.1
\(B(p)\) is uniquely defined for \(p\in S\), i.e.,
Proof
Since \(p_{1}\in S\), \(p_{2} \in S \), we have
By the monotonicity of ∇f, we have
Replacing \(\nabla f (p_{1})\) with \(B(p_{1})\) and \(\nabla f (p_{2})\) with \(B(p_{2})\), we get
which by cocoercivity of B gives \(\lambda \B(p_{2})B(p_{1})\^{2} \leq 0\), and hence \(B(p_{2})=B(p_{1})\). □
3.1 General case
The general line of the proof is close to that given by Attouch and Laszlo in [8, 9]. The first major difference with the approach developed in [8, 9] is that in our context, thanks to the hypothesis of cocoercivity on the nonpotential part, we do not need to go through the Yosida regularization of the operators. The second difference is that we treat the potential and nonpotential operators in a differentiated way. These points are crucial for applications to numerical algorithms, because the computation of the Yosida regularization of the sum of the two operators is often out of reach numerically.
The following theorem states the asymptotic convergence properties of (DINAM).
Theorem 3.1
Let \(B: \mathcal{H} \to \mathcal{H}\) be a λcocoercive operator and \(f: \mathcal{H}\to \mathbb{R}\) be a \(\mathcal{C}^{1}\) convex function whose gradient is Lipschitz continuous on the bounded sets. Suppose that \(S= (\nabla f +B)^{1} (0)\neq \emptyset \), and that the parameters involved in the evolution equation (DINAM) satisfy the following conditions: \(\beta _{f} >0\) and
Then, for any solution trajectory \(x:[0,+\infty [\, \to \mathcal{H}\) of (DINAM), the following properties are satisfied:

(i)
(convergence) \(x(t)\) converges weakly, as \(t\to +\infty \), to an element of S.

(ii)
(integral estimates) Set \(A:=B+\nabla f\) and \(p\in S\). Then
$$\begin{aligned}& \int _{0}^{+\infty } \bigl\Vert \dot{x}(t) \bigr\Vert ^{2}\,dt< +\infty ,\qquad \int _{0}^{+ \infty } \bigl\Vert \ddot{x}(t) \bigr\Vert ^{2}\,dt< +\infty , \\& \int _{0}^{+\infty } \bigl\Vert B\bigl(x(t) \bigr)B(p) \bigr\Vert ^{2}\,dt< +\infty ,\qquad \int _{0}^{+ \infty } \biggl\Vert \frac{d}{dt}B\bigl(x(t)\bigr) \biggr\Vert ^{2}\,dt< +\infty , \\& \int _{0}^{+\infty } \bigl\Vert A \bigl(x(t)\bigr) \bigr\Vert ^{2}\,dt< +\infty , \quad \textit{and}\quad \int _{0}^{+ \infty } \biggl\Vert \frac{d}{dt}A \bigl(x(t)\bigr) \biggr\Vert ^{2}\,dt< +\infty . \end{aligned}$$ 
(iii)
(pointwise estimates)
$$ \lim_{t\to +\infty } \bigl\Vert \dot{x}(t) \bigr\Vert =0,\qquad \lim _{t\to +\infty } \bigl\Vert B\bigl(x(t)\bigr)B(p) \bigr\Vert =0,\qquad \lim_{t\to +\infty } \bigl\Vert A\bigl(x(t)\bigr) \bigr\Vert =0, $$where \(B(p)\) is uniquely defined for \(p\in S\).
Proof
Lyapunov analysis. Set \(A:=B+\nabla f\) and \(A_{\beta }:=\beta _{b}B+\beta _{f}\nabla f\). Take \(p\in S\). Consider the function \(t\in [0, +\infty [\, \mapsto \mathcal{V}_{p}(t) \in \mathbb{R}_{+}\) defined by
where c and δ are coefficients to adjust. Using the differentiation chain rule for absolutely continuous functions (see [22, Corollary VIII.10]) and (DINAM), we get
Setting \(\delta :=c\gamma 1>0\), from (3.3) we obtain
We have
Using the fact that \(p\in S \), ∇f is monotone, and B is λcocoercive, we have
From (3.4)–(3.6), we deduce that
Let \(\Gamma : [0,+\infty [ \, \to \mathbb{R}\) be the function defined by
and \(\mathcal{{E}}_{p}: [0,+\infty [ \, \to \mathbb{R}\) be the energy function given by
Since f is convex, we have \(\Gamma (t) \ge 0\) for all \(t\ge 0\). This implies \(\mathcal{{E}}_{p}(t)\ge 0\) for all \(t\ge 0\) as well.
We have
By using (3.8) and (3.9), equation (3.7) can be rewritten as
Let us eliminate the term \(\nabla f(x(t))\nabla f(p)\) from this relation by using the elementary algebraic inequality
We obtain
Equivalently
where
Set \(X(t)= \dot{x}(t)\) and \(Y(t) = B(x(t))B(p)\). We have \(\mathcal{S}(t)= q(X(t),Y(t)) \), where \(q: \mathcal{H}\times \mathcal{H}\to \mathbb{R}\) is the quadratic form
with \(a= \delta \), \(b= \delta \beta _{b}+c\), and \(g= c\beta _{b}+\lambda  \frac{ c(\beta _{b}+\beta _{f})^{2}}{4\beta _{f}} = \lambda  \frac{ c(\beta _{b}\beta _{f})^{2}}{4\beta _{f}}\).
According to Lemma A.3, and since \(a=\delta = c\gamma 1 >0\), we have that q is positive definite if and only if \(4ag b^{2} > 0\). Equivalently
Our aim is to find c such that \(c\gamma 1 >0\) and such that (3.12) is satisfied. Take \(\delta :=c\gamma 1>0\) as a new variable. Equivalently, we must find \(\delta >0\) such that
After development and simplification we obtain
Therefore, we just need to assume that
Elementary optimization argument gives that
Therefore we end up with the condition
Equivalently
When \(\beta _{b}=\beta _{f}=\beta \), we recover the condition
Note that \(c\gamma =1+\delta \) and \(\delta >0\) implies \(c>0\). According to (3.11), \(\mathcal{S}(t)= q(X(t),Y(t)) \), and q positive definite, we deduce that there exist positive real numbers c, μ such that
Estimates. Let us start from (3.14) that we integrate on \([0,t]\), \(t\ge 0\). We obtain
From (3.15) and the definition of \(\mathcal{E}_{p}\), we immediately deduce
Let us return to (3.10). We recall that
After integration on \([0,t]\), and by using the integral estimates \(\int _{0}^{+\infty }\\dot{x}(t)\^{2}\,dt<+\infty \) and \(\int _{0}^{+\infty } \B(x(t))B(p)\^{2} \,dt<+\infty \) obtained in (3.18) and (3.19), we get the existence of a constant \(C>0\) such that
Therefore, for any \(\epsilon >0\), we have
By taking \(\epsilon >0\) such that \(\beta _{f}> \epsilon (\beta _{b}+\beta _{f}) \), which is always possible since \(\beta _{f} >0\), we conclude
Combining this with \(\int _{0}^{+\infty } \B(x(t))B(p)\^{2} \,dt<+\infty \), it follows immediately
Moreover, we also have
According to (3.16) the trajectory \(x(\cdot )\) is bounded. Set \(R:= \sup_{t\geq 0} \x(t)\\). By assumption, ∇f is Lipschitz continuous on the bounded sets. Let \(L_{R} <+\infty \) be the Lipschitz constant of ∇f on \(B(0,R)\). Since B is λcocoercive, it is \(\frac{1}{\lambda }\)Lipschitz continuous. Therefore A is LLipschitz continuous on the trajectory with \(L:=L_{R} + \frac{1}{\lambda } \). Therefore
Using (3.21) and (3.23), we deduce that \(u(t):= \A(x(t))\\) satisfies the condition of Lemma A.2 (with \(p=2\) and \(r=2\)). Therefore,
Likewise, according to (3.22), we have
By using the same argument as in (3.23), we obtain that \(\frac{d}{dt}A_{\beta }(x(t))\) is bounded. From (3.23) we also get that
Similarly, we also have
By using (DINAM), we have
Since the second member of the above equality belongs to \(L^{2} (0, +\infty ; \mathcal{H})\), we finally get
Combining this property with (3.18) and using Lemma A.2, we deduce that
The limit. To prove the existence of the weak limit of \(x(t)\), we use Opial’s lemma (see [26] for more details). Given \(p\in S \), let us consider the anchor function defined by, for every \(t \in [0,+\infty [\),
From \(\dot{q}_{p}(t)=\langle \dot{x}(t), x(t)p\rangle \) and \(\ddot{q}_{p}(t)=\\dot{x}(t)\^{2}+ \langle \ddot{x}(t), x(t)p \rangle \), we obtain
Equivalently,
According to the differentiation formula for a product, we can rewrite (3.27) as follows:
By the Cauchy–Schwarz inequality, we get
Then note that the second member of (3.28)
is nonnegative and belongs to \(L^{1} (0,+\infty )\). Indeed, we have
Using (3.18) and (3.22), we deduce that
Note that the left member of (3.28) can be rewritten as a derivative of a function, precisely
with
So we have
Let us prove that the function h given in (3.29) is bounded from below by some constant. Indeed, since the terms \(q_{p}(t)\) and \(\langle A_{\beta }(x(t))A_{\beta }(p), x(t)p \rangle \) are nonnegative, we have
According to the boundedness of \(x(\cdot )\) and \(\dot{x}(\cdot )\) (see (3.16) and (3.26)), we deduce that there exists \(m\in \mathbb{R}\) such that
Let us introduce the realvalued function \(\varphi : \mathbb{R}_{+}\to \mathbb{R}\), \(t\mapsto \varphi (t)\) defined by
We have \(\varphi '(t)=\dot{h}(t)g(t)\leq 0\). Hence, the function φ is nonincreasing on \([0,+\infty [\). This classically implies that the limit of φ exists as \(t\to +\infty \). Since \(g\in L^{1}(0,+\infty )\), we deduce that \(\lim_{t\to +\infty } h(t)\) exists.
Using the fact that \(\langle A_{\beta }(x(t))A_{\beta }(p), x(t)p \rangle \) tends to zero as \(t\to +\infty \) (a consequence of (3.25) and \(x(\cdot )\) bounded), we obtain
with limit of \(\theta (t)\) exists as \(t\to +\infty \). The existence of the limit of \(q_{p}\) then follows from a classical general result concerning the convergence of evolution equations governed by strongly monotone operators (here γId, see Theorem 3.9, p. 88 in [21]). This means that, for all \(p\in S\),
To complete the proof via Opial’s lemma, we need to show that every weak sequential cluster point of \(x(t)\) belongs to S. Let \(t_{n} \to +\infty \) such that \(x(t_{n}) \rightharpoonup x^{*}, n\to +\infty \). We have
From the closedness property of the graph of the maximally monotone operator A in \(w\mathcal{H}\times s\mathcal{H}\), we deduce that \(A(x^{*})=0\), that is, \(x^{*} \in S\).
Consequently, \(x(t)\) converges weakly to an element of S as t goes to +∞. The proof of Theorem 3.1 is thereby completed. □
Remark 3.1
In the statement of Theorem 3.1, the parameters have to satisfy a certain condition. If the rest of parameters are fixed, then the set of λs that fulfill the inequality can easily be found. Likewise, the feasible set of γs if the other parameters are fixed can be determined explicitly.
In fact, let us rewrite condition (3.1) as follows:
Equivalently,
Thanks to
we immediately deduce that
Therefore (3.30) is equivalent to
This in turn is equivalent to
From the first inequation of (3.31), we deduce that
From the second inequation of (3.31), we deduce that
Therefore,
Since (3.33) implies (3.32), we obtain that the feasible set of γs is defined by
3.2 Case \(\beta _{b} =\beta _{f}\)
Let us specialize the previous results in the case \(\beta _{b}=\beta _{f}\). We set \(\beta _{b}= \beta _{f}:=\beta > 0\) and \(A:= \nabla f + B\). We thus consider the evolution system
The existence of strong global solutions to this system is guaranteed by Theorem 2.1. The convergence properties as \(t \to +\infty \) of the solution trajectories generated by this system is a consequence of Theorem 3.1 and are given below.
Corollary 3.1
Let \(B: \mathcal{H} \to \mathcal{H}\) be a λcocoercive operator and \(f: \mathcal{H}\to \mathbb{R}\) be a \(\mathcal{C}^{1}\) convex function whose gradient is Lipschitz continuous on the bounded sets. Suppose that the solution set \(S= (\nabla f+B)^{1} (0)\neq \emptyset \). Consider the evolution equation (DINAM), where \(A= \nabla f +B\), \(\beta _{b}=\beta _{f}:= \beta > 0\) and where the involved parameters satisfy the following conditions:
Then, for any solution trajectory \(x:[0,+\infty [\,\to \mathcal{H}\) of (DINAM), the following properties are satisfied:

(i)
(convergence) The trajectory \(x(\cdot )\) is bounded and \(x(t)\) converges weakly, as \(t\to +\infty \), to an element \(x^{*}\in S\).

(ii)
(integral estimate)
$$\begin{aligned}& \int _{0}^{+\infty } \bigl\Vert \dot{x}(t) \bigr\Vert ^{2}\,dt< +\infty ,\qquad \int _{0}^{+ \infty } \bigl\Vert \ddot{x}(t) \bigr\Vert ^{2}\,dt< +\infty , \\& \int _{0}^{+\infty } \bigl\Vert A\bigl(x(t)\bigr) \bigr\Vert ^{2}\,dt< +\infty ,\quad \textit{and}\quad \int _{0}^{+ \infty } \biggl\Vert \frac{d}{dt}A\bigl(x(t)\bigr) \biggr\Vert ^{2}\,dt< +\infty . \end{aligned}$$ 
(iii)
(pointwise estimate)
$$ \lim_{t\to +\infty } \bigl\Vert \dot{x}(t) \bigr\Vert =0, \quad \textit{and}\quad \lim_{t\to + \infty } \bigl\Vert A\bigl(x(t)\bigr) \bigr\Vert =0. $$
Remark 3.2
It is worth stating the result of Corollary 3.1 apart because this is an important case. This also makes it possible to highlight this result compared to the existing literature for secondorder dissipative evolution systems involving cocoercive operators. Indeed, letting β go to zero in (3.34) gives the condition
introduced by Attouch and Maingé in [10] to study the secondorder dynamic (1.3) without geometric damping. With respect to [10], the introduction of the geometric damping, i.e., taking \(\beta >0\), provides some useful additional estimates.
4 Numerical illustrations
In this section, we give some numerical illustrations of (DINAM).
4.1 From continuous dynamic to algorithms
Let us first give some indications concerning the algorithms obtained by temporal discretization of the continuous dynamic (DINAM). Their convergence analysis will be postponed to another research investigation. Let us recall the condensed formulation of (DINAM)
where \(A:=\nabla f+B\) and \(A_{\beta }:=\beta _{b}B+\beta _{f}\nabla f\). Take a fixed time step \(h>0\), and consider the following finitedifference scheme for (DINAM):
This scheme is implicit with respect to the nonpotential B and explicit with respect to the potential operator ∇f. The temporal discretization of the Hessian driven damping \(\beta _{f} \nabla ^{2} f(x(t)) \dot{x}(t)\) is taken equal to \(\frac{\beta _{f}}{h}(\nabla f(x_{k})\nabla f(x_{k1}))\). After expanding (4.1), we obtain
Set \(s:=\frac{h}{1+\gamma h}\) and \(\alpha :=\frac{1}{1+\gamma h}\). So we have
where \(\mathcal{B}_{h}=(h+\beta _{b})B\), and
From (4.3) we get
By combining (4.4) and (4.5), we obtain the following algorithm, called (DINAAM). It is a splitting algorithm which involves the operators ∇f and B separately.
4.2 Numerical experiments for the continuous dynamics (DINAM)
A general method to generate monotone cocoercive operators which are not gradients of convex functions is to start from a linear skew symmetric operator A and then take its Yosida approximation \(A_{\lambda }\). As a model situation, take \(\mathcal{H}= \mathbb{R}^{2}\) and start from A equal to the rotation of angle \(\frac{\pi }{2}\). We have A=\left(\begin{array}{cc}0& 1\\ 1& 0\end{array}\right). An elementary computation gives that, for any \(\lambda >0\), {A}_{\lambda}=\frac{1}{1+{\lambda}^{2}}\left(\begin{array}{cc}\lambda & 1\\ 1& \lambda \end{array}\right), which is therefore λcocoercive. As a consequence, for \(\lambda =1\), we obtain that the matrix B=\left(\begin{array}{cc}1& 1\\ 1& 1\end{array}\right) is \(\frac{1}{2}\)cocoercive. With these basic blocks, one can easily construct many other cocoercive operators which are not potential operators. For that, use Lemma A.1 which gives that the sum of two cocoercive operators is still cocoercive, and therefore the set of cocoercive operators is a convex cone.
Example 4.1
Let us start this section by a simple illustrative example in \(\mathbb{R}^{2}\). We take \(\mathcal{H}= \mathbb{R}^{2}\) equipped with the usual Euclidean structure. Let us consider B as a linear operator whose matrix in the canonical basis of \(\mathbb{R}^{2}\) is defined by \(B=A_{\lambda }\) for \(\lambda =5\). According to the above remark, we can check that B is λcocoercive with \(\lambda =5\) and that B is a nonpotential operator. To observe the classical oscillations, in the heavy ball with friction, we take \(f: \mathbb{R}^{2} \to \mathbb{R}\) defined by
We set \(\gamma =0.9\). It is clear that f is convex but not strongly convex. We study three cases: (1) \(\beta _{b}=1\), \(\beta _{f}=0.5\), (2) \(\beta _{b}=0.5\), \(\beta _{f}=1\), and (3) \(\beta _{b}=\beta _{f}=0.5\). As a straight application of Theorem 3.1, we obtain that the trajectory \(x(t)\) generated by (DINAM) converges to \(x_{\infty }\), where \(x_{\infty }\in S=(B+\nabla f)^{1}(0)=\{ 0\}\). The trajectory obtained by using Matlab is depicted in Fig. 1, where we represent the components \(x_{1}(t)\) and \(x_{2}(t)\) in red and blue respectively.
Now we study the behavior of the trajectories by considering more different values of \(\beta _{b}\) and \(\beta _{f}\). We study four cases in Fig. 2. The plots of the second variable of the solutions have been depicted in Fig. 2(a), while in Fig. 2(b) the number of iterations k versus \(\ B(x_{k})+\nabla f(x_{k})\\) is plotted. Through Figs. 1 and 2, we can conclude that by introducing the Hessian damping (\(\beta _{f}>0\)) the oscillations of the trajectories in Fig. 2 are attenuated. The oscillations of the solutions appear whenever \(\beta _{f}\) goes to 0.
Example 4.2
Now we are looking at another higher dimensional example. Let us consider \(f: \mathbb{R}^{n}\to \mathbb{R}\) given by \(f(x)=\frac{1}{2}\ Mxb\^{2}\), where \(M\in \mathbb{R}^{m\times n}\) and \(b\in \mathbb{R}^{m}\). We have
Since \(M^{\top }M\) is positive semidefinite for any matrix M, the quadratic function f is convex. Furthermore, if M has full column rank, i.e., \(\operatorname{rank} (M)=n\), then \(M^{\top }M\) is positive definite. Therefore f is strongly convex. Take
Then B is cocoercive. Indeed, for any \(x,y\in \mathbb{R}^{n}\),
If the matrix M has not full column rank with \(M^{\top }M+B\) nonsingular, then
In our experiment, we pick M a random \(10\times 100\) matrix which has not full column rank. Set \(\gamma =3\), \(\beta _{b}=1\), \(\beta _{f}=1\) and the operator B as presented above. Thanks to Corollary 3.1, we conclude that the trajectory \(x(t)\) generated by the system (DINAM) converges to \(x_{\infty }=(M^{\top }M+B)^{1}M^{\top }b\). Implementing the algorithm (DINAAM) in Matlab, we obtain the plot of k versus the norm of \(B(x_{k})+\nabla f(x_{k})\). Similarly, we study several cases by changing the parameters \(\beta _{b}\), \(\beta _{f}\). This is depicted in Fig. 3.
Before ending this part, we discuss an application of our model to dynamical games.
The following example is taken from Attouch and Maingé [10] and adapted to our context.
Example 4.3
We make the following standing assumptions:

(i)
\(\mathcal{H}=\mathcal{X}_{1}\times \mathcal{X}_{2}\) is the Cartesian product of two Hilbert spaces equipped with norms \(\ \cdot \_{\mathcal{X}_{1}}\) and \(\ \cdot \_{\mathcal{X}_{2}}\) respectively. In which, \(x=(x_{1},x_{2})\), with \(x_{1}\in \mathcal{X}_{1}\) and \(x_{2}\in \mathcal{X}_{2}\), stands for an element in \(\mathcal{H}\);

(ii)
\(f: \mathcal{X}_{1}\times \mathcal{X}_{2} \to \mathbb{R}\) is a convex function whose gradient is Lipschitz continuous on bounded sets;

(iii)
\(B=(\nabla _{x_{1}}\mathcal{L},\nabla _{x_{2}}\mathcal{L})\) is the maximally monotone operator which is attached to a smooth convexconcave function \(\mathcal{L}: \mathcal{X}_{1}\times \mathcal{X}_{2}\to \mathbb{R}\). The operator B is assumed to be λcocoercive with \(\lambda >0\).
In our setting, with \(x(t)=(x_{1}(t),x_{2}(t))\) the system (DINAM) is written
According to Theorem 3.1, \(x(t) \rightharpoonup x_{\infty }=(x_{1,\infty },x_{2,\infty })\) weakly in \(\mathcal{H}\), where \((x_{1,\infty },x_{2,\infty })\) is solution of
Structured systems such as (4.8) contain both potential and nonpotential terms which are often present in decision sciences and physics. In game theory, (4.8) describes Nash equilibria of the normal form game with two players 1, 2 whose static loss functions are respectively given by
\(f(\cdot ,\cdot )\) is their joint convex payoff, and \(\mathcal{L}\) is a convexconcave payoff with zerosum rule. For more details, we refer the reader to [10]. As an example, take \(\mathcal{X}_{1}=\mathcal{X}_{2}=\mathbb{R}\) and \(\mathcal{L}: \mathbb{R}^{2}\to \mathbb{R}\) given by \(\mathcal{L}(x)=\frac{1}{2}(x_{1}^{2}2x_{1}x_{2}x_{2}^{2})\). Then B=({\mathrm{\nabla}}_{{x}_{1}}\mathcal{L},{\mathrm{\nabla}}_{{x}_{2}}\mathcal{L})=\left(\begin{array}{cc}1& 1\\ 1& 1\end{array}\right). Pick \(f(x)=\frac{1}{2}(3x_{1}^{2}2x_{1}x_{2}+x_{2}^{2})x_{1}2x_{2}\). The Nash equilibria described in (4.8) can be solved by using (DINAM). Take \(\gamma =3\), \(\beta _{b}=0.5\), \(\beta _{f}=0.5\) and \(x_{0}=(1,1)\), \(\dot{x}_{0}=(10,10)\) as initial conditions, then the numerical solution for (DINAM) converges to \(x_{\infty }=(\frac{3}{4},1)\) which is the solution of (4.8) as well. The numerical trajectories and phase portrait of our model applied to dynamical games are depicted in Fig. 4.
5 The nonsmooth case
The equivalence obtained in Proposition 2.1 between (DINAM) and a firstorder evolution system in time and space allows a natural extension of both our theoretical and numerical results to the case of a convex, lower semicontinuous and proper function \(f:\mathcal{H}\to \mathbb{R}\cup \{+\infty \}\). It suffices to replace the gradient of f with the convex subdifferential ∂f. We recall that the subdifferential of f at \(x\in \mathcal{H}\) is defined by
and the domain of f is equal to \(\operatorname{dom}f= \{ x\in \mathcal{H}: f(x) < +\infty \}\). This leads to consider the system
The prefix g in front of (DINAM) stands for generalized. Note that the first equation of (gDINAM) is now a differential inclusion, because of the possibility for \(\partial f(x(t))\) to be multivalued. By taking \(f= f_{0} + \delta _{C}\), where \(\delta _{C}\) is the indicator function of a constraint set C, the system (gDINAM) allows to model damped inelastic shocks in mechanics and decision sciences, see [11]. The original aspect comes from the fact that (gDINAM) now involves both potential driven forces (attached to \(f_{0}\)) and nonpotential driven forces (attached to B). As we will see, taking into account shocks created by nonpotential driving forces is a source of difficulties.
Let us first establish the existence and uniqueness of the solution trajectory of the Cauchy problem.
Theorem 5.1
Let \(f:\mathcal{H}\to \mathbb{R}\cup \{+\infty \}\) be a convex, lower semicontinuous, and proper function. Suppose that \(\beta _{f}>0 \) and \(\beta _{b}\geq 0\). Then, for any \((x_{0}, y_{0}) \in \operatorname{dom}f \times \mathcal{H}\), there exists a unique strong global solution \((x,y):[0, +\infty [ \, \to \mathcal{H}\times \mathcal{H}\) of (gDINAM) which satisfies the Cauchy data \(x(0) =x_{0}\), \(y(0) =y_{0}\).
Proof
The proof is parallel to that of Theorem 2.1. The system (gDINAM) can be equivalently written as
where \(Z:= (x,y)\), and the function \(\Phi (Z)= \Phi (x,y) := \beta _{f} f(x) \) is now convex lower semicontinuous and proper on \(\mathcal{H}\times \mathcal{H}\). The operator G is unchanged and is globally Lipschitz continuous. The above equation falls under the setting of the Lipschitz perturbation of an evolution system governed by the subdifferential of a convex lower semicontinuous and proper function. The existence and uniqueness of the strong solution to (5.1) follows from Brézis [21, Proposition 3.12] and the fact that \((x_{0}, y_{0})\in \operatorname{dom}\Phi \). Recall that strong solution means that \(x(\cdot )\) and \(y(\cdot )\) are locally absolutely continuous functions whose distributional derivatives ẋ and ẏ belong to \(L^{2} (0,T, \mathcal{H})\) for any \(T>0\). □
Remark 5.1
As a consequence of the general theory developed above, the system (gDINAM) satisfies a regularization effect on the initial condition. Precisely given \((x_{0}, y_{0}) \in \overline{\operatorname{dom}f} \times \mathcal{H}\), there still exists a unique strong solution to the corresponding Cauchy problem, but now with \(\sqrt{t}\dot{x}(t) \in L^{2} (0,T, \mathcal{H})\) and \(\sqrt{t}\dot{y}(t) \in L^{2} (0,T, \mathcal{H})\) for any \(T>0\).
The solution set S is now defined by
Before stating our main result, notice that \(B(p)\) is uniquely defined for \(p\in S\).
Lemma 5.1
\(B(p)\) is uniquely defined for \(p\in S\), i.e.,
Proof
The proof is similar to that of Lemma 3.1. It is based on the monotonicity of the subdifferential of f and the cocoercivity of the operator B. □
For the sake of simplicity, we give a detailed proof of the convergence analysis in the case \(\beta _{f}=\beta _{b}=\beta >0\). The system (gDINAM) takes the simpler form:
To formulate the convergence results and the corresponding estimates, we write the first equation of (gDINAM) as follows:
where \(\xi (t) \in \partial f(x(t))\), and we set \(A(x(t))= \xi (t) + B(x(t))\).
Theorem 5.2
Let \(B: \mathcal{H} \to \mathcal{H}\) be a λcocoercive operator. Let \(f:\mathcal{H}\to \mathbb{R}\cup \{+\infty \}\) be a convex, lower semicontinuous, proper function. Suppose that \(S= \{p\in \mathcal{H}: 0\in \partial f(p)+B(p) \}\neq \emptyset \). Consider the evolution equation (gDINAM) where the parameters satisfy the conditions: \(\beta _{f}=\beta _{b}=\beta >0\) and
Then, for any solution trajectory \(x:[0,+\infty [\,\to \mathcal{H}\) of (gDINAM), the following properties are satisfied:

(i)
(integral estimates) Set \(A (x(t)):=\xi (t) +B(x(t))\) with \(\xi (t) \in \partial f(x(t))\) as defined in (5.2) and \(p\in S\). Then
$$\begin{aligned}& \int _{0}^{+\infty } \bigl\Vert \dot{x}(t) \bigr\Vert ^{2}\,dt< +\infty ,\qquad \int _{0}^{+ \infty } \bigl\Vert B\bigl(x(t) \bigr)B(p) \bigr\Vert ^{2}\,dt< +\infty , \\& \int _{0}^{+\infty } \bigl\Vert A \bigl(x(t)\bigr) \bigr\Vert ^{2}\,dt< +\infty ,\qquad \int _{0}^{\infty } \bigl\langle A\bigl(x(t)\bigr), x(t)p \bigr\rangle \,dt < +\infty . \end{aligned}$$ 
(ii)
(convergence) For any \(p\in S\),

1.
\(\lim_{t\to +\infty }\x(t)p\\) exists.

2.
\(\lim_{t\to +\infty }\B(x(t))B(p)\=0\), where \(B(p)\) is uniquely defined for \(p\in S\).

1.
Proof
Let us adapt the Lyapunov analysis developed in the previous sections to the case where f is nonsmooth. We have to pay attention to the following points. First, we must invoke the (generalized) chain rule for derivatives over curves (see [21, Lemma 3.3]), that is, for a.e \(t\geq 0\),
The second ingredient is the validity of the subdifferential inequality for convex functions.
As a Lyapunov function, let us consider the function \(t\in [0, +\infty [\, \mapsto \mathcal{E}_{p}(t) \in \mathbb{R}_{+}\) defined by
where we recall that \(A (x(t)):=\xi (t) +B(x(t))\) with \(\xi (t) \in \partial f(x(t))\) as defined in (5.2) and \(p\in S\). To differentiate \(\mathcal{E}_{p}(t)\), we use the formulation (gDINAM)
Since x and y are locally absolutely continuous functions, this allows us to differentiate \(\dot{x}(t)+ \beta A(x (t)) \) and obtain similar formulas as in the smooth case. Then a close examination of the Lyapunov analysis shows that we can obtain the additional estimate
Set \(0\in \partial f(p) + B(p)\). To obtain (5.5), we return to (3.6) and consider the following minorization that we split into a sum with coefficients \(\epsilon '\) and \(1\epsilon '\) (where \(\epsilon ' >0\) will be taken small enough). According to the monotonicity of ∂f and the definition of \(A (x(t))\), we have
So the proof continues with λ replaced with \((1\epsilon ')\lambda \). This does not change the conditions on the parameters since in our assumptions the inequality \(\lambda \gamma >\beta +\frac{1}{\gamma }\) is strict, it is still satisfied by \((1\epsilon ') \lambda \) when \(\epsilon '\) is taken small enough. So, after integrating the resulting strict Lyapunov inequality, we obtain the supplementary property (5.5). Until (3.22) the proof is essentially the same as in the case of a smooth function f. We obtain the integral estimates
But then, we can no longer invoke the Lipschitz continuity on the bounded sets of ∇f. To overcome this difficulty, we modify the end of the proof as follows. Recall that given \(p\in S \), the anchor function is defined by, for every \(t \in [0,+\infty [\),
and that we need to prove that the limit of the anchor functions exists, as \(t \to +\infty \). The idea is to play on the fact that we have in hand a whole collection of Lyapunov functions, parametrized by the coefficient c. Recall that we have obtained that the limit of \(\mathcal{E}_{p}(t)\) exists as \(t\to +\infty \), and this is satisfied for the whole interval of values of c. So, for such c, the limit of \(W_{c} (t):=\frac{1}{c\delta \beta +c^{2}} \mathcal{E}_{p}(t)\) as \(t\to +\infty \) exists, where
Take two such values of c, let \(c_{1}\) and \(c_{2}\), and make the difference (recall that \(\delta = c\gamma 1\)). We obtain
where
So, we obtain the existence of the limit as \(t\to +\infty \) of \(W(t)\). Then note that \(W(t)= \gamma q_{p}(t) + \frac{d}{dt}w(t) \) where
Reformulate \(W(t)\) in terms of \(w(t)\) as follows:
As a consequence of (5.5) and of the previous estimates, we have that the limit of the two above integrals exists as \(t \to +\infty \). Therefore, according to the convergence of \(W(t)\), we obtain that
The existence of the limit of w follows from a classical general result concerning the convergence of evolution equations governed by strongly monotone operators (here γId, see Theorem 3.9, p. 88 in [21]). In turn, using the same argument as above, we obtain that, for all \(p\in S\),
As in the smooth case, the strong convergence of \(B(x(t))\) to \(B(p)\) is a direct consequence of the integral estimates \(\int _{0}^{+\infty }\B(x(t))B(p)\^{2}\,dt<+\infty \), \(\int _{0}^{+\infty }\\dot{x}(t)\^{2}\,dt<+\infty \) and of the fact that B is Lipschitz continuous. The proof of Theorem 3.1 is thereby completed. □
Remark 5.2

(i)
A natural question is to know if the weak limit of the trajectory exists. Indeed we are not far from this result since \(\int _{0}^{+\infty }\A(x(t))\^{2}\,dt<+\infty \), which implies that \(A(x(t))\) converges strongly to zero in an “essential” way. According to Opial’s lemma, this allows to complete the convergence proof as in the smooth case. This is a seemingly difficult question to examine in the future.

(ii)
A particular situation is the case \(\gamma =\frac{1}{\beta }\), in which case the system (gDINAM) can be written in an equivalent way
$$ \dot{u}(t) + \gamma u(t)=0, $$where
$$ \dot{x}(t) + \beta A\bigl(x(t)\bigr)\ni u(t). $$The convergence of the trajectory \(t\mapsto x(t)\) is then a consequence of the convergence of the semigroup generated by the sum of a cocoercive operator with the subdifferential of a convex lower semicontinuous and proper function, see Abbas and Attouch [1]. Note that in this case the condition for the convergence of the trajectories generated by (gDINAM) does not depend any more on the cocoercivity parameter λ.
6 Conclusion, perspectives
In this paper, in a general real Hilbert space setting, we investigated a dynamic inertial Newton method for solving additively structured monotone problems. The dynamic is driven by the sum of two monotone operators with distinct properties: the potential component is the gradient of a continuously differentiable convex function f, and the nonpotential one is a monotone and cocoercive operator B. The geometric damping is controlled by the Hessian of the potential f and by a Newtontype correction term attached to B. The wellposedness of the Cauchy problem is shown as well as the asymptotic convergence properties of the trajectories generated by the continuous dynamic. The convergence analysis is carried out through the parameters \(\beta _{f}\) and \(\beta _{b}\) attached to the geometric dampings as well as the parameters γ and λ (the viscous damping and the coefficient of cocoercivity respectively). The introduction of geometric damping makes it possible to control and attenuate the oscillations known for viscous damping of inertial systems, giving rise to faster numerical methods. It would be interesting to extend the analysis for both the continuous dynamic and its discretization to the case of an asymptotic vanishing damping \(\gamma (t)=\frac{\alpha }{t}\), with \(\alpha >0\) as in [28]. This is a decisive step towards proposing faster algorithms for solving structured monotone inclusions, which are connected to the accelerated gradient method of Nesterov. The study of the corresponding splitting methods is also an important topic which needs further investigations. In fact, by replacing ∇f with a general maximally monotone operator A, the resolvent of which can be easily computed, it would be interesting to study a forwardbackward inertial algorithm with Hessiandriven damping for solving structured monotone inclusions of the form: \(Ax+Bx\ni 0\). These are interesting topics for future research.
Notes
i.e. B is not supposed to be equal to the gradient of a given function.
References
Abbas, B., Attouch, H.: Dynamical systems and forward–backward algorithms associated with the sum of a convex subdifferential and a monotone cocoercive operator. Optimization 64(10), 2223–2252 (2015)
Abbas, B., Attouch, H., Svaiter, B.F.: Newtonlike dynamics and forward–backward methods for structured monotone inclusions in Hilbert spaces. J. Optim. Theory Appl. 161(2), 331–360 (2014)
Adly, S., Attouch, H.: Finite convergence of proximalgradient inertial algorithms combining dry friction with Hessiandriven damping. SIAM J. Optim. 30(3), 2134–2162 (2020)
Alecsa, C.D., László, S., Pinta, T.: An extension of the second order dynamical system that models Nesterov’s convex gradient method. Appl. Math. Optim. 84, 1687–1716 (2021)
Alvarez, F., Attouch, H.: An inertial proximal method for maximal monotone operators via discretization of a nonlinear oscillator with damping. SetValued Anal. 9(1–2), 3–11 (2001)
Alvarez, F., Attouch, H., Bolte, J., Redont, P.: A secondorder gradientlike dissipative dynamical system with Hessiandriven damping. Application to optimization and mechanics. J. Math. Pures Appl. 81(8), 747–779 (2002)
Attouch, H., Chbani, Z., Fadili, J., Riahi, H.: Firstorder algorithms via inertial systems with Hessian driven damping. Math. Program. (2020). https://doi.org/10.1007/s10107020015911
Attouch, H., László, S.C.: Continuous Newtonlike inertial dynamics for monotone inclusions. SetValued Var. Anal. (2020). https://doi.org/10.1007/s1122802000564y
Attouch, H., László, S.C.: Newtonlike inertial dynamics and proximal algorithms governed by maximally monotone operators. SIAM J. Optim. 30(4), 3252–3283 (2020)
Attouch, H., Maingé, P.E.: Asymptotic behavior of second order dissipative evolution equations combining potential with nonpotential effects. ESAIM Control Optim. Calc. Var. 17(3), 836–857 (2011)
Attouch, H., Maingé, P.E., Redont, P.: A secondorder differential system with Hessiandriven damping; application to nonelastic shock laws. Differ. Equ. Appl. 4(1), 27–65 (2012)
Attouch, H., Marques Alves, M., Svaiter, B.F.: A dynamic approach to a proximalNewton method for monotone inclusions in Hilbert spaces, with complexity \(\mathcal{O}(1/n^{2})\). J. Convex Anal. 23(1), 139–180 (2016)
Attouch, H., Peypouquet, J.: Convergence of inertial dynamics and proximal algorithms governed by maximal monotone operators. Math. Program. 174(1–2), 391–432 (2019)
Attouch, H., Peypouquet, J., Redont, P.: Fast convex minimization via inertial dynamics with Hessian driven damping. J. Differ. Equ. 261(10), 5734–5783 (2016)
Attouch, H., Redont, P., Svaiter, B.F.: Global convergence of a closedloop regularized Newton method for solving monotone inclusions in Hilbert spaces. J. Optim. Theory Appl. 157(3), 624–650 (2013)
Attouch, H., Svaiter, B.F.: A continuous dynamical Newtonlike approach to solving monotone inclusions. SIAM J. Control Optim. 49(2), 574–598 (2011)
Baillon, J.B., Haddad, G.: Quelques propriétés des opérateurs anglesbornés et ncycliquement monotones. Isr. J. Math. 26, 137–150 (1977)
Bauschke, H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces. CMS Books in Mathematics. Springer, Berlin (2011)
Boţ, R.I., Csetnek, E.R.: Second order forward–backward dynamical systems for monotone inclusion problems. SIAM J. Control Optim. 54, 1423–1443 (2016)
Boţ, R.I., Csetnek, E.R., László, S.C.: Tikhonov regularization of a second order dynamical system with Hessian damping. Math. Program. (2020). https://doi.org/10.1007/s10107020015288
Brézis, H.: Opérateurs maximaux monotones dans les espaces de Hilbert et équations d’évolution. Lecture Notes, vol. 5. NorthHolland, Amsterdam (1972)
Brézis, H.: Analyse fonctionnelle. Collection Mathématiques Appliquées pour le Maîtrise. Masson, Paris (1983)
Castera, C., Bolte, J., Févotte, C., Pauwels, E.: An inertial Newton algorithm for deep learning (2019). HAL02140748
Kim, D.: Accelerated proximal point method for maximally monotone operators. Preprint (2020). arXiv:1905.05149v3
Lin, T., Jordan, M.I.: A controltheoretic perspective on optimal highorder optimization. Preprint. (2019). arXiv:1912.07168v1
Peypouquet, J., Sorin, S.: Evolution equations for maximal monotone operators: asymptotic analysis in continuous and discrete time. J. Convex Anal. 17(3–4), 1113–1163 (2010)
Shi, B., Du, S.S., Jordan, M.I., Su, W.J.: Understanding the acceleration phenomenon via highresolution differential equations. Preprint (2018). arXiv:1810.08907 [math.OC]
Su, W., Boyd, S., Candès, E.J.: A differential equation for modeling Nesterov’s accelerated gradient method. J. Mach. Learn. Res. 17, 1–43 (2016)
Author information
Authors and Affiliations
Contributions
All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no competing interests.
Appendix
Appendix
1.1 A.1 Technical lemmas
Let us show that the sum of two cocoercive operators is still cocoercive. For further properties concerning cocoercive operators see [18].
Lemma A.1
Let \(T_{1},T_{2}: \mathcal{H} \to \mathcal{H}\) be two cocoercive operators with respective cocoercivity coefficients \(\lambda _{1},\lambda _{2}>0\). Then \(T:=T_{1}+T_{2}: \mathcal{H} \to \mathcal{H}\) is λcocoercive with \(\lambda = \frac{\lambda _{1}\lambda _{2}}{\lambda _{1}+\lambda _{2}}\).
Proof
According to the cocoercivity assumptions of \(T_{1}\) and \(T_{2}\), we have
Let us show that the sum \(T=T_{1}+T_{2}\) is still cocoercive. Using elementary computation in Hilbert spaces, for all \(x,y\in \mathcal{H}\), we have
Since \(T_{1}\), \(T_{2}\) are cocoercive, we deduce that
Equivalently,
So, T is still λcocoercive with \(\lambda = \frac{\lambda _{1}\lambda _{2}}{\lambda _{1}+\lambda _{2}} >0\).
Let us show that this estimate is sharp. Take \(T_{1}: \mathcal{H}\to \mathcal{H}\), \(x\mapsto \lambda _{1}^{1}x\) and \(T_{2}: \mathcal{H}\to \mathcal{H}\), \(x\mapsto \lambda _{2}^{1}x\). It is easy to check that \(T_{1}\), \(T_{2}\) are two cocoercive operators with cocoercivity coefficients \(\lambda _{1}\), \(\lambda _{2}\) respectively. Then their sum operator is equal to \(Tx= ( \lambda _{1}^{1} + \lambda _{2}^{1} ) x = \lambda ^{1} x \) with \(\lambda = \frac{\lambda _{1}\lambda _{2}}{\lambda _{1}+\lambda _{2}} \), and hence is λ cocoercive. This shows that we cannot obtain a better estimate. □
The next lemma is a classical result in integration theory.
Lemma A.2
Let \(1\leq p<\infty \) and \(1\leq r\leq \infty \). Suppose that \(u\in L^{p}([0,\infty [; \mathbb{R})\) is a locally absolutely continuous nonnegative function, \(g\in L^{r}([0,\infty [; \mathbb{R})\) and
for almost every \(t>0\). Then \(\lim_{t\to \infty }u(t)=0\).
In the proof of Theorem 3.1, we use the following elementary result concerning positive quadratic forms.
Lemma A.3
Let a, b, c be three real numbers. The quadratic form \(q: \mathcal{H}\times \mathcal{H}\to \mathbb{R}\)
is positive definite if and only if \(ac b^{2} > 0\) and \(a >0 \). Moreover,
where the positive real number \(\mu := \frac{1}{2} ( a+c \sqrt{(ac)^{2} +4b^{2}} ) \) is the smallest eigenvalue of the positive symmetric matrix associated with q.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Adly, S., Attouch, H. & Vo, V.N. Asymptotic behavior of Newtonlike inertial dynamics involving the sum of potential and nonpotential terms. Fixed Point Theory Algorithms Sci Eng 2021, 17 (2021). https://doi.org/10.1186/s13663021007027
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s13663021007027