Baixe o app para aproveitar ainda mais
Prévia do material em texto
Lecture Notes in Chemistry Edited by: Prof. Dr. Gaston Berthier Universite de Paris Prof. Dr. Hanns Fischer Universitat ZUrich Prof. Dr. Kenichi Fukui Kyoto University Prof. Dr. George G. Hall University of Nottingham Prof. Dr. JUrgen Hinze Universitat Bielefeld Prof. Dr. Joshua Jortner Tel-Aviv University Prof. Dr. Werner Kutzelnigg Universitat Bochum Prof. Dr. Klaus Ruedenberg Iowa State University Prof Dr. Jacopo Tomasi Universita di Pisa 74 Springer-Verlag Berlin Heidelberg GmbH M. Defranceschi C. Le Bris Mathematical Models and Methods for Ab Initio Quantum Chemistry Springer Authors Dr. Mireille Defranceschi CEA-Saclay, Bât. 125 DPE/SPCP/LEPCA, 91191 Gif-sur-Yvette Cedex, France E-mail: defrancesc@camac.cea.fr Prof. Claude Le Bris C.E.R.M.I.C.S. Ecole Nationale des Ponts et Chaussees 6&8 Avenue Blaise Pascal Cite Descartes, Champs sur Mame 77455 Mame La VaII ee Cedex 2, France E-mail: lebris@cermics.enpc.fr Cataloging-in-Publication Data applied for Die Deutsche Bibliothek - CIP-Einheitsaufnahme Mathematical models and methods for ab initio quantum chemistry I Mireille Defranceschi ; Claude LeBris. - Berlin; Heidelberg ; New York; Barcelona; Hong Kong; London; Milan; Paris; Singapore; Tokyo: Springer, 2000 (Lecture notes in chemistry ; 74) ISBN 978-3-540-67631-7 ISBN 978-3-642-57237-1 (eBook) DOI 10.1007/978-3-642-57237-1 ISSN 0342-4901 ISBN 978-3-540-67631-7 This work is subject to copyright. AII rights are reserved, whether the whole or part of the material is concemed, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer-Ver1ag. Violations are liable for prosecution under the German Copyright Law. © Springer-Verlag Berlin Heidelberg 2000 Originally published by Springer-Verlag Berlin Heidelberg New York in 2000 Softcover reprint ofthe hardcover Ist edition 2000 The use of general descriptive names, registered names, trademarks, etc .. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Typesetting: Camera ready by author Printed on acid-free paper SPIN: 10771679 51/3142 - 543210 Contents Foreword, by Mireille Defranceschi & Claude Le Bris ix I General topics 1 1 Is a molecule in chemistry explicable as a broken symmetry in quantum mechanics?, by B. Sutcliffe 3 1 Introduction.................... 4 2 The equations of motion for the molecule . . . . 5 3 The Eckart choice of a frame fixed in the body . 7 3.1 The effects of nuclear permutations on the Eckart coordinates 9 3.2 Discussion and conclusions. . . . . . . . . . . . . . . . . . 14 2 SCF algorithms for HF electronic calculations, by E. Cance.s 17 1 Introduction.................... 18 2 A brief presentation of the Hartree-Fock model. 19 3 General remarks on SCF algorithms. . . . . . . 24 4 The Roothaan algorithm: why and how it fails. 27 5 Level-shifting........... 31 6 The Optimal Damping Algorithm . . . . . . . . 33 3 A pedagogical introduction to Quantum Monte-Carlo, by M. Caffarel & R. Assam! 1 Introduction....... 2 QMC for a 2 x 2 matrix .. . 3 2.1 The 2 x 2 matrix .. . 2.2 QMC is a power method 2.3 Introducing a stochastic matrix 2.4 Pure Diffusion Monte Carlo (PDMC) 2.5 Diffusion Monte Carlo (DMC) .... 2.6 Stochastic reconfiguration Monte Carlo (SRMC) 2.7 Accelerating QMC .... QMC beyond the 2 x 2 matrix .. 3.1 QMC for a N x N matrix 3.2 QMC in a continuum ... 45 46 49 49 50 51 55 56 58 61 66 66 66 VI 4 On the controllability of bilinear quantum systems, by G. Turinici 75 1 Introduction . . . . . . . . . . . . . 76 2 Infinite dimensional controllability. . . . 3 Finite dimensional controllability .... 4 Transfer graph and necessary conditions 5 Controllability results. . . . . . . 5.1 Local exact controllability 5.2 Global controllability ... 6 Application to a five level system 7 Conclusions 8 Appendix ............ . II Condensed phases 76 78 79 80 80 83 86 88 88 93 5 Recent mathematical results on the quantum modeling of crystals, by 1. Catto, C. Le Bris & P-L. Lions 95 1 Introduction................ 96 2 Models from Density Functional Theory 98 3 Hartree type models ........... . 102 4 Hartree-Fock type models ........ . 104 5 The optimal periodic configuration for the nuclei. . 110 6 Extensions and perspectives . . . . . . . . . . . . . 113 6 Local density approximations for the energy of a periodic Coulomb model, by O. Bokanowski, B. Grebert & N. J. Mauser 121 1 Introduction................ . 122 2 Deformations and the Jacobian problem .......... . 123 3 Kinetic energy estimates . . . . . . . . . . . . . . . . . . . . 124 4 Justification of the X a method and exchange energy approximation . 126 5 Global energy bound ........................... 127 7 A mathematical insight into ab initio simulations of the solid phase, by X. Blanc 133 1 Introduction..... · 134 2 Crystalline structure . · 134 3 Band theory . . . . . . · 136 3.1 Band structure · 136 3.2 Density of states · 138 4 Electronic calculations: standard approximations · 140 4.1 Hartree-Fock formalism ... . · 140 4.2 Density functional theory .. . · 142 4.3 The use of pseudo-potentials . . · 143 4.4 Self-consistent field algorithms. · 144 5 Numerical schemes · 145 5.1 Basis sets . . . . . . . . . . . . · 145 6 5.2 5.3 Reciprocal space integration and special point technics Ewald sums Conclusion. . . ..................... . vii 147 · 149 · 152 8 Examples of hidden numerical tricks in a solid state determination of electronic structure., by M. Defranceschi f3 V. Louis-Achille 159 1 Introduction...................... . 160 2 General aspects of the Density Functional Theory (DFT) . 160 3 Choice of the method ......... 162 4 5 3.1 Representation of the function . 163 3.2 Use of pseudo-potentials . . 164 Choice of the number of k-points . 165 Other numerical tricks · 165 5.1 Results .......... . · 166 9 Quantum mechanical models for systems in solution, by B. Mennucci 171 1 Introduction.. ................. . · 172 2 The Effective Hamiltonian and the Free Energy · 172 3 Intermolecular Forces . . . . . . . · 175 3.1 Repulsion and Dispersion ....... . · 176 3.2 Electrostatic contribution ........ . 178 4 Quantum Mechanical Theory of Solvation: the ASC approach . 185 5 The solute cavity and its tessellation . 187 6 Derivatives and Molecular Properties . 189 6.1 Free Energy Surfaces . 189 6.2 External Properties. . . . . . 192 7 An example of applications. . . . . . . 195 7.1 Structures and (hyper)polarizabilities of push-pull systems in solution . 196 8 Conclusions................................ 202 III Relativistic models 209 10 Variational methods in relativistic quantum mechanics: new ap- proach to the computation of Dirac eigenvalues, by J. Dolbeault, M. J. Esteban f3 E. Sere 211 1 Introduction............... . . . . . . 212 2 Min-max approaches. . . . . . . . . . . . 214 3 4 Minimization method and corresponding min-max approaches. 3.1 A constrained minimization method. . 3.2 Relationship with Talman's min-max .. Some related Ilumerical computations. · 216 · 216 .217· 218 Vlll 11 Quaternion symmetry of the Dirac equation, by H. J. Aa. Jensen 1 Introduction . . . . . . . . 2 3 4 5 The Dirac equation . . . . The full symmetry group . 3.1 Spatial symmetry. 3.2 Time reversal symmetry The quat ern ion Dirac equation. Discussion and conclusion . . . T. Saue fj 227 · 228 .228 · 231 · 231 · 236 .240 · 243 Foreword On the occasion of the fourth International Conference on Industrial and Applied Mathematics!, we decided to organize a sequence of 4 minisymposia devoted to the mathematical aspects and the numerical aspects of Quantum Chemistry. Our goal was to bring together scientists from different communities, namely mathematicians, experts at numerical analysis and computer science, chemists, just to see whether this heterogeneous set of lecturers can produce a rather homogeneous presentation of the domain to an uninitiated audience. To the best of our knowledgde, nothing of this kind had never been tempted so far. It seemed to us that it was the good time for doing it, both .because the interest of applied mathematicians into the world of computational chemistry has exponentially increased in the past few years, and because the community of chemists feels more and more concerned with the numerical issues. Indeed, in the early years of Quantum Chemistry, the pioneers (Coulson, Mac Weeny, just to quote two of them) used to solve fundamental equations modelling toy systems which could be simply numerically handled in view of their very limited size. The true difficulty arose with the need to model larger systems while possibly taking into account their interaction with their environment. Hand calculations were no longer possible, and computing science came into the picture. Today, the challenge is twofold: improving the formalism (both from a physical viewpoint and from the mathematical viewpoint), and speeding up rigorously founded numerical methods. From this originates the revival of interest in the communities of chemists and applied mathematicians. From what we heard from the audience of our series of minisymposia, the result of our enterprise was beyond our hope. So the idea carne out to translate that in a written manner. We therefore suggested that each of the nineteen lecturers should give a written account on his view on the subject. The topic of the written contribution need not be exactly the same as that of the talk, but. it. has to be related with the interplay between Mathematics, Numerical Analysis and Quantum Chemistry. The volume the two of us planned to edit was not a proceedings volume. It was rather an outgrowth of the series of talks, and an attempt to bring together this heterogeneous population this time in the written mode. The only constraint we imposed to the contributors was the following rule of the game: the mathematicians had to write in a language understandable by chemists, and vice-veT'.'ia. Thirteen lecturers out of ninet.een played the game until the end. The result is in the reader's 1 ICIAM99, held in Edinburgh, Scotland, July 5-9th, 1999, x hands. It consists of eleven chapters devoted to various aspects of Computational Chemistry. This volume is divided into three parts. The first part deals with topics of general interest, the second one is devoted to the modelling of the condensed phases, the third one focuses on the relativistic aspects. The book opens with a contribution by Brian Sutcliffe on questions of symmetry in Quantum Mechanics. We find it symbolic that the first chapter of such a volume is written by such an eminent chemist, who is known to have had a constant interest into the mathematical aspects. The first part continues with a chapter by Eric Cances on the numerical analysis of SCF algorithms for HF calculations. Eric Cances is one of the representatives of this new generation of applied mathematicians who are very much involved in Computational Chemistry. He presents a rigorous analysis of the existing SCF algorithms and introduces new ones. Chapter 3 is written together by a chemist and a mathematician, Michel Caffarel and Roland Assaraf. It is devoted to Quantum Monte-Carlo methods. Their con- tribution, that we consider in some sense as an instance of what should be done in order to enhance the links between the two communities, has been deliberately writ- ten on a pedagogical tone. They are to be thanked for that (unfortunately unusual but so much useful) intention. The last chapter of the first part is due to Gabriel Turinici, who introduces the reader to the very important issue of exact control of quantum systems. As announced above, the second part of this volume deals with the modelling of the condensed phases. In Chapters 5 to 8, the crystalline solid phase is of concern. Chapter 9 is devoted to the liquid phase. Chapters 5 and 6 rather stand on the theoretical side. The former, due to Isabelle Catto, Pierre-Louis Lions and one of us (CLB), reports on a series of works devoted to the rigorous derivation of the models of the solid phase, a topic not so often addressed in the physical literature. The latter, written by Olivier Bokanowski, Benoit Grebert and Norbert Mauser, follows the same vein. It discusses the rigorous foundations of some well accepted approximations of Quantum Chemistry, such as the Xet method. In Chapter 7, Xavier Blanc, a young mathematician, presents his personal view on the models in use for the simulation of the crystalline phase. It may be interesting for a chemist expert at this topic to see the presentation of it by someone of the "outer world" . Chapter 8, from Vanina Louis-Achille and one of us (MD), is the logical sequel of Chapter 7; it develops the practical numerical aspects encountered in solid quantum chemistry calculations which are hidden in usual scientific papers. Chapter 9 by Benedetta Mennucci is an overview of liquid state methods of calculations. It provides the theoretical framework for the various methods currently used in Quantum Chemistry codes. The third part, devoted to the relativistic models, features two contributions. Xl Chapter 10 is due to Jean Dolbeault, Maria Esteban and Eric Sere. It reports on the works of the authors that give a sound mathematical ground to the Dirac-Fock model. By their contribution to the field, they bring the mathematical understand- ing of the Dirac-Fock model to the level to that of the Hartree-Fock model. Trond Saue and H. J. Aa. Jensen introduce in Chapter 11 the various concepts of quaternion symmetry appearing in the Dirac Equations as they are implemented in the 4-component relativistic molecular calculations (DIRAC code). Let us end this foreword by emphasizing we strongly believe that the two com- munities of applied mathematicians and chemists can take benefit of an interaction. At least we hope that our endeavour towards the challenge to bring together applied mathematicians and chemists will stimulate other ones. As we are tenacious, we are currently working on another project in the same vein. Needless to say, we encourage every interested reader to contact us. M. D. & C. L. B. Paris, April 2000. Part I General topics Chapter 1 Is a molecule in chemistry explicable as a broken symmetry in quantum mechanics? Brian Sutcliffe Theoretical Division 1, Institute for Molecular Science, Myodaiji, Okazaki 444-8585, Japan, and present address Department de Chimie Physique Moleculaire, Universite Libre de Bruxelles, B-1050 Bruxelles, Belgium. bsutclif@ulb.ac.be Abstract : It is argued that incorporating the molecular geometry into the transformation specifications for deriving the standard (Eckart) Hamiltonian used to describe molecular spectra, cannot generallybe accomplished without breaking the permutational symmetry requirements on identical nuclei. M. Defranceschi et al., Mathematical Models and Methods for Ab Initio Quantum Chemistry © Springer-Verlag Berlin Heidelberg 2000 4 Brian Sutcliffe 1 Introduction The idea that the proper way to treat molecules in quantum mechanics is to treat their nuclear motions as essentially classical with the electrons treated quantum mechanically, dates from the very earliest days of the subject. The genesis of the idea is usually attributed to Born and Oppenheimer [1], but it is an idea that was in the air at the time, for the earliest papers in which the idea is used, predate the publication of their paper. The physical picture that informs the attempted separation is one well known and widely used even in classical mechanics, namely division of the problem into a set of rapidly moving particles, here electrons, and a much more slowly moving set, here the nuclei. Experience is that it is wise to try and separate such incommensurate motions both to calculate efficiently and to get a useful physical picture. The object of the separation in the molecular case is to get an electronic motion problem in which the nuclear positions can be treated as parameters and whose solutions can be used to solve the nuclear motion problem. The insights arising from classical chemistry seem to predicate that, for lowish energies, the nuclear motion function should be strongly peaked at a nuclear geometry that corresponds to the traditional molecular geometry. A function of this kind would allow a good account of the electronic structure of a molecule to be given in terms of a single choice for the nuclear geometry about which a molecule could be considered as performing small vibrations and undergoing essentially rigid rotations. Carl Eckart [2] was among the first to discuss how this picture might be supported. He did so in a context that assumed the separation of electronic and nuclear motions. In his approach, the electrons are regarded simply as providing a potential. This potential is invariant under all uniform translations and rigid rotations of the nuclei that form the molecule. It is usually referred to as a potential energy surface and the nuclei are said to move on this surface. Eckart actually treats the nuclear motions by classical rather than quantum mechanics but it is his approach and developments from it that have dominated the interpretation of molecular spectra since 1936. Because the approach is essentially classical, it is perfectly proper to treat identical particles as distinguishable, by virtue of their positions on the potential surface. This identification allows the assignment of a molecule to a point group, that of its equilibrium geometry, and such assignments have proved extremely fruitful in interpreting molecular vibration-rotation spectra. In what follows we shall not concern ourselves with the details of the separation of electronic from nuclear motion but rather with the consequences of trying to treat the nuclear motion quantum mechanically rather than classically and to allow for the permutational symmetry of identical nuclei and for the overall translation and rotation symmetries. Is a molecule in chemistry explicable as a broken symmetry... 5 2 The equations of motion for the molecule Schrodinger's Hamiltonian describing the molecule as a system of N charged parti- cles in a coordinate frame fixed in the laboratory is A !i2 N 1 2 e2 N I Z,Zj H(x)=--L-V (Xi)+- L- 2-1 mi 8HEa ·-1 Xi]· t- Z,)- (1) where the separation between particles is defined by X7j = L(Xoj - Xoi)2 (2) o It is convenient to regard Xi as a column matrix of three cartesian components Xoi , a = X, y, z and to regard Xi collectively as the 3 by N matrix x. Each of the particles has mass mi and charge Zie. The charge-numbers Zi are positive for a nucleus and minus one for an electron. In a neutral system the charge-numbers sum to zero. To distinguish between electrons and nuclei, the variables are split up into two sets, one set consisting of L variables, xi, describing the electrons and the other set of H variables, xi, describing the nuclei and N = L + H. When it is necessary to emphasise this split, (1) will be denoted if (xn, xe). If the full problem has eigenstates that are square-integrable so that (3) then an eigenstate can be written approximately as a product of the required form The function, <I>(xn) (chosen to be non-trivial) is determined as a solution of an effec- tive nuclear motion equation, and traditionally, a good guess for the electronic wave function is supposed to be provided by a solution of the damped nudei electronic Hamiltonian This Hamiltonian is obtained from the original one (1) by assigning the values ai to the nuclear variables x;', hence the designation clamped nuclei for this form. Within the electronic problem each nuclear position ai is treated as a parameter. For solution of the entire problem, the electronic wave function must be available for all values of these parameters. The energy obtained from the solution of this problem depends on the nuclear parameters and is commonly called the electronic energy. It is usual to think of the potential energy surface, used in the Eckart approach, as formed by adding the electronic and the classical nuclear repulsion energy. But it is not actually possible to use this approach directly to approximate so- lutions, because the Hamiltonian (1) is invariant under uniform translations in the 6 Brian Sutcliffe frame fixed in the laboratory. This means that the centre of molecular mass moves through space like a free particle and the states of a free particle are not quan- tised and eigenfunctions are not square integrable. Before we consider this further let us note another two symmetries of the equation. The molecular Hamiltonian is invariant under all orthogonal transformations (rotation-reflections) of the par- ticle variables in the frame fixed in the laboratory and is also invariant under the permutation of the variable sets of all identical particles. The way in which translational motion can be removed from the problem is well understood from classical mechanics. But it involves an essentially arbitrary choice of translation ally invariant coordinates. If we want to maintain a distinction between electronic and nuclear variables, it seems sensible to define L translation- ally invariant electronic coordinates such that they transform into one another in the standard way under electronic permutations and are unaffected by nuclear permuta- tions. The rotational motion should then be described in a translationally invariant way, entirely in terms of the original nuclear coordinates. This process consists es- sentially of the choice of an orthogonal matrix C fixing a frame in the body of the molecule along with the choice of 3H - 6 internal coordinates qk which describe the changes in nuclear geometry and are invariant under any orthogonal transformation of the original coordinate set. The matrix C can be parameterised by three Euler angles which may be used to define the angular momentum of the system while the sign of detlCl specifies the handedness of the chosen frame. The coordinate transformation from the frame fixed in the laboratory to one fixed in the molecule is clearly a non-linear one and so must fail for some coordinate specification where the jacobian vanishes. So, for example, there will always be a molecular geometry at which, for any particular choice of embedding rule, it is not possible to define three Euler angles and here the transformation will fail. But there will always be a domain of validity and we shall confine attention to that domain. Within that domain the wave function forthe problem has the form +J T(XT ) L <I>~(q,z)IJMk > (5) k=-J where X T denotes the centre of mass coordinate. and q and z the internal nuclear and electronic coordinates respectively while IJ Mk > is an angular momentum eigenfunction depending on the three Euler angles. The translational motion can be completely separated off but the rotational motion can be separated only approx- imately. If the separation between electronic and nuclear motion is valid then the internal motion function can be written in good approximation as (6) for any electronic state p and it is perfectly reasonable now to regard the electronic wave function as a solution of the clamped nuclei problem, a particular choice of the ai yielding a particular choice of the qk. A permutation of the variables describing a set of identical nuclei will naturally leave the centre of mass coordinate unchanged and therefore will not affect the re- moval of translational motion. It will also leave the centre of nuclear mass coordinate Is a molecule in chemistry explicable as a broken symmetry... 7 unchanged. But it is perfectly possible that such a permutation will induce changes in the orientation variables and in the internal coordinates and it might well be the case that it actually mixes the orientation variables and the internal coordinates. 3 The Eckart choice of a frame fixed in the body In the Eckart approach to a fixing a frame in a molecule, the electrons are considered simply as providing a potential for nuclear motion and so in discussing this embed- ding we shall ignore them. Implicitly however, the electronic coordinates are treated relative to the centre of nuclear mass so that the translationally invariant electronic coordinates are just like the original electronic coordinates in that there are just L of them, they are invariant under any permutations of the nuclei and transform in the standard way under electronic permutations. It is perhaps sufficiently clear from this that they cause no deep complications. The frame C fixed in the body is defined entirely in terms of the original nuclear coordinates and is such as to define a redundant set of cartesian coordinates z? according to x7 - X = Cz7 where X is the centre of nuclear mass coordinate. Thus H Lmiz7 = 0 (7) i=1 and the matrix C is chosen as to define a set of cartesians in the frame fixed in the body such that the reference structure is specified by where the ai are constant matrices and, by definition, H Lmiai = 0 i=1 The reference structure is, of course, usually chosen to be the classical molecular geometry. It is assumed chosen to reflect the equilibrium geometry of the molecule as it would be at the minimum of the potential. However for the purposes of carrying through the derivations, there is no need to specify it more closely than as a reference configuration. Defining the displacement coordinates as the specification of the matrix C is completed by requiring that H Lmiai x Pi = (5 (8) i=1 8 Brian Sutcliffe provided that the ai do not define a line. If they do, then they provide only one rather than the required three constraints and the Euler angles of the Eckart frame cannot then be defined. Even when they do not define a line, it is possible that the displacement vectors might, and if they do, the definition of the Eckart frame again will fail. But we assume that this happens only for very large displacements from the reference molecular geometry and thus in a region where the vibrational wave function vanishes. These constraint conditions are on the components of the mass weighted sum over all the vectors, of the vector products of the reference geometry vectors with the displacement vectors. In classical mechanics the vanishing of these components would be interpreted as the system having no internal angular momentum at the reference geometry. The two conditions (7) and (8) defining the embedded frame are often called the Eckart conditions. The standard account of the Eckart formulation in quantum mechanical form is provided by Watson [3] and, for comparison, here zi is used for Watson's ri and ai for his r~. The zi are completely expressible in terms of a set of 3H - 6 internal coordi- nates qk. The internal coordinates are expressed in terms of displacements from the reference geometry by H H qk = L L bexikPexi == L b~Pi' k = 1,2, ... 3H - 6 (9) i=l 0: i=1 where the elements bexik are simply constants which may be regarded as components of a column matrix bik . The range of these coordinates is (-00, (0). Internal coordinates must be linearly independent and so the bik must be linearly independent and in order for an inverse transformation to exist between the internal coordinates and the coordinates in the frame fixed in the laboratory, it is [4] required that i=1 Using the two Eckart conditions given above it follows that (10) where H B = L mi(xi' - X)a; (11) i=1 The 3 by 3 matrix BTB is symmetric and therefore diagonalisable, so functions of it may be properly defined in terms of its eigenvalues. There are in principle, eight (23) possible distinct square root matrices, all such that their square yields the original matrix product. Consistency (see [5]) requires, however, that the positive square roots of each of the eigenvalues be chosen. The eigenvalues cannot be negative but one or more might be zero and if this is so then the Eckart frame cannot be properly defined This is a matter to which we shall return later. Is a molecule in chemistry explicable as a broken symmetry... 9 If we denote the 3 by H matrix of the xi - X as wand the similar collection of all the zi and thus the ai as zn and a then, from (11), B may be written as where m is an H by H diagonal matrix with the nuclear masses along the diagonal. The second Eckart condition (8) can be manipulated to show that it is equivalent to the requirement that the matrix H A = L miziaf == znmaT (12) i=1 is symmetric. Thus and so from (10) Whatever the precise specification made of the inverse square root, if A is singular then C is undefined and the Eckart specification fails. Thus, as mentioned above, if the ai together specify a line, for example if all the ayi and an vanish then A is clearly singular. But there is a special case. If the ai together specify a planar figure, for example if all the azi vanish, then A is again singular but in this case the Eckart conditions may be satisfied by requiring that the Z~i vanish too. The last row and column of A are null and there is a 2 by 2 non-vanishing block and, provided that this block is non-singular, the problem remains well defined. As a matter of fact there is no need in practice to treat this special case explicitly. The second Eckart condition implicitly orients the third axis to be perpendicular to the defined plane and, as long as the two by two block of A is non-singular, the planarity constraints are subsumed. It is possible to write the cartesians expressed in the Eckart frame directly in terms of the coordinates of the frame fixed in the laboratory as: and since the ij-th element of w T w is a scalar product, (xi - xyr (xj - X), the inter- nal cartesians are clearly invariant under any orthogonal transformation, including inversion, of the original nuclear coordinates. 3.1 The effects of nuclear permutations on the Eckart coor- dinates If we represent a particular permutation of identical nuclei by P, where this is a standard orthogonal permutation matrix, then we can express any permutation of nuclei by xn ---t xnp 10 Brian Sutcliffe and we see that this permutation induces in B the change B ---+ wpT maT = wm(aPf = Cznm(ap)T = CAP because the permutation is ofidentical nuclei, and so its representative matrix must commute with the mass matrix. Thus the permutation seems to act as if it changed the reference vectors but these are, of course, quantities fixed at the time of problem specification and hence unchanging parameters once a choice is made. From (10) it follows that under this permutation C ---+ CAP (APT Apr~ == CU If the matrix A P is singular then U will not exist, the change in C will be undefined, and hence the Eckart frame embedding will fail. So no such permutation can be realised in the Eckart formulation and a fundamental symmetry of of the problem is broken. If the permutation is such is that A P is non-singular and symmetric then it can be diagonalised by an orthogonal matrix V with eigenvector columns Vi and eigenvalues a;. In dealing with the square root, consistency again requires the choice of the positive value and U can therefore be written as 3 U = L sgn(af)v;v; ;=1 If all the a; are positive then U is just the unit matrix and if all negative then U is the negative unit matrix and represents an inversion. The form that it actually takes depends on the precise nature of the permutation. If U is a constant matrix, separation between angular and internal coordinates is essentially preserved. The most that can happen is that the internal coordinates go into linear combinations of themselves and the angular momentum components go into linear combinations of themselves. A special case of such a constant matrix is if the permutation is such that (13) then (14) The matrix S could arise as the orthogonal representation matrix of a point group operation on the molecular framework, for the point group of the molecu- lar framework is isomorphic with a subgroup of the full permutation group of the molecule. The subgroup might be the full group, but that is only rarely the case. It is also the case a relation like (13) might be possible for a permutation or set of permutations that are unconnected with any point group operations. When the permutations can be realised by proper or improper rotation matrices, they are often called perrotations, a name introduced by Gilles and Philippot [6J. These matters and many other related ones, are discussed in more detail in Chapters 2 and 3 of Ezra's monograph [5]. In general the permutation will yield an unsymmetric A P but provided it is non-singular then U will be well defined and orthogonal. Is a molecule in chemistry explicable as a broken symmetry... 11 An example As an example we shall choose ethene, which also illustrates the planar special case too: HI H4 \ / CI C2 / \ H2 H3 with the following choice of ai a, ~ ( -n a, ~ ( =n ~ ~ ( -J ) a, ~ ( n ~~ cn~~ 0) Where al ... a4 represent the reference positions of the protons, each with mass mp and a5 and a6 the reference positions of the carbons, each with mass me. The system is specified to lie in the x - y plane and we shall be concerned only with the 2 by 2 non-vanishing sub matrix of A. The point group of the reference figure is D 2h , using the Schonfliess notation which is standard in molecular spectroscopy. The full permutation group is S4 x S2 assuming the nuclei to be identical isotopes of hydrogen and of carbon respectively. The point group contains 8 operations while the permutation group contains 48 operations. It is convenient to rewrite A from (12) as where the split has been made into the proton and the carbon parts of the problem and permutations will occur only within each part. The carbon part of the problem is always of the form ( ±mC(z~6 - z~5)d 0) = ±A o 0 c where the plus sign represents the identity permutation and the mmus SIgn the transposition of the carbons. If Ap is non-singular then and even though Ae is singular there is no reason to expect the second term on the right to be singular, except perhaps for particular values of the coordinates and we can assume that such regions of coordinate space may be treated as unvisited for our present purposes. It is possible that even if Ap is singular (Ap ± Ac) might not be, but this again will happen only in rather special regions. What has be(~n said 12 Brian Sutcliffe in relation to Ap here, will hold equally for any A:. We would anticipate therefore that that any permutation of the protons that yields a non-singular A~' would yield a non-singular A P when taken with either of the possibilities for Ac. If the permuted form is symmetric, then A P will be symmetric, though there is no reason to expect this to be generally the case. With these matters in mind we shall concentrate on the proton permutations in what follows. Looking at the effect of the point group operations on the framework vectors it is easy to see that C2x == (jzx == (12)(34)(5)(6), C2y == (jyz == (14)(23)(56), C2z == i == (13)(24)(56) while (jxy is equivalent to the identity operation in the point group and the identity permutation. Thus the effect of these permutations on Ap is A~A(1 0) p p 0 -1 c ( -1 0 ) Ap :2f Ap 0-1 Adjoining the relevant transformed Ac to each of these matrices allows the constant matrix to be separated as S and we obtain precisely the results for U anticipated from (13) above. The proton permutations above are the complete set of double transpositions for the protons, and we would expect each of them, even when joined with the alternative carbon permutations to those of the point group, to yield non-singular A P . Hence the definition of the Eckart frame would be maintained even for such non-point group permutations. However one would not in general be able to factor out a common constant matrix for such permutations as one can for perrotations and one would expect the relevant U to be functions of the internal coordinates. It is difficult to analyse the problem further at this level of generality. However if we choose the reference values for the zi we can go a little further and if, with this special choice, the Eckart frame definition fails, then the frame certainly cannot be defined for small displacements, which form the most important region for the Eckart approach. For present purposes, therefore, we shall regard A as if composed amaT and the transformation as With these restrictions Ap = (4mt2 4~pb2) and Ac = (2mod2 ~) Looking at the six proton transpositions, four yield singular A: but two, (13) and (24), do not. The permutation (12) gives Is a molecule in chemistry explicable as a broken symmetry... 13 and the matrix for (34) is the same. In neither case can the singularity be removed with the aid of Ac. The permutation (14) gives and the matrix for (23) is the same. In both cases the singularity can be removed with the aid of Ac. The permutation (24) yields and (13) is the same but with positive off-diagonal elements. Both matrices are non-singular and the Eckart frame can certainly be well defined in the case of either of these permutations associated with either of the carbon permutations. The eight permutations involving just three protons all lead to singular A; of the form so for none of these permutations can a satisfactory Eckart embedding be defined even when Ac is considered. Of the six permutations involving all four protons two, (1324) and (1423), yield singular A; of the form and two, (1243) and (1342), yield singular A; like This matrix can be rendered non-singular with the aid of Ac. The permutation (1234) yields the non-singular but unsymmetric A; and (4321) yields the negative of this matrix. Thus in the region of the reference geometry, of the 24 possible proton permu- tations, 16 lead to singular A; and of these only 4 can lead to non-singular A P The remaining 8 proton permutations form a group and all lead tonon-singular A P combined with either of the possible carbon permutations. This group has itself a sub-group of order 4 which gives rise to the perrotations discussed above. 14 Brian Sutcliffe 3.2 Discussion and conclusions What emerges from what has been said above is precisely how the imposition of a classical molecular geometry as a reference structure on a system in order to sepa- rate its internal motions from its rotations, breaks the permutational symmetry of the problem. It does so in such a way as to preclude a proper separation of rota- tions and internal motions if full permutational symmetrisation is attempted. That a molecular structure seems plausible within a quantum mechanical context, is un- doubtedly due to the role played by the solutions of the clamped nuclei Hamiltonian in considering the separation of electronic and nuclear motion. Indeed just this point was made many years ago in the first section of a very interesting paper by Berry [7]. But it is perhaps worthwhile pointing out that, providing that the electronic coordinates are chosen relative to the centre of nuclear mass, there is absolutely no reason why the Eckart conditions should not be imposed on the nuclear part of the full problem, without any explicit separation of electronic motion. It is our classical preconceptions about a molecule that cause the trouble, rather than anything to do with the perceived virtues or limitations of that separation usually attributed to Born and Oppenheimer. This observation, if correct, must limit too the validity (though not, of course the utility) of the notion of feasible permutations, introduced by Longuet-Higgins [8] to deal with this problem and so much developed by later workers (see, for example [9]). Since the problem is entirely of our own making it would seem foolish to persist in our error, particularly when there is an obvious way in which it can be avoided. If we choose the matrix C defining the rotational variables to be such as to diagonalise the instantaneous inertia tensor then all particles with the same mass will enter into the definition equivalently and permutational equivalence of the rotational variables will follow automatically. And indeed two of the first attempts to separate rotational from vibrational motion two, [10] and [11], adopted precisely this approach. It turns out that in this approach the definition of the frame embedded in the system depends upon the reciprocals of differences between instantaneous moments of inertia and therefore fails whenever two moments are equal. Thus the Hamiltonian could not be used to describe any symmetrical top molecule, for example ammonia. Indeed it turns out more generally, inspite of some heroic efforts by van Vleck [12], that the Hamiltonian so derived is largely ineffective in describing molecules in terms of their traditional geometrical structures and so it has found no use in the elucidation of molecular spectra. It seems therefore that at present molecular structure has to be incorporated into the Hamiltonian in order to describe spectra effectively but that the price for this incorporation is that of a fundamental symmetry breaking. Bibliography [1] M. Born and J.R. Oppenheimer, Ann.der Phys., 192784,457. [2] C. Eckart, Phys. Rev., 1935,47,552. [3] J.K.G. Watson, Mol. Phys., 1968, 15,479. [4] R. J. Malhiot and S. M. Ferigle, J.Chem. Phys, 22, 717-719, (1954). [5] G. Ezra, Symmetry properties of molecules, Lecture Notes in Chemistry 28, Springer-Verlag, Berlin, 1982. [6] J. M. F. Gilles and J. Philippot, Int. journ. quant. chem., 1972, 6, 225. [7] R. S. Berry, Rev. Mod. Phys., 1960, 32, 447. [8] H. C. Longuet-Higgins, Molec. Phys., 1963, 6, 445. [9] P. R. Bunker, Molecular Symmetry and Spectroscopy, Academic Press, Lon- don, 1979. [10] C. Eckart, Phys. Rev., 1934., 46, 384. [11] J. O. Hirschfelder and E. Wigner, Proc. Nat. Acad. Sci., 1935,21, 113. [12] J. H. van Vleck, Phy". Rev., 1935,47,487. Chapter 2 SCF algorithms for HF electronic calculations Eric Cand~s CERMICS, Ecole Nationale des Pants et Chaussees, 6 fj 8, avenue Blaise Pascal, Cite Descartes, F-77455 Marne-La- Vallee Cedex, France cances@cermics.enpc.fr Abstract: This paper presents some mathematical results on SCF algorithms for solving the Hartree-Fock problem. In the first part of the article the focus is on two classical SCF procedures, namely the Roothaan algorithm and the level-shifting algorithm. It is demonstrated that the Roothaan algorithm either converges towards a solution to the Hartree-Fock equations or oscillates between two states which are not solution to the Hartree-Fock equations, any other behavior (oscillations between more than two states, "chaotic" behavior, ... ) being excluded. The level-shifting algorithm is then proved to converge for large enough shift parameter, whatever the initial guess. The second part of the article details the convergence properties of a new algorithm recently introduced by Le Bris and the author, the so-called Optimal Damping Algorithm (ODA). Basic numerical simulations pointing out the principal features of the various algorithms under study are also provided. M. Defranceschi et al., Mathematical Models and Methods for Ab Initio Quantum Chemistry © Springer-Verlag Berlin Heidelberg 2000 18 Eric Cances 1 Introduction The Hartree-Fock (HF) model is a standard tool for computing an approximation of the ground state of a molecular system within the Born-Oppenheimer setting. From a mathematical viewpoint, the HF model gives rise to a nonquadratic constrained minimization problem for the numerical solution of which iterative procedures are needed; such procedures are referred to as Self-Consistent Field (SCF) algorithms. The solution to the HF problem can be obtained either by directly minimizing the HF energy functional [6, 11, 17, 25] or by solving the associated Euler-Lagrange equations, the so-called Hartree-Fock equations [20, 21, 22]. SCF algorithms for solving the HF equations are in general much more efficient than direct energy minimization techniques. However, these algorithms do not a priori ensure the decrease of the energy and they may lead to convergence problems [23]. For instance, the famous Roothaan algorithm (see [21] and section 4) is known to sometimes lead to stable oscillations between two states, none of t.hem being a solution to the HF problem. This situation may occur even for simple chemical systems (see section 4). Many articles have been devoted to the important issue of the SCF convergence. The behavior of the Roothaan algorithm is notably investigated in [2, 12] and in [26, 27]. In [2, 12] convergence difficulties are demonstrated for elementary two- dimensional models; in [26, 27], a stability condition of the Roothaan algorithm in the neighbourhood of a minimum of the HF energy is given for closed-shell sys- tems. More sophisticated SCF algorithms for solving the HF equations have also been proposed to improve the convergence using various techniques like for instance damping [10, 28] or level-shifting [22]. Damping (as implemented in [28]) cures some convergence problems but many other remain. Numerical tests <confirm that the level-shifting algorithm converges towards a solution to the HF equations for large enough shift parameters; a perturbation argument is provided in [22] to prove this convergence in the neighborhood of a stationary point. Unfortunately there is no guarantee that the so-obtained critical point of the HF energy functional is actually a minimum (even local); in addition, the level-shifting algorithm is known to only offer a slow speed of convergence. In practice, the most commonly used SCF algorithm is at the present time the Direct Inversion in the Iteration Space (DIIS)algorithm [20]. Numerical tests show that this algorithm is very efficient in most cases, but that it sometimes fails. The present article belongs to a series of articles [3, 4] devoted to the SCF algorithms. Our first purpose here is to report on recent mathematical results on the conver- gence properties of the Roothaan and of the level-shifting algorithms. Section 4 concerns the Roothaan algorithm, which is the most "natural" algorithm for solving the HF equations. Its is demonstrated that the Roothaan algorithm either converges towards a solution to the HF equations or oscillates between two states which are not solution to the HF equations, any other behavior being excluded. This theo- retical result is in accordance with the numerical experiments. It is then explained in Section 5 why the introduction of a "level-shift" makes the algorithm converge. SCF algorithms for HF electronic calculations 19 The mathematical proofs are presented in the context of the finite dimension ap- proximations of the HF problem obtained by a Galerkin method with a finite basis of atomic orbitals or plane waves, typically. They are consequently much simpler from a technical viewpoint than the proofs detailed in [3] which concern the original infinite dimension HF problem. Recently, new SCF algorithms have been introduced in [4] by Le Bris and the author. They seem to exhibit good convergence properties at least for the chemical systems computed so far. These algorit.hms have been called Relaxed Constraints Algorithms (RCA) for they can be interpreted as direct minimization procedure of the HF energy which do not care about satisfying at each it.eration the nonlinear constraints D2 = D that characterize admissible density matrices. The second purpose of this article (Section 6) is to detail the mathematical proof of the convergence of the basic RCA, namely the Opt.imal Damping Algorithm (ODA). Section 6 also contains some comments on the connections between RCA and other algorithms like the level- shifting and the DIIS algorithms. Before coming up to our main topic, we devot.e Sect.ion 2 t.o a brief presentation of the HF model for readers (especially mathematicians) who are not familiar with Quantum Chemistry. Section 3 collects various general comments that apply to all the SCF algorithms considered in the sequel. 2 A brief presentation of the Hartree-Fock model The problem under consideration consists in computing ab initio, that is to say without using any empirical parameter, the ground state energy of a molecular system made of M nuclei and N electrons. Tackling directly the M + N-body Schrodinger equation is today, and will probably remain, out of the scope of brute force numerical methods. Various approximations are therefore to be resorted to. The first approximation that is common to most models of Quantum Chemistry is the so-called Born-Oppenheimer approximation. To make short, it consists in considering the nuclei as classical point particles. The Born-Oppenheimer approx- imation, which has been mathematically founded by Combes and al. [5], lays on the fact that nuclei are much heavier than electrons. The Born-Oppenheimer ap- proximation is almost always valid in Chemistry (except for instance for studying specifically quantum phenomena involving nuclei as proton transfer by tunnel effect) and is therefore almost always used. Within the Born-Oppenheimer approximation, the searching for the ground state takes the form of two nested minimization problems: (1) with 20 Eric Cances N 1 N M Zk 1 H{xd = - L -~Xi - L L - + L i=l 2 i=l k=l IXi - Xk I l<'<J<N lx, - xJ I N 1i = 1\ L 2 (R3 X {I+), I-)} , C) i=l In the above expressions, Xk denotes the current position in R3 of the k-th nucleus and Zk its charge. The N electrons are described by a wave function l/J( Xl, 0"1; ... ; X N, O"N), where Xi and O"i are respectively the position in 1R3 and the spin coordinate of the i-th electron. Each spin coordinate O"i can take two values here denoted by 1+) (spin up) and 1-) (spin down). The wave function l/J is a normalized vector of the fermionic Hilbert space 1i, and therefore satisfies on the one hand the antisymmetry condition l/J(Xp(l) , O"p(l);···; Xp(N), O"p(N)) = (-1)'(P)l/J(X1, 0"1;···; XN, O"N) for any permutation p of 1[1, N]I (c(p) denoting the signature of p), and on the other hand the normalization condition The operator H{xd is the so-called electronic hamiltonian. It acts on 1i; the Xk play the role of parameters. It is made of three terms, the first term accounting for the kinetic energy of the electrons, the second and the third terms accounting for nuclei-electrons and electrons-electrons interactions respectively. All physical quantities are expressed in atomic units [18]. Searching for the ground-state of the molecular system thus consists in minimizing the potential energy W (Xl, ... , X M) by solving the so-called geometry optimization problem (1). From the mathematical point of view, problem (1) is an unconstrained minimization problem of finite dimension. We refer the reader to [19, 24] for an overview of the various numerical methods dedicated to geometry optimization. The specificity of problem (1) is that the function to be minimized, namely the potential energy W, is itself the result (up to the inter-nuclear repulsion term L zkzL/lxk - xt!) of the minimization problem (2) which is usually referred to as the electronic problem. We face this time a constrained minimization problem on the infinite dimension space 1i. In the sequel, we focus on the electronic problem, which is rewritten (in order to simplify the notations) inf{ (l/J, Hl/J), l/J E 1i, Ill/JII = 1} (3) with N 1i = /\ L 2 (R 3 X {I + ), 1-) }, C ) i=l SCF algorithms for HF electronic calculations 21 N 1 N 1 H = - L -6Xi + LV(Xi) + L ;=1 2 i=1 1 <i<j<N IXi - Xj I M Zk V(x) = - L I - I k=1 X - Xk the Xk being now fixed parameters in lR3 . The Hartree-Fock approximation is of variational nature. It consists in restricting the set {1/! E 11., II1/! II = I} on which the energy functional (1/!, H1/!) is minimized to the set of the Slater determinants, i.e. to the set of the wave functions 1/! of the form (4) where the rPi, which are called molecular orbitals, satisfy the orthonormality condi- tions A classical calculation (see [18] for instance) gives for any 1/) of the form (4) with N T<I>(X, 0"; x', 0"') = L rPi (x, 0") rPi(X', 0"')*, i=1 N P<I>(x) = L L IrPi(X, O"W· i=l a The HF problem thus reads The mathematical properties of the HF problem have been studied by Lieb and Simon [14] and by P.-L. Lions [15]. The existence of a HF electronic ground state is guaranteed for positive ions (Z := L~I Zk > N) and neutral systems (Z = N). We are not aware of any general existence result for negative ions (the available existence proofs only work for N < Z + 1). On the other hand, there is a non-existence results for negative ions such that N > 2Z + M [13] (this inequality holds for instance for 22 Eric CancEls the ion H2-). As far as we know, uniqueness (of the density p at least) is an open problem, probably of outstanding difficulty. The last step of the approximation procedure consists in approaching the infinite dimensional HF problem by a finite dimensional HF problem by means of a Galerkin approximation: the HF energy is minimized over the set of molecular orbitals that can be expanded on a given finite basis {xp} 1 < <n: -p- n rPi = LCkiXk. k=1 Denoting by S = [Sktl with the so-called overlap matrix, the constraints L k3 rPirP; = 6ij read cr or in matrix form C*SC=IN , where IN denotes the identity matrix of rank N. In addition, t G k31VrPd2 + k3 VlrPil2) ~ (~ k31v ~ CikXkl2+ k3 V I~ CikX{) N n n L L L hklCliCki i=1 k=1 1=1 Tr (hCC*) where h denotes the matrix of the core hamiltonian -~Ll + V in the basis {Xk}: Lastly, denoting by the inter-electronic repulsion term reads r r p.p(x) p.p(x') , r r IT.p(x, x')12 , n N * * if 3 if 3 I _ 'I dx dx - if 3 if 3 I _ 'I dx dx = L L AijklCioCje,ck/3Cl/3· R R X X R R X X i,j,k,I=10,/3=1 SCF algorithms for HF electronic calculations 23 The above expressions incite one to introduce the so-called density matrix D = CC*, which permits to write the HF energy under the compact form EHF(D) = Tr (hD) + ~Tr (G(D)D), where G(D) denotes the contracted product of the 4-index tensor A by D: G(D)ij = (A : D)ij = L AijklDkl. kl It is easy to see that the matrices D which read D = CC* with C E M(n, N) and C*SC = IN are those which satisfy Tr (SD) = Nand DSD = D. The so-obtained finite dimension HF problem then reads inf {EHF(D), DE M(n, n), D* = D, Tr (SD) = N, DSD = D}. For the sake of simplicity, we assume in the sequel that the overlap matrix S equals identity, that is to say that the basis {XpL<p<n is orthonormal. The general case is recovered by the transformation rules D -+- 51/ 2 DSI/2, h -+ S-1/2hS- l/2, G(D) -+ S-1/2G(Sl/2DS1/2)S-1/2. The HF problem then reads: (5) with p = {D E M(n, n), D* = D, Tr D = N, D2 = D}. The following lemma provides a characterization of the critical points of the HF minimization problem (5). Lemma 1. For any D, let us denote by F(D) = h+G(D) the Fock matrix associated with D. 1. A density matrix DE P is a critical point of the HF problem (5) if and only if { F(D)C = CE C'C = IN D=CC* (6) where E = Diag( 1'1,1'2, ... , EN) is a N x N diagonal matrix collecting N eigen- values of the linear eigenvalue problem F(D)· ¢ = E¢. and where C is a n x N matrix containing N orthonor'mal eigenvectors asso- ciated with 1'1, E2, ... , EN· The condition (6) is equivalent to the condition [F(D), D] = 0, where [".J denotes the matrix commutator defined for any A and Bin M(n, 71) by [A,B] = AB - BA. 24 Eric Cand~s 2. For D E P being a local minimum of the HF problem, it is necessary that (1, E2, ... , EN are the smallest N eigenvalues of F(D) including multiplicity. This result is classical; its proof can be read in any textbook of Quantum Chemistry (see [18] for instance). Remark. The model described above is the so-called General Hartree-Fock (GHF) model. Most often in practice, Quantum Chemistry calculations are performed with spin constraints models like the Restricted Hartree-Fock (RHF), the Unrestricted Hartree-Fock (UHF) or the Restricted Open-shell Hartree-Fock (ROHF) models. The convergence results stated below can be adapted without difficulties to these models. (; 3 General remarks on SCF algorithms Various SCF algorithms are studied in the following three sections. All of them consist in generating a sequence (Dk ) defined by { FkCk+1 = Ck+1 Ek+l Ck+1Ck+1 = IN Dk+l = Ck+l Ck+1 (7) h E D· (k+l k+l) k+l < k+1 < < k+1 b . h· I were k+l = lag E1 , ... ,EN ,E1 _ E2 _ ... _ En emg t e e1genva ues of the linear eigenvalue problem and where Ck+1 collects N orthonormal eigenvectors associated with f~+l, E~+l, ... , Et+1. The expression of the current Fock matrix Fk characterizes the algorithm. We have for instance • Fk = F(Dk ) for the Roothaan algorithm; • Fk = F(Dk) -bDk where b is a positive constant for the level-shifting algorithm; • Fk = F(15k) for the ODA, where 15k is a pseudo-density matrix which satisfies the relaxed constraints 15~ ::::; Dk and is defined so that the HF energy EHF (15k) decreases at each iteration (see section 6). The procedure consisting in assembling the matrix D k+1 E P by populating the N molecular orbitals of lowest energies of the current Fock matrix Fk is referred to as the aufbau principle. It is justified by the results stated in Lemma 1. For the matrix Dk+1 being defined in a unique way, it suffices that Et+1 < Et+;l. Degeneracies in the spectrum are in general related to the symmetries of the system: in the cases when the system does not exhibit any symmetry, numerical experiments show that the eigenvalues of Fk are generically non-degenerate for any k, whereas it may not be the case when the system does exhibit symmetries (consider for instance the spherical SCF algorithms for HF electronic calculations 25 symmetry of the hamiltonian in the atomic case). Degeneracies create technical difficulties which complicate the theoretical studies on SCF convergence. For the sake of simplicity, we therefore assume from now on that the uniform well-posedness (UWP) property introduced in [3] is satisfied: UWP property: a SCF algorithm of the form (7) with initial guess Do will be said to be uniformly well-posed if there exists some positive con8tant I such that The consequences of the UWP assumption which will be useful below have been collected in the following lemma, whose proof is postponed until the end of the present section. Lemma 2. Let us consider a SCF algorithm of the form (7) with initial guess Do which satisfies the UWP property. Then 1. The updated density matrix Dk +1 is defined in a unique way at each iteration; this matrix can be characterized as the minimizer of the variational problem 2. For any D E M(n,n) 8uch that D = D*, Tr (D) = Nand D2 S D, - - I 2 Tr (FkD) ::: Tr (FkDk+d + 211D - Dk+111 , 11·11 denoting the Hilbert-Schmidt norm defined for any A E M(n, n) by IIAII = Tr (AA*)1/2. In the sequel, we denote by arg inf MP the minimizer of the minimization problem MP. We can therefore write Dk+l = arg inf {Tr ChD), DE p}. Remark. Let us point out that some convergence results can be obtained without resorting to the UWP assumption. In particular, it turns out that the level-shifting algorithm is automatically UWP as soon as the shift parameter is large enough (see [3] for details). It can also be proved that the ODA numerically converges towards an aufbau solution to the HF equations within the GHF setting and provided the basis is "large enough". We do not detail here the rather technical proof of this assertion. Let us just mention that it is based on a mathematical result by P.-L. Lions [16] related to finite-temperature HF models. Unfortunately, so far as we know, the arguments used in [16] cannot be extended to the RHF, L"HF or ROHF models. <> Before turning to the study of SCF algorithms, the notion of convergence has to be made precise. We are in fact not able to prove mathematical convergence results of 26 Eric Cances the form "the sequence (Dk) converges towards a minimizer D of the HF problem (5)" for at least two reasons. First, we are solving the Euler-Lagrange equations associated with the HF minimization problem (5), namely the HF equations (6); even in case of convergence we have no argument to conclude that the so-obtained critical point is actually a minimum (even local) of the HF energy. Second, we have no precise description of the topology of the set of the critical points of (5); this lack of information prevents us from proving the convergence of the whole sequence (Dk ) towards a solution D to the HF equations. We can at best obtain that Dk+l - Dk goes to zero, and that for "large" k, Dk is "close to" a solution to the HF equations (6) satisfying the aufbau principle. For instance, it may happen that the HF problem admits a connected manifold of minima; this phenomenon is observed in particular for open-shell atoms because the spherical symmetry of the problem is broken by the HF approximation (this can be related to a mathematical result by Bach, Lieb, Loss and Solovej [1] stating that "there are no unfilled shell" in the HF ground states). We cannot then discriminate between the casewhen the sequence (Dk ) converges towards a point of the manifold and the case when the sequence (Dk ) is attracted by the manifold together with a slow drift parallel to the manifold. We shall consequently adopt here the following two convergence criteria, which are sufficient in practice. We shall say that a SCF algorithm of the form (7) numerically converges towards a solution to the HF equations if the sequence (Dk) satisfies and that it numerically converges towards an aufbau solution to the HF equations if the sequence (Dk) satisfies As all norms are equivalent in finite dimension, we do not need to specify the ma- trix norm in which the variations are evaluated. Let us remark that the latter convergence criterion is stronger than the former one for Let us conclude this section with the Proof of Lemma 2. Let us denote by D a current matrix such that D = D*, Tr (D) = N, D2 ::; D and by Dij its coefficients in an orthonormal basis in which Fk = Diag( f~+l, f~+l, ... , f~+l) with f~+l ::; f~+l ::; '" ::; f~+l. In such a basis Dk+1 = Diag(l, .. ·,1,0, ... ,0). As in addition Tr (D) = I:?=1 Dii = N, we get first Tr ((Dk+1 - D)· (Dk+1 - D)) Tr (m+l) + Tr (D2) - 2 Tr (DDk+1) SCF algorithms for HF electronic calculations 27 < Tr (Dk+l ) + Tr (D) - 2Tr (DDk+d N 2N - 2LDii i=1 n = 2 L D ii · i=N+l 0:::; Dii :::; 1, for any 1 :::; i :::; n for D2 :::; D = D* implies IDid2 + LJt'i IDiJI2 :::; Dii . Putting together the above results, we obtain n L E7+1Dii i=1 N n > L E7+1Dii+ L (Et+I+,)Dii i=1 i=N+I N n n L E7+ 1 Dii + ft+ 1 L Dii +, L D'i i=1 i=N+l i=N+1 N N n L E7+ 1 Dii + Et+I(N - L Dii ) +, L Dii i=1 i=1 i=N+l N N n L E7+l + L(Et+ 1 - E7+ 1)(1- D ii ) +, L D i ,. i=1 i=1 As for any 1 :::; i :::; N, 0 :::; Dii :::; 1 and Et+1 2: E7+ 1, we finally obtain - - , 2 Tr (FkD) 2: Tr (FkDk+d + 2" liD - Dk+1 11 . The two statements of Lemma 2 follow. <) 4 The Roothaan algorithm: why and how it fails The Roothaan algorithm (also called simple SCF or pur'e SCF or conventional SCF in the literature) is the simplest fixed point procedure associated with the nonlinear eigenvalue problem (6). It consists in generating a sequence (D:th) in P satisfying { F(D:th)Ck+l = Ck+1Ek+1 Ck+I Ck+1 = IN DRth C C* k+l = k+l k+l h E D· (k+l k+l) k+l < k+l < < k+l b' h M II were k+l = lag El ,"',EN , EI _ E2 _'" _ EN elllg t e jV sma est eigenvalues of the linear eigenvalue problem 28 Eric Cand~s and where the n x N matrix CHI collects N orthonormal eigenvectors of F(Dfth) associated with (~+I, (~+1, ... , (~+1. The iteration procedure of the Roothaan algo- rithm can therefore be summarized by the diagram DRth k+l· The convergence properties of the Roothaan algorithm are not satisfactory: although the Roothaan algorithm sometimes numerically converges towards a solution to the HF equations, it frequently numerically oscillates between two states, none of them being solution to the HF equations. Numerical oscillation between two states means here that DRth _ DRth --+ 0 k+2 k , but DRth _ DRth ~ 0 k+1 k -,r . The behavior of the Roothaan algorithm can be explained by introducing the aux- iliary function E(D, D') = Tr (hD) + Tr (hD') + Tr (G(D) D'), which is symmetric since Tr (G(D) D') = Tr (G(D') D), and which satisfies E(D, D) = 2 EHF(D). Let us indeed minimize E alternatively with respect to each of the two arguments D and D': DI=arginf{E(Do,D), DEP}, D2 = arg inf{E(D, DI), DE P}, D3 = arg inf {E(D2' D), DE P}, This minimization procedure is usually called relaxation in the mathematical liter- ature. For the first two steps, we obtain Dl arg inf {E(Do, D), DE P} arg inf {Tr (hDo) + Tr (hD) + Tr (G(Do)D), DE P} arg inf {Tr (F(Do)D), D E P} DRth 1 , and, since E is symmetric on P x P, D2 arg inf {E(D, Drth ), DE p} arg inf {E(Drth, D), D E p} arg in£{ Tr (hD) + Tr (hDfth) + Tr (G(Drth)D), DE p} arg in£{ Tr (F ( Dfth ) D) , D E p} Dfth. SCF algorithms for HF electronic calculations 29 It follows by induction that the sequences generated by the relaxation algorithm on the one hand, and by the Roothaan algorithm on the other hand, are the same. The functional E, which decreases at each iteration of the relaxation procedure can therefore be interpreted as a Lyapunov functional of the Roothaan algorithm. This basic remark is the foundation of the proof of the following result. Theorem 1. Let Do E P such that the Roothaan algorithm with initial guess Do is UWP. Then the sequence (Dfth) generated by the Roothaan algorithm satisfies one of the following two properties • either (Dfth) numerically converges towards an aufbau solution to the HF equa- tions • or (Dfth) numerically oscillates between two states, none of them being an aufbau solution to the HF equations. Proof. For any kEN, we deduce from Lemma 2 that Tr (F(DRth)DRth) + 211DRth - DRthl12 < Tr (F(DRth)DRth) k+1 k+2 2 k+2 k - k+l k . Adding Tr (hDt:!1) to both terms of the above inequality, we obtain E(DRth DRth) + 211DRth _ DRthl12 < E(DRth DRth) k+1' k+2 2 k+2 k - k' k+l' We then sum up the above inequalities for kEN and we get LkEN IIDf~1- Dfth 112 < +00, which involves in particular that DRth _ DRth --+ 0 k+2 k . Now, either Df~1 - Dfth converges to zero or it does not. In the former case, we deduce from the characterization of Df~1 by that Tr (F(Dfth)Dfth) - inf {Tr (F(Dfth)D), DE p} --+ O. Convergence towards an aufbau solution to the HF equations is thus established. In the latter case Tr (F(D:~h)D:~h) - inf {Tr (F(D:~h)D), DE p} = Tr (F(D:~h)D:~h) - Tr (F(D:~h)D:~~l) > 211DRth _ DRth 112_Jc O. - 2 2k 2k+l--r' Convergence of (D2k) towards an aufbau solution to the HF equations is therefore excluded; the same argument holds for (D2k+d· <> 30 Eric Cances Mimicking the proof of Theorem 2 (see section 5), it is easy to establish in addition that (D2k> D2k+1) converges up to an extraction to a critical point (D, D') E P x P of the functional E which satisfies F(D')C = CE C'C = IN D=CC' F(D)C' = C'E' C"C' = IN D' = C'C'* where E and E' are diagonal matrices collecting the smallest N eigenvalues of F(D') and F(D) respectively. Besides, as E(D2k' D2k+1) is decreasing, the whole sequence (D2k' D2k+1) converges to (D, D') if this critical point is a strict (local) minimum. In this case, the alternatives are • either (D, D') is on the diagonal of P x P (i.e. D = D') and (Dfth) converges towards an au/bau solution to the HF equations; • or (D, D') is not on the diagonal of P x P (i.e. D #- D') and (Dfth) oscillates between two states which are not au/bau solutions to the HF equations. Both situations are represented on Figure 2.1. 0' 0' , ......-::--:::-:---.F----- ~ D __________ ~--~~__.~ D Figure 2.1: Minimization of E by relaxation: convergence towards a strict local minimum located on (resp. off) the "diagonal" leads to the convergence (resp. oscillations) of the Roothaan algorithm. Oscillations can be observed even for simple chemical systems. As a matter of ex- ample, we have tested the Roothaan algorithm in the UHF setting for the atoms of the periodic table and for two sets of atomic orbitals, namely the Gaussian ba- sis sets 3-21G and 6-311++G(3df,3pd) (see [9]). The initial guess is obtained by diagonalization of the core hamiltonian. Calculations have been performed with Gaussian 98 [8]. The results are reported in Figure 2.2; they indicate that SCF algorithms for HF electronic calculations 31 1. Both alternatives (convergence vs oscillation) are met in practice. 2. Convergence towards a critical point of the HF problem which is not a global minimum can sometimes be observed. 3. For the same system, we can getconvergence for one basis set and oscillation for another basis set. 111111111111111 111111111111111 111111111111111 111111111111111 ......................................... • ._ ........ __ 10_ .. ',- ---- o __ ..-IoC-,. Figure 2.2: Searching the ground state of atoms with the Roothaan algorithm. Results are shown on the periodic table of the elements. 5 Level-shifting The analysis developed in the previous section suggests to add to the functional E a penalty term Ep of the off-diagonal pairs (D, D') with D # D' in order to enforce the critical points of the functional E + Ep to lie on the diagonal of P x P, which should ensure convergence towards a critical point of (5). A simple penalty fonctional reads Ep = biiD - D'11 2 , where b is a positive constant and where II . II denotes as above the Hilbert-Schmidt norm. Let us therefore set Eb(D, D') = Tr (hD) + Tr (hD') + Tr (G(D) D') + b liD - D'I1 2 The relaxation algorithm associated with the minimization problem in£{ Eb(D, D') , (D , D') E P x p} 32 Eric Cances generates the sequence (D~) defined by aufbau Db ---+ k+1' The sequence (Dn can be identified with the sequence generated by the so-called level-shifting algorithm [22] with level-shift parameter b. The convergence of the level-shifting algorithm towards a (non necessarily aufbau) solution to the HF equa- tions is mathematically guaranteed: Theorem 2. There exists a positive constant bo such that for any Do E P and for any level-shift parameter b 2: bOl 1. The sequence of the energies EH F (D~) decreases towards some stationary value E of EHF. 2. The sequence (D~) numerically converges towards a solution to the HF equa- tions. Proof. Let bo be a positive constant such that V(D, D') E M(n, n) x M(n, n), Tr (G(D - D') . (D - D')) :::; bollD - D'11 2 . (8) Such a bo exists since (d, d') f-7 Tr (G (d)d') is a bilinear form on M (n, n). As Eb is symmetric on P x P, we have for any k E IN, Eb(DZ, DZ+1) in£{ Eb(DZ, D), DE p} < Eb(D%, DZ). A simple calculation shows that this inequality can be rewritten as Therefore, for any b 2: bo, (9) It follows that (EHF(D~)) is a decreasing sequence and that +00 L IID%+l - DtW < +00. k=O The latter statement, which has been obtained by summing the inequalities (9) for k 2: 0, implies in particular that As for any k E IN, SCF algorithms for HF electronic calculations 33 it follows that This concludes the proof of statement 2. As P is compact, we can extract from (D%)kEN a subsequence (Dt)IEN which converges towards some D E P, such that EHF(D%) -I- EHF(D) and [F(D),D] = 0; E. = limEHF(Dk) = EHF(D) is therefore a stationary value of the HF energy. <) The level-shift parameter bo implicitly defined by (8) is far from being optimal. Ex- plicit and more refined estimates of shift parameters that guarantee convergence are given in [3]. From a numerical viewpoint, it is important to choose not too large a shift parameter; otherwise the speed of convergence is very slow and the risk of con- verging towards a critical point whose energy is above that of the HF ground state is enhanced. This point, on which we will come back in the course of section 6, is illus- trated by the numerical example reported on Figure 2.3 in which the level-shifting algorithm with various shift parameters has been used to compute the (doublet) UHF ground state of the Bromine atom in the Gaussian basis 6-311 ++G(3df,3pd) (see [9]). In each case, the initial guess is computed by diagonalizing the core hamil- tonian. Calculations have been performed with Gaussian 98 [8]. The algorithm oscillates for small shift parameters. For larger shift parameters, damped oscilla- tions leading to convergence are observed. For very large shift parameter, the energy decreases at each iteration. Too large shift parameters have however to be excluded because they slow down the convergence (for b = 30.0 Ha, convergence towards the ground state is obtained after more than 200 iterations). 6 The Optimal Damping Algorithm The present section is devoted to the mathematical study of the Optimal Damp- ing Algorithm (ODA) which is the simplest representative of the class of Relaxed Constraints Algorithms (RCA) introduced in [4]. The ODA is defined by the following two-step iteration procedure l. Diagonalize the current Fock matrix Fk = F(i5k) and assemble the matrix Dk +1 E P by the aufbau principle; denotes the line segment linking Dk and Dk +1 . The procedure is initialized with Do = Do, Do E P being a given initial guess. The ODA thus generates two sequences of matrices: 34 Eric Cances .- - L .. r • , .- .. .. .. .. .. .. .. .. - 1- :;::.:; ~:.~ 1 r-· ::=:;!!.--l .... - 1- ~ 1- _\ - \ - c .. .. .. .. .. - I .. :=:;!:.. ] ,- I_ I • • .- .. .. .. .. .. .. .. • .. - Figure 2.3: Calculation of the UHF ground state of the Bromine atom . • The principal sequence of density matrices (Dk)kEN which will be proved to numerically converge towards an aufbau solution to the HF equations; • A secondary sequence (Dkh?l of pseudo-density matrices which belong to the set P = {15 E M(n,n), Tr (D) = N, obtained from P by relaxing the nonlinear constraints D2 = D. The latter statement is a direct consequence of Lemma 3. The set P is convex, whose proof is postponed until the end of the present section. Indeed, Do = Do E PcP and, by ind~lction, if 15k E P then by convexity Dk+1 E Seg[Dk, D k+1] C P since D k+1 E PCP. SCF algorithms for HF electronic calculations 35 The properties of the ODA are put together in the following theorem. Theorem 3. For any initial guess Do for which the aDA is UWP, 1. The sequence E(i5k ) decreases towards a stationar'y value of the HF energy. 2. The sequence (DkhEN converges towards an aufbau solution to the HF equa- tions. As a first step towards the understanding of the ODA, let us consider 15k E 15 and D' E P, and let us compute the variation of the HF energy on the line segment Seg[15k , D'] = {(I - A)15k + AD', A E [0, I]}. We obtain for any A E [0,1]' EHF((l - A)15k + AD') = EHF (15k) + ATr (F(15k) . (D' - 15k)) + ~2 Tr (G(D' - 15k) . (D' - 15k)) . The "steepest descent" direction, i.e. the density matrix D for which the slope Sjjk-+ D = Tr (F(15k) . (D - 15k)) is minimum, is given by the solution to the mini- mization problem which also reads D = arg inf {Tr (F(Dk) . D'), D' E.P} . This is precisely the direction Dk+1 obtained by the aufbau principle. The ODA can therefore be interpreted as a steepest descent algorithm in 15. The practical implementation of the ODA is detailed in [4] for the RHF setting. The cost of one ODA iteration is approximatively the same as the cost of one iteration of the Roothaan algorithm (see [4] for details). Figure 2.4 reports on a comparison between the ODA and the DIIS approaches for the calculation of the RHF ground state of the E form of n-methyl-2-nitrovinylamine (CH3-NH-CH=CH-N02 ) in the basis 6-31G(d) (see [7]). The speed of convergence is estimated by computing the logarithm of the difference between the HF energy of the current density matrix and the (presumed) HF ground state energy. Calculations have been performed within Gaussian 98 [8]. The graph on the left hand side corresponds to an initial guess computed by a semi-empirical method. In this case, both algorithms converge but the speed of convergence of the DIIS algorithm is higher. From a general viewpoint, numerical tests performed until now demonstrate that the ODA is efficient for performing the early iterations of the SCF procedure; when the sequence (Dk ) has reached a neighbourhood of a critical point of the HF problem, convergence can be accelerated either by resorting
Compartilhar