PHYSICAL PRINCIPLES OF CONSTRUCTING HYBRID QM/MM PROCEDURES A.L. TCHOUGRE EFF

Karpov Institute of Physical Chemistry 10 Vorontsovo pole, 105064, Moscow, Russia

[email protected]

A.M. TOKMACHEV

Karpov Institute of Physical Chemistry 10 Vorontsovo pole, 105064, Moscow, Russia [email protected]

Abstract Hybrid QM/MM methods are widely used for describing dierent as-

pects of behavior of complex molecular systems. The key problem when applying the QM/MM methodology is a substantiated construction/selection of the junction between the parts of the system described at the QM and MM levels, respectively. It is especially important in the case of covalently bound QM and MM subsystems. We pursue here a general approach based on a sequential separation of electronic variables in order to develop a fundamental form of the intersubsystem junction. Special attention is given to construction of frontier oneelectron states and renormalization of QM Hamiltonian parameters and MM force elds. From this point of view we consider a series of the junction forms present in the literature and in some cases suggest theoretically more reliable alternatives. General theoretical conclusions are supported by data of numerical experiments.

Keywords: QM/MM methods, junction, physical principles, APSLG scheme, eec-

1.

tive Hamiltonian

Introduction

Modern computational quantum chemistry tends to cover with acceptable precision molecular systems of real interest. In the framework of ab initio methodology achieving a good quality results is usually concerned with extending basis sets of one-electron states and with explicit taking into account a great deal of electron correlation. It leads to ex-

1

2 tremely high computational costs for medium and large size systems due to unpleasant scalability of computational resources like N 4 N 7 (where N is the dimension of a space spanned by one-electron basis functions). Even in the case of semiempirical methods the computational costs grow as N 3 with growth of system size. This is a serious problem for application of quantum chemical procedures for calculating properties of many practically important molecular systems and especially of their chemical transformations [1]. The problem of high computational costs is especially actual when large systems of biological importance are considered or massive calculations of potential energy surfaces (PESs) are necessary, for example, in the framework of molecular dynamics simulations. In the literature dierent means are proposed to signi cantly reduce computational costs without deteriorating the quality of obtained results. The rst way is to smooth the dependence of computational costs on the size of the system. This type approaches usually exploit the localization of electronic degree of freedom, based on the "principle of nearsightedness" [2], or the exponential decay of the one-electron density matrix elements [3]. The almost linear scalability is not impossible and, for example, authors of Ref. [4] have shown that the scalability of the order N 1:3 can be achieved. There is a reasonable number of successful attempts to achieve optimal N -scalability properties for the whole spectrum of quantum chemical schemes { ab initio, DFT, and semiempirical. The non-variational schemes are based on the "divideand-conquer" methodology [5, 6, 7], Fermi operator expansion [8, 9], energy renormalization group [10] and recursion technique [11]. The variational approaches [12, 13, 14, 15] are based on the substitution of energy minimization by the grand canonical potential minimization [16]. These techniques are thoroughly reviewed in Refs. [17, 18]. At the same time these methods are mostly oriented on the tight-binding models of solids or solid clusters. The density matrix renormalization group method [19] is quite important achievement since it competes in quality with most elaborated methods of conventional quantum chemistry [20]. We should stress that the use of local one-electron basis states can signi cantly reduce computational costs [21]. In this context direct determination of localized Hartree-Fock orbitals can be especially useful and workable [22]. Signi cant acceleration of computations can be achieved by using pseudodiagonalization procedures [23] or by special choice of the trial electronic wave function [24, 25, 26, 27] alternative to the standard SCF form. The application of the methods with good scalabilty properties to enzyme reactions, structure and properties of proteins and DNA is straightforward [28, 29].

PHYSICAL PRINCIPLES OF QM/MM

3

The molecular mechanics (MM) [30] is another good candidate for using in calculations of properties of large molecular systems. Currently it is extensively used in the eld of biochemical simulations. At the same time there are signi cant limitations for its use. Generally, they are consequences of classical nature of the MM force elds [31]. For example, the application of MM schemes to chemical reactions including the processes of bond formation or cleaveage, transition states, or to highly correlated compounds like transition metal complexes is impossible or, at least, questionable since the main prerequisite for the validity of MM procedures is its application to the ground electronic state of a molecule without low-lying excitations . Moreover, the MM schemes usually require rather complex parameterization procedure. The main advantages of the MM schemes are their low cost and high eciency in the prediction of molecular geometry for organic compounds without signi cant electron correlation. The principal advantage of the QM procedures is a wide range of problems they potentially apply. The concert expoitation of advantages of the both (quantum and classical) approaches can be achieved by the incorporation of quantum mechanical (QM) description to the MM framework. This methodology, alternative to construction of the QM schemes with the O(N ) scalability, leads to hybrid quantum mechanical/molecular mechanical (QM/MM) schemes allowing another way to bypass the bottleneck of N n -scalability. The QM/MM schemes describe some relatively small part of the system where chemical transformations take place by an appropriate quantum chemical method while the rest (relatively inert environment) is covered by classical force elds (molecular mechanics). The practical usefulness and general validity of these approaches is based on the chemically motivated observation: chemical transformations usually aect only a small part of the whole system (reaction center) while the r^ole of surrounding groups and molecules is reduced to modi cation of the PES due for example to some polarization or steric strains. This situation is characteristic for chemical reactions of biological interest (especially, for catalysis by enzymes). Since the seminal work by Warshell and Levitt [32] the hybrid QM/MM schemes of calculating large molecular systems acquired an increasing popularity. There is a big variety of dierent hybrid approaches described in the literature [33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43]. Evenmore, numerous cases of separating electronic variables like -electron models [44] or even taking into account explicitly only valence electrons in semiempirical methods [45] can be considered as special cases of hybrid schemes since they also bear the family marks of the QM/MM approach, namely, (i) the separation of the system into parts, and (ii)

4 treating these parts on quantum or classical levels, respectively. Very close to the QM/MM modeling is a problem of constructing eective fragment (group) potentials which can aect the quantum chemically described part of molecule modeling the in uence of environment [46, 47, 48, 49, 50, 52, 53, 54, 51]. In such a broader sense several other problems in the area of computational chemistry seem to be related to the QM/MM context: these are the problem of embedding in the cluster calculations on solids and their surfaces with special attention to adsorption and catalysis problems; the problem of description of solute/solvent eects for reactions in condensed media; the pseudopotential descriptions of of molecular and atomic electronic structure to list several most important. Usually all these areas are considered separately without an attempt of establishing relations among them. Also a great variety of dierent speci c schemes referred to as "protocols" implemented in dierent packages are normaly considered only from the point of view of their practical feasibility and their t for particular applied purpose, rather than in a context of their exact placement among other approximate methods and of evaluation of relative precision of that or another approximation. The QM/MM modeling can be placed in the general context of combination of dierent level descriptions. We should mention that the hybrid QM/QM schemes are well known in the literature. Some schemes by construction are far enough from the QM/MM construction since the separation on parts is performed not on the geometrical principle but on the principle of necessity of correlated description to some delocalized orbitals. Well known examples of such an approach is the complete active spase self consistent eld (CASSCF) scheme [55] and GVBCAS method [56] combining generalized valence bond and CASSCF approaches which can be considered as three-level scheme. The eective crystal eld method developed for calculation of d-d spectra [57] and its extension designed to treat catalytical systems [58] can be considered as taking an intermediate position in this ierarchy since they use some strictly localized one-electron states (for example, d-orbitals of one transition metal ion) as a high-level subsystem treating it on the full con guration interaction level with delocalization corrections taken into account perturbatively. The QM/QM schemes similar by construction to QM/MM ones (geometric separation on parts) are also quite popular and should be specially mentioned since they have many common problems . For example, the IMOMO method [59, 60, 61, 62] integrating molecular orbital approximations of dierent level is very similar in the way of construction, main principles and physical contributions taken into account to the QM/MM IMOMM method [38]. Another example

PHYSICAL PRINCIPLES OF QM/MM

5

of such a construction is the QM/DIM (diatomics-in-molecules) scheme [63] which takes the interactions between subsystems in the rst order of perturbation theory. This scheme can be brought to a type of QM/MM one by relevant signi cant simpli cations [63, 64]: by ignoring, rst, the dierences in potential curves of diatomic fragments due to angular and spin momentum, second, the couplings between the states of the same symmetry, and multicenter electrostatic contributions. The scheme proposed by Naray-Szabo et al [65, 66] is especially interesting here. It uses strictly localized bond orbitals (SLBOs) for the environment and the SCF procedure for the fragment of particular interest. Note that in the case of large systems the calculation of SLBOs can be very timeconsuming process, so the transition to the MM scheme for the environment is an important and almost unavoidable step for calculations of biopolymers. It should be noted that the hybrid quantum/classical schemes apply not only for determination of geometries, energies, and reaction mechanisms. The Monte Carlo [67, 68] and molecular dynamics (MD) [69, 70, 71, 72] simulations are quite popular as frameworks for which various QM/MM procedures serve as "subroutines". Before employing hybrid schemes the large-scale MD simulations were performed only with low-level approximations for force elds. The use of hybrid schemes extends signi cantly the scope of their application, improve precision of the results that allows to improve the understanding of statistical properties and dynamical processes in liquids and biopolymers. The area of the QM/MM modeling is not a steady one since there are many unsolved problems mostly caused by ad hoc character of construction of junction between subsystems. Despite a general claim of being stipulated by "speci c practical needs" the ad hoc junction constructs present in the literature frequently lead to more or less serious technical ("practical") problems in construction of hybrid schemes. This point is stressed in Refs. [73, 74]. The most direct application of hybrid schemes is in the case of solute/solvent interactions since the border between quantum and classical parts of the whole system in this case is naturally de ned by the division of the whole system into individual molecules or ions. In this case the environment (classical) variables can be chosen in many dierent ways, ranging from continual models with special properties [72, 75, 76, 77, 78] to the discrete ones explicitly employing the information on the structure of solvent molecules [79, 80, 81, 82] through some intermediate schemes adopting advantages of both extrema [83, 84] are also possible. Nevertheless, even in such a transparent case the problems occur when the formation of complexes between solute metal ions and solvent molecules has to be considered. In this case some solvent

6 molecules may have to be considered either as part of QM system when form a close contact with the metal ion or as a part of the MM system when they are absorbed by the solvent bulk. Approaches to constructing relevant force elds in this situation are discussed in Ref. [85]. Another important area of applications is solid state and surface chemistry [37, 39, 86, 87] with catalysis and adsorption by oxide systems as the most popular objects. In this case the subsystems are very polar and usually charged, so a special attention should be paid to electrostatic interactions between subsystems. The most important problem in the eld of embedding is the unphysical polarization of the QM subsystem (cluster) and the related necessity to adjust eective charges on the ions to obtain sensible results. It should be noted that the most dicult and not trivial case of separating quantum and classical subsystems is that of covalently bound systems since the electrons in chemical bonds connecting two regions are highly correlated. At the same time this case seems to be very actual (especially for biological applications) and requires thorough consideration. In this paper we consider the hybrid QM/MM schemes with covalent linkage between regions in more details, give account of the "state-of-art" in this eld, and show how the junction between quantum and classical subsystems can be constructed in a consistent (not ad hoc) way by deriving it from an underlying QM description with an emphasis on the choice and re nement of one-electron states related to the intersubsystem boundary and the related parameterization. We consider the possibility of subsequent derivation (i.e. proof) of the model is an argument in favor it (in mathematical sense) while the absence of derivation (or its impossibility) as a contra argument. The paper is organized as follows. In the next Section we formulate the physical principles of constructing hybrid QM/MM schemes and illustrate their application by separation of electronic variables in the framework of the eective Hamiltonian technique and by derivation of renormalization of QM and MM parameters in the framework of the deductive MM formalism. In the third Section the outline of existing approaches to QM/MM intersubsystem junction is given with numerical examples con rming theoretical conclusions and analysis from the general principles given in the second Section is performed. Finally, a brief conclusion is given.

2.

Physical principals of constructing QM/MM schemes

The structure of the area of the QM/MM modeling is obscured by a great deal of the recipes proposed and by the lack of sequential deriva-

PHYSICAL PRINCIPLES OF QM/MM

7

tions of them. The situation is partly caused by the structure of the MM itself { the MM scheme is not derived but is taken on the ground of some intuitive concepts of what classical terms must enter the energy energy with adjusted parameters for the latter. We extract here a minimal set of the most essential facts (physical principles) which are necessary prerequisites for constructing hybrid schemes: 1 Chemical transformations are local and the perturbation caused by the environment is small. This principle is con rmed by all the experimental chemistry { the concept of the reaction center and mechanisms of chemical reactions aecting only several chemical bonds. The very concept of chemical bond is mostly due to the local (in many cases one bond) nature of chemistry. 2 The system can be divided on parts and dierent level approximations (quantum or classical) can be applied to dierent parts of the system. The size of the quantum subsystem should be determined by the locality of transformations and not by the intention to move the boundary far from the reactive center to mask the errors in the junction construction. In the case of three-dimensional structures signi cant shift of boundary from the reactive center is problematic since the size of the QM system may become very large to be covered by the high-level quantum chemistry. 3 Correct form of intersubsystem junction corresponds to sequential separation of variables. This principle simply states that the construction should be made not in the ad hoc manner. 4 Separation of system into parts should be performed in a manner in which uctuations of electronic density between subsystems are small, i.e. chemical bonds are not to be cut. This requirement seems to be quite natural since both QM and MM schemes work properly only in the case of systems with well de ned numbers of electrons. The developed QM schemes operate with the manyelectron states which are eigenstates of the number of particles operator. Even if the bond to be cut is not polar and the diagonal one-electron densities can be set equal to integers and thus the average numbers of electrons in subsystems are also integer the situation does not change since according to Ref. [88] the oneelectron density is separable only for the wave functions where the intergroup electron transfers are projected out. It is de nitely not a good approximation for two ends of a chemical bond which can exist only in the case of a nontrivial QM superposition of the states with dierent number of electrons on its ends. The MM description

8 for a system with uctuating number of electrons is not de ned as well. 5 There exists a QM description underlying the MM one. It means that the form of junction is not arbitrary but essentially de ned by one-electron states arising from the underlying QM description. These points are very general. Now we consider in more details the practical implementation of these principles.

2.1

Eective Hamiltonian technique

As a rst step for derivation of junction we consider a general formulation of separation of electronic variables of quantum subsystem from those describing electrons in the classical (MM) subsystem which provide a consistent form of intersubsystem junction. This separation is based on the Lowdin partition technique [89] and the McWeeny group function formalism [90]. Generally, we construct an eective Hamiltonian for the QM subsystem in the presence of classical environment and the PES of combined system. This strategy was proposed and developed in Refs. [73, 74, 91]. Practically it is based on the assumption of existence of a QM scheme underlying the MM one. We consider two subsystems R- (reactive, quantum) and M - (inert, classical). The electronic Hamiltonian for the whole system is a sum of subsystem Hamiltonians and of their interaction which is taken to comprise the terms of two types { the Coulomb V c(q) and the resonance (electron transfer) V r (q) interactions:

H = H R (q) + H M (q) + V c(q) + V r (q): (1) The Hamiltonian for the M -subsystem is a sum of the free M -subsystem Hamiltonian H0M (q) and of the attraction of electrons in the M -subsystem to the cores of the R-subsystem V R (q). Analogous subdividing is true for the R-subsystem. The "exact" wave function of the system can be represented by generalized group function with number of electrons in subsystems not xed: k =

X X

fn g fi g

^

Cfki g(fn g) i (n );

(2)

where i (n ) is the i-th n -electron wave function of the -th group and electron distributions fn g satisfy the equation: X

n = Ne; 8; n 0:

(3)

9

PHYSICAL PRINCIPLES OF QM/MM

The coecients Cfki g (fn g) should be determined on the ground of variational principle. The function Eq. (2) is very general and does not correspond to the assumed wave function of the hybrid QM/MM method. First of all the numbers of electrons in the subsystems must be xed to apply computational schemes to separate parts on the legal ground. Second, we assume that the M -subsystem is treated with use of the MM, i.e. the PES of the M -subsystem is evaluated without explicit invocaion of its wave function. The parameters of the M -subsystem must be transferable, i.e. applicable to combination with any R-subsystem and even in the absence of it. For this purpose we should use the wave function of the ground state of the eective Hamiltonian for the M -subsystem since it is in a certain sense close the wave function calculated without any R-subsystem [73]. Thus the required wave function is represented by the antisymmetrized product of electronic wavefunction for the R-subsystem and that of the ground state for the free M -subsystem: k = Rk ^ M 00 :

(4)

It is obtained from the exact wavefunction by two sequential Lowdin projection procedures: the rst one to the subspace of the states with xed number of electrons in the subsystems (projection operator P and its complementary projection operator Q = 1 P ) and the second one { to the states with the ground state wavefunction of the free M -subsystem as the multiplier (projection operator P and its complementary projection operator Q = 1 P ). After the rst partition we obtain M (q)P + PV c(q)P + Heff (q; E ) = PH R(q)P +2 PH P +PV rr (q; E )P + e2 ZARZBM RAB1 ; A6=B

(5)

where the second order resolvent contribution coupling two one-electron transfers between the subsystems is:

V rr (q; E ) = V r (q)Q(E QHQ) 1QV r (q) = V r (q)QR(q; E )QV r (q):

(6) The Hamiltonian of the type Eq. (5) is typical for the ECF method [57], where projector P corresponds to the SCF wave function of the ligands in transition metal complex. The second projection and subsequent averaging over the ground state of the M -subsystem gives the eective

10 Hamiltonian for the R-subsystem: R (q; E; !) = H R (q) + V M (q)+ Heff 0 DD EE + hhPV rr (q; E )P iiM + P V R (q)P M + P + hhP W (q; E )P QR(!)QP W (q; E )P iiM + e22 ZARZBM RAB1 ;

(7)

A6=B

where the symbol hh:::iiM corresponds to the averaging over the ground D E M M state of the M -subsystem 00 ::: 00 , the averaged operator

V M (q) = V M (q) + hhPV c(q)P iiM (8) is close to the zero for quite typical case of M -subsystems with zero

overall charge and P W (q; E )P = PV c (q)P + PV rr (q; E )P + P V R (q)P: (9) To perform the averaging explicitly a form of the wavefunction for the M -subsystem should be speci ed. At the same time some simpli cations of the expression Eq. (7) can be made on quite a general level. For example, the second order contribution from the resolvent of rst partition expressed through one-electron states of subsystems is: hhPV rr (q; E )P iiM = P0 P0 vrm(q)vr0 m0 (q) rr 2R mm 2M P adv) (I E )+ f r+ ji hj r0G(mm 0 (10) 2ImOR (NR 1) P

2ImOR (NR +1)

ret) (A E )g; r ji hj r0+G(mm 0

where G(ret) () and G(adv) () are the retarded and advanced one-electron Green's functions for the M -subsystem, respectively. The renormalization of the bare R-subsystem Hamiltonian leads to addition of some interaction terms to the sum of energies of free subsystems. The PESs for the whole molecule are obtained by formula: Ek = EkR + E00M ; (11) where EkR is the k-th eigenvalue of the eective Hamiltonian of the Rsubsystem: D E R R (q) EkR = Rk (q) Heff (12) k M is the energy of the M -subsystem which is parameterized in the and E00 MM form. The detailed analysis [91] of the eective R-subsystem energy EkR allows to present it in the form of a sum of contributions of dierent

PHYSICAL PRINCIPLES OF QM/MM

11

physical nature. By construction the general consideration of junction is close to that of intermolecular forces. At the same time there are some important dierences: the hybrid schemes require consideration of boundary atoms; the special form of the wave function of inert subsystem is required; the resonance terms cannot be neglected. The correction to the k-th eigenvalue of the bare Hamiltonian (i.e. the dierence between EkR with and without environment aecting the reaction center) is

Ek Ekelst + Ekrr + Ekdisp + Ekcoul + Ekcrr + EkRrr ; (13) where the leading contribution Ekelst is of the rst order with respect to the perturbation operator and originates from the electrostatic interaction between the subsystems, it can be expressed through the charges on the atoms; the next three contributions are of the second order and follow from the coupling of one-electron transfers between subsystems, of Coulomb electron-electron interactions between them, of interactions between electrons of the M -subsystem and cores of the R-subsystem; and of interaction of the Coulomb electron-electron interaction with the interaction of the electrons of the M -subsystem and cores of the Rsubsystem; the second term can be regarded as dispersion interaction between subsystems. Its general form is: P P (rrjmm)(r0 r0 jm0 m0 ) E disp = 0 rr12R mm0 2M (14) M 0 (i!); R d!rrR 0 (i!)mm 0

M 0 (i!) are the reduced polarization propagators where rrR 0 (i!) and mm for the R- and M -subsystems. The sum of the contributions Ekdisp and Ekcoul has a physical meaning of the second order interaction between the electronic polarization in the M -subsystem and the polarized Rsubsystem. The last two contributions to Eq. (13) correspond to the third order in the interaction between the subsystems originating from the coupling two projection procedures. Physically it corresponds to interaction between two one-electron transfers and Coulomb interaction between electrons of the M -subsystem with electrons and cores of the R-subsystem. The explicit form of these contributions is given in Ref. [91].

2.2

Deductive MM formulation and boundary one-electron states

2.2.1 Semiempirical APSLG-MINDO/3 approach. General physical principles of constructing hybrid QM/MM approaches state

12 that the subsequent derivation of junction between quantum and classical subsystems requires a QM wave function underlying the MM description of PESs. This QM method is necessary, for example, to perform averaging of the eective Hamiltonian in Eq. (7). At the same time more important that this method should produce in a consistent manner one-electron states necessary for explicit formation of boundary and its response on the changes in molecular geometry of fragments and/or electronic structure of the R-subsystem. There are some general criteria for the QM method appropriate for underlying the MM description: this method should express molecular electronic structure and electronic energy in relevant local terms (i.e. to distinguish bonded and nonbonded contributions) and reproduce molecular properties with sucient accuracy. The parameters characterizing the wave function of this method (electronic structure parameters or ESPs) have to be transferable in a broad sense of the term "transferability", i.e. the form of any bond-related functions (e.g. the bond energy dependence on interatomic separation) must be also transferable. As an appropriate method satisfying these conditions we take the trial wave function in the form of antisymmetrized product of strictly localized geminals (APSLG) [92] implemented with slightly modi ed semiempirical MINDO/3 Hamiltonian [26, 27]. The APSLG wave function is constructed from two-electron building blocks { geminals: Y ji = gm+ j0i : (15) m

Each geminal corresponds to a chemical bond or to a lone pair and it is expressed as a linear combination of singlet two-electron con gurations constructed from two (in the case of chemical bond) or one (in the case of lone pair) hybrid orbitals (HOs) assigned to the geminal at hand: + r + + v l + l + + w (r + l + + l + r + ) gm+ = um rm m m m m m m m m m (16) u2m + vm2 + 2wm2 = 1: The rst two con gurations are ionic with two electrons at the end of bond while the last is the covalent (Heitler-London) one. In the case of lone pair only rst con guration survives with the coecient um = 1. The HOs jrm i and jlm i correspond to the right and left ends of the bond and are formed by 44 orthogonal (SO(4)) transformations of atomic orbitals basis sets for each "heavy" (non-hydrogen) atom. The amplitudes of the geminals Eq. (16) and parameters of the SO(4) transformations are determined by optimizing the energy. The variational character of the one-electron states is an essential factor for use of the APSLG-MINDO/3 scheme in the construction of boundary between Rand M -subsystems.

13

PHYSICAL PRINCIPLES OF QM/MM

The important feature of the APSLG-MINDO/3 energy is that it is a sum of increments of dierent form for bonded and nonbonded atoms: EA = P [2Umt Pmtt + (tm tmjtm tm)Tm ttm ]+ tm 2A P +2 0 gtTkkt0m Pktt Pmt0 t0 ; (17) tk

Karpov Institute of Physical Chemistry 10 Vorontsovo pole, 105064, Moscow, Russia

[email protected]

A.M. TOKMACHEV

Karpov Institute of Physical Chemistry 10 Vorontsovo pole, 105064, Moscow, Russia [email protected]

Abstract Hybrid QM/MM methods are widely used for describing dierent as-

pects of behavior of complex molecular systems. The key problem when applying the QM/MM methodology is a substantiated construction/selection of the junction between the parts of the system described at the QM and MM levels, respectively. It is especially important in the case of covalently bound QM and MM subsystems. We pursue here a general approach based on a sequential separation of electronic variables in order to develop a fundamental form of the intersubsystem junction. Special attention is given to construction of frontier oneelectron states and renormalization of QM Hamiltonian parameters and MM force elds. From this point of view we consider a series of the junction forms present in the literature and in some cases suggest theoretically more reliable alternatives. General theoretical conclusions are supported by data of numerical experiments.

Keywords: QM/MM methods, junction, physical principles, APSLG scheme, eec-

1.

tive Hamiltonian

Introduction

Modern computational quantum chemistry tends to cover with acceptable precision molecular systems of real interest. In the framework of ab initio methodology achieving a good quality results is usually concerned with extending basis sets of one-electron states and with explicit taking into account a great deal of electron correlation. It leads to ex-

1

2 tremely high computational costs for medium and large size systems due to unpleasant scalability of computational resources like N 4 N 7 (where N is the dimension of a space spanned by one-electron basis functions). Even in the case of semiempirical methods the computational costs grow as N 3 with growth of system size. This is a serious problem for application of quantum chemical procedures for calculating properties of many practically important molecular systems and especially of their chemical transformations [1]. The problem of high computational costs is especially actual when large systems of biological importance are considered or massive calculations of potential energy surfaces (PESs) are necessary, for example, in the framework of molecular dynamics simulations. In the literature dierent means are proposed to signi cantly reduce computational costs without deteriorating the quality of obtained results. The rst way is to smooth the dependence of computational costs on the size of the system. This type approaches usually exploit the localization of electronic degree of freedom, based on the "principle of nearsightedness" [2], or the exponential decay of the one-electron density matrix elements [3]. The almost linear scalability is not impossible and, for example, authors of Ref. [4] have shown that the scalability of the order N 1:3 can be achieved. There is a reasonable number of successful attempts to achieve optimal N -scalability properties for the whole spectrum of quantum chemical schemes { ab initio, DFT, and semiempirical. The non-variational schemes are based on the "divideand-conquer" methodology [5, 6, 7], Fermi operator expansion [8, 9], energy renormalization group [10] and recursion technique [11]. The variational approaches [12, 13, 14, 15] are based on the substitution of energy minimization by the grand canonical potential minimization [16]. These techniques are thoroughly reviewed in Refs. [17, 18]. At the same time these methods are mostly oriented on the tight-binding models of solids or solid clusters. The density matrix renormalization group method [19] is quite important achievement since it competes in quality with most elaborated methods of conventional quantum chemistry [20]. We should stress that the use of local one-electron basis states can signi cantly reduce computational costs [21]. In this context direct determination of localized Hartree-Fock orbitals can be especially useful and workable [22]. Signi cant acceleration of computations can be achieved by using pseudodiagonalization procedures [23] or by special choice of the trial electronic wave function [24, 25, 26, 27] alternative to the standard SCF form. The application of the methods with good scalabilty properties to enzyme reactions, structure and properties of proteins and DNA is straightforward [28, 29].

PHYSICAL PRINCIPLES OF QM/MM

3

The molecular mechanics (MM) [30] is another good candidate for using in calculations of properties of large molecular systems. Currently it is extensively used in the eld of biochemical simulations. At the same time there are signi cant limitations for its use. Generally, they are consequences of classical nature of the MM force elds [31]. For example, the application of MM schemes to chemical reactions including the processes of bond formation or cleaveage, transition states, or to highly correlated compounds like transition metal complexes is impossible or, at least, questionable since the main prerequisite for the validity of MM procedures is its application to the ground electronic state of a molecule without low-lying excitations . Moreover, the MM schemes usually require rather complex parameterization procedure. The main advantages of the MM schemes are their low cost and high eciency in the prediction of molecular geometry for organic compounds without signi cant electron correlation. The principal advantage of the QM procedures is a wide range of problems they potentially apply. The concert expoitation of advantages of the both (quantum and classical) approaches can be achieved by the incorporation of quantum mechanical (QM) description to the MM framework. This methodology, alternative to construction of the QM schemes with the O(N ) scalability, leads to hybrid quantum mechanical/molecular mechanical (QM/MM) schemes allowing another way to bypass the bottleneck of N n -scalability. The QM/MM schemes describe some relatively small part of the system where chemical transformations take place by an appropriate quantum chemical method while the rest (relatively inert environment) is covered by classical force elds (molecular mechanics). The practical usefulness and general validity of these approaches is based on the chemically motivated observation: chemical transformations usually aect only a small part of the whole system (reaction center) while the r^ole of surrounding groups and molecules is reduced to modi cation of the PES due for example to some polarization or steric strains. This situation is characteristic for chemical reactions of biological interest (especially, for catalysis by enzymes). Since the seminal work by Warshell and Levitt [32] the hybrid QM/MM schemes of calculating large molecular systems acquired an increasing popularity. There is a big variety of dierent hybrid approaches described in the literature [33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43]. Evenmore, numerous cases of separating electronic variables like -electron models [44] or even taking into account explicitly only valence electrons in semiempirical methods [45] can be considered as special cases of hybrid schemes since they also bear the family marks of the QM/MM approach, namely, (i) the separation of the system into parts, and (ii)

4 treating these parts on quantum or classical levels, respectively. Very close to the QM/MM modeling is a problem of constructing eective fragment (group) potentials which can aect the quantum chemically described part of molecule modeling the in uence of environment [46, 47, 48, 49, 50, 52, 53, 54, 51]. In such a broader sense several other problems in the area of computational chemistry seem to be related to the QM/MM context: these are the problem of embedding in the cluster calculations on solids and their surfaces with special attention to adsorption and catalysis problems; the problem of description of solute/solvent eects for reactions in condensed media; the pseudopotential descriptions of of molecular and atomic electronic structure to list several most important. Usually all these areas are considered separately without an attempt of establishing relations among them. Also a great variety of dierent speci c schemes referred to as "protocols" implemented in dierent packages are normaly considered only from the point of view of their practical feasibility and their t for particular applied purpose, rather than in a context of their exact placement among other approximate methods and of evaluation of relative precision of that or another approximation. The QM/MM modeling can be placed in the general context of combination of dierent level descriptions. We should mention that the hybrid QM/QM schemes are well known in the literature. Some schemes by construction are far enough from the QM/MM construction since the separation on parts is performed not on the geometrical principle but on the principle of necessity of correlated description to some delocalized orbitals. Well known examples of such an approach is the complete active spase self consistent eld (CASSCF) scheme [55] and GVBCAS method [56] combining generalized valence bond and CASSCF approaches which can be considered as three-level scheme. The eective crystal eld method developed for calculation of d-d spectra [57] and its extension designed to treat catalytical systems [58] can be considered as taking an intermediate position in this ierarchy since they use some strictly localized one-electron states (for example, d-orbitals of one transition metal ion) as a high-level subsystem treating it on the full con guration interaction level with delocalization corrections taken into account perturbatively. The QM/QM schemes similar by construction to QM/MM ones (geometric separation on parts) are also quite popular and should be specially mentioned since they have many common problems . For example, the IMOMO method [59, 60, 61, 62] integrating molecular orbital approximations of dierent level is very similar in the way of construction, main principles and physical contributions taken into account to the QM/MM IMOMM method [38]. Another example

PHYSICAL PRINCIPLES OF QM/MM

5

of such a construction is the QM/DIM (diatomics-in-molecules) scheme [63] which takes the interactions between subsystems in the rst order of perturbation theory. This scheme can be brought to a type of QM/MM one by relevant signi cant simpli cations [63, 64]: by ignoring, rst, the dierences in potential curves of diatomic fragments due to angular and spin momentum, second, the couplings between the states of the same symmetry, and multicenter electrostatic contributions. The scheme proposed by Naray-Szabo et al [65, 66] is especially interesting here. It uses strictly localized bond orbitals (SLBOs) for the environment and the SCF procedure for the fragment of particular interest. Note that in the case of large systems the calculation of SLBOs can be very timeconsuming process, so the transition to the MM scheme for the environment is an important and almost unavoidable step for calculations of biopolymers. It should be noted that the hybrid quantum/classical schemes apply not only for determination of geometries, energies, and reaction mechanisms. The Monte Carlo [67, 68] and molecular dynamics (MD) [69, 70, 71, 72] simulations are quite popular as frameworks for which various QM/MM procedures serve as "subroutines". Before employing hybrid schemes the large-scale MD simulations were performed only with low-level approximations for force elds. The use of hybrid schemes extends signi cantly the scope of their application, improve precision of the results that allows to improve the understanding of statistical properties and dynamical processes in liquids and biopolymers. The area of the QM/MM modeling is not a steady one since there are many unsolved problems mostly caused by ad hoc character of construction of junction between subsystems. Despite a general claim of being stipulated by "speci c practical needs" the ad hoc junction constructs present in the literature frequently lead to more or less serious technical ("practical") problems in construction of hybrid schemes. This point is stressed in Refs. [73, 74]. The most direct application of hybrid schemes is in the case of solute/solvent interactions since the border between quantum and classical parts of the whole system in this case is naturally de ned by the division of the whole system into individual molecules or ions. In this case the environment (classical) variables can be chosen in many dierent ways, ranging from continual models with special properties [72, 75, 76, 77, 78] to the discrete ones explicitly employing the information on the structure of solvent molecules [79, 80, 81, 82] through some intermediate schemes adopting advantages of both extrema [83, 84] are also possible. Nevertheless, even in such a transparent case the problems occur when the formation of complexes between solute metal ions and solvent molecules has to be considered. In this case some solvent

6 molecules may have to be considered either as part of QM system when form a close contact with the metal ion or as a part of the MM system when they are absorbed by the solvent bulk. Approaches to constructing relevant force elds in this situation are discussed in Ref. [85]. Another important area of applications is solid state and surface chemistry [37, 39, 86, 87] with catalysis and adsorption by oxide systems as the most popular objects. In this case the subsystems are very polar and usually charged, so a special attention should be paid to electrostatic interactions between subsystems. The most important problem in the eld of embedding is the unphysical polarization of the QM subsystem (cluster) and the related necessity to adjust eective charges on the ions to obtain sensible results. It should be noted that the most dicult and not trivial case of separating quantum and classical subsystems is that of covalently bound systems since the electrons in chemical bonds connecting two regions are highly correlated. At the same time this case seems to be very actual (especially for biological applications) and requires thorough consideration. In this paper we consider the hybrid QM/MM schemes with covalent linkage between regions in more details, give account of the "state-of-art" in this eld, and show how the junction between quantum and classical subsystems can be constructed in a consistent (not ad hoc) way by deriving it from an underlying QM description with an emphasis on the choice and re nement of one-electron states related to the intersubsystem boundary and the related parameterization. We consider the possibility of subsequent derivation (i.e. proof) of the model is an argument in favor it (in mathematical sense) while the absence of derivation (or its impossibility) as a contra argument. The paper is organized as follows. In the next Section we formulate the physical principles of constructing hybrid QM/MM schemes and illustrate their application by separation of electronic variables in the framework of the eective Hamiltonian technique and by derivation of renormalization of QM and MM parameters in the framework of the deductive MM formalism. In the third Section the outline of existing approaches to QM/MM intersubsystem junction is given with numerical examples con rming theoretical conclusions and analysis from the general principles given in the second Section is performed. Finally, a brief conclusion is given.

2.

Physical principals of constructing QM/MM schemes

The structure of the area of the QM/MM modeling is obscured by a great deal of the recipes proposed and by the lack of sequential deriva-

PHYSICAL PRINCIPLES OF QM/MM

7

tions of them. The situation is partly caused by the structure of the MM itself { the MM scheme is not derived but is taken on the ground of some intuitive concepts of what classical terms must enter the energy energy with adjusted parameters for the latter. We extract here a minimal set of the most essential facts (physical principles) which are necessary prerequisites for constructing hybrid schemes: 1 Chemical transformations are local and the perturbation caused by the environment is small. This principle is con rmed by all the experimental chemistry { the concept of the reaction center and mechanisms of chemical reactions aecting only several chemical bonds. The very concept of chemical bond is mostly due to the local (in many cases one bond) nature of chemistry. 2 The system can be divided on parts and dierent level approximations (quantum or classical) can be applied to dierent parts of the system. The size of the quantum subsystem should be determined by the locality of transformations and not by the intention to move the boundary far from the reactive center to mask the errors in the junction construction. In the case of three-dimensional structures signi cant shift of boundary from the reactive center is problematic since the size of the QM system may become very large to be covered by the high-level quantum chemistry. 3 Correct form of intersubsystem junction corresponds to sequential separation of variables. This principle simply states that the construction should be made not in the ad hoc manner. 4 Separation of system into parts should be performed in a manner in which uctuations of electronic density between subsystems are small, i.e. chemical bonds are not to be cut. This requirement seems to be quite natural since both QM and MM schemes work properly only in the case of systems with well de ned numbers of electrons. The developed QM schemes operate with the manyelectron states which are eigenstates of the number of particles operator. Even if the bond to be cut is not polar and the diagonal one-electron densities can be set equal to integers and thus the average numbers of electrons in subsystems are also integer the situation does not change since according to Ref. [88] the oneelectron density is separable only for the wave functions where the intergroup electron transfers are projected out. It is de nitely not a good approximation for two ends of a chemical bond which can exist only in the case of a nontrivial QM superposition of the states with dierent number of electrons on its ends. The MM description

8 for a system with uctuating number of electrons is not de ned as well. 5 There exists a QM description underlying the MM one. It means that the form of junction is not arbitrary but essentially de ned by one-electron states arising from the underlying QM description. These points are very general. Now we consider in more details the practical implementation of these principles.

2.1

Eective Hamiltonian technique

As a rst step for derivation of junction we consider a general formulation of separation of electronic variables of quantum subsystem from those describing electrons in the classical (MM) subsystem which provide a consistent form of intersubsystem junction. This separation is based on the Lowdin partition technique [89] and the McWeeny group function formalism [90]. Generally, we construct an eective Hamiltonian for the QM subsystem in the presence of classical environment and the PES of combined system. This strategy was proposed and developed in Refs. [73, 74, 91]. Practically it is based on the assumption of existence of a QM scheme underlying the MM one. We consider two subsystems R- (reactive, quantum) and M - (inert, classical). The electronic Hamiltonian for the whole system is a sum of subsystem Hamiltonians and of their interaction which is taken to comprise the terms of two types { the Coulomb V c(q) and the resonance (electron transfer) V r (q) interactions:

H = H R (q) + H M (q) + V c(q) + V r (q): (1) The Hamiltonian for the M -subsystem is a sum of the free M -subsystem Hamiltonian H0M (q) and of the attraction of electrons in the M -subsystem to the cores of the R-subsystem V R (q). Analogous subdividing is true for the R-subsystem. The "exact" wave function of the system can be represented by generalized group function with number of electrons in subsystems not xed: k =

X X

fn g fi g

^

Cfki g(fn g) i (n );

(2)

where i (n ) is the i-th n -electron wave function of the -th group and electron distributions fn g satisfy the equation: X

n = Ne; 8; n 0:

(3)

9

PHYSICAL PRINCIPLES OF QM/MM

The coecients Cfki g (fn g) should be determined on the ground of variational principle. The function Eq. (2) is very general and does not correspond to the assumed wave function of the hybrid QM/MM method. First of all the numbers of electrons in the subsystems must be xed to apply computational schemes to separate parts on the legal ground. Second, we assume that the M -subsystem is treated with use of the MM, i.e. the PES of the M -subsystem is evaluated without explicit invocaion of its wave function. The parameters of the M -subsystem must be transferable, i.e. applicable to combination with any R-subsystem and even in the absence of it. For this purpose we should use the wave function of the ground state of the eective Hamiltonian for the M -subsystem since it is in a certain sense close the wave function calculated without any R-subsystem [73]. Thus the required wave function is represented by the antisymmetrized product of electronic wavefunction for the R-subsystem and that of the ground state for the free M -subsystem: k = Rk ^ M 00 :

(4)

It is obtained from the exact wavefunction by two sequential Lowdin projection procedures: the rst one to the subspace of the states with xed number of electrons in the subsystems (projection operator P and its complementary projection operator Q = 1 P ) and the second one { to the states with the ground state wavefunction of the free M -subsystem as the multiplier (projection operator P and its complementary projection operator Q = 1 P ). After the rst partition we obtain M (q)P + PV c(q)P + Heff (q; E ) = PH R(q)P +2 PH P +PV rr (q; E )P + e2 ZARZBM RAB1 ; A6=B

(5)

where the second order resolvent contribution coupling two one-electron transfers between the subsystems is:

V rr (q; E ) = V r (q)Q(E QHQ) 1QV r (q) = V r (q)QR(q; E )QV r (q):

(6) The Hamiltonian of the type Eq. (5) is typical for the ECF method [57], where projector P corresponds to the SCF wave function of the ligands in transition metal complex. The second projection and subsequent averaging over the ground state of the M -subsystem gives the eective

10 Hamiltonian for the R-subsystem: R (q; E; !) = H R (q) + V M (q)+ Heff 0 DD EE + hhPV rr (q; E )P iiM + P V R (q)P M + P + hhP W (q; E )P QR(!)QP W (q; E )P iiM + e22 ZARZBM RAB1 ;

(7)

A6=B

where the symbol hh:::iiM corresponds to the averaging over the ground D E M M state of the M -subsystem 00 ::: 00 , the averaged operator

V M (q) = V M (q) + hhPV c(q)P iiM (8) is close to the zero for quite typical case of M -subsystems with zero

overall charge and P W (q; E )P = PV c (q)P + PV rr (q; E )P + P V R (q)P: (9) To perform the averaging explicitly a form of the wavefunction for the M -subsystem should be speci ed. At the same time some simpli cations of the expression Eq. (7) can be made on quite a general level. For example, the second order contribution from the resolvent of rst partition expressed through one-electron states of subsystems is: hhPV rr (q; E )P iiM = P0 P0 vrm(q)vr0 m0 (q) rr 2R mm 2M P adv) (I E )+ f r+ ji hj r0G(mm 0 (10) 2ImOR (NR 1) P

2ImOR (NR +1)

ret) (A E )g; r ji hj r0+G(mm 0

where G(ret) () and G(adv) () are the retarded and advanced one-electron Green's functions for the M -subsystem, respectively. The renormalization of the bare R-subsystem Hamiltonian leads to addition of some interaction terms to the sum of energies of free subsystems. The PESs for the whole molecule are obtained by formula: Ek = EkR + E00M ; (11) where EkR is the k-th eigenvalue of the eective Hamiltonian of the Rsubsystem: D E R R (q) EkR = Rk (q) Heff (12) k M is the energy of the M -subsystem which is parameterized in the and E00 MM form. The detailed analysis [91] of the eective R-subsystem energy EkR allows to present it in the form of a sum of contributions of dierent

PHYSICAL PRINCIPLES OF QM/MM

11

physical nature. By construction the general consideration of junction is close to that of intermolecular forces. At the same time there are some important dierences: the hybrid schemes require consideration of boundary atoms; the special form of the wave function of inert subsystem is required; the resonance terms cannot be neglected. The correction to the k-th eigenvalue of the bare Hamiltonian (i.e. the dierence between EkR with and without environment aecting the reaction center) is

Ek Ekelst + Ekrr + Ekdisp + Ekcoul + Ekcrr + EkRrr ; (13) where the leading contribution Ekelst is of the rst order with respect to the perturbation operator and originates from the electrostatic interaction between the subsystems, it can be expressed through the charges on the atoms; the next three contributions are of the second order and follow from the coupling of one-electron transfers between subsystems, of Coulomb electron-electron interactions between them, of interactions between electrons of the M -subsystem and cores of the R-subsystem; and of interaction of the Coulomb electron-electron interaction with the interaction of the electrons of the M -subsystem and cores of the Rsubsystem; the second term can be regarded as dispersion interaction between subsystems. Its general form is: P P (rrjmm)(r0 r0 jm0 m0 ) E disp = 0 rr12R mm0 2M (14) M 0 (i!); R d!rrR 0 (i!)mm 0

M 0 (i!) are the reduced polarization propagators where rrR 0 (i!) and mm for the R- and M -subsystems. The sum of the contributions Ekdisp and Ekcoul has a physical meaning of the second order interaction between the electronic polarization in the M -subsystem and the polarized Rsubsystem. The last two contributions to Eq. (13) correspond to the third order in the interaction between the subsystems originating from the coupling two projection procedures. Physically it corresponds to interaction between two one-electron transfers and Coulomb interaction between electrons of the M -subsystem with electrons and cores of the R-subsystem. The explicit form of these contributions is given in Ref. [91].

2.2

Deductive MM formulation and boundary one-electron states

2.2.1 Semiempirical APSLG-MINDO/3 approach. General physical principles of constructing hybrid QM/MM approaches state

12 that the subsequent derivation of junction between quantum and classical subsystems requires a QM wave function underlying the MM description of PESs. This QM method is necessary, for example, to perform averaging of the eective Hamiltonian in Eq. (7). At the same time more important that this method should produce in a consistent manner one-electron states necessary for explicit formation of boundary and its response on the changes in molecular geometry of fragments and/or electronic structure of the R-subsystem. There are some general criteria for the QM method appropriate for underlying the MM description: this method should express molecular electronic structure and electronic energy in relevant local terms (i.e. to distinguish bonded and nonbonded contributions) and reproduce molecular properties with sucient accuracy. The parameters characterizing the wave function of this method (electronic structure parameters or ESPs) have to be transferable in a broad sense of the term "transferability", i.e. the form of any bond-related functions (e.g. the bond energy dependence on interatomic separation) must be also transferable. As an appropriate method satisfying these conditions we take the trial wave function in the form of antisymmetrized product of strictly localized geminals (APSLG) [92] implemented with slightly modi ed semiempirical MINDO/3 Hamiltonian [26, 27]. The APSLG wave function is constructed from two-electron building blocks { geminals: Y ji = gm+ j0i : (15) m

Each geminal corresponds to a chemical bond or to a lone pair and it is expressed as a linear combination of singlet two-electron con gurations constructed from two (in the case of chemical bond) or one (in the case of lone pair) hybrid orbitals (HOs) assigned to the geminal at hand: + r + + v l + l + + w (r + l + + l + r + ) gm+ = um rm m m m m m m m m m (16) u2m + vm2 + 2wm2 = 1: The rst two con gurations are ionic with two electrons at the end of bond while the last is the covalent (Heitler-London) one. In the case of lone pair only rst con guration survives with the coecient um = 1. The HOs jrm i and jlm i correspond to the right and left ends of the bond and are formed by 44 orthogonal (SO(4)) transformations of atomic orbitals basis sets for each "heavy" (non-hydrogen) atom. The amplitudes of the geminals Eq. (16) and parameters of the SO(4) transformations are determined by optimizing the energy. The variational character of the one-electron states is an essential factor for use of the APSLG-MINDO/3 scheme in the construction of boundary between Rand M -subsystems.

13

PHYSICAL PRINCIPLES OF QM/MM

The important feature of the APSLG-MINDO/3 energy is that it is a sum of increments of dierent form for bonded and nonbonded atoms: EA = P [2Umt Pmtt + (tm tmjtm tm)Tm ttm ]+ tm 2A P +2 0 gtTkkt0m Pktt Pmt0 t0 ; (17) tk