Octopus
exponential.F90
Go to the documentation of this file.
1!! Copyright (C) 2002-2006 M. Marques, A. Castro, A. Rubio, G. Bertsch
2!! Copyright (C) 2019 M. Oliveira
3!!
4!! This program is free software; you can redistribute it and/or modify
5!! it under the terms of the GNU General Public License as published by
6!! the Free Software Foundation; either version 2, or (at your option)
7!! any later version.
8!!
9!! This program is distributed in the hope that it will be useful,
10!! but WITHOUT ANY WARRANTY; without even the implied warranty of
11!! MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
12!! GNU General Public License for more details.
13!!
14!! You should have received a copy of the GNU General Public License
15!! along with this program; if not, write to the Free Software
16!! Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
17!! 02110-1301, USA.
18!!
19
20#include "global.h"
21
23 use accel_oct_m
24 use batch_oct_m
26 use blas_oct_m
28 use debug_oct_m
29 use global_oct_m
30 use grid_oct_m
34 use, intrinsic :: iso_fortran_env
38 use math_oct_m
39 use mesh_oct_m
44 use parser_oct_m
48 use types_oct_m
50 use xc_oct_m
51
52 implicit none
53
54 private
55 public :: &
61
62 integer, public, parameter :: &
63 EXP_LANCZOS = 2, &
64 exp_taylor = 3, &
66
67 type exponential_t
68 private
69 integer, public :: exp_method
70 real(real64) :: lanczos_tol
71 real(real64) :: chebyshev_tol
72 integer, public :: exp_order
73 integer :: arnoldi_gs
74 logical, public :: full_batch = .false.
75 contains
76 procedure :: apply_batch => exponential_apply_batch
77 procedure :: apply_single => exponential_apply_single
78 procedure :: apply_phi_batch => exponential_apply_phi_batch
79 end type exponential_t
80
81contains
82
83 ! ---------------------------------------------------------
84 subroutine exponential_init(te, namespace, full_batch)
85 type(exponential_t), intent(out) :: te
86 type(namespace_t), intent(in) :: namespace
87 logical, optional, intent(in) :: full_batch
88
89 push_sub(exponential_init)
90
91 !%Variable TDExponentialMethod
92 !%Type integer
93 !%Default taylor
94 !%Section Time-Dependent::Propagation
95 !%Description
96 !% Method used to numerically calculate the exponential of the Hamiltonian,
97 !% a core part of the full algorithm used to approximate the evolution
98 !% operator, specified through the variable <tt>TDPropagator</tt>.
99 !% In the case of using the Magnus method, described below, the action of the exponential
100 !% of the Magnus operator is also calculated through the algorithm specified
101 !% by this variable.
102 !%Option lanczos 2
103 !% Allows for larger time-steps.
104 !% However, the larger the time-step, the longer the computational time per time-step.
105 !% In certain cases, if the time-step is too large, the code will emit a warning
106 !% whenever it considers that the evolution may not be properly proceeding --
107 !% the Lanczos process did not converge. The method consists in a Krylov
108 !% subspace approximation of the action of the exponential
109 !% (see M. Hochbruck and C. Lubich, <i>SIAM J. Numer. Anal.</i> <b>34</b>, 1911 (1997) for details).
110 !% The performance of the method is controlled by the tolerance (controlled by <tt>TDLanczosTol</tt>).
111 !% The smaller the tolerance, the more precisely the exponential
112 !% is calculated, but also the larger the dimension of the Arnoldi
113 !% subspace. If the maximum dimension (currently 200) is not enough to meet the criterion,
114 !% the above-mentioned warning is emitted.
115 !% Be aware that the larger the required dimension of the Krylov subspace, the larger
116 !% the memory required for this method. So if you run out of memory, try to reduce
117 !% the time step.
118 !%Option taylor 3
119 !% This method amounts to a straightforward application of the definition of
120 !% the exponential of an operator, in terms of its Taylor expansion.
121 !%
122 !% <math>\exp_{\rm STD} (-i\delta t H) = \sum_{i=0}^{k} {(-i\delta t)^i\over{i!}} H^i.</math>
123 !%
124 !% The order <i>k</i> is determined by variable <tt>TDExpOrder</tt>.
125 !% Some numerical considerations from <a href=http://www.phys.washington.edu/~bertsch/num3.ps>
126 !% Jeff Giansiracusa and George F. Bertsch</a>
127 !% suggest the 4th order as especially suitable and stable.
128 !%Option chebyshev 4
129 !% In principle, the Chebyshev expansion
130 !% of the exponential represents it more accurately than the canonical or standard expansion.
131 !% <tt>TDChebyshevTol</tt> determines the tolerance to which the expansion is computed.
132 !%
133 !% There exists a closed analytic form for the coefficients of the exponential in terms
134 !% of Chebyshev polynomials:
135 !%
136 !% <math>\exp_{\rm CHEB} \left( -i\delta t H \right) = \sum_{k=0}^{\infty} (2-\delta_{k0})(-i)^{k}J_k(\delta t) T_k(H),</math>
137 !%
138 !% where <math>J_k</math> are the Bessel functions of the first kind, and H has to be previously
139 !% scaled to <math>[-1,1]</math>.
140 !% See H. Tal-Ezer and R. Kosloff, <i>J. Chem. Phys.</i> <b>81</b>,
141 !% 3967 (1984); R. Kosloff, <i>Annu. Rev. Phys. Chem.</i> <b>45</b>, 145 (1994);
142 !% C. W. Clenshaw, <i>MTAC</i> <b>9</b>, 118 (1955).
143 !%End
144 call parse_variable(namespace, 'TDExponentialMethod', exp_taylor, te%exp_method)
145
146 select case (te%exp_method)
147 case (exp_taylor)
148 case (exp_chebyshev)
149 !%Variable TDChebyshevTol
150 !%Type float
151 !%Default 1e-10
152 !%Section Time-Dependent::Propagation
153 !%Description
154 !% An internal tolerance variable for the Chebyshev method. The smaller, the more
155 !% precisely the exponential is calculated and the more iterations are needed, i.e.,
156 !% it becomes more expensive. The expansion is terminated once the error estimate
157 !% is below this tolerance.
158 !%End
159 call parse_variable(namespace, 'TDChebyshevTol', 1e-10_real64, te%chebyshev_tol)
160 if (te%chebyshev_tol <= m_zero) call messages_input_error(namespace, 'TDChebyshevTol')
161 case (exp_lanczos)
162 !%Variable TDLanczosTol
163 !%Type float
164 !%Default 1e-6
165 !%Section Time-Dependent::Propagation
166 !%Description
167 !% An internal tolerance variable for the Lanczos method. The smaller, the more
168 !% precisely the exponential is calculated, and also the bigger the dimension
169 !% of the Krylov subspace needed to perform the algorithm. One should carefully
170 !% make sure that this value is not too big, or else the evolution will be
171 !% wrong.
172 !%End
173 call parse_variable(namespace, 'TDLanczosTol', 1e-6_real64, te%lanczos_tol)
174 if (te%lanczos_tol <= m_zero) call messages_input_error(namespace, 'TDLanczosTol')
175
176 case default
177 call messages_input_error(namespace, 'TDExponentialMethod')
178 end select
179 call messages_print_var_option('TDExponentialMethod', te%exp_method, namespace=namespace)
180
181 if (te%exp_method == exp_taylor) then
182 !%Variable TDExpOrder
183 !%Type integer
184 !%Default 4
185 !%Section Time-Dependent::Propagation
186 !%Description
187 !% For <tt>TDExponentialMethod</tt> = <tt>taylor</tt>,
188 !% the order to which the exponential is expanded.
189 !%End
190 call parse_variable(namespace, 'TDExpOrder', default__tdexporder, te%exp_order)
191 if (te%exp_order < 2) call messages_input_error(namespace, 'TDExpOrder')
192 else
193 if (parse_is_defined(namespace, 'TDExpOrder')) then
194 message(1) = "TDExpOrder is only relevant for TDExponentialMethod = taylor"
195 call messages_fatal(1, namespace=namespace)
196 end if
197 end if
198
199 te%arnoldi_gs = option__arnoldiorthogonalization__cgs
200 if (te%exp_method == exp_lanczos) then
201 !%Variable ArnoldiOrthogonalization
202 !%Type integer
203 !%Section Time-Dependent::Propagation
204 !%Description
205 !% The orthogonalization method used for the Arnoldi procedure.
206 !% Only for TDExponentialMethod = lanczos.
207 !%Option cgs 3
208 !% Classical Gram-Schmidt (CGS) orthogonalization.
209 !% The algorithm is defined in Giraud et al., Computers and Mathematics with Applications 50, 1069 (2005).
210 !%Option drcgs 5
211 !% Classical Gram-Schmidt orthogonalization with double-step reorthogonalization.
212 !% The algorithm is taken from Giraud et al., Computers and Mathematics with Applications 50, 1069 (2005).
213 !% According to this reference, this is much more precise than CGS or MGS algorithms.
214 !%End
215 call parse_variable(namespace, 'ArnoldiOrthogonalization', option__arnoldiorthogonalization__cgs, &
216 te%arnoldi_gs)
217 end if
218
219 ! do lanczos expansion for full batch?
220 te%full_batch = optional_default(full_batch, te%full_batch)
221
222 pop_sub(exponential_init)
223 end subroutine exponential_init
224
225 ! ---------------------------------------------------------
226 subroutine exponential_copy(teo, tei)
227 type(exponential_t), intent(inout) :: teo
228 type(exponential_t), intent(in) :: tei
229
230 push_sub(exponential_copy)
231
232 teo%exp_method = tei%exp_method
233 teo%lanczos_tol = tei%lanczos_tol
234 teo%exp_order = tei%exp_order
235 teo%arnoldi_gs = tei%arnoldi_gs
236
237 pop_sub(exponential_copy)
238 end subroutine exponential_copy
239
240 ! ---------------------------------------------------------
242 subroutine exponential_apply_single(te, namespace, mesh, hm, zpsi, ist, ik, deltat, vmagnus, imag_time)
243 class(exponential_t), intent(inout) :: te
244 type(namespace_t), intent(in) :: namespace
245 class(mesh_t), intent(in) :: mesh
246 type(hamiltonian_elec_t), intent(inout) :: hm
247 integer, intent(in) :: ist
248 integer, intent(in) :: ik
249 complex(real64), contiguous, intent(inout) :: zpsi(:, :)
250 real(real64), intent(in) :: deltat
251 real(real64), optional, intent(in) :: vmagnus(mesh%np, hm%d%nspin, 2)
252 logical, optional, intent(in) :: imag_time
253
254 type(wfs_elec_t) :: psib, inh_psib
255 complex(real64), allocatable :: zpsi_inh(:, :)
256
258
259 !We apply the phase only to np points, and the phase for the np+1 to np_part points
260 !will be treated as a phase correction in the Hamiltonian
261 if (hm%phase%is_allocated()) then
262 call hm%phase%apply_to_single(zpsi, mesh%np, hm%d%dim, ik, .false.)
263 end if
264
265 call wfs_elec_init(psib, hm%d%dim, ist, ist, zpsi, ik)
266
267 if (hamiltonian_elec_inh_term(hm)) then
268 safe_allocate(zpsi_inh(1:mesh%np_part, 1:hm%d%dim))
269 call states_elec_get_state(hm%inh_st, mesh, ist, ik, zpsi_inh(:, :))
270 call wfs_elec_init(inh_psib, hm%d%dim, ist, ist, zpsi_inh, ik)
271 call te%apply_batch(namespace, mesh, hm, psib, deltat, &
272 vmagnus=vmagnus, imag_time=imag_time, inh_psib=inh_psib)
273 call inh_psib%end()
274 safe_deallocate_a(zpsi_inh)
275 else
276 call te%apply_batch(namespace, mesh, hm, psib, deltat, &
277 vmagnus=vmagnus, imag_time=imag_time)
278 end if
279
280 call psib%end()
281
282 if (hm%phase%is_allocated()) then
283 call hm%phase%apply_to_single(zpsi, mesh%np, hm%d%dim, ik, .true.)
284 end if
285
287 end subroutine exponential_apply_single
288
289 ! ---------------------------------------------------------
306 ! ---------------------------------------------------------
307 subroutine exponential_apply_batch(te, namespace, mesh, hm, psib, deltat, psib2, deltat2, vmagnus, imag_time, inh_psib)
308 class(exponential_t), intent(inout) :: te
309 type(namespace_t), intent(in) :: namespace
310 class(mesh_t), intent(in) :: mesh
311 class(hamiltonian_abst_t), intent(inout) :: hm
312 class(batch_t), intent(inout) :: psib
313 real(real64), intent(in) :: deltat
314 class(batch_t), optional, intent(inout) :: psib2
315 real(real64), optional, intent(in) :: deltat2
316 real(real64), optional, intent(in) :: vmagnus(:,:,:) !(mesh%np, hm%d%nspin, 2)
317 logical, optional, intent(in) :: imag_time
318 class(batch_t), optional, intent(inout) :: inh_psib
320 complex(real64) :: deltat_, deltat2_
321 class(chebyshev_function_t), pointer :: chebyshev_function
322 logical :: imag_time_
323
325 call profiling_in("EXPONENTIAL_BATCH")
326
327 assert(psib%type() == type_cmplx)
328
329 assert(present(psib2) .eqv. present(deltat2))
330 if (present(inh_psib)) then
331 assert(inh_psib%nst == psib%nst)
332 end if
333
334 if (present(vmagnus)) then
335 select type(hm)
336 class is (hamiltonian_elec_t)
337 assert(size(vmagnus, 1) >= mesh%np)
338 assert(size(vmagnus, 2) == hm%d%nspin)
339 assert(size(vmagnus, 3) == 2)
340 class default
341 write(message(1), '(a)') 'Magnus operators only implemented for electrons at the moment'
342 call messages_fatal(1, namespace=namespace)
343 end select
344 end if
345
346 deltat2_ = cmplx(optional_default(deltat2, m_zero), m_zero, real64)
347
348 imag_time_ = optional_default(imag_time, .false.)
349 if (imag_time_) then
350 deltat_ = -m_zi*deltat
351 if (present(deltat2)) deltat2_ = m_zi*deltat2
352 else
353 deltat_ = cmplx(deltat, m_zero, real64)
354 if (present(deltat2)) deltat2_ = cmplx(deltat2, m_zero, real64)
355 end if
356
357 if (.not. hm%is_hermitian() .and. te%exp_method == exp_chebyshev) then
358 write(message(1), '(a)') 'The Chebyshev expansion cannot be used for non-Hermitian operators.'
359 write(message(2), '(a)') 'Please use the Lanczos exponentiation scheme ("TDExponentialMethod = lanczos")'
360 write(message(3), '(a)') 'or the Taylor expansion ("TDExponentialMethod = taylor") method.'
361 call messages_fatal(3, namespace=namespace)
362 end if
363
364 select case (te%exp_method)
365 case (exp_taylor)
366 ! Note that delttat2_ is only initialized if deltat2 is present.
367 if (present(deltat2)) then
368 call exponential_taylor_series_batch(te, namespace, mesh, hm, psib, deltat_, &
369 psib2, deltat2_, vmagnus)
370 else
371 call exponential_taylor_series_batch(te, namespace, mesh, hm, psib, deltat_, &
372 vmagnus=vmagnus)
373 end if
374 if (present(inh_psib)) then
375 if (present(deltat2)) then
376 call exponential_taylor_series_batch(te, namespace, mesh, hm, psib, deltat_, &
377 psib2, deltat2_, vmagnus, inh_psib)
378 else
379 call exponential_taylor_series_batch(te, namespace, mesh, hm, psib, deltat_, &
380 vmagnus=vmagnus, inh_psib=inh_psib)
381 end if
382 end if
383
384 case (exp_lanczos)
385 if (present(psib2)) call psib%copy_data_to(mesh%np, psib2)
386 call exponential_lanczos_batch(te, namespace, mesh, hm, psib, deltat_, vmagnus)
387 if (present(inh_psib)) then
388 call exponential_lanczos_batch(te, namespace, mesh, hm, psib, deltat_, vmagnus, inh_psib)
389 end if
390 if (present(psib2)) then
391 call exponential_lanczos_batch(te, namespace, mesh, hm, psib2, deltat2_, vmagnus)
392 if (present(inh_psib)) then
393 call exponential_lanczos_batch(te, namespace, mesh, hm, psib2, deltat2_, vmagnus, inh_psib)
394 end if
395 end if
396
397 case (exp_chebyshev)
398 if (present(inh_psib)) then
399 write(message(1), '(a)') 'Chebyshev exponential ("TDExponentialMethod = chebyshev")'
400 write(message(2), '(a)') 'with inhomogeneous term is not implemented'
401 call messages_fatal(2, namespace=namespace)
402 end if
403 ! initialize classes for computing coefficients
404 if (imag_time_) then
405 chebyshev_function => chebyshev_exp_imagtime_t(hm%spectral_half_span, hm%spectral_middle_point, deltat)
406 else
407 chebyshev_function => chebyshev_exp_t(hm%spectral_half_span, hm%spectral_middle_point, deltat)
408 end if
409 if (present(psib2)) call psib%copy_data_to(mesh%np, psib2)
410 call exponential_cheby_batch(te, namespace, mesh, hm, psib, deltat, chebyshev_function, vmagnus)
411 if (present(psib2)) then
412 call exponential_cheby_batch(te, namespace, mesh, hm, psib2, deltat2, chebyshev_function, vmagnus)
413 end if
414 deallocate(chebyshev_function)
415
416 end select
417
418 call profiling_out("EXPONENTIAL_BATCH")
420 end subroutine exponential_apply_batch
421
422 ! ---------------------------------------------------------
423 subroutine exponential_taylor_series_batch(te, namespace, mesh, hm, psib, deltat, psib2, deltat2, vmagnus, inh_psib, phik_shift)
424 type(exponential_t), intent(inout) :: te
425 type(namespace_t), intent(in) :: namespace
426 class(mesh_t), intent(in) :: mesh
427 class(hamiltonian_abst_t), intent(inout) :: hm
428 class(batch_t), intent(inout) :: psib
429 complex(real64), intent(in) :: deltat
430 class(batch_t), optional, intent(inout) :: psib2
431 complex(real64), optional, intent(in) :: deltat2
432 real(real64), optional, intent(in) :: vmagnus(:,:,:) !(mesh%np, hm%d%nspin, 2)
433 class(batch_t), optional, intent(inout) :: inh_psib
434 integer, optional, intent(in) :: phik_shift
435
436 complex(real64) :: zfact, zfact2
437 integer :: iter, denom, phik_shift_
438 logical :: zfact_is_real
439 class(batch_t), allocatable :: psi1b, hpsi1b
440
442 call profiling_in("EXP_TAYLOR_BATCH")
443
444 call psib%clone_to(psi1b)
445 call psib%clone_to(hpsi1b)
446
447 zfact = m_z1
448 zfact2 = m_z1
449 zfact_is_real = abs(deltat-real(deltat)) < m_epsilon
450
451 if (present(psib2)) call psib%copy_data_to(mesh%np, psib2)
452
453 if (present(inh_psib)) then
454 zfact = zfact*deltat
455 call batch_axpy(mesh%np, real(zfact), inh_psib, psib)
456
457 if (present(psib2)) then
458 zfact2 = zfact2*deltat2
459 call batch_axpy(mesh%np, real(zfact2), inh_psib, psib2)
460 end if
461 end if
462
463 ! shift the denominator by this shift for the phi_k functions
464 phik_shift_ = optional_default(phik_shift, 0)
465
466 do iter = 1, te%exp_order
467 denom = iter+phik_shift_
468 if (present(inh_psib)) denom = denom + 1
469 zfact = zfact*(-m_zi*deltat)/denom
470 if (present(deltat2)) zfact2 = zfact2*(-m_zi*deltat2)/denom
471 zfact_is_real = .not. zfact_is_real
472 ! FIXME: need a test here for runaway exponential, e.g. for too large dt.
473 ! in runaway case the problem is really hard to trace back: the positions
474 ! go haywire on the first step of dynamics (often NaN) and with debugging options
475 ! the code stops in ZAXPY below without saying why.
476
477 if (iter /= 1) then
478 call operate_batch(hm, namespace, mesh, psi1b, hpsi1b, vmagnus)
479 else
480 if (present(inh_psib)) then
481 call operate_batch(hm, namespace, mesh, inh_psib, hpsi1b, vmagnus)
482 else
483 call operate_batch(hm, namespace, mesh, psib, hpsi1b, vmagnus)
484 end if
485 end if
486
487 if (zfact_is_real) then
488 call batch_axpy(mesh%np, real(zfact), hpsi1b, psib)
489 if (present(psib2)) call batch_axpy(mesh%np, real(zfact2), hpsi1b, psib2)
490 else
491 call batch_axpy(mesh%np, zfact, hpsi1b, psib)
492 if (present(psib2)) call batch_axpy(mesh%np, zfact2, hpsi1b, psib2)
493 end if
494
495 if (iter /= te%exp_order) call hpsi1b%copy_data_to(mesh%np, psi1b)
496
497 end do
498
499 call psi1b%end()
500 call hpsi1b%end()
501 safe_deallocate_a(psi1b)
502 safe_deallocate_a(hpsi1b)
503
504 call profiling_out("EXP_TAYLOR_BATCH")
507
508 subroutine exponential_lanczos_batch(te, namespace, mesh, hm, psib, deltat, vmagnus, inh_psib)
509 type(exponential_t), intent(inout) :: te
510 type(namespace_t), intent(in) :: namespace
511 class(mesh_t), intent(in) :: mesh
512 class(hamiltonian_abst_t), intent(inout) :: hm
513 class(batch_t), intent(inout) :: psib
514 complex(real64), intent(in) :: deltat
515 real(real64), optional, intent(in) :: vmagnus(:,:,:) !(mesh%np, hm%d%nspin, 2)
516 class(batch_t), optional, intent(in) :: inh_psib
517
518 class(batch_t), allocatable :: tmpb
519
521
522 if (present(inh_psib)) then
523 call inh_psib%clone_to(tmpb, copy_data=.true.)
524 ! psib = psib + deltat * phi1(-i*deltat*H) inh_psib
525 call exponential_lanczos_function_batch(te, namespace, mesh, hm, tmpb, deltat, phi1, vmagnus)
526 call batch_axpy(mesh%np, real(deltat, real64), tmpb, psib)
527 call tmpb%end()
528 else
529 call exponential_lanczos_function_batch(te, namespace, mesh, hm, psib, deltat, exponential, vmagnus)
530 end if
531
533 end subroutine exponential_lanczos_batch
534
535 ! ---------------------------------------------------------
547 subroutine exponential_lanczos_function_batch(te, namespace, mesh, hm, psib, deltat, fun, vmagnus)
548 type(exponential_t), intent(inout) :: te
549 type(namespace_t), intent(in) :: namespace
550 class(mesh_t), intent(in) :: mesh
551 class(hamiltonian_abst_t), intent(inout) :: hm
552 class(batch_t), intent(inout) :: psib
553 complex(real64), intent(in) :: deltat
554 interface
555 complex(real64) function fun(z)
556 import real64
557 complex(real64), intent(in) :: z
558 end
559 end interface
560 real(real64), optional, intent(in) :: vmagnus(:,:,:) !(mesh%np, hm%d%nspin, 2)
561
562 integer :: iter, l, ii, ist, max_initialized
563 complex(real64), allocatable :: hamilt(:,:,:), expo(:,:,:)
564 real(real64), allocatable :: beta(:), res(:), norm(:)
565 integer, parameter :: max_order = 200
566 type(batch_p_t), allocatable :: vb(:) ! Krylov subspace vectors
567
569 call profiling_in("EXP_LANCZOS_FUN_BATCH")
570
571 if (te%exp_method /= exp_lanczos) then
572 message(1) = "The exponential method needs to be set to Lanzcos (TDExponentialMethod=lanczos)."
573 call messages_fatal(1)
574 end if
575
576 safe_allocate(beta(1:psib%nst))
577 safe_allocate(res(1:psib%nst))
578 safe_allocate(norm(1:psib%nst))
579 safe_allocate(vb(1:max_order))
580 call psib%clone_to(vb(1)%p)
581 max_initialized = 1
582
583 call psib%copy_data_to(mesh%np, vb(1)%p, async=.true.)
584 call mesh_batch_nrm2(mesh, vb(1)%p, beta)
585
586 if (te%full_batch) beta = norm2(beta)
587
588 ! If we have a null vector, no need to compute the action of the exponential.
589 if (all(abs(beta) <= 1.0e-12_real64)) then
590 safe_deallocate_a(beta)
591 safe_deallocate_a(res)
592 safe_deallocate_a(norm)
593 call vb(1)%p%end()
594 safe_deallocate_a(vb)
595 call profiling_out("EXP_LANCZOS_FUN_BATCH")
597 return
598 end if
599
600 call batch_scal(mesh%np, m_one/beta, vb(1)%p, a_full = .false.)
602 safe_allocate(hamilt(1:max_order+1, 1:max_order+1, 1:psib%nst))
603 safe_allocate( expo(1:max_order+1, 1:max_order+1, 1:psib%nst))
604
605 ! This is the Lanczos loop...
606 do iter = 1, max_order
607 call psib%clone_to(vb(iter + 1)%p)
608 max_initialized = iter + 1
609
610 ! to apply the Hamiltonian
611 call operate_batch(hm, namespace, mesh, vb(iter)%p, vb(iter+1)%p, vmagnus)
612
613 ! We use either the Lanczos method (Hermitian case) or the Arnoldi method
614 if (hm%is_hermitian()) then
615 l = max(1, iter - 1)
616 hamilt(1:max(l-1, 1), iter, 1:psib%nst) = m_zero
617 else
618 l = 1
619 if (iter > 2) then
620 hamilt(iter, 1:iter-2, 1:psib%nst) = m_zero
621 end if
622 end if
623
624 ! Orthogonalize against previous vectors
625 call zmesh_batch_orthogonalization(mesh, iter - l + 1, vb(l:iter), vb(iter+1)%p, &
626 normalize = .false., overlap = hamilt(l:iter, iter, 1:psib%nst), norm = hamilt(iter + 1, iter, 1:psib%nst), &
627 gs_scheme = te%arnoldi_gs, full_batch = te%full_batch)
628
629 ! We now need to compute exp(Hm), where Hm is the projection of the linear transformation
630 ! of the Hamiltonian onto the Krylov subspace Km
631 ! See Eq. 4
632 !
633 ! Note that in the Hermitian case, we use the Lanczos algorithm that requires
634 ! only a tridiagonal matrix. Else we have an upper Hessenberg matrix.
635 do ii = 1, psib%nst
636 call zlalg_matrix_function(iter, -m_zi*deltat, hamilt(:,:,ii), expo(:,:,ii), fun, &
637 hm%is_hermitian(), tridiagonal=hm%is_hermitian())
638 res(ii) = abs(hamilt(iter + 1, iter, ii) * abs(expo(iter, 1, ii)))
639 end do !ii
641 ! We now estimate the error we made. This is given by the formula denoted Er2 in Sec. 5.2
642 if (all(abs(hamilt(iter + 1, iter, :)) < 1.0e4_real64*m_epsilon)) exit ! "Happy breakdown"
643 ! We normalize only if the norm is non-zero
644 ! see http://www.netlib.org/utk/people/JackDongarra/etemplates/node216.html#alg:arn0
645 norm = m_one
646 do ist = 1, psib%nst
647 if (abs(hamilt(iter + 1, iter, ist)) >= 1.0e4_real64 * m_epsilon) then
648 norm(ist) = m_one / abs(hamilt(iter + 1, iter, ist))
649 end if
650 end do
651
652 call batch_scal(mesh%np, norm, vb(iter+1)%p, a_full = .false.)
653
654 if (iter > 3 .and. all(res < te%lanczos_tol)) exit
655
656 end do !iter
657
658 if (iter == max_order) then ! Here one should consider the possibility of the happy breakdown.
659 write(message(1),'(a,i5,a,es9.2)') 'Lanczos exponential expansion did not converge after ', iter, &
660 ' iterations. Residual: ', maxval(res)
661 call messages_warning(1, namespace=namespace)
662 else
663 write(message(1),'(a,i5)') 'Debug: Lanczos exponential iterations: ', iter
664 call messages_info(1, namespace=namespace, debug_only=.true.)
665 end if
666
667 ! See Eq. 4 for the expression here
668 ! zpsi = nrm * V * expo(1:iter, 1) = nrm * V * expo * V^(T) * zpsi
669 call batch_scal(mesh%np, expo(1,1,1:psib%nst), psib, a_full = .false.)
670 ! TODO: We should have a routine batch_gemv for improved performance (see #1070 on gitlab)
671 do ii = 2, iter
672 call batch_axpy(mesh%np, beta(1:psib%nst)*expo(ii,1,1:psib%nst), vb(ii)%p, psib, a_full = .false.)
673 ! In order to apply the two exponentials, we must store the eigenvalues and eigenvectors given by zlalg_exp
674 ! And to recontruct here the exp(i*dt*H) for deltat2
675 end do
676
677 do ii = 1, max_initialized
678 call vb(ii)%p%end()
679 end do
680
681 safe_deallocate_a(vb)
682 safe_deallocate_a(hamilt)
683 safe_deallocate_a(expo)
684 safe_deallocate_a(beta)
685 safe_deallocate_a(res)
686 safe_deallocate_a(norm)
687
688 call accel_finish()
689
690 call profiling_out("EXP_LANCZOS_FUN_BATCH")
691
694
695
696 ! ---------------------------------------------------------
708 subroutine exponential_cheby_batch(te, namespace, mesh, hm, psib, deltat, chebyshev_function, vmagnus)
709 type(exponential_t), intent(inout) :: te
710 type(namespace_t), intent(in) :: namespace
711 class(mesh_t), intent(in) :: mesh
712 class(hamiltonian_abst_t), intent(inout) :: hm
713 class(batch_t), intent(inout) :: psib
714 real(real64), intent(in) :: deltat
715 class(chebyshev_function_t), intent(in) :: chebyshev_function
716 real(real64), optional, intent(in) :: vmagnus(:,:,:) !(mesh%np, hm%d%nspin, 2)
717
718 integer :: j, order_needed
719 complex(real64) :: coefficient
720 complex(real64), allocatable :: coefficients(:)
721 real(real64) :: error
722 class(batch_t), allocatable, target :: psi0, psi1, psi2
723 class(batch_t), pointer :: psi_n, psi_n1, psi_n2
724 integer, parameter :: max_order = 200
725
727 call profiling_in("EXP_CHEBY_BATCH")
728
729 call psib%clone_to(psi0)
730 call psib%clone_to(psi1)
731 call psib%clone_to(psi2)
732 call psib%copy_data_to(mesh%np, psi0)
733
734 order_needed = max_order
735 do j = 1, max_order
736 error = chebyshev_function%get_error(j)
737 if (error > m_zero .and. error < te%chebyshev_tol) then
738 order_needed = j
739 exit
740 end if
741 end do
742
743 call chebyshev_function%get_coefficients(j, coefficients)
744
745 ! zero-order term
746 call batch_scal(mesh%np, coefficients(0), psib)
747 ! first-order term
748 ! shifted Hamiltonian
749 call operate_batch(hm, namespace, mesh, psi0, psi1, vmagnus)
750 call batch_axpy(mesh%np, -hm%spectral_middle_point, psi0, psi1)
751 call batch_scal(mesh%np, m_one/hm%spectral_half_span, psi1)
752 ! accumulate result
753 call batch_axpy(mesh%np, coefficients(1), psi1, psib)
754
755 ! use pointers to avoid copies
756 psi_n => psi2
757 psi_n1 => psi1
758 psi_n2 => psi0
759
760 do j = 2, order_needed
761 ! compute shifted Hamiltonian and Chebyshev recurrence formula
762 call operate_batch(hm, namespace, mesh, psi_n1, psi_n, vmagnus)
763 call batch_axpy(mesh%np, -hm%spectral_middle_point, psi_n1, psi_n)
764 call batch_xpay(mesh%np, psi_n2, -m_two/hm%spectral_half_span, psi_n)
765 call batch_scal(mesh%np, -m_one, psi_n)
766
767 ! accumulate result
768 call batch_axpy(mesh%np, coefficients(j), psi_n, psib)
769
770 ! shift pointers for the three-term recurrence, this avoids copies
771 if (mod(j, 3) == 2) then
772 psi_n => psi0
773 psi_n1 => psi2
774 psi_n2 => psi1
775 else if (mod(j, 3) == 1) then
776 psi_n => psi2
777 psi_n1 => psi1
778 psi_n2 => psi0
779 else
780 psi_n => psi1
781 psi_n1 => psi0
782 psi_n2 => psi2
783 end if
784 end do
785
786 if (order_needed == max_order) then
787 write(message(1),'(a,i5,a,es9.2)') 'Chebyshev exponential expansion did not converge after ', j, &
788 ' iterations. Coefficient: ', coefficient
789 call messages_warning(1, namespace=namespace)
790 else
791 write(message(1),'(a,i5)') 'Debug: Chebyshev exponential iterations: ', j
792 call messages_info(1, namespace=namespace, debug_only=.true.)
793 end if
794
795 call psi0%end()
796 call psi1%end()
797 call psi2%end()
798 safe_deallocate_a(psi0)
799 safe_deallocate_a(psi1)
800 safe_deallocate_a(psi2)
802 safe_deallocate_a(coefficients)
803
804 call profiling_out("EXP_CHEBY_BATCH")
805
807 end subroutine exponential_cheby_batch
808
809
810 subroutine operate_batch(hm, namespace, mesh, psib, hpsib, vmagnus)
811 class(hamiltonian_abst_t), intent(inout) :: hm
812 type(namespace_t), intent(in) :: namespace
813 class(mesh_t), intent(in) :: mesh
814 class(batch_t), intent(inout) :: psib
815 class(batch_t), intent(inout) :: hpsib
816 real(real64), optional, intent(in) :: vmagnus(:, :, :)
817
818 push_sub(operate_batch)
819
820 if (present(vmagnus)) then
821 call hm%zmagnus_apply(namespace, mesh, psib, hpsib, vmagnus)
822 else
823 call hm%zapply(namespace, mesh, psib, hpsib)
824 end if
825
826 pop_sub(operate_batch)
827 end subroutine operate_batch
828
829 ! ---------------------------------------------------------
833 subroutine exponential_apply_all(te, namespace, gr, hm, st, deltat, order)
834 type(exponential_t), intent(inout) :: te
835 type(namespace_t), intent(in) :: namespace
836 type(grid_t), intent(inout) :: gr
837 type(hamiltonian_elec_t), intent(inout) :: hm
838 type(states_elec_t), intent(inout) :: st
839 real(real64), intent(in) :: deltat
840 integer, optional, intent(inout) :: order
841
842 integer :: ik, ib, i
843 real(real64) :: zfact
844
845 type(states_elec_t) :: st1, hst1
846
847 push_sub(exponential_apply_all)
848
849 assert(te%exp_method == exp_taylor)
850
851 call states_elec_copy(st1, st)
852 call states_elec_copy(hst1, st)
853
854 zfact = m_one
855 do i = 1, te%exp_order
856 zfact = zfact * deltat / i
857
858 if (i == 1) then
859 call zhamiltonian_elec_apply_all(hm, namespace, gr, st, hst1)
860 else
861 call zhamiltonian_elec_apply_all(hm, namespace, gr, st1, hst1)
862 end if
863
864 do ik = st%d%kpt%start, st%d%kpt%end
865 do ib = st%group%block_start, st%group%block_end
866 call batch_set_zero(st1%group%psib(ib, ik))
867 call batch_axpy(gr%np, -m_zi, hst1%group%psib(ib, ik), st1%group%psib(ib, ik))
868 call batch_axpy(gr%np, zfact, st1%group%psib(ib, ik), st%group%psib(ib, ik))
869 end do
870 end do
871
872 end do
873 ! End of Taylor expansion loop.
874
875 call states_elec_end(st1)
876 call states_elec_end(hst1)
877
878 ! We now add the inhomogeneous part, if present.
879 if (hamiltonian_elec_inh_term(hm)) then
880 !write(*, *) 'Now we apply the inhomogeneous term...'
881
882 call states_elec_copy(st1, hm%inh_st)
883 call states_elec_copy(hst1, hm%inh_st)
884
885
886 do ik = st%d%kpt%start, st%d%kpt%end
887 do ib = st%group%block_start, st%group%block_end
888 call batch_axpy(gr%np, deltat, st1%group%psib(ib, ik), st%group%psib(ib, ik))
889 end do
890 end do
891
892 zfact = m_one
893 do i = 1, te%exp_order
894 zfact = zfact * deltat / (i+1)
895
896 if (i == 1) then
897 call zhamiltonian_elec_apply_all(hm, namespace, gr, hm%inh_st, hst1)
898 else
899 call zhamiltonian_elec_apply_all(hm, namespace, gr, st1, hst1)
900 end if
901
902 do ik = st%d%kpt%start, st%d%kpt%end
903 do ib = st%group%block_start, st%group%block_end
904 call batch_set_zero(st1%group%psib(ib, ik))
905 call batch_axpy(gr%np, -m_zi, hst1%group%psib(ib, ik), st1%group%psib(ib, ik))
906 call batch_axpy(gr%np, deltat * zfact, st1%group%psib(ib, ik), st%group%psib(ib, ik))
907 end do
908 end do
909
910 end do
911
912 call states_elec_end(st1)
913 call states_elec_end(hst1)
914
915 end if
916
917 if (present(order)) order = te%exp_order*st%nik*st%nst ! This should be the correct number
918
919 pop_sub(exponential_apply_all)
920 end subroutine exponential_apply_all
921
922 subroutine exponential_apply_phi_batch(te, namespace, mesh, hm, psib, deltat, k, vmagnus)
923 class(exponential_t), intent(inout) :: te
924 type(namespace_t), intent(in) :: namespace
925 class(mesh_t), intent(in) :: mesh
926 class(hamiltonian_abst_t), intent(inout) :: hm
927 class(batch_t), intent(inout) :: psib
928 real(real64), intent(in) :: deltat
929 integer, intent(in) :: k
930 real(real64), optional, intent(in) :: vmagnus(:,:,:) !(mesh%np, hm%d%nspin, 2)
931
932 class(chebyshev_function_t), pointer :: chebyshev_function
933 complex(real64) :: deltat_
934
935 push_sub_with_profile(exponential_apply_phi_batch)
936
937 assert(psib%type() == type_cmplx)
938 if (present(vmagnus)) then
939 select type(hm)
940 class is (hamiltonian_elec_t)
941 assert(size(vmagnus, 1) >= mesh%np)
942 assert(size(vmagnus, 2) == hm%d%nspin)
943 assert(size(vmagnus, 3) == 2)
944 class default
945 write(message(1), '(a)') 'Magnus operators only implemented for electrons at the moment'
946 call messages_fatal(1, namespace=namespace)
947 end select
948 end if
949
950 if (.not. hm%is_hermitian() .and. te%exp_method == exp_chebyshev) then
951 write(message(1), '(a)') 'The Chebyshev expansion for the exponential will only converge if the imaginary'
952 write(message(2), '(a)') 'eigenvalues are small enough compared to the span of the real eigenvalues,'
953 write(message(3), '(a)') 'i.e., for ratios smaller than about 1e-3.'
954 write(message(4), '(a)') 'The Lanczos method ("TDExponentialMethod = lanczos") is guaranteed to'
955 write(message(5), '(a)') 'always converge in this case.'
956 call messages_warning(5, namespace=namespace)
957 end if
958
959 deltat_ = cmplx(deltat, m_zero, real64)
960
961 select case (te%exp_method)
962 case (exp_taylor)
963 call exponential_taylor_series_batch(te, namespace, mesh, hm, psib, deltat_, vmagnus=vmagnus, phik_shift=k)
964
965 case (exp_lanczos)
966 if (k == 1) then
967 call exponential_lanczos_function_batch(te, namespace, mesh, hm, psib, deltat_, phi1, vmagnus)
968 else if (k == 2) then
969 call exponential_lanczos_function_batch(te, namespace, mesh, hm, psib, deltat_, phi2, vmagnus)
970 else
971 write(message(1), '(a)') 'Lanczos expansion not implemented for phi_k, k > 2'
972 call messages_fatal(1, namespace=namespace)
973 end if
974
975 case (exp_chebyshev)
976 if (k == 1) then
977 chebyshev_function => chebyshev_numerical_t(hm%spectral_half_span, hm%spectral_middle_point, deltat, phi1)
978 else if (k == 2) then
979 chebyshev_function => chebyshev_numerical_t(hm%spectral_half_span, hm%spectral_middle_point, deltat, phi2)
980 else
981 write(message(1), '(a)') 'Chebyshev expansion not implemented for phi_k, k > 2'
982 call messages_fatal(1, namespace=namespace)
983 end if
984 call exponential_cheby_batch(te, namespace, mesh, hm, psib, deltat, chebyshev_function, vmagnus)
985 deallocate(chebyshev_function)
986
987 end select
988 pop_sub_with_profile(exponential_apply_phi_batch)
989 end subroutine exponential_apply_phi_batch
990
991end module exponential_oct_m
992
993!! Local Variables:
994!! mode: f90
995!! coding: utf-8
996!! End:
batchified version of the BLAS axpy routine:
Definition: batch_ops.F90:154
scale a batch by a constant or vector
Definition: batch_ops.F90:162
batchified version of
Definition: batch_ops.F90:170
subroutine, public accel_finish()
Definition: accel.F90:1407
This module implements batches of mesh functions.
Definition: batch.F90:133
This module implements common operations on batches of mesh functions.
Definition: batch_ops.F90:116
subroutine, public batch_set_zero(this, np, async)
fill all mesh functions of the batch with zero
Definition: batch_ops.F90:242
This module contains interfaces for BLAS routines You should not use these routines directly....
Definition: blas.F90:118
subroutine, public exponential_copy(teo, tei)
subroutine, public exponential_apply_all(te, namespace, gr, hm, st, deltat, order)
Note that this routine not only computes the exponential, but also an extra term if there is a inhomo...
subroutine operate_batch(hm, namespace, mesh, psib, hpsib, vmagnus)
subroutine exponential_apply_phi_batch(te, namespace, mesh, hm, psib, deltat, k, vmagnus)
subroutine, public exponential_init(te, namespace, full_batch)
integer, parameter, public exp_taylor
subroutine exponential_lanczos_batch(te, namespace, mesh, hm, psib, deltat, vmagnus, inh_psib)
subroutine exponential_apply_single(te, namespace, mesh, hm, zpsi, ist, ik, deltat, vmagnus, imag_time)
Wrapper to batchified routine for applying exponential to an array.
subroutine exponential_taylor_series_batch(te, namespace, mesh, hm, psib, deltat, psib2, deltat2, vmagnus, inh_psib, phik_shift)
subroutine exponential_apply_batch(te, namespace, mesh, hm, psib, deltat, psib2, deltat2, vmagnus, imag_time, inh_psib)
This routine performs the operation:
subroutine, public exponential_lanczos_function_batch(te, namespace, mesh, hm, psib, deltat, fun, vmagnus)
Compute fun(H) psib, i.e. the application of a function of the Hamiltonian to a batch.
integer, parameter, public exp_chebyshev
subroutine exponential_cheby_batch(te, namespace, mesh, hm, psib, deltat, chebyshev_function, vmagnus)
Calculates the exponential of the Hamiltonian through an expansion in Chebyshev polynomials.
real(real64), parameter, public m_two
Definition: global.F90:190
real(real64), parameter, public m_zero
Definition: global.F90:188
complex(real64), parameter, public m_zi
Definition: global.F90:202
real(real64), parameter, public m_epsilon
Definition: global.F90:204
complex(real64), parameter, public m_z1
Definition: global.F90:199
real(real64), parameter, public m_one
Definition: global.F90:189
This module implements the underlying real-space grid.
Definition: grid.F90:117
This module defines an abstract class for Hamiltonians.
subroutine, public zhamiltonian_elec_apply_all(hm, namespace, gr, st, hst)
pure logical function, public hamiltonian_elec_inh_term(hm)
subroutine, public zlalg_matrix_function(n, factor, a, fun_a, fun, hermitian, tridiagonal)
This routine calculates a function of a matrix by using an eigenvalue decomposition.
Definition: lalg_adv.F90:2262
This module is intended to contain "only mathematical" functions and procedures.
Definition: math.F90:115
complex(real64) pure function, public phi2(z)
Compute phi2(z) = (phi1(z)-1)/z = (exp(z) - z - 1)/z^2.
Definition: math.F90:975
complex(real64) pure function, public exponential(z)
Wrapper for exponential.
Definition: math.F90:944
complex(real64) pure function, public phi1(z)
Compute phi1(z) = (exp(z)-1)/z.
Definition: math.F90:956
This module defines functions over batches of mesh functions.
Definition: mesh_batch.F90:116
subroutine, public mesh_batch_nrm2(mesh, aa, nrm2, reduce)
Calculate the norms (norm2, not the square!) of a batch of mesh functions.
Definition: mesh_batch.F90:177
subroutine, public zmesh_batch_orthogonalization(mesh, nst, psib, phib, normalize, overlap, norm, gs_scheme, full_batch)
Orthonormalizes states of phib to the orbitals of nst batches of psi.
This module defines various routines, operating on mesh functions.
This module defines the meshes, which are used in Octopus.
Definition: mesh.F90:118
subroutine, public messages_warning(no_lines, all_nodes, namespace)
Definition: messages.F90:537
character(len=256), dimension(max_lines), public message
to be output by fatal, warning
Definition: messages.F90:160
subroutine, public messages_fatal(no_lines, only_root_writes, namespace)
Definition: messages.F90:414
subroutine, public messages_input_error(namespace, var, details, row, column)
Definition: messages.F90:713
subroutine, public messages_info(no_lines, iunit, debug_only, stress, all_nodes, namespace)
Definition: messages.F90:616
logical function, public parse_is_defined(namespace, name)
Definition: parser.F90:502
subroutine, public profiling_out(label)
Increment out counter and sum up difference between entry and exit time.
Definition: profiling.F90:623
subroutine, public profiling_in(label, exclude)
Increment in counter and save entry time.
Definition: profiling.F90:552
subroutine, public states_elec_end(st)
finalize the states_elec_t object
subroutine, public states_elec_copy(stout, stin, exclude_wfns, exclude_eigenval, special)
make a (selective) copy of a states_elec_t object
type(type_t), public type_cmplx
Definition: types.F90:134
Definition: xc.F90:114
Class defining batches of mesh functions.
Definition: batch.F90:159
Description of the grid, containing information on derivatives, stencil, and symmetries.
Definition: grid.F90:169
The abstract Hamiltonian class defines a skeleton for specific implementations.
Describes mesh distribution to nodes.
Definition: mesh.F90:186
The states_elec_t class contains all electronic wave functions.
batches of electronic states
Definition: wfs_elec.F90:139
int true(void)