4 Giantomassi GWR Abinit
4 Giantomassi GWR Abinit
M. Giantomass
Université Catholique de Louvai
Louvain-la-Neuve, Belgium
i
In brief:
• Work with the analytic continuation of Hedin’s equations on the image. axis: (t → iτ, ω → iω)
• Avoid convolutions by working in the most natural space e.g.:
∑
χk(g, g′, iωk) = γkj cos(ωkτj)χk(g, g′, iτj)
j=1








GWR code in a nutshell
• optdriver 6 to activate the GWR driver
• gwr_task specifies the task to perform:
‣ “HDIAGO” for direct diagonalization with scalapack followed by WFK output
‣ “G0W0” for one-shot method
‣ "EGEW", "EGW0", “G0EW" for eigenvalue-only self-consistency
‣ “RPA_ENERGY” for Ec energy with automatic extrapolation for npweps → ∞
fi
fi
fi
0
fi
y
ψn(r)ψ* n (r′)
∑ ω − εn + iδ +sign(εn)
G(r, r′, ω) = G(r, r′, iτ) = Θ(τ)G(r, r′, iτ) + Θ(−τ)G(r, r′, iτ)
n
unocc
−εnτ
∑
G
G(r, r′, iτ) = − ψn(r)ψ*
n (r′)e (τ > 0)
n
Bounded exp.
occ
−εnτ
∑
G(r, r′, iτ) = ψn(r)ψ*
n (r′)e (τ < 0)
n
1 i(q+G1)⋅r1 −i(q+G2)⋅r2
V ∑
f(r1, r2) = e fG1G2(q) e
q
G1G2
1
V ∬V
fG1G2(q) = e −i(q+G1)⋅r1
f(r1, r2) e i(q+G2)⋅r2
dr1 dr2
1 1 1
where the q-points belong to the BZ mesh dual to the BvK supercell: ( , , )
N1 N2 N3
fi
fi
GWR algorithm
P. Liu et al. PhysRevB. 94 165109 (2016)
WFK generation with direct diagonalization
G(r, r′, iτ) = Θ(τ)G(r, r′, iτ) + Θ(−τ)G(r, r′, iτ)
optdriver 6 # enter GWR code
gwr_task "HDIAGO" # direct diago
unocc
getden_ lepath “GS_DEN” # read GS density to build H −εnτ
∑
G(r, r′, iτ) = − ψn(r)ψ*
n (r′)e (τ > 0)
nband 1200 # occ + empty states
n
Bounded exp.
occ
−εnτ
∑
G(r, r′, iτ) = ψn(r)ψ*
n (r′)e (τ < 0)
Scalapack diago vs iterative eigensolvers: n
ϵℛk = ϵk
[H, {ℛ, t}] = 0 uℛk(r) = e −iℛk⋅t uk(ℛ−1(r − t))
−i(ℛk+G)⋅t −1
uℛk(G) = e uk(ℛ G) .
Rotation matrix Fractional translation
Take-home message:
‣ Bloch states are computed in the IBZ and then reconstructed in the BZ at runtime
0
‣ G(k) and χ (q) are computed and stored only for k/q in the IBZ
‣ BZ integrals depending on an external q, can be restricted to the IBZq de ned by the little-group of q
‣ Signi cant speedup and memory saving in high-symmetry systems. Time-reversal can be easily included
fi
fi
MPI distribution of G, χ, W in GWR
• 4D MPI grid to distribute memory and operations over:
σ
Gk ( g, g′ , ± iτ)
⏟
- collinear spins inside spin_comm (trivial algo.)
- IBZ k-points inside kpt_comm
- g’ components inside g_comm
- iτ/iω points inside tau_comm (almost trivial algo.)
PBLAS
• spin_comm and tau_comm levels are very efficient (few MPI communications)
• kpt_comm and g_comm are network intensive but crucial to keep memory at bay
• To go to the supercell, indeed, we need to pre-compute and store in memory:
NBZ n t
i(k+g)r
for each k ∈ BZ memory ∝
∑
Gk(r, g′) = e Gk(g, g′) × × npw
npk npg
g
• For optimal performance, MPI procs should be a multiple of gwr_ntau x nsppol but mind the memory for G!
• Matrices are stored in single precision by default (—enable-gw-dpc=“yes” to use double precision)



ff
GWR algorithm
P. Liu et al. PhysRevB. 94 165109 (2016)
From G to χ in iτ space (step 1) NB: the loop over τ-points is external. At
each iteration, we have to consider ± τi
For each k in the BZ, do: (not always shown in the equations)
symm
➡ Use symmetries to build G GkIBZ(g, g′) → Gk(g, g′)
k
k in the BZ
in the BZ
i(k+g)r
∑
Gk(r, g′) = e Gk(g, g′)
➡ FFT along g index
g
Local to each MPI distributed
MPI proc inside g_comm
Cons: Pros:
NBZ n t
‣ Workspace memory ∝
npk
×
npg
× npw
‣ Linear scaling in NBZ
2NBZ
‣ Lots of calls to PTRANS: ( np ) ‣ Scales well with npk (less PTRANS calls)
k
‣ Memory increases with Nr (ecut and gwr_boxcutratio)







ff
From G to χ in iτ space (step 2)
Step 1. For each r in unit cell, use G̃ to compute:
χ(r, R′, iτ) = G(r, R′, iτ)G*(R′, r, − iτ) Only χr(R′) is stored at xed r
iG′R′
∑
χ(r, G′) = χ(r, R′)e Transform immediately to G′-space (k + g′)
R′∈S and store results in temp. PBLAS matrix χ̃
Step 2. Once all r have been computed, MPI-transpose χ and perform FFT along the r-axis
−i(k+g)r
∑
χk(g, g′) = e χ(r, k + g′) Only k-points in the IBZ are stored
Matrices are PBLAS-distributed
r∈C
Cons: Pros:
‣ k-parallelism requires nfft communications ‣ Tons of FFTs in batch mode (blocking over r)
‣ We loose part of the speedup gained in step 1 ‣ Ideal scenario for OpenMP/GPUs


















fi
GWR algorithm
P. Liu et al. PhysRevB. 94 165109 (2016)
Computing W from χ
Step 1. Cosine transform (iω → iτ):
Requires communication inside tau_comm
N
∑
χk(g, g′, iωk) = γkj cos(ωkτj)χk(g, g′, iτj)
j=1
−1
Wk(g, g′, iω) = vk(g, g′)ϵk (g, g′, iω) W̃k(g, g′, iω) = Wk(g, g′, iω) − vk(g, g′)
Matrix inversion with Scalapack/ELPA.
∑
W̃k(g, g′, iτk) = ξkj cos(ωkτj)W̃k(g, g′, iωj)


j=1















Requires communication inside tau_comm
GWR algorithm
P. Liu et al. PhysRevB. 94 165109 (2016)
Computing Σnq(ω)
FFT FFT
Step 1. FFTs in the unit cell: Gk(g, g′, iτ) ⟹ Gk(r, g′, iτ) W̃k(g, g′, iτ) ⟹ W̃k(r, g′, iτ)
Σ(r, R′, iτ) = − G(r, R′, iτ)W(r, R′, iτ) Avoid storing full
Σ(r, R′) in memory
Compute partial
∑
Σnq(iτ) = Σnq(iτ) + ψ*
nq(r) Σ(r, R′, iτ) ψnq(R′)
contribution to Σnq
R′∈S
and accumulate
C S CT+ST AC
Σnq(iτ) = Σnq(iτ) + Σnq(iτ) ⟹ Σnq(iω) ⟹ Σnq(ω)



Step 4. Add exchange part (sum over occ states directly). Finally, solve the linearized QP equation


















Validation: χ(ω) with GWR and Adler-Wiser
• Silicon with 4x4x4 Γ-centered k-mesh
• gwr_ntau = 12
• nband = 100 and inclvkb 2 to compute head and wings
QP direct gaps with GWR and quartic GW
• 4x4x4 Γ-centered k-mesh
• nband = 100 × nocc, ecuteps = 14 Ha
• gwr_ntau = 20 in GWR, nfreqre = 50, freqremax=1.5 Ha, nfreqim 10 for CD
wall-time (s)
Benchmark results for ZnO: nband Quartic GW GWR
‣ 8 nodes on Lumi, 2 Gb per core
1000 3023 1947
‣ ecut 40.0
‣ ecuteps 12 2000 MEM_FAIL 2145
ℒk
∑
Σk(r, r′) ≈ Gk+q(r, r′)Wq(r, r′)
q
‣ Computing Σ in the supercell is the recommended approach if one needs Σnk for all k in the IBZ, e.g.:
- band structure interpolation of G0W0 results
- self-consistency (requires off-diagonal matrix elements for which symmetries are not easy to exploit)



Pros and cons of GWR code
Pros:
‣ Cubic scaling in natom
‣ Linear scaling with Nk in the full BZ
‣ Fast convergence with minimax mesh (~20 points)
‣ GW beyond PPA: Σ(ω) and A(ω) at reasonable cost
k
‣ Computing off-diagonal Σmn for all k-points in the IBZ is not as expensive as in legacy code
Cons:
‣ Symmetries are more difficult to exploit, especially in the supercell
‣ Requires Pade’ to go back to the real axis: Σ(iω) → Σ(ω)
‣ Much more memory-demanding than conventional GW algorithm
‣ Requires different MPI levels and PBLAS distribution of G, χ, W to make memory scale
‣ Needs precomputed minimax meshes (solved thanks to Green-X library)
Supplemental material
Why do we need GW?
AlAs,GaP,SiC,AlP,CdS
diamond
AlN
SrO
ZnO,GaN,ZnS
exchange e ects beyond Kohn-Sham (KS) theory
InP,GaAs,CdTe,AlSb
Calculated gap (eV)
6
ZnSe,CuBr
InN,Ge,GaSb,CdO
• G0W0 still undershoots gaps: lot of discussions on starting point,
Se,Cu2O
4
MgO
vertex-corrections, self-consistency, e-ph interaction, etc.
InSb,P,InAs
CaO
2
Si
HgTe
• For accurate optical properties, we need to go beyond GW and include
0 :LDA
e-h interaction via e.g. the Bethe-Salpeter equation (BSE) :GW(LDA)
0 2 4 6 8
Experimental gap (eV)
• The GW equations are much “easier” to solve as we completely bypass the vertex equation
• Several technical aspects to be considered:
- representation: r-space vs G-space, frequency-space vs (imaginary) time, etc.
- basis set expansion, integration techniques
- self-consistency: one-shot, partial/full GW consistency or e ective QP Hamiltonian
ff
Spatial dependence: unit cell and BvK supercell
Unit cell C with lattice ℒC and BZ C*
• k: wave-vector in C*
• g: vector of the reciprocal lattice ℒ*
C
iGr −iG′r′
∑
‣ Fourier series for functions ful lling BvK conditions: F(r, r′) = e F(G, G′)e
GG′ Block diagonal
matrix
‣ If F(r, r′) = F(r + a, r′ + a) ∀ a ∈ ℒC then F(k + g, k′ + g′) = δkk′Fk(g, g′)
-18 Band 4
Σx (eV)
-20 Spherical Integration
Carrier's Auxiliary Function
Spherical Cutoff in vc(r)
• approximate the mini-box around Γ with a sphere -22
-6.4
• perform the integration analytically (gw_icutcoul 3) -6.8
-7.2 Band 6
• The integrand Σx(q, …) − f(q) is smooth and can be integrated with a coarse q-mesh
[Rangel2020]
Long wavelength limit
• In semiconductors, the head and the wings of χ̃
G,G′ go to zero for | q | → 0
0
• At the level of ε, this leads to a form for | q | → 0
0
• The limit is nite but the value depends on the direction q̂
−iq⋅r 2
⟨b1, k − q | e | b2, k⟩ = δb1b2 − iq ⋅ ⟨b1, k | r | b2, k⟩ + (q )
ill de ned in periodic systems
⟨b1 , k | ∇−[V NL , r] | b2 , k⟩
⟨b1, k − q | e −iq⋅r | b2, k⟩ ≈ − iq ⋅ for b1 ≠ b2
q→0 εb2k − εb1k
Take-home message:
• The commutator [VNL, r] is important for optical properties or for GW calculations in large cells.
Less critical for GW in bulk systems
• This term is included by default (inclvkb 2). Use 0 to ignore it
• Heads and wings converge fast with nband and slowly with the number of k-points
• Randomly shifted k-meshes are usually used to converge the macroscopic dielectric function with/wo
local- eld (LF) e ects:
• In GW, one usually uses the same high-symmetry k-mesh both in screening and sigma to reduce nkpt
fi
fi
ff
GW method in Fourier space
in a nutshell
Plane-wave expansion of Bloch orbitals
Unit cell
2
!k + G!
• The basis set is truncated such that 2
< Ecut
fi
Plane-wave expansion of two-point functions
• In nite system simulated with Born-von-Karman (BvK) periodic boundary conditions i.e.
(N1, N2, N3) supercell of volume V = NΩ with N = N1N2N3
1 i(q+G1)⋅r1 −i(q+G2)⋅r2
V ∑
f(r1, r2) = e fG1G2(q) e
q
G1G2
1
V ∬V
fG1G2(q) = e −i(q+G1)⋅r1
f(r1, r2) e i(q+G2)⋅r2
dr1 dr2
1 1 1
where the q-points belong to the BZ mesh dual to the BvK supercell: ( , , )
N1 N2 N3
fi
fi
RPA polarizability in the ω-domain
• Use the Lehmann representation of the time-ordered G:
Ψi(r1)Ψ†i (r2)
η → 0+,
∑ ω − ϵi + iη sign(ϵi − μ)
G(r1, r2; ω) =
i
i
2π ∫
Σ(r1, r2; ω) = G(r1, r2; ω + ω′)W(r1, r2; ω′)e iω′δ +
dω′ .
−1
• Using W = v + (ε − 1)v, we rewrite Σ as exchange (x) + correlation (c):
k 0 0
• ABINIT computes Σnm in the KS basis. G W energies are then obtained via the linearized QP equation:
−1
[ ]
KS ∂Σ(ϵ) KS
Z ≡ 1 − ⟨Ψ | |Ψ ⟩
∂ϵKS









Exchange part
BZ occ
†
∑∑
• Fock operator in real-space: Σx(r1, r2) = − Ψnk(r1)Ψnk(r2) v(r1, r2)
k ν
where the expression for J depends on the integration technique (CD, PPM, AC)
Kramers-Kronig Ω2G1G2(q)
Im ϵG−11G2(q, ω) = AG1G2(q) [δ(ω − ωG1G2(q)) − δ(ω + ωG1G2(q))] Re ϵG1G2(q, ω) = δG1G2 + 2 ~2
−1
ω −ω G1G2(q)
Amplitude of the peak Plasmon frequency
• The two parameters are tted so to reproduce ab-initio results at selected frequencies:
Other models
ppmodel 3 : Spectral decomposition of ε −1 PRB 47 15931 (1993)
+∞
i
{ }
c(z) (z − z ) − c(iω′) d(iω′)
∑ ∫−∞
Σc(ω) = 2π i lim G(z) W p G(ω + iω′) W
2π z
z→zp
p
Contribution from the poles located inside the contour. Integration along the imaginary axis (smooth
Usually ~50 ω-points. Need to interpolate W(ω′) integrand, Usually ~10 points are enough)
• Accurate but expensive, especially at the level of memory since G vectors and ω′ Re Σ with PPM
Re Σ without PPM
-points are not MPI-distributed
• CD is required for more advanced treatments e.g. e-e- lifetimes τnk, spectral
Re Σ ( eV )
0
i
2π ∫
Σc(r1, r2; ω) = c
G(r1, r2; ω + ω′) W (r1, r2; ω′) e iω′δ +
dω′ .
Analytic model




Plasmon-pole models
• Main idea: approximate the imaginary part of ε −1(ω) with a delta-peak (plasmon resonance)
Kramers-Kronig Ω2G1G2(q)
Im ϵG−11G2(q, ω) = AG1G2(q) [δ(ω − ωG1G2(q)) − δ(ω + ωG1G2(q))] Re ϵG1G2(q, ω) = δG1G2 + 2 ~2
−1
ω −ω G1G2(q)
Amplitude of the peak Plasmon frequency
• The two parameters are tted so to reproduce ab-initio results at selected frequencies:
Other models
ppmodel 3 : Spectral decomposition of ε −1 PRB 47 15931 (1993)
Very e cient both in term of CPU and memory as the convolution integral has the analytical expression:
ppmodel=1,2
iω ! δ
dω !
!
s 2 e
JG1 G2 (q, ω) = ΩG1 G2 (q) " #" #
ω + ω ! − "s + iη sign("s − µ) ω !2 − (ω̃G1 G2 (q) − iη)2
−1
Questionable in systems with d- or f-electron systems as one usually nds multiple peaks in ℑε
Re Σ with PPM
Re Σ without PPM
Re Σ ( eV ) 0
-40 -20 0 20 40
WARNING ω ( eV )
Avoid PPMs for computing band widths in metals
Avoid PPMs for self-consistent GW calculations, especially when updating the wavefunctions
GW band structures
GW band structures
https://docs.abinit.org /tutorial/gw1/#7-how-to-compute-gw-band-structures
• GW corrections can be computed only for the k-points in the WFK le (k-mesh)
• GW band structures (k-path) require some sort of interpolation technique
• Three methods available:
1. Energy-dependent scissors operator
- t QP corrections as a function of the KS eigenvalues: ϵ QP = ϵ KS + Δ(ϵ KS)
- Easy to implement but rather crude, see this AbiPy example
2. Wannier interpolation
- Accurate but much more complex (requires wannier90 and maximally-localized Wannier functions)
- See tests/wannier90/t03.in and Phys. Rev. B 79, 045109, (2009)
3. Star-function interpolation
- Less accurate than Wannier but much easier to use (same method as in Bolztrap)
- Possible instabilities in the presence of band crossings
• In all methods, QP corrections for all k-points in the IBZ and all the relevant bands are needed.
fi
fi
Star-function interpolation with AbiPy
Corrections are smooth hence easier to interpolate
• Good compromise between accuracy and easiness of use
2
• Can interpolate either QP energies or QP corrections (recommended) PPMODEL 1
Contour Deformation
1
• In brief:
EQP-EKS (eV)
0
1. Compute KS band structure along a high-symmetry k-path
-1
2. Run GW for all k-points in the IBZ and the relevant bands
r.qp_ebands_kpath.plot(with_gaps=True)
fi
fi
s
GW with pseudopotentials
Pseudopotentials in a nutshell
• A pseudopotential (PP) mimics the interaction seen by valence electrons due to the core electrons and
the nucleus
• By construction, the PP reproduces the atomic energies and the valence wavefunctions of the all-electron
(AE) atom outside a certain radius
• AE valence wavefunctions are replaced by pseudized orbitals that are easier to describe in Fourier space
• Advantages of PPs:
- Much smaller cuto energy
- Less electrons involved in the calculation (frozen core approximation)
• Drawback of PPs:
- Cannot reproduce the nodal shape of AE orbitals
- Cannot account for core relaxation e ects
• In GW codes based on PPs, we only compute the valence part of the self-energy. Many-body e ects
due to core electrons are treated at the KS level and imported from the atomic environment
ff
ff
ff
GW with pseudopotentials
Important things to know when using pseudos for GW:
• The matrix elements of Σx are sensitive to the nodal shape of the orbitals
bad logder at
high energy
The PseudoDojo project
http://www.pseudo-dojo.org /
iG·R
e =1
the periodic part of the Bloch’s function can be written:
Unit cell
2
The basis set is truncated such that !k + G!
• < Ecut
2
fi
Convolution theorem
• Density associated with one eigenfunction: nbk (r) =
∗
ubk (r) ubk (r)
!" # !" #
• In Fourier space:
#
∗ −iGr # iG r
nbk (r) = ubk (G)e ubk (G )e =
G G# convolution
"$
∗ # % i(G# −G)r
= ubk (G)ubk (G ) e
GG#
FFT box
• The radius of the G-sphere for n(G) is twice the radius of the
G-sphere used for the wavefunctions
G-sphere for n
Take-home message:
- Product in r-space —> convolution in G-space
- The FFT mesh should enclose the sphere of radius 2 * Gmax to treat the convolution exactly
Plane-wave expansion of two-point functions
• In nite system simulated with Born-von-Karman (BvK) periodic boundary conditions i.e.
(N1, N2, N3) supercell of volume V = NΩ with N = N1N2N3
• G, χ̃, W are invariant if we translate both r and r’ by R, that is: G(r1, r2) = G(r1 + R, r2 + R)
1 i(q+G1)⋅r1 −i(q+G2)⋅r2
V ∑
f(r1, r2) = e fG1G2(q) e
q
G1G2
1
V ∬V
fG1G2(q) = e −i(q+G1)⋅r1
f(r1, r2) e i(q+G2)⋅r2
dr1 dr2
1 1 1
where the q-points belong to the BZ mesh that is dual to the BvK supercell: ( , , )
N1 N2 N3
• If G is expanded with cuto energy ecut then χ̃ = GG will have components up to 4 ecut!
fi
ff
fi
Crystal symmetries and k-points
• Wavefunctions and eigenvalues in the full Brillouin zone (BZ)
can be reconstructed from the irreducible wedge (IBZ)
• S = Rotation, t = fractional translation
!Sk = !k ! = !−k
% −1 &
nk
uk S (r − t)
uSk (r) =
e−iSk·t †
(r) un−k (r)
u nk =
uSk (G) = uk (S G)
e−i(Sk+G)·t −1
unk (G) †
= un−k (−G)
Spatial symmetries
Time-reversal symmetry Irreducible wedge
Take-home message:
- unk(G) are computed and stored only for nkpt k-points in the IBZ
- The higher the number of symmetries nsym, the faster the calculation
- Space group is automatically detected, all symmetries are used by default (kptopt 1)
- ABINIT may nd less symmetries than expected if lattice and positions are not given with enough digits
fi
Exchange + correlated self-energy
• ABINIT computes Σnmk(ω) = ⟨nk | Σ(ω) | mk⟩ for a subset of KS states
• We never compute Σ(r1, r2; ω)
• Diagonal terms Σnk(ω) are enough for G0W0
• Computing GW band structures is not an easy task!
fi
Convolution theorem
• Density associated with one eigenfunction: nbk (r) =
∗
ubk (r) ubk (r)
!" # !" #
• In Fourier space:
#
∗ −iGr # iG r
nbk (r) = ubk (G)e ubk (G )e =
G G# convolution
"$
∗ # % i(G# −G)r
= ubk (G)ubk (G ) e
GG#
FFT box
• The radius of the G-sphere for n(G) is twice the radius of the
G-sphere used for the wavefunctions
G-sphere for n
Take-home message:
- Product in r-space —> convolution in G-space
- The FFT mesh should enclose the sphere of radius 2 * Gmax to treat the convolution exactly
Generation of the WFK le with empty bands
• Expensive step so we recommend to:
getden_ lepath “pre x_DEN"
- start immediately with reasonably large nband (>> 10 nband_occ)
iscf -2 # NSCF
- perform initial convergence studies tolwfr 1e-18
- generate new WFK if nband is not enough nband 1200 # occ + empt
nbdbuf 120 # ~10% of nban
• Use nbdbuf to save a lot of time as high-energy states converge slowly
nstep 100 # default is too small
• The WFK de nes the BZ sampling and the list of k-points where QP
corrections can be computed NB: only the rst nband - nbdbuf
states are converged within tolwfr
Best practices:
• First compute the KS bands to locate the CBM/VBM then select the k-mesh (ngkpt, nshiftk, shiftk) accordingly
• Don’t use datasets to run GW in a single run. Split everything and optimize the number of MPI procs for each step
• The default eigensolver (conjugate gradient, CG) cannot use more than nkpt * nsppol MPI cores
• Use paral_kgb 1 (LOBPCG solver) if ncores > nkpt * nsppol …
fi
fi
fi
fi
y
fi
LOBPCG eigensolver
• More scalable than CG: parallelized over k-points, bands, FFT, spins getden_ lepath "pre x_DEN"
iscf -2 # NSCF
• More di cult to con gure (npkpt, npband, np t, bandpp) tolwfr 1e-18
nband 1200 # occ + empt
• Good news: memory scales with all MPI levels nbdbuf 120 # ~10% of nban
• Use autoparal 1 to let ABINIT nd a good con guration for given nstep 100 # default is too smal
number of MPI cores
paral_kgb
autoparal 1 # Only for GS!
https://docs.abinit.org /tutorial/paral_gspw/
Best practices:
• Use an even number of MPI processors when nsppol 2
• Ideally, the number of MPI cores should be proportional to nkpt * nsppol
• Avoid prime numbers for nband as npband must divide nband
• npband and bandpp can a ect the SCF convergence. Increasing bandpp usually makes the algorithm more stable
ffi
fi
1
fi
fi
ff
y
fi
d
ff
fi
Self-consistency methods
Self-consistency in the GW approximation
• If the initial DFT band structure if not adequate, one should update QP energies or wavefunctions
in the self-consistent cycle
• Full self-consistent GW calculations do not improve over GoWo. Moreover Σ is not hermitian and
energy dependent
8 MgO
AlN
ZnO,GaN
Calculated gap (eV)
CaO
ZnSe,CuBr
InP,GaAs,CdTe
6 SrO
ZnTe,CdS
ZnS
diamond
Cu2O
4
InN,GaSb
InSb,InAs
AlAs,GaP,SiC,AlP
2
HgTe
AlSb,Se
Si
Ge,CdO
0 P,Te
0 2 4 6 8
Experimental gap (eV)
[adapted from van Schilfgaarde et al., PRL 96, 226402 (2006)]
Quasi-particle SCGW (II)
Left: LDA and G0W0 results for the band gap. Right: QPSGW results
Much more CPU demanding than one-shot GW due to the o -diagonal terms ⟨i|Σ|j⟩
The QP corrections must be calculated for all k-points and all occupied states
Check whether the chosen KS basis set is exible enough
fl
ff
Quasi-particle SCGW (I)
PRL 96 226402 (2006)
k 1 !
k k
"
Σ̃ij ≡ Herm Σij (!ik ) + Σij (!jk )
2
QP
! "
Equations are solved self-consistently ĥQP GW Ψ QP
,E QP n , Σ̃
QP
!
QP states are expanded in terms of KS states (QPS le) |Ψmk ! = k
Unm |ΨKS
nk !
n