0% found this document useful (0 votes)
59 views

Implementation of Multi-Expansion Point Model Order Reduction For Coupled PEEC-Semiconductor Simulations

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
59 views

Implementation of Multi-Expansion Point Model Order Reduction For Coupled PEEC-Semiconductor Simulations

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Implementation of Multi-Expansion Point Model

Order Reduction for Coupled PEEC-Semiconductor


Simulations
Valon Blakaj Bawar Jalal Paul L. Evans
PEMC Research Group PEMC Research Group PEMC Research Group
2022 IEEE Design Methodologies Conference (DMC) | 978-1-6654-7999-8/22/$31.00 ©2022 IEEE | DOI: 10.1109/DMC55175.2022.9906539

University of Nottingham University of Nottingham University of Nottingham


Nottingham, UK Nottingham, UK Nottingham, UK
[email protected] [email protected] [email protected]

Abstract—An algorithm for the generation of reduced order show that the time taken to create these reduced order models
PEEC models is presented, this algorithm addresses the addi- is strongly dependent on the required model bandwidth and
tional complexity resulting from use of multiple expansion points. propose a methodology to minimise model generation time.
Nested iterative solvers for the L and P PEEC sub-matrices are
used and these solvers are accelerated using multipole expan- A. Modelling Environment and Demonstration Case
sions. The reduced order models are validated in the frequency
domain against commercial finite element software, and time- An in-house software application developed for virtual
domain co-simulation with accurate semiconductor models is then prototyping of of power electronic systems [1] was used for
demonstrated. It is shown that a coupled, 3D PCB mounted this work. It has an implementation of the PEEC method
inductor and semiconductor co-simulation with 100,000 time-
steps can be completed in 23 minutes including reduced order for electromagnetic simulation that can account for resistive,
model generation. inductive and capacitive parasitics. It has the ability to
Index Terms—Power Electronics, Reduced Order Modelling, generate reduced order surrogate models for time-domain
PEEC simulation, and can couple these models with SPICE models
of semiconductor components for time-domain simulation. It
I. I NTRODUCTION is also possible to simulate the PEEC models in the frequency
domain for ease of comparison with commercial tools.
Understanding the impact of 3D design geometry on power
electronic semiconductor switching behaviour is essential to
A PCB integrated inductor structure in Fig. 1 is used for
allow the optimal design of wide-bandgap semiconductor
demonstration and validation, it consists of a winding mounted
based systems. Time-domain simulations that capture the in-
on a Insulated Metal Substrate (IMS) PCB, encapsulated in
teraction of accurate semiconductor models with 3D models of
potting compound, and is designed as a high-frequency EMI
electrical conductor networks and integrated filter components
filter for GaN devices. The capabilities of the reduced order
such as inductors are needed. The Partial Element Equivalent
modelling methods were demonstrated in [2], however this
Circuit (PEEC) method has proven to be effective for the
work presents methods for reducing the time taken to generate
creation of 3D electromagnetic models that can be easily
these models.
coupled with equivalent circuit models of semiconductors and
other components. B. PEEC Matrix Formulation
The main challenge with this approach is minimising simu-
A PEEC model of the inductor winding is generated by
lation speed - PEEC models of complex geometries result in
first discretising the geometry to obtain a series of mesh cells
large, typically O(104 ), simulation matrices with an extremely
containing surface panels and internal conductors. The PEEC
dense structure due to the integral formulation of the numerical
method is then applied to determine relationships between the
method. For time-domain simulations containing switching
charge at each of the surface panels, rate-of-change of current
semiconductor models, these equations must be solved at
in each of the conductors, and the potential at a set of solution
many time-steps to capture semiconductor behaviour. This
nodes. These relationships are represented as a sub-matrix
simulation speed challenge can be resolved using reduced or-
of partial potential coefficients (inverse of capacitance, P ),
der surrogate models (ROM) for the time-domain simulation,
a sub-matrix of partial inductance coefficients (L), conductor
however the original, large model must be solved in order to
resistances (R), and a connectivity sub-matrix (G). These sub-
create these reduced order simulation models. In this work we
matrices form an equivalent LCR circuit that approximates
This work was supported by the Engineering and Physical Sciences the electromagnetic behaviour of the geometry, and which can
Research Council, through grant EP/R513283/1 under grant EP/R004390/1. be coupled to other equivalent circuit models through a set

Authorized licensed use limited to: HANGZHOU DIANZI UNIVERSITY. Downloaded on April 23,2023 at 06:24:20 UTC from IEEE Xplore. Restrictions apply.
of defined input voltages (vin ) and output currents (iout ) at • The C sub-matrix is nout × n and sparse
a small number of terminals, D and C are sub-matrices that Methods for computation of the L and P matrix coefficients
couple the inputs and outputs to the model(1). These equations are described elsewhere [3], but it is important to note that
can be solved for voltage and current distribution in the model. the L sub-matrix is typically around 50% dense for an
orthogonal mesh (50% of its entries are non-zero) and the P
sub-matrix 100% dense. Computing these entries is extremely
computationally intensive, as an example for a model with
nc = 10, 000 and ns = 10, 000, 50 × 106 L entries and
100 × 106 P entries must be computed using analytical or
numerical integration routines.

Coupling these equations directly with semiconductor cir-


cuit models is impractical as the large, dense L and P − 1 sub-
matrices must be solved at each time-step and the P matrix
must be explicitly inverted during initialisation or implicitly
solved at each time-step. Combining this model with non-
linear semiconductor models directly results in equations that
cannot be solved efficiently.

C. Model Order Reduction Methods


One solution to this is the use of model order reduction
(MOR) algorithms that are applied to the PEEC matrices to
(a) PEEC mesh of PCB integrated inductor component with nc
inductive conductor cells and ns capacitive surface panels produce a smaller, reduced order model (ROM) that can be
coupled with the semiconductor models for the time-domain
simulation. This front-loads the computation effort involved
in solving the PEEC method to allow efficient time-domain
simulation with many time-steps whilst preserving the 3D
current density prediction methods of the PEEC method [1].
For this work, Krylov Subspace MOR methods are used,
and these algorithms produce an orthogonal transform matrix
V (n×m), where m is the reduced order model size - typically
O(10). This matrix can be used for transform the original
matrices (shown in simplified notation in (2)) to produce the
(b) Equivalent PEEC circuit for part of mesh reduced order model (3) according to (3). The ROM can be
Fig. 1: PEEC model of inductor winding solved in time- or frequency- domain and the reduced order
solution expanded using V to obtain the original 3D solution
for quantities such as current density.
 −1             
P 0 v̇ 0 G v D M ẋ = K x + F vin iout = O x (2)
= + v
0 L i̇ −GT R i 0 in
  (1)
0 v
iout =
C i [Mr ] = V T [M ] [V ] ; [Kr ] = V T [K] [V ] ;
   
(3)
[Fr ] = V T [F ] ;
 
Some of the sub-matrices in this system have size and [Or ] = [V ] [O]
structure that makes the system difficult to solve efficiently.
• The L sub-matrix is dense and nc ×nc in size, where nc is
        v
the number of current carrying conductors and typically Mr x˙r = Kr xr + Fr vin iout = Or xr = [V ] x
i
O(104 )
(4)
• The P sub-matrix is dense and ns × ns in size, where
ns is the number of surface panels and also typically Details of Krylov Subspace based MOR algorithms are not
O(104 ). This sub-matrix cannot be used directly in the provided here (but can be found in [2, 4, 5]), however it is
PEEC equivalent circuit, but must be inverted to obtain important to note that the V transformation matrix constructed
a capacitance sub-matrix. column-wise by a series of iterations which involve a matrix-
• The G and R sub-matrices combine to form a sparse n×n vector product on the dense matrix M and a matrix solve
matrix (where n = nc + ns ). on the sparse matrix K (5). The second, sparse operation can
• The D sub-matrix is n × nin and sparse be implemented efficiently using easily available sparse matrix

Authorized licensed use limited to: HANGZHOU DIANZI UNIVERSITY. Downloaded on April 23,2023 at 06:24:20 UTC from IEEE Xplore. Restrictions apply.
solvers. The first dense-matrix product operation will dominate
the model generation time.

   
K Vi+1 = M Vi (5)

D. Multiple expansion point MOR


Recently it has been shown that standard MOR methods
can fail to accurately capture the full frequency range of the (a) Level 0
original PEEC model and that multiple, shifted applications
of the MOR algorithm can be used to increase the reduced
order model bandwidth [2]. Shifted MOR applications move
the MOR expansion point by an offset of σ to allow the capture
of higher frequency effects. The operations involved at each
MOR iteration are now (6).

    (b) Level 1 (c) Level 2


K + σM Vi+1 = M Vi (6)
Fig. 2: Mutual coupling between n conductors or surface
The critical difference is that a matrix solve is now required panels using FMM methods: Level 0 is direct evaluation- n
on a matrix involving the dense matrix (K+σM ) which signif- relationships, Level 1 evaluates via 2 cell midpoints - 2 × (2n)
icantly increases the computational effort required to generate relationships. Level 2 evaluates via 2 cell midpoints - 2 × (n)
the model. The challenges associated with this operation, and relationships.
a potential solution, are the subject of this paper.
II. FAST M ULTIPOLE M ETHOD E XPANSION FOR L AND P
A. The M matrix-vector product
MATRIX - VECTOR PRODUCTS
In both the shifted and non-shifted cases, the M matrix
The following sections explain the multiple-expansion point product required in the MOR algorithm (right-hand side of
MOR method implementation, however first a brief overview (6)) can be decomposed as two sub-matrix operations on P
of the Fast Multipole Method (FMM) expansion of the L and and L (7):
P matrices is needed.  −1 
Evaluating the L and P matrices directly involves computing P 0
y= Vi (7)
n2c and n2s values respectively leading to large memory re- 0 L
quirements and computation time. A FMM expansion involves A matrix-vector product on the L sub-matrix and a solve on the
decomposing the relationships between any two conductors or P sub-matrix must be performed. The L matrix-vector product
surface panels using Taylor Series about a series of expansion can be evaluated using a FMM expansion. The P −1 solve
points. A high level overview is given in Fig. 2. Rather can be evaluated using the GMRES algorithm [6]. GMRES
than evaluating all inter-conductor relationships directly, the is an iterative matrix solver that uses an orthogonal Krylov
conductors are grouped and relationships evaluated via the Subspace projection to solve P y = Vi using a series of
group midpoints which reduces the number of PEEC integral iterations involving a matrix-vector product on the matrix to
evaluations from being ∝ n2 to ∝ n for a fixed number be solved, P . The algorithm effectively converts a matrix solve
of groups. The grouping is evaluated dynamically based on into a series of matrix-vector products which can be evaluated
required accuracy and the implementation used effectively using an FMM expansion.
converts the original dense PEEC matrix to two smaller sparse
matrices that relate conductors to midpoints, and midpoints B. Preconditioning the P −1 matrix solve
to conductors. These two sparse matrices can be used to In general, a preconditioner is required to control conver-
approximate the result of a matrix-vector operation on the gence of GMRES. A preconditioner is a matrix whose inverse
original dense matrix, this significantly reduces computational approximates that of the matrix to be solved (P in this case),
effort and memory requirements for algorithms where matrix- but that can be solved efficiently. The inverse of the precondi-
vector product operations are required. tioner is evaluated at each iteration to accelerate convergence
using a direct matrix solver, from the Lapack library in this
III. M ATRIX PARTITIONING AND OPTIMAL SOLVER case. In this work, a block diagonal preconditioner with blocks
SELECTION derived from the FMM expansion point partitioning (Fig. 2) is
Whilst the MOR methods accelerate the time-domain sim- used. This blocks on the diagonal are formulated by directly
ulation due use of a ROM, generating this ROM is computa- evaluating the relationship between all surface panels within a
tionally expensive as it involves solving (5) if a DC expansion cell grouping, for higher level preconditioners the number of
point is used, and (6) for all other expansion points. panels in a group increases and therefore so does the size of

Authorized licensed use limited to: HANGZHOU DIANZI UNIVERSITY. Downloaded on April 23,2023 at 06:24:20 UTC from IEEE Xplore. Restrictions apply.
the diagonal blocks. The preconditioner can be solved easily / / m = index of next column in V
by solving each diagonal block in turn. If the diagonal blocks m = 0;
are small, this operation is computationally efficient. / / F o r e a c h e x p a n s i o n p o i n t c h o s e n ( σk )
−1 f o r ( k = 0 ; k < nexp ; k ++) {
C. The (K + σM ) matrix solve / / Compute v e c t o r s a t e x p a n s i o n p o i n t
This solve is the left-hand side of (6). For the case where f o r ( i = 0 ; i < nbvec ; i ++) {
σ = 0 (the single expansion point case, where (6) reduces to / / hC o m
i p u t e new c o l u m n ( s o l v e ( 6 ) )
(5)), this solve is trivial and a direct sparse matrix solver can y = M Vm−1 ;
be used to solve the K sub-matrix, however for the multiple h i−1
expansion point MOR algorithm which is required for high Vm = K + σk M y;
bandwidth models, σ is non-zero for all but the first expansion / / O r t h o g o n a l i s e new c o l u m n
point. In this case, this solve can be evaluated using GMRES f o r ( j = 0 ; j < m; j ++) {
which converts the matrix solve to a series of matrix-vector ϵ = VjT Vm ;
products. These matrix-vector products consist of a sparse Vm = Vm − ϵVj ;
matrix-vector product on the K matrix and a FMM accelerated }
matrix vector product on the σM matrix, which itself contains / / check for convergence
a GMRES solve on the P sub-matrix, as described in the i f ( |Vm | < t o l ) b r e a k ;
previous section. / / Normalise
Vm = Vm / |Vm ;
D. Preconditioning the K + σM matrix solve m++; / / i n c r e a s e c o l u m n i n d e x
For small values of σ, the K matrix dominates K + σM . }
Since K is sparse, it can easily be factored once using a direct } h i
sparse LU factorisation algorithm and then used as the precon- V = V0 V1 V2 V3 ... Vm
ditioner for the iterative K + σM solve. This preconditioner
is the exact inverse of K + σM for a DC expansion point and
Fig. 3: Pseudo-code for critical operations in MOR algorithm.
so GMRES will converge on the first iteration. For shifted
The matrix operations dominate computation time. The matrix
expansion points, convergence performance deteriorates with
vector product requires iterations to compute P −1 Vm−1 since
increasing σ however performance was found to be acceptable
M is known in terms of P −1 and  L (7). The −1matrix solve
for values of σ used in this work.
requires iterations to compute K + σk M y for cases
E. Summary and algorithm where the shift, σk , is non-zero. Each of these iterations also
require internal iterations to evaluate P −1 y.
The solves described in the previous sections result in a
Krylov Subspace projection algorithm with nested iterative
matrix solves: a outer K + σM matrix solve, with a P
at 10kHz, 30kHz, 50kHz, 75kHz.
sub-matrix solve within each outer iteration. This allows the
The ROMs consist of matrices with dimensions < 10, com-
entire MOR process to proceed utilising matrix vector product
pared with over 50, 000 for the original PEEC model. The
operations on the dense L and P sub-matrices which can
exact ROM size depends on the point at which the convergence
be efficiently evaluated using FMM expansions, even when
tolerance was met.
non-zero expansion point shifts are used. This avoids the
need to explicitly form the dense sub-matrices and results
A. Frequency domain validation
in a computational effort, and memory usage, efficient MOR
algorithm. They key steps in each iteration are shown in Fig. The ROMs can be simulated in time- or frequency- domain.
3, initialisation steps are not shown but more detail is available For initial validation, a frequency domain simulation was
in [2]. performed to extract the impedance of the inductor. A current
source is used to drive current through the winding, with one
IV. R ESULTS end of the winding structure, and the IMS PCB back-plane
The PEEC method and solvers are implemented in our both grounded. ROM are compared with a Z-parameter result
own power electronic virtual prototyping software that allows obtained using the full-wave solver in Ansys HFSS (Fig. 4).
coupled simulation with SPICE models. A volume mesh con- The single, DC expansion point ROM is unable to capture
taining nc = 31, 504 conductors and a surface mesh containing high frequency effects. Dual expansion point models can
ns = 24, 944 surface panels is used to generate the PEEC capture effects up to the second resonant point, however
model of the component in Fig. 1. The potting compound and accuracy is sensitive to the choice of expansion point. The
IMS dielectric layer are modelled using a global ϵr = 4 to use of additional expansion points can reduce the influence
avoid the need to explicitly model dielectric regions. ROM of expansion point location on model accuracy, although this
were generated for a single DC expansion point, and dual isn’t addressed in this work.
expansion points with the first at DC, and a subsequent point

Authorized licensed use limited to: HANGZHOU DIANZI UNIVERSITY. Downloaded on April 23,2023 at 06:24:20 UTC from IEEE Xplore. Restrictions apply.
(a) 100Hz

(b) 10kHz

Fig. 4: Impedance validation of single point and multi-point


ROM against CST Studio full-wave solution

3D field plots can also be extracted from the small ROM.


Plots for volume current density are shown in Fig. 5. Accuracy
is acceptable over all frequencies shown (100Hz to 100MHz),
and capacitive coupled currents in the grounded back-plane (c) 1MHz
can be seen at higher frequencies. The error observed can be
further reduced with mesh refinement however the meshing
code implementation available for this work made further
refinement difficult and software improvements are needed.

B. Reduced order model generation times


1) Convergence of outer matrix solve: The outer ma-
trix solve involves iterations that evaluate the product of
−1
(K + σM ) with a vector, with each iteration computing the (d) 100MHz
product of (K + σM ) with a vector. Evaluating the product
of σM with a vector requires inner iterations to solve P −1 .
The number of iterations required for this outer iteration, for
different values of σ, is shown in Table I. The preconditioner Fig. 5: Volume current density (A/m2 ) - Ansys HFSS (Left)
used is an exact inverse of K + σM for the case σ = 0 and vs frequency domain solve of reduced order model (Right)
so GMRES solve requires a single iteration. For larger shifts,
the preconditioner is only an approximation to the inverse and
so the number of iterations required increases. For all cases, −1
the (K + σM ) matrix solve to evaluate a P −1 solve, which
the number of iterations required no more than 17 .
is also preconditioned. The choice of preconditioner level
The time taken to perform this solve is dependent on the
affects the P −1 solve time, which in turn affects the ROM
number of iterations needed to evaluate the inner P −1 matrix
generation time. Table II shows a summary of these times,
solve.
including: time taken to initialise the FMM solvers for P and
TABLE I: Number of outer iterations for convergence of L; time taken for factor K matrix (this also acts as the outer
−1 solve preconditioner); time taken to generate and factor the
(K + σM ) solve
P −1 solve preconditioner; number of iterations needed for the
0Hz 10kHz 30kHz 50kHz 75kHz P −1 solve; time taken to perform P −1 solve; time for a single
−1
1 5 12 15 17 outer (K + σM ) solve for the case where σ = 30kHz; and
total ROM generation time for a dual DC+30kHz expansion
point ROM including the time taken to apply the transform
2) Convergence of inner, P , matrix solve and resulting matrix and generate the ROM matrices.
model generation times: Inner iterations are required within Increasing the preconditioner level increases the effectiveness

Authorized licensed use limited to: HANGZHOU DIANZI UNIVERSITY. Downloaded on April 23,2023 at 06:24:20 UTC from IEEE Xplore. Restrictions apply.
of the preconditioner, but also the size of the blocks in
its block-diagonal structure. This increases the time taken
to compute the blocks and also the time taken to factor
them. The number of iterations required for the P −1 solve
convergence decreases with increasing level, however the solve
time decreases initially and then increases again. This is due
to the computational effort required to evaluate the product
of the preconditioner and a vector at each iteration, which
counteracts the benefit of reduced iterations.
The minimum model generation time for this problem was
1343 seconds or 22.5 minutes for the Level 2 case. The
reduced order model matrices had dimensions 7x7, this con-
sists of 3 initialisation iterations, 2 outer iterations at the DC
expansion point, and 2 outer iterations at the 30kHz expansion Fig. 6: Current density distribution and terminal current pre-
point. All times were obtained using an AMD Ryzen Pro7 dicted by the coupled simulation.
laptop.

TABLE II: Iterations required for inner, P −1 solve conver- and evaluated. It is shown that using this algorithm, a 100,000
gence for different preconditioners and resulting solve times. time-step simulation of a coupled 3D inductor and switching
semiconductor model can be evaluated in less that 23 minutes
Level1 Level2 Level3 Level4 including reduced order model generation, and subsequent
FMM setup (s) 509 509 509 509 time-domain simulations that can reuse the reduced order
Factor K matrix (s) 4 4 4 4 models can be evaluated in less than 4 seconds.
P Prec. gen. (s) 0.08 40.5 99.5 292
N inner Iterations 650 320 245 137 R EFERENCES
Inner solve (s) 74 27 30.6 68.5 [1] P. L. Evans, A. Castellazzi, and C. M. Johnson, “Design
Outer solve (s) 927 362 399 890 tools for rapid multidomain virtual prototyping of power
Total (s) 2520 1343 1479 2662 electronic systems,” IEEE Transactions on Power Elec-
tronics, vol. 31, no. 3, pp. 2443–2455, 2016.
[2] X. Gao, P. Evans, M. Johnson, and K. Li, “Multi expansion
C. Time-domain simulation with coupled semiconductor mod- point reduced order modelling for electromagnetic design
els of power electronics,” in 2021 IEEE Design Methodolo-
To demonstrate the capability of the ROM for coupled time- gies Conference (DMC), 2021, pp. 1–6.
domain simulation, the inductor is connected in series with a [3] A. Ruehli, G. Antonini, and L. Jiang, Circuit Oriented
30Ω load resistor and driven by a half-bridge consisting of Electromagnetic Modeling Using the Peec Techniques.
two GaN HEMT models [7] switching at 1MHz, to a final IEEE, 2017.
10µs end time, with 0.1ns fixed time-steps. The ROM with [4] T. Bechtold, E. B. Rudnyi, and J. G. Korvink, Fast
expansion points at DC and 30kHz was used and 100,000 time- Simulation of Electro-Thermal MEMs. Freiburg: Springer,
steps were evaluated, the total simulation time was 1347s. This 2006.
time is dominated by the model generation time of 1343s, [5] A. Odabasioglu, M. Celik, and L. T. Pileggi, “Prima:
with only 4 seconds required for time-stepping. Providing passive reduced-order interconnect macromodeling algo-
the 3D geometry does not change, the ROM can be reused rithm,” Computer-Aided Design of Integrated Circuits and
for further time-domain simulations without needing to be Systems, IEEE Transactions on, vol. 17, no. 8, pp. 645–
regenerated. All circuit waveforms and 3D results such as 654, 1998.
current density are available at all time-steps if required. [6] Y. Saad and M. H. Schultz, “Gmres: A generalized
Example time domain waveforms for the induced winding minimal residual algorithm for solving nonsymmetric
voltage are shown in Fig. 6. Experimental validation is needed linear systems,” SIAM Journal on Scientific and Statistical
to verify the accuracy of the combined models however the Computing, vol. 7, no. 3, pp. 856–869, 1986. [Online].
results are sufficient to illustrate the computational efficiency Available: https://doi.org/10.1137/0907058
of the methods described for coupled time-domain simulation. [7] K. Li, P. L. Evans, C. M. Johnson, A. Videt,
and N. Idir, “A gan-hemt compact model including
dynamic rdson effect for power electronics converters,”
V. C ONCLUSIONS Energies, vol. 14, no. 8, 2021. [Online]. Available:
An algorithm for generation of multiple expansion point https://www.mdpi.com/1996-1073/14/8/2092
reduced order models using using nested, iterative solvers and
accelerated using multi-pole expansions has been presented

Authorized licensed use limited to: HANGZHOU DIANZI UNIVERSITY. Downloaded on April 23,2023 at 06:24:20 UTC from IEEE Xplore. Restrictions apply.

You might also like