0% found this document useful (0 votes)

19 views

A Metaheuristics-Based Hyperparameter Optimization

This document discusses using metaheuristics to optimize hyperparameters for deep learning models to solve beamforming design problems in 6G wireless systems. Specifically, it proposes using metaheuristics to optimize hyperparameters for deep learning models to maximize spectral efficiency in millimeter wave communications. The research results show that models optimized with the proposed metaheuristics approach achieve higher spectral efficiency than models optimized with other approaches or empirical trials. Metaheuristics are well-suited for hyperparameter optimization because they can balance exploration and exploitation during search and provide acceptable solutions efficiently for complex, non-convex problems.

Uploaded by

John Watson

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views

A Metaheuristics-Based Hyperparameter Optimization

Uploaded by

John Watson

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

This article has been accepted for publication in IEEE Access.

This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3277625

A Metaheuristics-Based Hyperparameter
Optimization Approach to Beamforming Design
KIEU-XUAN THUC , (MEMBER, IEEE), HOANG MANH KHA , (MEMBER, IEEE),
NGUYEN VAN CUONG , AND TONG VAN LUYEN , (MEMBER, IEEE)
Hanoi University of Industry, Hanoi 100000, Vietnam

Corresponding author: Tong Van Luyen ([email protected]).

ABSTRACT The paradigm shift from “connected things” to “connected intelligence” is anticipated to be
made possible by the sixth-generation wireless systems, which typically use millimeter wave beamforming
to mitigate the significant propagation loss. However, beamforming design in millimeter wave
communications poses many different challenges owing to the large antenna arrays with the limitation of
radio frequency chains and analog beamforming architectures. To circumvent this problem, deep learning
models have recently been utilized as a disruptive method for solving difficult optimization problems in sixth-
generation mobile systems, such as maximizing spectral efficiency. However, it is still unclear how to
produce high-performance deep learning models which require considering appropriate hyperparameters.
This study proposes a metaheuristics-based approach for optimizing hyperparameters that are used to build
optimized deep learning models to maximize spectral efficiency. The research results demonstrate that the
proposed approach-based models establish higher spectral efficiency than the state-of-the-art approach-based
models and the reference model whose hyperparameters are based on empirical trials.

INDEX TERMS Hyperparameter optimization, beamforming, metaheuristics, millimeter wave, large-scale

antenna arrays.

I. INTRODUCTION Large-phased arrays are typically used in mmWave

Since the first generation of mobile telecommunications was communication to mitigate the significant propagation loss
introduced in the 1970s, wireless communication technology using mmWave beamforming, which includes hybrid or
has advanced incredibly quickly. By 2030, newly developed analog beamforming when one or several radio frequency
data-hungry applications and a greatly expanded wireless (RF) chains are present. Analog and hybrid beamforming are
network will have required the use of the sixth generation bound by the constraint of constant modulus since only phase
(6G) communication, which is a significant improvement shifters are used to adjust excited antenna weights [2]. Fully
over other network generations and might cover nearly the digital beamforming systems are impractical for
entire surface of the earth as well as the vicinity of space. In mmWave/Sub-THz frequency because each antenna element
addition, as the number of wireless consumer devices and the requires a specialized RF transceiver chain, which is neither
Internet of Things grows rapidly, the amount of mobile data cost-effective nor energy-effective to construct for large-
transfer nearly doubles every year, surpassing that of cable scale arrays and bandwidths [3]. In addition, analog
communication. Therefore, in the future 6G network, beamforming, in which phase shifting is accomplished in the
millimeter wave (mmWave) technology will play a analog domain, has been frequently used owing to its
significant part in attaining the anticipated network affordable price and ease of implementation [1].
performance and communication responsibilities with The needs for the 6G systems have necessitated the
greater speed and reliability than previous generations of granular optimization of radio resources and the efficient
networks [1]. Millimeter-wave communication with acquisition of network-related data [4]. Due to the huge size,
gigahertz or tens of gigahertz bandwidths is also viewed as a high density, the varied quality of services, and integrated
possible technology for 6G wireless systems. multi-functional cross-layer architecture, 6G optimization
Communication in these bandwidths will ease the spectrum problems might be exceedingly complex and time-sensitive,
deficiency and capacity constraints of existing wireless posing many challenges to the development of effective
systems [2].

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3277625

optimization algorithms. Deep learning (DL) has recently global optimization due to the fact that they typically treat
been utilized as a disruptive method to solve difficult the problem as a black box and are therefore flexible and
optimization problems in 6G and to support a number of easy to implement. In addition, these optimizers have no
artificial intelligence services and the Internet of Everything stringent mathematical criteria (e.g., differentiability,
applications [4]. It has also been proven to be a useful tool smoothness), making them acceptable for problems with
for dealing with difficult non-convex problems and high- various features, such as discontinuities and nonlinearity [8].
computability concerns owing to its excellent recognition A metaheuristic is considered a potential solution to
and representation capabilities [5]. To enable a paradigm optimization problems if it can strike a tradeoff between
shift from traditional optimization theory-based approaches exploration (diversification) and exploitation
for employing more promising DL architectures, DL-based (intensification). Exploitation is required to find regions of
optimization algorithm design aims to achieve near-optimal the search space that contain solutions of high quality.
performance with excellent computing efficiency for Exploration is necessary to intensify the search in some
challenging large-scale optimization problems in 6G systems prospective regions based on gathered search knowledge [9],
[4]. In particular, superior performance, scalability and [10]. Metaheuristics are aimed at obtaining acceptable
generalizability, computational efficiency, and robustness solutions in a realistic running time and providing practical
are some benefits of using DL for large-scale optimization. solutions to a variety of problems [11], [12]. Metaheuristics
Hyperparameters, however, allow the performance of the have also gained appeal over exact methods for addressing
DL approach to be greatly tuned. The values of these optimization issues due to the ease and robustness of the
parameters must be carefully chosen in order to get the best solutions they give in a range of sectors, including
performance because they typically have a significant impact engineering, business, transportation, and even the social
on the learner’s complexity, behavior, speed, and other sciences. The metaheuristic community has also conducted
aspects. Human trial-and-error selection of these values is substantial research, which includes the development of
time-consuming, prone to error, frequently biased, and novel methods, applications, and performance evaluations
computationally impossible to reproduce unreproducible. As [13], [14].
the mathematical formulation of hyperparameter It can be seen that DL and metaheuristics both provide
optimization (HPO) is basically black-box optimization with their own distinct advantages, but what is missing from the
higher-dimensional spaces, it is preferable to transfer this past studies is a comprehensive approach to utilizing DL in
task to suitable algorithms in order to improve efficiency and the context of beamforming design. Our study contributes to
guarantee reproduction [6]. Over the past 20 to 30 years, finding solutions for beamforming design based on the
numerous HPO strategies have been developed to facilitate combination of metaheuristics and DL in a manner that
and automate the search for hyperparameter combinations facilitates synergy between the two approaches. Specifically,
with optimal performance. However, more advanced HPO we propose an HPO approach utilizing metaheuristics for
techniques are not utilized as frequently as they could (or designing beamforming in mmWave communication
should) be. This may be due to a combination of the systems. By applying this approach, obtained
following reasons [6]: (i) a lack of understanding of HPO hyperparameters can be used to build DL models with high
techniques by prospective users, who could consider them as performance. The proposed approach-based model has
complicated “black boxes”; (ii) low belief among proved to outperform the state-of-art approach-based model
prospective users in the superiority of HPO processes over [15] and the reference model in [16] with respect to spectral
rudimentary methods, resulting in doubt over the anticipated efficiency, convergence characteristics, and computational
return on investment (time); (iii) the absence of guidance on time.
the selection and configuration of pertinent HPO approaches The structure of this study is as follows. Section II
to the issue at hand; (iv) difficulty accurately defining the discusses related studies on HPO and DL-based
search space of HPO approaches. The primary objective of beamforming design for mmWave systems. Section III sheds
HPO is to automate the process of searching light on DL-based beamforming design in mmWave
hyperparameters and enable users to utilize optimized DL communication systems. Section IV formulates the HPO
models for real-world problems. A DL model’s optimal problem based on metaheuristics and introduces an
model architecture is expected to be attained using an HPO algorithm for optimizing hyperparameters. Results and
procedure. To effectively utilize HPO approaches, it is comparative analysis are shown in Section V, and the
essential to choose an appropriate optimization strategy to discussion is presented in Section VI in Section VII.
identify optimal hyperparameters. Numerous HPO problems
are non-convex or non-differentiable optimization problems. II. RELATED WORK
Therefore, traditional optimization approaches dealing with Recently, some HPO techniques have been developed with
these HPO problems might lead to a local solution rather than their own merits and demerits. Grid search (GS) is a
a global solution [7]. straightforward approach, but it suffers from the
Though traditional optimization algorithms can be dimensionality curse and takes a long time [17], [18]. In
effective for the local search, metaheuristic algorithms, also comparison to GS, random search (RS) is more effective and
known as metaheuristics, have significant advantages for supports all kinds of hyperparameters. In real-world

applications, RS evaluation of the hyperparameter values productive applications namely designing beamforming for
chosen at random enables analysts to search a wide area. weighted sum-rate maximization [22], predicting the optimal
However, as RS does not take the outcomes of earlier tests transmit/receive beam pairs by utilizing DL models as the
into account, it may include numerous pointless evaluations, role of hybrid precoding [23], using an autoencoder DL
which reduces its effectiveness [7], [18]. The iterative model to improve hybrid precoding [24], leveraging deep
Bayesian optimization (BO) algorithm is a well-liked reinforcement learning for beamforming [25]. A technique
solution to HPO problems. In contrast to GS and RS, BO based on convolutional neural networks for joint antenna
bases the next hyperparameter value on the outcomes of prior selection and beamforming is proposed [26]. Works [16],
evaluations in order to cut down on pointless assessments [27] have shown that in comparison to conventional
and increase efficiency. As a result, BO needs fewer approaches, DL approaches are computationally more
iterations to find the ideal set of hyperparameters than GS efficient in their search for optimum beamformers and
and RS. However, it is challenging to parallelize BO models tolerant of imperfect channel inputs. Based on compressive
since they operate sequentially to balance the search for channel data learned by deep auto-encoders, the work [23]
unexplored areas and the utilization of currently tested has designed beamformer vectors. BSs that collect the
regions [7]. Although GS, RS, and BO are frequently used to mobile user’s omni-beampatterns for codebook-based
configure hyperparameters, they are unworkable when the beamforming have been taken into account for the DL-based
complexity of the problem and the number of parameters are wideband beamforming in [28]. Moreover, in the case of
high. Both Hyperband and RS offer simultaneous assuming perfect channel covariance matrix knowledge at
executions, but Hyperband can be considered an enhanced the transmitter, DL-based statistical hybrid beamforming is
form of RS. Hyperband is more effective than RS, especially studied in [29]. mmWave multiple-input multiple-output
when time and resources are at a premium. It balances model systems can considerably benefit from the application of DL
performance with resource utilization. GS, RS, BO, and approaches to their essential components, as evidenced by
Hyperband treat each hyperparameter independently and do these works. However, hyperparameters in these DL models
not take into account hyperparameter correlations. This is a are all determined experimentally or not based on any
significant limitation for any of these approaches. They will principles at all. Therefore, HPO approaches for DL models
therefore be ineffective in logistic regression, support vector in mmWave communication problems are imperative.
machines, and density-based spatial clustering of noisy
applications, which are all DL algorithms [7]. III. DL-BASED BEAMFORMING
In addition, to automate the search for DLs’ designs and A. SYSTEM MODEL
settings, researchers have also presented new studies based The downlink of narrowband multiple-input single-output
on metaheuristic optimization techniques. The differential mmWave systems using analog beamforming architectures
evolution approach was used in the work [17] to give a in Fig. 1 is studied, in which base stations with a single RF
framework for automating the search for long short-term chain and N t antennas transmit one data stream to a user
memory hyperparameters, such as the number of hidden equipped with a single antenna [16]. Let s represent the
neurons and batch size. The experimental findings
symbol with normalized average symbol energy throughout
demonstrated that the system’s average accuracy, which was
transmission. The symbol is multiplied by a scalar digital
based on an optimized long short-term memory network
precoder D ( D is a scalar because there is only one RF
using differential evolution and particle swarm optimization
algorithms, improved dramatically over time. Besides, the chain) before being multiplied by an N t  1 analog precoder
work [19] trained DL by adjusting its parameters for the vector ( v RF ) that is used by phase shifters. The final signal
vehicle logo recognition system. The learning rate, the
after precoding is x = v RFD s .
number of filters, and the size of the filters, in each
convolutional layer, were all optimized hyperparameters. The received signal through the mmWave channel is
They claimed that when compared to existing manual feature denoted as r = h†channel vRFD s + n , where n is the additive
extraction techniques, the DL framework optimized by white gaussian noise satisfying the circularly-symmetric
particle swarm optimization obtained more accuracy. A complex normal with zero mean and covariance  2 , h†channel
hyper-heuristic parameter optimization approach was
is mmWave channel vector between the base station and the
proposed in the work [20] for configuring deep belief
user, and † denotes Hermitian transpose. With one line-of-
network parameters. On the MNIST, CalTech 101
Silhouettes, and Semeion datasets, this approach was sight path and L −1 non-line-of-sight paths, the widely
contrasted with various metaheuristic algorithms such as employed Saleh-Valenzuela mmWave channel is expressed
particle swarm optimization. In almost all datasets, the as [30]:
hyper-heuristic parameter optimization had the lowest test
 a ( ),
L
Nt
mean square error. h†channel = †
t t (1)
In the context of the mmWave communication systems, L =1

the implementation of DL research advancements has also

enhanced solutions for these systems [21]. There are several

: Phase shifter Practical rp hchannel_est Trained

v RF
Estimator
. Narrow Environment  est beamformer
x band
r Online
D
S RF . RF S
Chain
v RF . channel Chain
.
Base station . h†channel User Update weights

Simulation rp hchannel_est DL-based

v RF Loss
Estimator
samples  est beamformer function
FIGURE 1. The diagram of a multiple-input single-
output mmWave system using one RF chain [16]. hchannel ,
Offline

where  represents the complex gain of the th path, t FIGURE 2. The illustration of offline and online stages
is the azimuth angle of departure of the th path, and for DL-based beamformer [16].

a (t
†
t ) is the antenna array response vector at the base h channel
station. The term with = 1 means the line-of-sight path in  est

Lambda Layer 2 (1)

h†channel .

Lambda Layer 1 (64)

Input Layer (128)

FC Layer 1 (256)

FC Layer 2 (128)

FC Layer 3 (64)
Loss
The optimization objective function of the problem is h channel_est v RF

ReLu

ReLu
considered the spectral efficiency that is widely utilized in
current beamforming design works. This function is given as
[16]:

  † 2
FIGURE 3. The architecture of the reference DL model.
R = log 2 1 + hchannel v RF  , (2)
 Nt 
based beamformer with  est =  . By minimizing a loss
where  represents the Signal-to-Noise Ratio (SNR). The
function, the beamformer then can generate optimized
beamformer aims to generate the optimized analog beamforming vectors v RF . As the SNR values and channel
beamforming vectors v RF so that the spectral efficiency is
samples are produced randomly by the simulation (called
maximized. Then, the beamforming optimization problem generated channels in this study), they can be used directly
with the constant modulus constraint of v RF can be given by in the loss computation. By utilizing the estimated channels
[16]: as the input and generated channels in the loss function, the
beamformer can be trained to figure out how to obtain as
  † 2 close to the ideal spectral efficiency with the estimated
minimize − log 2 1 + hchannel v RF 
 Nt  channels as possible and become robust to channel
v RF (3) estimation errors. During the online deployment phase, the
 v RF nt
2
subject to = 1, for nt = 1,…,Nt . base station uses the same mmWave channel estimator. The
estimated channels are then inputted to the trained
As the SNR is often regarded as being more correctly beamformer to obtain optimized beamforming vectors for
measured than the channel, the SNR  and the estimated maximizing spectral efficiency. It is important to note that
generated channels are only necessary during the offline
SNR  est are assumed to be equal, i.e.,  est =  . training stage to compute the loss. This is because all the
B. DL-BASED BEAMFORMING DESIGN parameters of the trained beamformer have already been
In this study, we take the DL-based beamformer designed in fixed, and the trained beamformer is ready to accept practical
[16] as the reference one to verify our proposed approach. mmWave channels as inputs to directly output beamforming
This beamformer consists of two stages, which are illustrated vectors. Multiple offline training samples are necessary to
in Fig. 2, directly output v RF to solve (3). During the offline ensure the generalizability of DL models, so 1e5 samples to
training stage, random channel samples are generated using train and 5e3 samples to test are used in this study.
via simulation on the system model. The base station then The architecture of the reference model for the
applies a practical channel estimator to achieve partial beamformer in [16] in the offline stage consists of six main
channel state information. The mmWave channel estimator layers which are demonstrated in Fig. 3. The inputs are the
in [30] is adopted, where the mmWave channels are generated channels h channel , the estimated SNRs  est (random
estimated by sending pilot symbols in a hierarchical in the training stage), and the estimated channels hchannel_est ,
codebook and then receiving the user’s decision feedback
where complex-valued hchannel_est with the size of Nt = 64 is
based on the signal received rp . The estimated channel
separated into real and imaginary parts, and then these parts
h†channel_est and the estimated SNR  est are inputs for the DL- are concatenated into a vector with the size of 128. The

rp hchannel_est v RF of clusters), or continuous (e.g., learning rate). For an HPO

Practical Trained
Environment
Estimator
 est beamformer problem, in general, the aim is to obtain:
Online
h* = arg min f ( h ) , (5)
hH
Update weights

rp hchannel_est v RF where f ( h ) is the fitness function to be minimized, h * is

Simulation DL-based Loss
Estimator
samples  est beamformer function a hyperparameter vector that yields the optimum value of
hchannel , f ( h ) while a hyperparameter vector h can take any value
in the search space H . The goal of HPO is to tune
Proposed
Approach hyperparameters within allowed budgets to produce optimal
Offline
or nearly optimal model performance. Some metrics,
FIGURE 4. The utilization of the proposed approach for including accuracy or loss such as root mean square error can
hyperparameter optimization. be used to evaluate the performance of the model. DL models
are retrained if a new hyperparameter set is evaluated, and
output is the optimized analog beamforming vectors v RF the validation set should be processed to produce a score that
measures the model’s performance [7].
that are applied to analog phase shifters. Besides, Lambda For DL models, the search space H can include the
layer 1 is added to compute complex-valued vectors v RF number of filters, the size of the filters in convolutional
based on N t real-valued phases so that the constant modulus layers, the dimensionality of the output in long short-term
constraint is satisfied. With the input of v RF ,  est and h channel memory layers, the number of neurons in fully connected
layers, activation functions, optimizers, and the learning rate.
, the Lambda layer 2 is used to compute the loss function
Assuming that a DL model requires optimizing m different
which is defined as:
hyperparameters and that the domain of these
1 N sam
  ns † 2 hyperparameters are categorical and discrete, each
Loss = −  log 2 1 + hchannel,ns v RF,ns . (4) hyperparameter has ni choices in the i -th search space H i
N sam ns =1  Nt 
for i = 1, 2, , m . Hence, the search space can be expressed
The loss function has a direct relationship with the objective as:
function in (3) with N sam training samples, where v RF,ns ,  ns
H1,1 H1,2 H1,n1
and h†channel,ns represent the optimized analog beamforming
H 2,1 H 2,2 H 2,n2
vectors, SNRs, and the generated channels associated with H= . (6)
the nsth sample. Note that the reduction in loss correlates
precisely with the increase in the average spectral efficiency. H m,1 H m,2 H m,nm
The fully connected (FC) layers 1, 2, and 3 include 256, 128,
The vector h* =  h1 , h2 , , hm 
T
and 64 neurons, respectively, with corresponding activation consists of m optimal
functions in Fig. 3 and batch normalization layers preceded
hyperparameters. To determine this vector, the index vector
these FC layers. The Adam optimizer is adopted with a
k =  k1 , k2 , , km  , which includes m values mapping to H
T
learning rate of 0.001, and the channel samples are related to
different random SNRs between −20dB and 20dB . , should be optimized. The values in the index vector are less
than or equal to the choices in H i . For example, if the first
IV. PROPOSED APPROACH search space H 1 has n1 choices, k1 is less than or equal to
HPO approaches aim to improve DL architectures by
identifying the best combinations of hyperparameters [7]. As n1 , and the first optimal hyperparameter h1 is H1,k1 .
shown in Fig. 4, our proposed approach is adopted to Therefore, it is necessary to apply the proper optimization
optimize hyperparameters for the DL model described in the methods to HPO problems to determine the index vector and
previous section. The key ideas of the HPO problem and the then identify optimal hyperparameter configurations for DL
algorithm are described in this section. models.
A. FORMULATION OF HPO PROBLEM B. PROPOSED ALGORITHM
The process of searching hyperparameter combinations The proposed algorithm is developed based on Binary Bat
involves four main parts [7]: an estimator (a classifier or Algorithm (BBA) [31], which is one of the best
regressor) with its fitness functions, a search space metaheuristics for solving problems with discrete binary
(configuration space), an optimization or a search method, search spaces, to identify the optimal hyperparameter vector
and an evaluation function to evaluate how well various h * . However, other metaheuristics can still be applied based
hyperparameter configurations perform. A hyperparameter’s on the proposed algorithm instead of BBA. The pseudocode
domain can be categorical (e.g., type of optimizer), binary is demonstrated in Algorithm 1. It can be briefly described
(e.g., whether to apply early stopping), discrete (e.g., number as follows:

Algorithm 1 The proposed algorithm for HPO. where   and int ( ) denotes rounding to the nearest number
1: Determine: Datasets; the search space H ; and converting to integer numbers, respectively. Next, the
performance metrics; fitness function Fitness ; hyperparameter vector h can be obtained by mapping k
number of populations ( numPop ); number of into H . At this point, it can build DL models with h , then
train models and test models to find the current best
iterations, and dimension of solutions.
hyperparameter vector based on performance metrics.
2: Initialize populations and then obtain h from
Finding the best hyperparameters: The search operation
solutions in initialized populations; train and test
of BBA is implemented. For the p -th bat with
DL models with h ; and find the current best
solution (the current h * ). p = 1,2, , numPop , the frequency Qp and the velocity
3: Repeat V iter
at the iter -th iteration are updated as follows:
p
4: Adjust frequency and update velocities; compute
transfer function; and then update positions. Qp = Qmin + ( Qmax − Qmin ) rand , (9)
5: if rand > pulse rate then

Vpiter = Vpiter −1 + ( soliter − Gbest ) Qp ,

6: Select randomly binary values among the best −1
solutions ( Gbest ). p (10)
7: Change binary values in sol with the selected where Qmin , Qmax , Gbest , and rand are the minimum
binary values in Gbest .
frequency, the maximum frequency, the current best
8: end if solutions, and random values drawn from the uniform
9: Obtain new h from current solutions; train and distribution in ( 0,1) , respectively. To map velocity values to
test DL models with new h .
10: Compute the fitness function; rank the bats and binary values for updating the positions or forcing bats to
determine the current Gbest . move in a binary space, the following V-shaped transfer
function is used to update the position of the p -th bat:
11: Until Termination conditions are satisfied.
12: Obtain h * from the final Gbest .
2 
Ftranfer (Vpiter ) =
2
arctan  Vpiter  , (11)
13: Train DL models with h * and then use trained   
models to directly output beamforming vectors.

( sol iter −1 )−1 if rand  F

tranfer (V p )
iter

=
p
Initialization: First, the type of learning (supervised sol iter
, (12)
( )
p
iter −1
versus unsupervised) and datasets should be determined.  sol
 p if rand  Ftranfer V iter
p
Next, the search space H , such as the number of neurons in
FC layers, activation functions, the number of choices or
()
−1
where indicates the complement of binary numbers. If
upper and lower limits for each hyperparameter, and whether
to apply early stopping or not, should be defined. Because rand is greater than pulse rate , change binary numbers in
the goal of the optimization problem is to minimize the sol with the randomly selected binary values in Gbest so that
fitness function, this function is determined according to
the local solution, sol , moves towards the current best
performance metrics such as the spectral efficiency on test
datasets. After that, the number of populations and iterations solution Gbest , where pulse rate represents the pulse
are initialized, and the dimension of solutions of BBA ( d ) emission rate of bats. At the step of obtaining new h from
is calculated based on H as follows: current solutions, h is derived from the same manner as
explained above. The operation is finished when the
m
d =  log 2 ni ,
termination conditions are satisfied. In this paper, the
(7) optimization process is terminated after running 15
i =1
iterations, which is chosen based on experiments.
where   denotes rounding up to the nearest number. The Building, training, and employing DL models with
bats’ solutions, which are binary numbers, are initialized optimized hyperparameters: From the best solution
randomly. The solutions or bats’ positions, sol , are a binary (binary numbers), the best hyperparameter vector h * can be
number vector, so they should be converted to a decimal obtained. Next, the optimal DL model is built and trained.
number vector that is k . For i = 1, 2, , m , the element ki in Finally, the trained model is used to output beamforming
vectors.
k is determined as follows:
V. RESULTS AND COMPARATIVE ANALYSIS
 n 
ki =  log ni  int ( sol )  , (8) The efficiency of the proposed approach will be evaluated in
2  2 i
−1  this section. Firstly, the reference model’s parameters,
BBA’s parameters, and H are described. Next, the

convergence ability is demonstrated. Finally, the proposed time tradeoff [32]. AdaMax, Adam, RMSProp, and Nadam
approach-based model is compared to the reference model are the most efficient and widely used optimization
and the Hyperband approach-based model in terms of algorithms in DL [33]-[35]. A large learning rate helps the
maximizing spectral efficiency. In all figures, reference, model to learn quicker at the expense of arriving at a
Hyperband, and proposal refer to the reference model, the suboptimal final set of weights. A smaller learning rate may
Hyperband approach-based model, and the proposed enable the model to acquire a more optimum or even globally
approach-based model, respectively. optimal set of weights, but it may require much more time to
train [36]. The learning rate range to be taken into
A. PARAMETER SETUP
This study focuses on verifying the proposed approach, so consideration is from 1e−4 to 5e−3 , including the learning
we use the same datasets as used for the reference model. rate which is set in the reference model.
Datasets, source code, and trained weights for the reference The fitness function for the proposed algorithm is built
model are publicly provided by authors in [16]. The number based on the spectral efficiency function in (2) as follows:
of total paths ( L ) is 3 and the estimation of channel samples 1
with the pilot-to-noise power ratio is 20dB . Fitness = 20
.

(14)
BBA belongs to one type of metaheuristics; in addition, Rsnr
snr =−20
the maximum number of iterations and the population size
are two factors that have a close relationship with the The spectral efficiency is evaluated on test datasets with
metaheuristics’ performance [8]. Based on experiments, we SNRs from −20dB to 20dB with the step of 5. Note that the
have determined that the population size and the maximum spectral efficiency increases as the fitness function
number of iterations should be 20 and 15, respectively, for decreases.
this problem. Termination conditions are that all iterations
have been completed. Other parameters are set as suggested B. CONVERGENCE CHARACTERISTICS
by [31]: pulse rate = 0.5 ; Qmin = 0 ; Qmax = 2 . The In this subsection, the convergence ability and the training
loss produced by DL models on test datasets are evaluated.
illustrated results are the average value of 20 independent
The values of the fitness function in Fig. 5 indicate that the
runs.
proposed approach nearly converges after the 6th iteration
This study verifies the proposed approach by optimizing
hyperparameters of DL models that have six main layers with the value of −16.728dB and insignificantly decreases
same as the model of the reference beamformer. The search from the 7th iteration onwards. This means that at the 6th
space H , which is expressed in (13), includes the number iteration, the proposed approach can figure out optimized
of neurons in the first two FC layers (corresponding to the hyperparameters that are listed in Table 1. Fig. 6 compares
first two rows), activation functions after the first two FC the training loss between the reference model, the proposed
layers (the fourth row), optimizers (the fifth row), and the approach-based model, and the Hyperband approach-based
initial learning rate (the last row). The order of model, where optimized hyperparameters of these DL
hyperparameters in the search space is not required to be in models are in Table 1. Both HBO approach-based models
the order of each layer in the reference DL model. These achieve lower loss values and converge faster than the
hyperparameters are determined by the empirical trials in reference model even though both have more trainable
[16], so they will be optimized by our proposed approach for parameters. However, the proposed approach-based model
achieving the ideal spectral efficiency. Assume that each achieves −5.302 while the Hyperband approach-based
hyperparameter has 4 choices, the dimension of one solution model is −5.252 , and the reference model is −5.136 .
d is 12, calculated by (7). C. SPECTRAL EFFICIENCY CHARACTERISTICS
This subsection compares the achievable spectral efficiency
128 192 256 320 between the reference model, the Hyperband approach-based
64 96 128 192 model, and the proposed approach-based model. The spectral
H = ELU ReLU Sigmoid Tanh . (13) efficiency versus SNR performance in Fig. 7 shows that the
proposed approach-based model produces higher spectral
AdaMax Adam RMSprop Nadam efficiency than the reference model. To obtain 9.72 bits/s/Hz,
1e−4 5e−4 1e −3 5e −3 for example, the optimized model achieves around 1dB in
SNR over the reference model. Besides, the proposed
The network complexity of DL models increases
approach-based model is also slightly better than the
proportionally with the number of neurons, so the search
Hyperband approach-based model for spectral efficiency.
space for the number of neurons in the first two FC layers is
There are estimation errors in estimating L in practical
set to values in a range that includes the number of neurons
systems. Owing to the estimation complexity and the sparsity
set in the reference model. Exponential Linear Unit (ELU),
of mmWave channels, the estimated number of channel
Sigmoid, Rectified Linear Unit (ReLU), and Tanh are the
paths should be set to a small value [30]. Moreover, L in
most prevalent and widespread non-linearity layers and are
practice often differs from those in training, so the
proven to be effective solutions to non-zero mean and zero
consideration of the mismatch between training and
gradient problems, as well as the accuracy versus training

12
-16.66 11.37

Spectral Efficiency (bits/s/Hz)

11
10
Fitness Function (dB)

-16.68 10 9.72
8

9
-16.70 14 16 18 20
6

-16.72 4

2 Reference
-16.74 Hyperband
Proposed
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 -20 -15 -10 -5 0 5 10 15 20
Iteration SNR (dB)

FIGURE 5. The fitness function over 15 iterations. FIGURE 7. The spectral efficiency versus SNR.

12
TABLE 1. Hyperparameters for three DL models.
Proposed, LTr = 2, 3 LTr = 3

Spectral Efficiency (bits/s/Hz)

Hyperband- Proposed- 10 Reference, LTr = 2, 3
Reference
Hyperparameter based based
model
model model 8
Number of neurons
256 320 320
in 1st FC layer 6
LTr = 2
Number of neurons
128 192 192
in 2nd FC layer 4

Activation function ReLu Sigmoid Sigmoid

2
Optimizer Adam RMSprop Nadam
Initial learning rate 1e−3 1e−3 5e−3 0
-20 -15 -10 -5 0 5 10 15 20
SNR (dB)
-4.0 Reference FIGURE 8. The impact of the channel model’s mismatch.
Hyperband
-4.2
Proposed
-4.4 11 25%~75%
Training Loss

10 Range within 1.5IQR

-4.6 Median
Spectral Efficiency (bits/s/Hz)

9
-4.8
8
-5.0 7
-5.2 6
5
-5.4
4
0 100 200 300 400 500
3
Epoch
2
FIGURE 6. The training loss versus epochs 1
0
deployment plays an important role. Assuming that the Proposed Hyperband Reference
online deployment stage’s channel model has three paths (
L = 3 ), but the DL-based models are trained with LTr paths. FIGURE 9. The distribution of the spectral efficiency.
The impact of the channel model’s mismatch between
training and deployment stages is depicted in Fig. 8. This and deployment stages are limited, which indicates the
figure demonstrates the achievable spectral efficiency with robustness and generalizability of DL-based models to the
the output of the DL-based models which have been trained model mismatch issue. In these models, the proposed
with LTr = 2,3 , respectively. Even though there is a model approach-based model produces higher spectral efficiency
than the reference model by about 0.041 to 0.304 bits/s/Hz.
mismatch when LTr = 2,3 , the losses between the training

TABLE 2. The median, the first and the third quartiles For SNR = 5dB , moreover, the proposed approach-based
produced by three DL models. model is better than both the Hyperband approach-based
model and the reference model in respect of spectral
Hyperband- Proposed-
Reference efficiency. With q = 3 , for example, the proposed approach-
Parameter based based
model based model achieves 6.412 bits/s/Hz while the Hyperband
model model
Median 7.340 7.593 7.616
approach-based model and the reference model only achieve
6.365 and 6.104 bits/s/Hz, respectively.
The first quartile 4.826 5.442 5.538 Once the DL model is trained in the offline stage, this
The third
7.596 7.661 7.663
model will be adopted to output beamforming vectors.
quartile Therefore, the computational time for yielding these vectors
should be carefully considered in the online stage. Table 3
6.6 shows the average computational time in milliseconds to
output beamforming vectors on 5000 test samples over 1000
Spectral Efficiency (bits/s/Hz)

6.4
independent runs in computers equipped with an NVIDIA
6.2 T4 Tensor Core GPU. The proposed approach-based model
6.0
not only takes less time than the other two models but also
achieves higher spectral efficiency.
5.8

Reference, q = ¥ VI. DISCUSSION

5.6
Hyperband, q = ¥ This study has combined metaheuristics and DL in a manner
5.4 Proposed, q = ¥ that facilitates synergy between these two approaches to
Reference, quantized propose an HPO approach. This combination solves not only
5.2 Hyperband, quantized
Proposed, quantized
the HPO problem but also the following problems [37]:
5.0 training DL models, architecture optimization (architecture
1 2 3 4 5 6 7 8
Number of quantization, q search), and optimization at feature representation levels.
Interestingly, these types of optimization problems are
FIGURE 10. The spectral efficiency performance versus amenable to solutions via metaheuristic algorithms. Based
the resolution of phase shifters. on the knowledge of solutions, the selection operators of
metaheuristic algorithms direct the search for promising
TABLE 3. Computational time (in milliseconds) to regions in the search space, making them efficient
output beamforming vectors approaches for solving challenging problems.
Besides, in recent years, there is considerable interest in
Proposed- DL due to its ability to develop intelligent systems that can
Reference Hyperband-
based make effective decisions and accurate predictions. DL
model based model
model approaches help significantly enhance efficiency compared
Time 631.187 628.080 626.956 to conventional communication systems [28]. Therefore, the
proposed approach can be considered a premise for
optimizing hyperparameters for various DL-based problems
Fig. 9 shows violet plots, and Table 2 shows the median, in general, not just problems for mmWave communication
the first and the third quartiles of the distribution of the systems.
spectral efficiency with SNR = 5dB of three DL models. Although the proposed approach is specifically verified by
The median, the first and third quartiles based on the optimizing main hyperparameters such as the number of
proposed approach-based model are 7.616, 5.538, and 7.663, neurons in FC layers, and activation functions in this study,
respectively, which are higher than both those based on the it can have good generality for more complex models and
reference model and the Hyperband-based model. Moreover, problems. For instance, the proposed approach can be used
the shape of the distribution from the minimum value to the to optimize hyperparameters in convolutional and long short-
first quartile in Fig. 9 indicates that the spectral efficiency of term memory neural networks. In [38], for example, the
the proposed approach-based model is thinner than those of following hyperparameters can be optimized: the number of
the other two models and is highly concentrated around the filters, the size of pooling windows in convolutional neural
median compared to the reference model. network modules, and the output size of long short-term
Typically, considering the resolution of practical phase memory modules. Eventually, DL models built with
shifters is limited. When beamforming coefficients or the optimized hyperparameters will output predictive
output of DL models are quantized with q bits, the spectral beamforming matrices. These matrices are utilized to
efficiency performance versus as a function of these bits is approach achievable sum rates of the upper bound method
considered, which is shown in Fig. 10. As q increases, the for vehicular networks with the integration of sensing and
performance loss lessens, and it is negligible when q  4 . communication.

VII. CONCLUSION [17] Nakisa et al., “Long short term memory hyperparameter optimization
This study has proposed an HPO approach based on for a neural network based emotion recognition framework,” IEEE
Access, vol. 6, pp. 49325–49338, 2018.
metaheuristics for DL models. The proposed approach was [18] N. Bacanin et al., “Optimizing convolutional neural network
applied to optimizing hyperparameters in DL models that hyperparameters by enhanced swarm intelligence metaheuristics,”
aim to output optimized beamforming coefficients to Algorithms, vol. 13, no. 3. MDPI AG, p. 67, 2020.
approach the ideal spectral efficiency in mmWave [19] F.C. Soon et al., “Hyper‐parameters optimisation of deep CNN
architecture for vehicle logo recognition,” IET Intelligent Transport
communication systems with large-scale antenna arrays. Systems, vol. 12, no. 8, pp. 939–946, 2018.
Results have shown the ability to optimize hyperparameters [20] N.R. Sabar et al., “An evolutionary hyper-heuristic to optimise deep
and provided an insightful solution to forthcoming HPO belief networks for image reconstruction,” Applied Soft Computing,
problems. Comparative analysis has also indicated that the vol. 97. Elsevier BV, p. 105510, 2020.
[21] V. Raj, N. Nayak and S. Kalyani, “Deep reinforcement learning based
proposed approach-based models can produce higher blind mmWave MIMO beam alignment,” IEEE Transactions on
spectral efficiency than the Hyperband approach-based Wireless Communications, vol. 21, no. 10, pp. 8772–8785, 2022.
models and the reference model. As for future work, it would [22] H. Huang et al., “Unsupervised learning-based fast beamforming
be interesting to apply the proposed approach to more design for downlink MIMO,” IEEE Access, pp. 7599–7605, 2019.
[23] X. Li and A. Alkhateeb, “Deep learning for direct hybrid precoding
complex DL models and beamforming problems using in millimeter wave massive MIMO systems,” 2019 53rd Asilomar
hybrid beamforming architectures for reconfigurable Conference on Signals, Systems, and Computers. IEEE, 2019.
intelligent surfaces, and integrated sensing and [24] H. Huang et al., “Deep-learning-based millimeter-wave massive
communication in 6G wireless communication systems. MIMO for hybrid precoding,” IEEE Transactions on Vehicular
Technology, vol. 68, no. 3, pp. 3027–3032, 2019.
[25] Q. Wang et al., “PrecoderNet: Hybrid beamforming for millimeter
REFERENCES wave systems with deep reinforcement learning,” IEEE Wireless
Communications Letters, vol. 9, no. 10, pp. 1677–1681, 2020.
[1] W. Hong et al., “The role of millimeter-wave technologies in 5G/6G
[26] A.M. Elbir and K.V. Mishra, “Joint antenna selection and hybrid
wireless communications,” IEEE Journal of Microwaves, vol. 1, no.
beamformer design using unquantized and quantized deep learning
1, pp. 101–122, 2021.
networks,” IEEE Transactions on Wireless Communications, vol. 19,
[2] L. Zhu et al., “Millimeter-wave communications with non-orthogonal
no. 3, pp. 1677–1688, 2020.
multiple access for B5G/6G,” IEEE Access, vol. 7, pp. 116123–
[27] P. Dong et al., “Deep CNN-based channel estimation for mmWave
116132, 2019.
massive MIMO systems,” IEEE Journal of Selected Topics in Signal
[3] M.Y. Javed et al., “Wideband inter-beam interference cancellation
Processing, vol. 13, no. 5, pp. 989–1000, 2019.
for mmW/Sub-THz phased arrays with squint,” IEEE Transactions
[28] A. Alkhateeb et al., “Deep learning coordinated beamforming for
on Vehicular Technology, pp. 1–13, 2023.
highly-mobile millimeter wave systems,” IEEE Access, pp. 37328–
[4] Y. Shi et al., “Deep learning for large-scale optimization in 6G
37348, 2018.
wireless networks.” arXiv, 2023. doi: 10.48550/arXiv.2301.03377.
[29] A.M. Elbir, “A deep learning framework for hybrid beamforming
[5] H. Huang et al., “Deep-learning-based millimeter-wave massive
without instantaneous CSI feedback,” IEEE Transaction Vehicle
MIMO for hybrid precoding,” IEEE Transactions on Vehicular
Technology, vol. 69, no. 10, pp. 11 743–11 755, 2020.
Technology, vol. 68, no. 3, pp. 3027–3032, 2019.
[30] A. Alkhateeb et al., “Channel estimation and hybrid precoding for
[6] B. Bischl et al., “Hyperparameter optimization: Foundations,
millimeter wave cellular systems,” in IEEE Journal of Selected
algorithms, best practices, and open challenges,” WIREs Data
Topics in Signal Processing, vol. 8, no. 5, pp. 831–846, 2014.
Mining and Knowledge Discovery. Wiley, 2023.
[31] S. Mirjalili et al., “Binary bat algorithm,” Neural Computing and
[7] L. Yang and A. Shami, “On hyperparameter optimization of deep
Applications, vol. 25, no. 3, pp. 663–681, 2014.
learning algorithms: Theory and practice,” Neurocomputing, vol.
[32] S.R. Dubey, S.K. Singh, and B.B. Chaudhuri, “Activation functions
415. Elsevier BV, pp. 295–316, 2020.
in deep learning: A comprehensive survey and benchmark,”
[8] Q. Li et al., “Influence of initialization on the performance of
Neurocomputing, vol. 503. Elsevier BV, pp. 92–108, 2022.
metaheuristic optimizers,” Applied Soft Computing, vol. 91, p.
[33] T. Dozat, “Incorporating nesterov momentum into adam,
106193, 2020.
international conference on learning representations,”, 2016.
[9] I. Boussaïd, J. Lepagnot, and P. Siarry, “A survey on optimization
[34] D. Soydaner, “A comparison of optimization algorithms for deep
metaheuristics,” Information Sciences, vol. 237. Elsevier BV, pp. 82–
learning,” International Journal of Pattern Recognition and Artificial
117, Jul. 2013. doi: 10.1016/j.ins.2013.02.041.
Intelligence, vol. 34, no. 13, p. 2052013, 2020.
[10] T.V. Luyen et al., “Null-steering beamformers for suppressing
[35] D. P. Kingma and J. Ba, “Adam: A method for stochastic
unknown direction interferences in sidelobes,” Journal of
optimization.” arXiv, 2014. doi: 10.48550/arXiv.1412.6980.
Communications, pp. 600–607, 2022.
[36] M. D. Zeiler, “Adadelta: An adaptive learning rate method.” arXiv,
[11] T. Dokeroglu et al., “A survey on new generation metaheuristic
2012. doi: 10.48550/arXiv.1212.5701.
algorithms,” Computers and Industrial Engineering, vol. 137.
[37] B. Akay, D. Karaboga, and R. Akay, “A comprehensive survey on
Elsevier BV, p. 106040, 2019.
optimizing deep learning models by metaheuristics,” Artificial
[12] T.V. Luyen and N.V. Cuong, “An effective beamformer for
Intelligence Review, vol. 55, no. 2, pp. 829–894, 2022.
interference suppression without knowing the direction,” Journal of
[38] C. Liu et al., “Predictive beamforming for integrated sensing and
Electrical and Computer Engineering, vol. 13, pp. 601–610, 2022.
communication in vehicular networks: A deep learning approach,”
[13] K. Hussain et al., “Metaheuristic research: a comprehensive survey,”
IEEE International Conf. on Communications, pp. 1948–1954, 2022.
Artificial Intelligence Review, vol. 52, no. 4. pp. 2191–2233, 2018.
[14] H.M. Kha et al., “A null synthesis technique-based beamformer for
uniform rectangular arrays,” 2022 International Conference on
Advanced Technologies for Communications, 2022, pp. 13–17.
[15] L. Li et al., “Hyperband: A novel bandit-based approach to
hyperparameter optimization,” Journal of Machine Learning
Research 18, pp. 1–52, 2018.
[16] T. Lin and Y. Zhu, “Beamforming design for large-scale antenna
arrays using deep learning,” IEEE Wireless Communications Letters,
vol. 9, no. 1, pp. 103–107, 2020.

KIEU-XUAN THUC (Member, IEEE) NGUYEN VAN CUONG received the

received the B.E. and M.S. degrees from engineer’s degree and master’s degree both
Hanoi University of Technology, Viet Nam, from Hanoi University of Industry, in 2020
in 1999 and 2004, respectively. In 1999, he and 2022, respectively. His research interests
joined Hanoi University of Industry. He include beamforming for antenna arrays,
received his Ph.D. degree from the University smart antennas, nature-inspired optimization
of Ulsan in February 2012. His research algorithms, deep learning, and convex
interests include intelligent signal processing optimization.
algorithms and next-generation wireless
communication systems.

HOANG MANH KHA (Member, IEEE) TONG VAN LUYEN (Member, IEEE)
received the B.E and M.E degrees in received the B.S. and M.S. degree from the
Electronics and Telecommunications Hanoi University of Science and Technology,
Engineering both from Hanoi University of in 2002 and 2004, respectively, and the Ph.D.
Science and Technology, in 2002 and 2004, degree from VNU University of Engineering
respectively. He obtained his Ph.D. degree in and Technology in 2019. His research
Communications Engineering from the interests are in beamforming and beam-
University of Paderborn, Germany in 2016. steering for antenna arrays, smart antennas,
His research interests include digital signal optimum array processing, nature-inspired
processing, wireless communication, optimization algorithms, and artificial
positioning engineering, deep learning, intelligence.
pattern classification, and metaheuristics.

ME5107: Numerical Methods in Thermal Engineering
No ratings yet
ME5107: Numerical Methods in Thermal Engineering
21 pages
Channel Estimation Based Low-Complexity Hierarchical
No ratings yet
Channel Estimation Based Low-Complexity Hierarchical
18 pages
A Survey On Hybrid Beamforming Techniques in 5G: Architecture and System Model Perspectives
No ratings yet
A Survey On Hybrid Beamforming Techniques in 5G: Architecture and System Model Perspectives
38 pages
Sensors 23 02772
No ratings yet
Sensors 23 02772
15 pages
Deep Learning-Based Mmwave Beam Selection For 5G NR - 6G With Sub-6 GHZ Channel Information - Algorithms and Prototype Validation
No ratings yet
Deep Learning-Based Mmwave Beam Selection For 5G NR - 6G With Sub-6 GHZ Channel Information - Algorithms and Prototype Validation
13 pages
Deep Learning For PHY Layer 5G Challenges
No ratings yet
Deep Learning For PHY Layer 5G Challenges
18 pages
Alternating Minimization Algorithms For Hybrid Precoding in Millimeter Wave MIMO Systems
No ratings yet
Alternating Minimization Algorithms For Hybrid Precoding in Millimeter Wave MIMO Systems
16 pages
A Kalman Based Hybrid Precoding For Multi-User Millimeter Wave MIMO Systems
No ratings yet
A Kalman Based Hybrid Precoding For Multi-User Millimeter Wave MIMO Systems
11 pages
An Adaptive Hybrid Beamforming Approach For 5G-MIMO MmWave Wireless Cellular Networks
No ratings yet
An Adaptive Hybrid Beamforming Approach For 5G-MIMO MmWave Wireless Cellular Networks
12 pages
Beamforming Algorithm For Multiuser Wideband Millimeter-Wave Systems With Hybrid and Subarray Architectures
No ratings yet
Beamforming Algorithm For Multiuser Wideband Millimeter-Wave Systems With Hybrid and Subarray Architectures
18 pages
Millimeter-Wave Beam Search With Iterative Deactivation and Beam Shifting
No ratings yet
Millimeter-Wave Beam Search With Iterative Deactivation and Beam Shifting
15 pages
A Learning-Based Dipole Yagi-Uda Antenna and Phased Array Antenna For Mmwave Precoding and V2V Communication in 5G Systems
No ratings yet
A Learning-Based Dipole Yagi-Uda Antenna and Phased Array Antenna For Mmwave Precoding and V2V Communication in 5G Systems
15 pages
Best 3
No ratings yet
Best 3
15 pages
Practical Hybrid Beamforming Schemes in Massive MIMO 5G NR Systems
No ratings yet
Practical Hybrid Beamforming Schemes in Massive MIMO 5G NR Systems
8 pages
Multi-User Massive MIMO Systems Based on Hybrid An
No ratings yet
Multi-User Massive MIMO Systems Based on Hybrid An
9 pages
Machine Learning Based MIMO Antenna Arrays Optimization For 5G 6G
No ratings yet
Machine Learning Based MIMO Antenna Arrays Optimization For 5G 6G
7 pages
Metaheuristic Algorithms For 6G Wireless Communications Recent Advances
No ratings yet
Metaheuristic Algorithms For 6G Wireless Communications Recent Advances
35 pages
A_Family_of_Deep_Learning_Architectures_for_Channel_Estimation_and_Hybrid_Beamforming_in_Multi-Carrier_mm-Wave_Massive_MIMO
No ratings yet
A_Family_of_Deep_Learning_Architectures_for_Channel_Estimation_and_Hybrid_Beamforming_in_Multi-Carrier_mm-Wave_Massive_MIMO
15 pages
Freq Domain BM Time Series
No ratings yet
Freq Domain BM Time Series
6 pages
ML in Optimization For MIMO Systems Seminar
No ratings yet
ML in Optimization For MIMO Systems Seminar
17 pages
Kalman Hierarchical Hybrid Precoding For Mmwave MIMO System
No ratings yet
Kalman Hierarchical Hybrid Precoding For Mmwave MIMO System
48 pages
6G localization
No ratings yet
6G localization
4 pages
TWC.2016.2614495
No ratings yet
TWC.2016.2614495
13 pages
CNN-Based Hybrid Precoding Design With Geometric Mean Decomposition
No ratings yet
CNN-Based Hybrid Precoding Design With Geometric Mean Decomposition
7 pages
Private 5G: A Systems Approach
From Everand
Private 5G: A Systems Approach
Larry L Peterson
No ratings yet
2016 Jing PDF
No ratings yet
2016 Jing PDF
10 pages
Beam Management in Millimeter Wave Communications For 5G and Beyond
No ratings yet
Beam Management in Millimeter Wave Communications For 5G and Beyond
12 pages
Introduction To Hybrid Beamforming: Improve SNR and Capacity of Wireless Communication Using Antenna Arrays
No ratings yet
Introduction To Hybrid Beamforming: Improve SNR and Capacity of Wireless Communication Using Antenna Arrays
7 pages
performance_comparison_of_antenna_5g_beamforming_by_merhawit_,mogos
No ratings yet
performance_comparison_of_antenna_5g_beamforming_by_merhawit_,mogos
12 pages
93096v00 Beamforming Whitepaper PDF
No ratings yet
93096v00 Beamforming Whitepaper PDF
22 pages
14-419
No ratings yet
14-419
3 pages
es8d571
No ratings yet
es8d571
12 pages
Sensors 23 03713
No ratings yet
Sensors 23 03713
16 pages
priya2021
No ratings yet
priya2021
23 pages
Deep Learning-Based Beamforming and Blockage Prediction For Sub-6GHz-Mm Wave Mobile Networks
No ratings yet
Deep Learning-Based Beamforming and Blockage Prediction For Sub-6GHz-Mm Wave Mobile Networks
6 pages
IET Communications - 2020 - Yu - Hybrid Precoding Design in Multiuser Large Scale Antenna Systems Under Correlated Fading
No ratings yet
IET Communications - 2020 - Yu - Hybrid Precoding Design in Multiuser Large Scale Antenna Systems Under Correlated Fading
10 pages
1388254 Channel Parameter Estimation of MmWave MIMO System in Urban Traffic Scene a Training Channel-Based Method
No ratings yet
1388254 Channel Parameter Estimation of MmWave MIMO System in Urban Traffic Scene a Training Channel-Based Method
9 pages
2
No ratings yet
2
12 pages
Preparation of Papers For IEEE Sponsored Conferences and Symposia
No ratings yet
Preparation of Papers For IEEE Sponsored Conferences and Symposia
6 pages
Spatially Sparse Precoding in Millimeter Wave MIMO Systems
No ratings yet
Spatially Sparse Precoding in Millimeter Wave MIMO Systems
30 pages
Adaptive Hybrid Deep Learning Based Effective Channel Estimation in MIMO-Noma For Millimeter-Wave Systems With An Enhanced Optimization Algorithm
No ratings yet
Adaptive Hybrid Deep Learning Based Effective Channel Estimation in MIMO-Noma For Millimeter-Wave Systems With An Enhanced Optimization Algorithm
19 pages
DR Manish Nair PH DThesis
No ratings yet
DR Manish Nair PH DThesis
157 pages
Efficient Hybrid Beamforming With Anti-Blockage Design For High-Speed Railway Communications
No ratings yet
Efficient Hybrid Beamforming With Anti-Blockage Design For High-Speed Railway Communications
13 pages
Radio Access Technologies in 5G Communic
No ratings yet
Radio Access Technologies in 5G Communic
6 pages
A_6-GHz_MU-MIMO_Eight-Element_Direct_Digital_Beamforming_TX_Utilizing_FIR_H-Bridge_DAC
No ratings yet
A_6-GHz_MU-MIMO_Eight-Element_Direct_Digital_Beamforming_TX_Utilizing_FIR_H-Bridge_DAC
9 pages
Optimal Beamforming For 5G MIMO
No ratings yet
Optimal Beamforming For 5G MIMO
6 pages
Literature Survey
No ratings yet
Literature Survey
3 pages
Propagation_Models_and_Performance_Evaluation_for_5G_Millimeter-Wave_Bands
No ratings yet
Propagation_Models_and_Performance_Evaluation_for_5G_Millimeter-Wave_Bands
18 pages
The Role of Millimeter-Wave Technologies in 5G 6G Wireless Communications
No ratings yet
The Role of Millimeter-Wave Technologies in 5G 6G Wireless Communications
22 pages
6G Wireless Communications and Mobile Networking
From Everand
6G Wireless Communications and Mobile Networking
Xianzhong Xie
No ratings yet
Hybrid Precoding for Massive MmWave MIMO Systems 2
No ratings yet
Hybrid Precoding for Massive MmWave MIMO Systems 2
10 pages
Fast Beam Training With True-Time-Delay Arrays in Wideband Millimeter-Wave Systems
No ratings yet
Fast Beam Training With True-Time-Delay Arrays in Wideband Millimeter-Wave Systems
13 pages
10 1109@lcomm 2019 2915977
No ratings yet
10 1109@lcomm 2019 2915977
4 pages
Deep Learning Based Massive MIMO Beamforming For 5G Mobile Network
No ratings yet
Deep Learning Based Massive MIMO Beamforming For 5G Mobile Network
4 pages
Guest Editorial Special Section on Advanced Beam-Forming Antennas for Beyond 5G and 6G
No ratings yet
Guest Editorial Special Section on Advanced Beam-Forming Antennas for Beyond 5G and 6G
7 pages
(2023IEEE) Deep Learning of Near Field Beam Focusing in Terahertz Wideband Massive MIMO Systems
No ratings yet
(2023IEEE) Deep Learning of Near Field Beam Focusing in Terahertz Wideband Massive MIMO Systems
5 pages
Beamfocusing Optimization For Near-Field Wideband Multi-User Communications
No ratings yet
Beamfocusing Optimization For Near-Field Wideband Multi-User Communications
16 pages
2022 - Survey On Positioning Information Assisted MmWave Beamforming Training
No ratings yet
2022 - Survey On Positioning Information Assisted MmWave Beamforming Training
18 pages
Zeng 2016
No ratings yet
Zeng 2016
15 pages
5G SIW-Based Phased Antenna Array With Cosecant-Squared Shaped Pattern
No ratings yet
5G SIW-Based Phased Antenna Array With Cosecant-Squared Shaped Pattern
10 pages
Quantum Machine Learning For Next-G Wireless Communications Fundamentals and The Path Ahead
No ratings yet
Quantum Machine Learning For Next-G Wireless Communications Fundamentals and The Path Ahead
21 pages
ML U-4
No ratings yet
ML U-4
63 pages
PS5 Draft
No ratings yet
PS5 Draft
10 pages
1_logistic_regression
No ratings yet
1_logistic_regression
1 page
Chapter 15
No ratings yet
Chapter 15
67 pages
A Genetic Algorithm For General Machine Scheduling Problems: Kyung-Mi Lee Takeshi Yamakawa 820
No ratings yet
A Genetic Algorithm For General Machine Scheduling Problems: Kyung-Mi Lee Takeshi Yamakawa 820
7 pages
Eversign Document Hash
No ratings yet
Eversign Document Hash
7 pages
Artificial Neural Networks and Their App
No ratings yet
Artificial Neural Networks and Their App
5 pages
Data Structure MCQ
No ratings yet
Data Structure MCQ
50 pages
Simulated Annealing and The Boltzmann Machine
No ratings yet
Simulated Annealing and The Boltzmann Machine
4 pages
Lab 13 DFT and Spectral Leakage
No ratings yet
Lab 13 DFT and Spectral Leakage
17 pages
Big M Method
No ratings yet
Big M Method
9 pages
Question Bank DSP EEC-602
No ratings yet
Question Bank DSP EEC-602
3 pages
Assgnment 5-6
No ratings yet
Assgnment 5-6
8 pages
AT70.20: Applied Machine Vision, Midterm Exam: Mon Jun 20, 2016
No ratings yet
AT70.20: Applied Machine Vision, Midterm Exam: Mon Jun 20, 2016
4 pages
Curve Fitting
No ratings yet
Curve Fitting
16 pages
Minor 1 Solution PDF
No ratings yet
Minor 1 Solution PDF
6 pages
Basic Simulation Lab
No ratings yet
Basic Simulation Lab
2 pages
Foca 3
No ratings yet
Foca 3
11 pages
Linear Modelling (Incl. Fem) AE4ASM003 P1-2015 15.09.2015
No ratings yet
Linear Modelling (Incl. Fem) AE4ASM003 P1-2015 15.09.2015
26 pages
Miniorange
No ratings yet
Miniorange
1 page
Assignment (Systems of Linear Equations) PDF Only - Note
No ratings yet
Assignment (Systems of Linear Equations) PDF Only - Note
13 pages
Phishing Website Detection Using Machine Learning Techniques
0% (1)
Phishing Website Detection Using Machine Learning Techniques
17 pages
A Comparison of Single Keyword Pattern Matching Algorithms: Abstract
No ratings yet
A Comparison of Single Keyword Pattern Matching Algorithms: Abstract
5 pages
Introduction To Nonlinear Analysis
No ratings yet
Introduction To Nonlinear Analysis
11 pages
Activity#5-Fourier Series and Fourier Transform
No ratings yet
Activity#5-Fourier Series and Fourier Transform
3 pages
Digital Signal Processing Questions and Answers - Implementation of Discrete Time Systems
100% (1)
Digital Signal Processing Questions and Answers - Implementation of Discrete Time Systems
193 pages
Bubble Sort
No ratings yet
Bubble Sort
20 pages
Practice Problems On DP & Greedy Algorithms With Solutions
No ratings yet
Practice Problems On DP & Greedy Algorithms With Solutions
5 pages
Course Title: Data Structure Using C Course Code: CSIT 124 Credit Units: - 04 Course Level: - UG Course Type: - PC-Core Course Objectives
No ratings yet
Course Title: Data Structure Using C Course Code: CSIT 124 Credit Units: - 04 Course Level: - UG Course Type: - PC-Core Course Objectives
7 pages

A Metaheuristics-Based Hyperparameter Optimization

Uploaded by

A Metaheuristics-Based Hyperparameter Optimization

Uploaded by

This article has been accepted for publication in IEEE Access.

Corresponding author: Tong Van Luyen ([email protected]).

INDEX TERMS Hyperparameter optimization, beamforming, metaheuristics, millimeter wave, large-scale

I. INTRODUCTION Large-phased arrays are typically used in mmWave

the implementation of DL research advancements has also

: Phase shifter Practical rp hchannel_est Trained

Simulation rp hchannel_est DL-based

Lambda Layer 2 (1)

Lambda Layer 1 (64)

rp hchannel_est v RF of clusters), or continuous (e.g., learning rate). For an HPO

rp hchannel_est v RF where f ( h ) is the fitness function to be minimized, h * is

Vpiter = Vpiter −1 + ( soliter − Gbest ) Qp ,

( sol iter −1 )−1 if rand  F

Spectral Efficiency (bits/s/Hz)

Spectral Efficiency (bits/s/Hz)

Activation function ReLu Sigmoid Sigmoid

10 Range within 1.5IQR

Reference, q = ¥ VI. DISCUSSION

KIEU-XUAN THUC (Member, IEEE) NGUYEN VAN CUONG received the

You might also like