[Aide300][Group 5] Final Report
[Aide300][Group 5] Final Report
Peer Evaluation
No. Full Name ID
(0% - 100%)
2
REFERENCES................................................................................................................34
LIST OF FIGURES
Figure 1. Result of checking for missing values................................................................11
Figure 2. Result of checking for outliers...........................................................................13
Figure 3. The open prices of Amazon stock from 2020 to 2025.......................................15
Figure 4. The high prices of Amazon stock from 2020 to 2025........................................16
Figure 5. The low prices of Amazon stock from 2020 to 2025.........................................16
Figure 6. The close prices of Amazon stock from 2020 to 2025.......................................17
Figure 7. Amazon’s stock exchange volume from 2020 to 2025......................................18
Figure 8. Amazon's stock close distribution......................................................................19
Figure 9. Amazon's distribution of volume of stock trading.............................................20
Figure 10. Correlation matrix of features..........................................................................23
LIST OF TABLES
Table 1. Predicted results................................................................................................................................................................. 27
3
ABSTRACT
The report focuses on forecasting Amazon's (AMZN) stock price via advanced
deep learning models, which is the Bidirectional Long Short-Term Memory (Bi-LSTM)
network. Traditional econometric techniques such as ARIMA and Ordinary Least
Squares (OLS) are compared with Bi-LSTM to demonstrate their distinct advantages and
drawbacks. The model utilizes historical stock data, integrating essential aspects such as
moving averages, volatility metrics, and market sentiment indicators, to improve the
accuracy of short-term price predictions. This research improves the rapidly growing
field of stock price prediction by employing deep learning techniques to identify
complex, non-linear relationships in stock price fluctuations that conventional models fail
to handle. Performance evaluation criteria, including Mean Absolute Error (MAE), Root
Mean Square Error (RMSE), and R-squared (R²), demonstrate that the Bi-LSTM model
outperforms the ARIMA and OLS models in stock price prediction. The report also
indicates that Bi-LSTM provides a strong framework for predicting stock price
movements, especially in volatile markets, but recommends using external data sources,
such as sentiment analysis and macroeconomic factors, to enhance prediction accuracy.
This methodology offers significant information for investors and analysts pursuing more
efficient stock price prediction instruments.
4
CHAPTER 1: INTRODUCTION
5
1.2. Overview of Amazon (AMZN)
Amazon (AMZN) is one of the largest and most influential companies in the
world, dominating many industries, including e-commerce, cloud computing, artificial
intelligence, and digital advertising. As a NASDAQ-listed company, Amazon's stock
price is closely followed by institutional investors, hedge funds, and retail traders.
Predicting the Amazon (AMZN) stock price is not only important for investors but also
provides insights into the broader technology and consumer markets, as Amazon plays a
significant role in shaping global economic trends (Gupta & Chen, 2020).
Amazon’s stock price is influenced by several factors, including:
● Quarterly earnings reports: Amazon’s financial performance in key areas such as
Amazon Web Services (AWS), Prime membership, and advertising significantly
impact the company’s stock price (Zhang, 2024).
● Macroeconomic conditions: Changes in interest rates set by the Federal Reserve
(FED) and general market sentiment affect Amazon’s valuation (Fama, 1970).
● Competitive Landscape: Amazon faces competition from Walmart in e-commerce,
Microsoft in cloud computing, and Google in digital advertising, which could
impact investor confidence and stock performance (Ma et Y., 2024).
● Technology Development: Investments in artificial intelligence (AI), logistics, and
automation can lead to large swings in stock prices (Bernard & Obinna, 2024).
Over the years, Amazon has diversified its revenue streams, making its stock an
interesting case for deep learning-based forecasting. The stock is known for its high
volatility, often experiencing sharp price swings following earnings reports, policy
changes, or macroeconomic events. Therefore, predicting AMZN stock prices using
advanced AI techniques such as Bi-LSTM poses a significant challenge while providing
valuable insights into the performance of state-of-the-art financial forecasting models
(Selvin et al., 2017).
This study focuses on applying Bi-LSTM deep learning models to forecast
Amazon stock prices in the next five days using historical stock data from Yahoo
6
Finance, aiming to improve the accuracy of short-term predictions in technology-driven
financial markets.
1.3. Previous studies
Despite significant advances in stock price prediction, there is still a significant
gap in the literature regarding the application of deep learning techniques, especially Bi-
LSTM networks, to predict short-term stock price movements. Most studies have focused
on long-term predictions or used more traditional models such as linear regression or
ARIMA (Kuang, 2023). However, deep learning models, including Bi-LSTM, have
shown great promise in capturing temporal dependencies and nonlinear patterns in stock
price data, which are often overlooked by traditional models (Chong et al., 2017; Kim &
Shin, 2019).
In addition, while several studies have focused on predicting stock prices for large
companies such as Apple or Tesla, few studies have specifically addressed Amazon
(AMZN) stock in the context of deep learning and short-term price forecasting. This
study aims to fill this gap by focusing on Amazon stock prices, using historical price data
and sentiment analysis to improve predictions (Nelson et al., 2017).
1.4. Project overview
The main objective of this paper is to develop a deep-learning model that can
predict the daily stock price of Amazon (AMZN) for the next five days using historical
stock data obtained from Yahoo Finance. This paper will:
● Apply advanced deep learning techniques, especially Bi-LSTM networks, to
forecast the short-term stock price of AMZN.
● Compare the performance of Bi-LSTM models with traditional econometric
models such as ARIMA and Linear Regression.
● Provide detailed information on the performance of the developed models and
evaluate their accuracy through performance metrics such as Mean Absolute Error
(MAE), Root Mean Square Error (RMSE), and R².
7
● Contribute to the understanding of how deep learning can be used to predict stock
prices more effectively, especially in volatile markets, thus potentially benefiting
investors and analysts.
By the end of the project, we hope to provide a robust predictive model for
Amazon stock, generating predictions that can aid decision-making for investors and
financial professionals.
1.5. Disclaimer
The code used in this report was generated with the help of AI models, including
ChatGPT-4o and Deepseek. The AI-generated code was used to develop a model to
predict Amazon's (AMZN) stock price using advanced deep learning techniques.
The AI prompts used to generate the code included:
● Creating a Python script to forecast stock prices on Google Colab using the best
performing model.
● Implementing key steps such as feature engineering, data preprocessing, model
training, and evaluating deep learning techniques such as Bi-LSTM, along with
traditional models such as ARIMA and Linear Regression.
● Retrieving Amazon stock data from Yahoo Finance and calculating technical
indicators, including moving averages, MACD, Bollinger Bands, and ATR.
● Use performance metrics such as Mean Absolute Error (MAE), Root Mean Square
Error (RMSE), and R-squared (R²) to evaluate different models and select the
most accurate one.
● Perform correlation analysis to identify the most relevant features and prevent
multicollinearity.
● Use the selected model to forecast Amazon's stock price for the next five days and
visualize the results using Matplotlib and Seaborn.
While the AI-generated code is used as a foundation, additional modifications and
optimizations have been made to ensure accuracy and efficiency. The predictions and
analysis in this report are for research purposes only and should not be considered
8
financial advice. Investors and analysts should conduct their own independent
evaluations before making any financial decisions.
9
CHAPTER 2: PREDICTIVE MODEL BUILDING
We use the head() and tail() functions to print out the first and last 5 lines of the
data, which helps to confirm that the data has been collected correctly and is in the right
structure. Here is how to do it:
The results show that the obtained data is in the correct format and can be used for
further analysis.
We continue to check the data type of the columns in the dataset using the info()
method. This helps to confirm that the columns have the correct data type, for example,
the time column (index) must be in datetime format.
10
After checking the basic structure of the data, we proceed to check for missing
values in the dataset using the isnull().sum() method. If there are missing values, we will
handle them, possibly by deleting the missing rows or filling in replacement values.
# Kiểm tra số lượng giá trị duy nhất của từng cột
print("Number of unique values per column:\n", data.nunique())
def outlier_detection(data, columns):
result_df = pd.DataFrame(index=data.describe().index)
for col in columns:
IQR = data[col].quantile(0.75) - data[col].quantile(0.25)
lower_bound = data[col].quantile(0.25) - 1.5 * IQR
upper_bound = data[col].quantile(0.75) + 1.5 * IQR
# Tìm các giá trị ngoại lệ
It can be seen from the output of the code that the data obtained is in the correct
format and can be used for analysis.
11
Figure 1. Result of checking for missing values
Step 4: Check for duplicate values
To ensure the reliability of the data, we used a Python code called
"data.duplicated" to check for any duplicates in our data and inserted a code to remove
any duplicate values. Fortunately, there were no duplicate values in our data set. The
results showed that there were no duplicates in the data, so there was no need to remove
any rows.
# Kiểm tra kiểu dữ liệu
print(data.info())
12
data.index = pd.to_datetime(data.index)
# Kiểm tra số lượng giá trị duy nhất của từng cột
print("Number of unique values per column:\n", data.nunique())
Finally, we also check the number of unique values in each column to ensure the
validity of the data. If any column contains too few unique values, we will reconsider the
validity of that column in the model.
# Kiểm tra số lượng giá trị duy nhất của từng cột
print("Number of unique values per column:\n", data.nunique())
Another important step is to check for outliers that may affect the quality of the
model. We perform outlier testing using the IQR (Interquartile Range) method. Outliers
are identified if they exceed the upper or lower thresholds of the IQR. Here is how to
identify outliers for important columns such as Open, High, Low, Close, and Volume
13
outlier_condition = (data[col] < lower_bound) | (data[col] >
upper_bound)
outliers = data.loc[outlier_condition.values, col].describe()
# Lưu các thống kê về ngoại lệ
result_df[f'outlier_{col}'] = outliers
return result_df
14
information on a wide range of stocks). The information includes Date, Open, High, Low,
Close, and Volume.
import yfinance as yf
import pandas as pd
# Define the date range
start_date = "2020-03-16"
end_date = "2025-03-17"
15
2.2.1.3. Data analysis
The opening price of Amazon (AMZN) shows volatility from 2020 to 2025, with
prices remaining relatively stable until a sharp increase around 2022. From 2022 to 2023,
the stock shows periods of volatility with significant increases and decreases, which may
correlate with external events, market conditions, or internal company strategies. The
price gradually declines by the end of 2024, reflecting the potential impact of external
factors such as a market downturn or economic uncertainty.
16
Figure 4. The high prices of Amazon stock from 2020 to 2025
The low price shows a steady increase from 2020 to 2025, reflecting overall
market optimism for AMZN, with a dip occurring in 2022 and 2023.
The steady increase from 2023 onwards could be related to increasing demand for
Amazon’s services, possibly due to the growth of e-commerce or expansion of cloud
services.
18
Image.
Figure 7. Amazon’s stock exchange volume from 2020 to 2025
The distribution of closing prices for Amazon (AMZN) shows a fairly
symmetrical distribution, with a peak around $160, indicating that AMZN’s closing
prices have mostly fluctuated within this range. Most closing prices are concentrated
around the median, but there is some volatility between $80 and $240, with a few
instances of prices reaching above $200. This suggests that while AMZN can reach
higher prices during positive market conditions, these higher prices are relatively rare.
The presence of higher price points suggests that AMZN has the potential to generate
high returns during favorable market periods, but overall the stock price has remained
stable and has not deviated much in either direction.
19
Figure 8. Amazon's stock close distribution
Finally, from the chart of the distribution of trading volume for Amazon stock, we
can see that the distribution is skewed to the right, indicating that the majority of trading
volume falls in the lower range, with a few instances of very high volume. The peak of
the distribution is between 0.5 and 1.0 million shares traded, indicating that most of the
time, trading volume is in this range. The tail on the right represents occasional spikes in
trading activity, where large volumes of shares are traded. These spikes can be due to
important events such as earnings reports, market reactions to news, or other external
factors that influence investor behavior. The overall trend shows that while high volume
days do occur, they are relatively rare and Amazon stock typically has moderate trading
volume.
20
Figure 9. Amazon's distribution of volume of stock trading
2.2.2. Feature selection
Before training the dataset for prediction, we examined the fundamentals of stock
prices from a practical viewpoint to include new features aimed at improving the
performance of our predictive model. Technical indicators based on past stock prices and
volume data are crucial for predicting stock trends and price fluctuations (Shynkevich et
al., 2017), as these indicators reflect market dynamics, investor mood, and volatility,
offering essential data for predictive models.
Essential price-based features include Close, High, Low, and Open prices, which
indicate fluctuations in markets and contribute to the calculation of technical indicators.
Volume measures, including Volume, Volume Change, and On-Balance Volume (OBV),
enable the evaluation of market activity and investor sentiment (Bao et al., 2017; Oak et
al., 2024). Trend-following indicators, including Moving Averages (MA) and
Exponential Moving Averages (EMA), mitigate price volatility and assist in trend
identification (Selvin et al., 2017). Momentum indicators, such as Momentum, Returns,
and Volatility, measure price acceleration and reversals, thereby enhancing short-term
prediction precision (Fischer & Krauss, 2018). Oscillators such as RSI, Williams %R,
and CCI assess price momentum and detect overbought or oversold levels (Selvin et al.,
2017; Oak et al., 2024). Bollinger Bands, ATR, and ADX measure market volatility and
21
trend strength, differentiating between stable and volatile phases (Bao et al., 2017;
Fischer & Krauss, 2018). The MACD and Signal Line are extensively utilized to identify
trend reversals and momentum fluctuations (Shynkevich et al., 2017). Researchers have
found that using important indicators like MA, EMA, MACD, RSI, Bollinger Bands,
ATR, and OBV can help make predictions more accurate and reduce overfitting (Oak et
al., 2024). With these indicators, we will have 25 features in total, which are listed
below:
Price-based features
● Close, High, Low, Open
● Return: The percentage changes in the closing price compared to the previous
day's closing price. It is used to measure the daily return on the stock.
● Volatility_10: The standard deviation of price fluctuations over the previous 10
days. It measures the fluctuations in the stock's price, with increased volatility
indicating greater risk.
Moving averages
● MA_5: The 5-day simple moving average of the stock price, determined by
averaging the closing prices from the previous 5 days.
● MA_20: The 20-day simple moving average of the stock price.
● EMA_5: The 5-day EMA of the stock price, which provides greater significance
to recent prices to react more swiftly to price fluctuations.
● EMA_20: The 20-day EMA, which responds more swiftly to market fluctuations
than a standard moving average.
Volatility and trend indicators
● Bollinger_Upper: indicates the stock's overbought status, determined by adding
twice the standard deviation of the closing price to the 20-day moving average.
● Bollinger_Lower: indicates the stock's oversold status, determined by deducting
twice the standard deviation from the 20-day moving average.
22
● RSI_14: evaluates overbought or oversold conditions by calculating the
magnitude of recent price fluctuations. A score beyond 70 indicates overbought
conditions, whereas a value below 30 indicates oversold conditions.
● Williams %R: compares the current closing price against the high-low range
over a given period (usually 14 days).
● CCI: measures the difference of the stock price from its mean price over a
specified duration. Positive values signify an overbought state, whilst negative
values indicate an oversold state.
● ADX: measures the strength of a trend, whether ascending or descending, with
values exceeding 25 indicating a robust trend.
● ATR: computes the mean range between high and low prices during an assigned
period.
Volume-related features
● Volume
● Volume Change: The percentage change in volume compared to the prior day.
● Volume_Spike: measures volume in relation to the 10-day moving average of
volume.
Technical indicators
● MACD: measures the difference between two exponential moving averages
(typically the 12-day and 26-day EMAs).
● Signal_Line: is the 9-day EMA of the MACD, which helps identify buy or sell
signals by crossing above or below the MACD line.
● Momentum_5: The difference between the current day's closing price and the
closing price from five days earlier. This assesses the stock's price momentum
over a short duration.
2.2.3. Testing feature significance
Similar to the beginning of the code, we also cleaned the technical indicators' data
to ensure that further processing was not troublesome. At this point, we continue to
optimize our model by attempting to identify variables that are important to the results of
23
the prediction by analyzing each variable's correlation among themselves through a
correlation matrix analysis. The purpose of this phase is to pinpoint which features are
the most important and least in the model’s prediction accuracy.
24
Among technical indicators, EMA_5 (0.0233) was recognized as a crucial element,
demonstrating its effectiveness in identifying short-term trends. A minimum value of
50% of the average feature importance score was implemented to optimize feature
selection, thereby maintaining the most significant signals. Finally, this methodology
highlighted High, Low, and EMA_5 as essential features, enhancing model efficiency
and prediction accuracy while minimizing redundancy.
2.3. Model analysis
2.3.1. Model training
This code details the process of dividing the dataset into training and test subsets,
which is essential for evaluating the generalization capability of the predictive model. In
this implementation, the dataset is split such that 80% of the data is allocated to the
training set, while 20% is reserved for testing.
25
to its capacity to acquire and integrate input bidirectionally, thereby reducing information
loss during training (Fischer & Krauss, 2018; Selvin et al., 2017).
In addition, Bao et al., (2017) showed that combining Bi-LSTM with optimization
methods such as Attention Mechanism and Dropout improves the accuracy of forecasts
while keeping the model stable. Dao et al., (2024) evaluated the efficacy of the LSTM
model in predicting the volatility of the VNIndex, demonstrating that this model exhibits
high accuracy and effectively captures market movements. Tran (2024) evaluated the
efficacy of the LSTM-GRU ensemble model in predicting stock indices, revealing that
this model enhances forecasting efficiency relative to conventional methods. These
results further validate the efficacy of Bi-LSTM in forecasting financial time series,
particularly in the setting of intricate data influenced by numerous external variables.
Finally, we run and train the Bidirectional LSTM (Bi-LSTM) model by fitting it to
the training data (X_train, y_train) over 20 epochs with a batch size of 32.
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Bidirectional
# Xây dựng mô hình Bi-LSTM
model = Sequential()
# Thêm lớp LSTM hai chiều (Bidirectional LSTM)
model.add(Bidirectional(LSTM(units=50, return_sequences=True),
input_shape=(X_train.shape[1], X_train.shape[2])))
model.add(LSTM(units=50, return_sequences=False)) # LSTM cuối cùng
model.add(Dense(units=1)) # Dự báo một giá trị cho Close
# Biên dịch mô hình
model.compile(optimizer='adam', loss='mean_squared_error')
# Hiển thị mô hình
model.summary()
# Huấn luyện mô hình
history = model.fit(X_train, y_train, epochs=20, batch_size=32,
validation_data=(X_test, y_test))
26
2.3.3. Model testing results
The accuracy of the Bi-LSTM model in forecasting stock prices can be evaluated
by four principal metrics: Mean Squared Error (MSE), Root Mean Squared Error
(RMSE), Mean Absolute Error (MAE), and R-squared (R²).
● The MSE of 0.0008 indicates a minimal average squared error, indicating that the
model's predictions closely correspond with real stock values.
● The RMSE of 0.0284 offers an understandable error metric in the same unit as the
target variable, indicating that, on average, the model's predictions vary from
actual values by approximately 2.84%.
● The MAE of 0.0226 supports the model's accuracy by measuring the average
absolute deviation between projected and actual prices.
● The R-squared (R²) score of 0.9474 indicates that the model accounts for 94.74%
of the variation in stock price movements, reflecting robust predictive capability.
The results indicate that the Bi-LSTM model accurately captures stock price trends.
Following that, the code written here defines that the prediction loop iterates five times,
corresponding to the next five trading days. At each iteration, the Bi-LSTM model
generates a one-day-ahead stock price prediction using model.predict(current_input), and
the result is appended to forecast_scaled.
27
current_input = np.concatenate([current_input[:, 1:, :],
next_day_reshaped], axis=1)
# Get the next business day
current_date += BDay(1) # Moves to the next business day
predicted_dates.append(current_date)
2025-03-19 197.080310
2025-03-20 196.621833
2025-03-21 196.171110
2025-03-24 195.717799
2025-03-25 195.262849
Technical factors
Amazon's moving average behavior, particularly the 20-day and 50-day EMAs,
has recently demonstrated a bearish crossover, aligning with the declining trend.
Historically, Amazon's stock has had a strong response to its 50-day EMA, which past
breaches have often led to even bigger drops. Recently, after falling below the 50-day
EMA at $221.40 (Barchart, 2025), the stock kept going down, following the same trend
seen in mid-2024, when a similar drop led to another 2% drop within two weeks (Money
Morning, 2025).
28
Fundamental factors
The latest earnings reports and revenue predictions from Amazon have a big
influence on investor sentiment. If Amazon Web Services growth slows below its normal
yearly rate of about 30% or if e-commerce sales drop because people are spending less,
the stock may be under even more pressure. Looking at Q1 2025, the company has
projected revenues between $151 billion and $155.5 billion, below analysts' expectations
of $158.6 billion (Reuters). Furthermore, higher operating expenses like wages and
shipping costs could make the profit margin smaller, which is another reason why prices
are expected to go down.
Macroeconomic factors
Amazon's stock is also affected by macroeconomic variables such as interest rates
and inflation patterns. In early 2025, the Federal Reserve reduced interest rates yet
anticipated gradual reductions due to ongoing inflation, affecting market volatility and
Amazon's performance (ft.com). Moreover, broader market patterns, such as sector
rotations from technology companies to defensive sectors, might influence Amazon's
stock price. In early 2025, concerns regarding inflation and trade policy led to an
important sell-off in the technology stock market, with Amazon experiencing
considerable falls.
Institutional activity and volume trends
Amazon's shares dropped 4.1% to $229.15 on February 7, 2025, with a trading
volume of 77.3 million, much more than its 50-day average volume of 34.9 million.
Stock performance may be impacted by these deviations, which may signal changes in
institutional purchasing or selling activity. Negative sentiment may also be strengthened
by a decline in institutional buying activity, which is indicated by reduced trading
volumes during price declines. This could indicate a decline in confidence in Amazon's
short-term potential. On the other hand, consistently high trading volumes during price
drops could indicate that investors are hesitant to "buy the dip," which would further
affect stock dynamics.
29
2.3.5. AI model evaluation and comparison with traditional econometric models
In econometrics, several models are used for time-series forecasting, with
ARIMA, OLS, and Bi-LSTM frequently applied in the analysis of stock price
predictions. Each model applied to Amazon’s stock price exhibits distinct strengths and
weaknesses, dependent on the underlying data and market conditions.
The ARIMA model uses autoregressive (AR), differencing (I), and moving
average (MA) components to forecast stock prices. According to Tsay (2010), ARIMA
models work best when the market is stable because they are based on “stationary data”.
In the analysis of Amazon, the ARIMA model forecasted a minor decrease of 0.11% over
a five-day period, with MAE of 2.6177 and a RMSE of 3.4479, reflecting greater errors
relative to the Bi-LSTM model and a R² value of 0.9684. The limitations of ARIMA in
integrating external events or technical indicators, such as AWS growth or e-commerce
expansion, limit its performance during volatile market conditions, including earnings
statements or geopolitical changes.
On the other hand, OLS (Ordinary Least Squares) regression analyzes the
relationship between a dependent variable (stock price) and independent variables (such
as economic indicators and technical indicators like EMA). Ordinary Least Squares
(OLS) models are not adept at managing non-linear relationships and frequently neglect
volatility or technical factors. In the case of Amazon, OLS demonstrated superior
performance compared to ARIMA, as indicated by a R² value of 0.9961, which suggests
a better fit to the data. However, the MAE of 0.9215 and RMSE of 1.2166 for OLS
indicated larger errors than those observed with Bi-LSTM. Furthermore, the dependence
of OLS on linear relationships fails to sufficiently represent short-term trends and
volatility compared to Bi-LSTM, which utilizes historical price patterns and market
signals.
In conclusion, ARIMA demonstrates effectiveness during periods of market
stability, while OLS is suitable for linear relationships and economic variables.
Conversely, Bi-LSTM is more adept at predicting Amazon’s stock price due to its
capacity to capture complex, non-linear interactions and market volatility. This
30
underscores the benefit of Bi-LSTM in stock forecasting, particularly where technical
indicators and market sentiment are essential.
CHAPTER 3: CONCLUSION
3.1. Findings
Predicting the stock price of Amazon (AMZN) for the next five days using a Bi-
LSTM (Bidirectional Long Short-Term Memory) deep learning model has shown
significant promise in forecasting short-term market trends. By leveraging historical
stock data from Yahoo Finance over the past five years, the model successfully identified
patterns in Amazon stock movements. Key features such as Exponential Moving
Averages (EMA_5, EMA_20), Relative Strength Index (RSI), Moving Average
Convergence Divergence (MACD), and Average True Range (ATR) were used to capture
both short-term trends and market volatility.
The accuracy of the model was evaluated using performance metrics, including Mean
Absolute Error (MAE), Root Mean Square Error (RMSE), and R² values, demonstrating
that Bi-LSTM outperforms traditional models such as ARIMA and Linear Regression in
predicting Amazon stock prices over the next 5 days. For example, the MAE is relatively
low, indicating that the Bi-LSTM model's predictions are close to the actual value, while
the R² value indicates that the model can explain a significant portion of the variation in
stock prices.
3.2. Limitations
Although the Bi-LSTM model is effective, some limitations were identified during
the analysis. A significant challenge is that the model is primarily trained using historical
stock price data and does not account for external variables such as macroeconomic
indicators, industry news, or global events (e.g., the COVID-19 pandemic or regulatory
changes) that can cause sudden stock price fluctuations. Ignoring these factors means that
the model may not be able to predict large market changes influenced by unpredictable,
often non-linear global events.
In addition, while the Bi-LSTM model is capable of capturing non-linear
relationships in the data, it still faces challenges when dealing with rare market
31
phenomena or sudden shocks that can cause significant price fluctuations in very short
periods of time. The model also relies on feature engineering and lag-based input data
(e.g., the previous 60 days), which limits its ability to predict price movements based on
external factors or real-time sentiment changes.
Furthermore, while the Bi-LSTM model provides accurate predictions for the
short term (5 days), its generalizability over longer time frames (e.g., weeks or months)
remains uncertain. A more robust model would need to take longer time frames and
external influences into account, potentially incorporating both historical data and real-
time input data.
3.3. Recommendations
To address these limitations and improve prediction accuracy, future research
should focus on integrating external data sources, such as sentiment analysis of financial
news, social media, or macroeconomic indicators. For example, using natural language
processing (NLP) techniques to analyze news sentiment or investor sentiment on
platforms like Twitter could provide valuable insights into how Amazon’s stock price is
affected. Additionally, including broader economic factors like GDP growth, inflation,
and interest rates would allow the model to better account for market conditions that
affect stock prices.
Furthermore, experimenting with other deep learning algorithms, such as Support
Vector Machines (SVM), Random Forest, or Gradient Boosting Machines (GBM), could
provide better performance in handling non-linear relationships or complex data patterns
that Bi-LSTM may struggle with. These methods, combined with deep learning, could
enhance the model’s ability to handle diverse data and improve prediction reliability.
For practical use, investors should combine the model’s stock predictions with
fundamental analysis and regular market monitoring. Integrating the model’s output with
major market events (such as earnings reports, product launches, and major market
moves) can improve decision-making and provide a more comprehensive approach to
stock price forecasting. Furthermore, using real-time updates and continuously training
the model with the latest data will keep predictions relevant and timely.
32
Finally, using sentiment analysis tools and understanding market sentiment can
provide value by providing insight into how the market reacts to news or events that are
not immediately reflected in stock data. This multifaceted approach can lead to more
informed investment strategies and better decision-making.
33
REFERENCES
Bao, W., Yue, J. and Rao, Y. (2017). A deep learning framework for financial time series
using stacked autoencoders and long-short term memory. PLOS ONE, 12(7),
p.e0180944. doi:https://doi.org/10.1371/journal.pone.0180944.
Barchart (n.d.). AMZN - Amazon.com Stock Price. [online] Barchart.com. Available at:
https://www.barchart.com/stocks/quotes/AMZN/overview.
Bensinger, G. and Sophia, D.M. (2025). Amazon shares drop as cloud growth, sales
forecast lag. Reuters. [online] 7 Feb. Available at:
https://www.reuters.com/technology/amazon-beats-quarterly-revenue-estimates-2025-02-
06/.
Chong, E., Han, C. and Park, F.C. (2017). Deep learning networks for stock market
analysis and prediction: Methodology, data representations, and case studies. Expert
Systems with Applications, 83, pp.187–205.
doi:https://doi.org/10.1016/j.eswa.2017.04.030.
Dao, O. and Nguyen, C. (2024). Dự báo chỉ số chứng khoán bằng học máy: Bằng chứng
thực nghiệm từ thị trường chứng khoán Việt Nam. [online] Philarchive.org. Available at:
https://philarchive.org/rec/LKIDBC [Accessed 18 Mar. 2025].
Fama, E. (1970). Efficient capital markets: A review of theory and empirical work. The
Journal of Finance, 25(2), pp.383–417. doi:https://doi.org/10.2307/2325486.
Fischer, T. and Krauss, C. (2018). Deep learning with long short-term memory networks
for financial market predictions. European Journal of Operational Research, 270(2),
pp.654–669.
34
Gupta, R. and Chen, M. (2020). Sentiment Analysis for Stock Price Prediction. [online]
IEEE Xplore. doi:https://doi.org/10.1109/MIPR49039.2020.00051.
Hughes, J., Alim, A.N. and Smith, I. (2025). Is the Federal Reserve’s preferred measure
of inflation set to fall? [online] @FinancialTimes. Available at:
https://www.ft.com/content/2a85a487-c881-4f9f-b287-87617b6673d3.
Kim, T. and Kim, H.Y. (2019). Forecasting stock prices with a feature fusion LSTM-
CNN model using different representations of the same data. PLOS ONE, [online] 14(2),
p.e0212320. doi:https://doi.org/10.1371/journal.pone.0212320.
Kuang, S. (2023). A Comparison of Linear Regression, LSTM model and ARIMA model
in Predicting Stock Price A Case Study: HSBC’s Stock Price. BCP Business &
Management, 44, pp.478–488. doi:https://doi.org/10.54691/bcpbm.v44i.4858.
Nelson, D.M.Q., Pereira, A.C.M. and de Oliveira, R.A. (2017). Stock market’s price
movement prediction with LSTM neural networks. 2017 International Joint Conference
on Neural Networks (IJCNN). doi:https://doi.org/10.1109/ijcnn.2017.7966019.
Oak, O., Nazre, R., Budke, R. and Mahatekar, Y. (2024). A Novel Multivariate Bi-LSTM
model for Short-Term Equity Price Forecasting. [online] arXiv.org. Available at:
https://arxiv.org/abs/2409.14693.
35
Onyenahazi, O.B. and Antwi, B.O. (2024). The Role of Artificial Intelligence in
Investment Decision-Making: Opportunities and Risks for Financial Institutions.
International Journal of Research Publication and Reviews, 5(10), pp.70–85.
doi:https://doi.org/10.55248/gengpi.5.1024.2701.
Schuster, M. and Paliwal, K.K. (1997). Bidirectional recurrent neural networks. IEEE
Transactions on Signal Processing, 45(11), pp.2673–2681.
doi:https://doi.org/10.1109/78.650093.
Selvin, S., Vinayakumar, R., Gopalakrishnan, E.A., Menon, V.K. and Soman, K.P.
(2017). Stock price prediction using LSTM, RNN and CNN-sliding window model. 2017
International Conference on Advances in Computing, Communications and Informatics
(ICACCI). doi:https://doi.org/10.1109/icacci.2017.8126078.
Shynkevich, Y., McGinnity, T.M., Coleman, S.A., Belatreche, A. and Li, Y. (2017).
Forecasting price movements using technical indicators: Investigating the impact of
varying input window length. Neurocomputing, 264, pp.71–88.
doi:https://doi.org/10.1016/j.neucom.2016.11.095.
Tran, D.T. (2024). Đánh giá hiệu suất mô hình phức hợp LSTM-GRU: nghiên cứu điển
hình về dự báo chỉ số đo lường xu hướng biến động giá cổ phiếu trên sàn giao dịch chứng
khoán Hồ Chí Minh. CTU Journal of Science, [online] 60(1).
doi:https://doi.org/10.22144/ctujos.2023.232.
Tsay, R.S. (2010). Analysis of Financial Time Series. [online] Wiley Series in Probability
and Statistics. Hoboken, NJ, USA: John Wiley & Sons, Inc.
doi:https://doi.org/10.1002/9780470644560.
36