big data
big data
Higher Nationals
Internal verification of assessment decisions – BTEC (RQF)
INTERNAL VERIFICATION – ASSESSMENT DECISIONS
Programme title BTEC Higher National Diploma in Computing
Mr. lasitha
Assessor Internal Verifier
Unit 16: Computing Research Project (Pearson Set)
Unit(s)
Research Proposal – Big Data
Assignment title
Mursheed Muzammil
Student’s name
List which assessment criteria Pass Merit Distinction
the Assessor has awarded.
Give details:
Internal Verifier
Date
signature
Programme Leader
Date
signature (if required)
3
LO1 Examine appropriate research methodologies and approaches as part of the research process
Resubmission Feedback:
Assignment Feedback
Formative Feedback: Assessor to Student
Action Plan
Summative feedback
Assessor Date
signature
[email protected] 9/7//2024
Student Date
signature
6
Pearson
Higher Nationals in
Computing
Unit 16: Computing Research Project
(Pearson Set)
Research Project Proposal
7
8
General Guidelines
1. A Cover page or title page – You should always attach a title page to your assignment. Use
previous page as your cover sheet and make sure all the details are accurately filled.
2. Attach this brief as the first section of your assignment.
3. All the assignments should be prepared using a word processing software.
4. All the assignments should be printed on A4 sized papers. Use single side printing.
5. Allow 1” for top, bottom, right margins and 1.25” for the left margin of each page.
1. The font size should be 12 point and should be in the style of Time New Roman.
2. Use 1.5 line spacing. Left justify all paragraphs.
3. Ensure that all the headings are consistent in terms of the font size and font style.
4. Use footer function in the word processor to insert Your Name, Subject, Assignment No,
and Page Number on each page. This is useful if individual sheets become detached for any
reason.
5. Use word processing application spell check and grammar check function to help editing
your assignment.
Important Points:
1. It is strictly prohibited to use textboxes to add texts in the assignments, except for the
compulsory information. eg: Figures, tables of comparison etc. Adding text boxes in the body
except for the before mentioned compulsory information will result in rejection of your
work.
2. Carefully check the hand in date and the instructions given in the assignment. Late
submissions will not be accepted.
3. Ensure that you give yourself enough time to complete the assignment by the due date.
4. Excuses of any nature will not be accepted for failure to hand in the work on time.
5. You must take responsibility for managing your own time effectively.
6. If you are unable to hand in your assignment on time and have valid reasons such as illness,
you may apply (in writing) for an extension.
7. Failure to achieve at least PASS criteria will result in a REFERRAL grade.
8. Non-submission of work without valid reasons will lead to an automatic REFERRAL. You will
then be asked to complete an alternative assignment.
9. If you use other people’s work or ideas in your assignment, reference them properly using
HARVARD referencing system to avoid plagiarism. You have to provide both in-text citation
and a reference list.
10. If you are proven to be guilty of plagiarism or any academic misconduct, your grade could be
reduced to A REFERRAL or at worst you could be expelled from the course
9
10
Student Declaration
I hereby, declare that I know what plagiarism entails, namely to use another’s work and to present
it as my own without attributing the sources in the correct way. I further understand what it means
to copy another’s work.
Assignment Brief
Student Name /ID Number Mursheed Muzammil / E181516
Unit Number and Title Unit 16: Computing Research Project (Pearson Set)
Academic Year
Unit Tutor
Issue Date
Submission Format:
Research Project Proposal
The submission is in the form of an individual written report.
This should be written in a concise, formal business style using single spacing and font size 12.
You are required to make use of headings, paragraphs and subsections as appropriate, and all
work must be supported with research.
Reference using the Harvard referencing system.
Please provide a referencing list using the Harvard referencing system.
The recommended word limit is minimum 2000 words.
Big data is a term that has become more and more common over the last decade. It was originally
defined as data that is generated in incredibly large volumes, such as internet search queries,
data from weather sensors or information posted on social media. Today big data has also come
to represent large amounts of information generated from multiple sources that cannot be
processed in a conventional way and that cannot be processed by humans without some form of
computational intervention.
Big data can be stored in several ways: Structured, whereby the data is organised into some form
of relational format, unstructured, where data is held as raw, unorganised data prior to turning
into a structured form, or semi-structured where the data will have some key definitions or
structural form but is still held in a format that does not conform to standard data storage
models.
Many systems and organisations now generate massive quantities of big data on a daily basis,
with some of this data being made publicly available to other systems for analysis and processing.
The generation of such large amounts of data has necessitated the development of machine
learning systems that can sift through the data to rapidly identify patterns, to answer questions or
to solve problems. As these new systems continue to be developed and refined, a new discipline
of data science analytics has evolved to help design, build and test these new machine learning
and artificial intelligence systems.
Utilising Big Data requires a range of knowledge and skills across a broad spectrum of areas and
consequently opens opportunities to organisations that were not previously accessible. The
ability to store and process large quantities of data from multiple sources has meant that
organisations and businesses are able to get a larger overall picture of the pattern of global trends
in the data to allow them to make more accurate and up to date decisions. Such data can be used
to identify potential business risks earlier and to make sure that costs are minimised without
compromising on innovation.
13
However, the rapid application and use of Big Data has raised several concerns. The storage of
such large amounts of data means that security concerns need to be addressed in case the data is
compromised or altered in such a way to make the interpretation erroneous. In addition, the
ethical issues of the storage of personal data from multiple sources have yet to be addressed, as
well as any sustainability concerns in the energy requirements of large data warehouses and
lakes.
The theme will enable students to explore some of the topics concerned with Big Data from the
standpoint of a prospective computing professional or data scientist. It will provide the
opportunity for students to investigate the applications, benefits and limitations of Big Data while
exploring the responsibilities and solutions to the problems it is being used to solve.
Choosing a research objective/question
Students are to choose their own research topic for this unit. Strong research projects are those
with clear, well focused and defined objectives. A central skill in selecting a research objective is
the ability to select a suitable and focused research objective. One of the best ways to do this is to
put it in the form of a question. Students should be encouraged by tutors to discuss a variety of
topics related to the theme to generate ideas for a good research objective.
The range of topics discussed on Big Data, could cover the following areas:
Storage models
Cyber security risks
Future developments and driving innovation.
Legal and ethical trade-offs
objectives, or hypothesis)
2. Provide a literature review giving the background and conceptualisation of the proposed area
of study. (This would provide existing knowledge and benchmarks by which the data can be
judged)
3. Examine and critically evaluate research methodologies and research processes available.
Select the most suitable methodologies and the process and justify your choice based on
theoretical/philosophical frameworks. Demonstrate understanding of the pitfalls and
limitations of the methods chosen and ethical issues that might arise.
4. Draw points (1–3, above) together into a research proposal by getting agreement with your
tutor.
15
Useful links
Useful resources for underlying principles, examples of articles and webinars on the theme:
Resourc
e Type of
Resource Titles Links
Resource
Number
Resourc
e Type of
Resource Titles Links
Resource
Number
strategy-and-big-data-analytics
https://www.sciencedirect.com/book/9780128156094/principles-and-
14 Book Principles and Practice of Big
practice-of-big-data
Resourc
e Type of
Resource Titles Links
Resource
Number
Critical analysis of Big Data challenges and analytical
24 Journal https://www.sciencedirect.com/science/article/pii/S014829631630488X
methods
25 Journal Big Data Security Issues and Challenges https://tinyurl.com/wabx7zya
26 Journal IoT Big Data Security and Privacy Versus Innovation https://ieeexplore.ieee.org/abstract /document/8643026
27 Journal Big Data Security and Privacy Protection https://www.atlantis-press.com/proceedings/icmcs18/25904185
https://journalofcloudcomputing.springeropen.com/articles/10.1186/
28 Journal Big data analytics in Cloud computing: an overview
s13677-022-00301-w
18
Grading Rubric
Centre Name
Unit Unit 16: Computing Research Project (Pearson Set)
Tutor
Proposed title
Title or working title of research project (in the form of a question, objective or hypothesis):
Research project objectives (e.g. what is the question you want to answer? What do you want to
learn how to do? What do you want to find out?): Introduction, Objective, Sub Objective(s),
Research Questions and/or Hypothesis
I confirm that the project is not work which has been or will be submitted for another qualification
and is appropriate.
Interviews:
Questionnaires:
Observations: ✘
Data Analysis: ✘
Action Research: ✘
Focus Groups:
Other (please specify): ...........................................................
Section 3: Participants
Please answer the following questions, giving full details where necessary.
Will your research involve human participants?
Describe the processes you will use to inform participants about what you are doing:
22
Will participants be given the option of omitting questions they do not wish to answer?
Yes ✘ No
If “NO” please explain why below and ensure that you cover any ethical issues arising from this.
Confirm whether participants will be asked for their informed consent to be observed.
Yes ✘ No
Will you debrief participants at the end of their participation (i.e. give them a brief explanation of
the study)?
Yes ✘ No
Will participants be given information about the findings of your study? (This could be a brief
summary of your findings in general)
Yes ✘ No
Confirm that all personal data will be stored and processed in compliance with the Data
Protection Act (1998)
Yes No
Who will have access to the data and personal information?
How long will the data and records be kept for and in what format?
23
Section 6: Declaration
I have read, understood and will abide by the institution’s Research and Ethics Policy:
Yes ✘ No
I have discussed the ethical issues relating to my research with my Unit Tutor:
Yes ✘ No
I confirm that to the best of my knowledge:
The above information is correct and that this is a full description of the ethics issues that may
arise in the course of my research.
Date:
Please submit your completed form to: ESOFT Learning Management System (ELMS)
THE RESEARCH PROPOSAL
THE
RESEARCH
PROPOSAL
Cost-Effective
Energy Solutions
for Big Data
Storage.
Cost-Effective Energy Solutions for Big Data Storage.
By
Mursheed Muzammil
E181516
Research Proposal Submitted in accordance with the requirements for the
COMPUTING RESEARCH PROJECT MODULE OF
PEARSON’S HND IN COMPUTING PROGRAMME
at the
ESOFT METRO CAMPUS
ACKNOWLEDGMENT
The following research was completed with the help of many people known as well as
unknown to me. I extend my heart-felt gratitude to each and every one of
them. Most importantly, I would like to express a special appreciation to the
following individuals. Specially I would like to thank my parents and my friends for
heartening and supported me to do this Research
i
ACKNOWLEDGMENT
The following research
was completed with
the help of many
people known as well
as
unknown to me. I
extend my heart-felt
gratitude to each
and every one of
them. Most
ii
importantly, I would
like to express a
special appreciation
to the following
individuals.
Specially I would like to
thank my parents and
my friends for
heartening and
supported me to
do this Research.
iii
EXECUTIVE SUMMARY
In the age of expanding digital information, there is a critical need for sustainable and
effective data storage techniques. This research project seeks to fulfill this need by examining
affordable energy alternatives for Big Data storage. The exponential growth of Big Data has
presented enterprises with a number of issues, including operational expenses, environmental
impact, and energy consumption. The aim of this project is to investigate novel approaches
and technological advancements that can optimize Big Data storage performance while
reducing its financial and environmental footprint.
The main goals of this study are to identify trends in energy use, appraise current energy
solutions, investigate emerging technologies, evaluate economic impacts, and address
scalability challenges. Through a combination of literature review, data analysis, economic
assessment, and scalability evaluation, the study aims to provide comprehensive insights into
affordable energy solutions specifically designed for Big Data storage environments.
iv
CONTENTS
ACKNOWLEDGMENT.............................................................................................................i
EXECUTIVE SUMMARY........................................................................................................ii
CONTENTS..............................................................................................................................iii
LIST OF TABLES.....................................................................................................................v
LIST OF FIGURES...................................................................................................................vi
INTRODUCTION......................................................................................................................1
1.1. Introduction.................................................................................................................1
1.2. Purpose of research......................................................................................................1
1.3. Significance of the Research.......................................................................................1
1.4. Research objectives.....................................................................................................1
1.5. Research Sub objectives..............................................................................................1
1.6. Research questions......................................................................................................1
1.7. Hypothesis...................................................................................................................1
LITERATURE REVIEW...........................................................................................................2
2.1. Literature Review........................................................................................................2
2.2. Conceptual framework................................................................................................2
METHODOLOGY.....................................................................................................................3
3.1. Research philosophy....................................................................................................3
3.2. Research approach.......................................................................................................3
3.3. Research strategy.........................................................................................................3
3.4. Research Choice..........................................................................................................3
3.5. Time frame..................................................................................................................3
3.6. Data collection procedures..........................................................................................3
3.6.1. Type of Data.........................................................................................................3
3.6.2. Data Collection Method.......................................................................................3
3.6.3. Data Collection and Analyze Tools.....................................................................3
3.7. Sampling......................................................................................................................3
3.7.1. Sampling Strategy................................................................................................3
3.7.2. Sample Size..........................................................................................................3
3.8. The selection of participants........................................................................................3
REFERENCES...........................................................................................................................4
1
INTRODUCTION
Introduction
There is no hard and fast rule about exactly what size a database needs to be for the data
inside of it to be considered "big." Instead, what typically defines big data is the need for
new techniques and tools to be able to process it. In order to use big data, you need
programs that span multiple physical and/or virtual machines working together in concert to
process all of the data in a reasonable span of time.
Getting programs on multiple machines to work together in an efficient way so that each
program knows which components of the data to process, and then being able to put the
results from all the machines together to make sense of a large pool of data, takes special
programming techniques. Since it is typically much faster for programs to access data stored
locally instead of over a network, the distribution of data across a cluster and how those
machines are networked together are also important considerations when thinking about big
data problems.
This research seeks to uncover feasible solutions for minimizing the energy footprint of
large-scale data storage operations by exploring upcoming technologies such as energy-
efficient hardware designs, enhanced cooling systems, and intelligent data management
algorithms. Furthermore, the study will examine the economic consequences of
implementing these solutions, weighing the trade-offs between initial investment costs and
long-term energy savings.
By conducting a thorough examination of current literature, case studies, and empirical
analysis, this study aims to provide significant insights for organizations looking to optimize
their Big Data storage infrastructure in an ecologically friendly and fiscally sensible
manner. By bridging the gap between theoretical research and real-world application, this
study hopes to contribute to the advancement of cost-effective energy solutions in the field
of Big Data storage, ultimately fostering greater efficiency, scalability, and environmental
responsibility in data management practice.
The research on cost-efficient energy solutions for Big Data storage has two main goals:
addressing environmental challenges and improving economic viability. It aims to support
sustainability in the IT sector by reducing the environmental impact of energy-intensive
storage activities. At the same time, it seeks to lower the financial burden caused by high
energy usage, thereby enhancing the economic sustainability of data storage practices.
By exploring innovative technologies and methodologies, the research intends to drive Big
Data storage innovation, creating energy-optimized solutions that align with data growth.
The study aims to support scalability and growth, ensuring the long-term relevance and
effectiveness of cost-effective energy solutions in the future digital environment.
Additionally, the study contributes to existing knowledge and best practices in Big Data
storage. It uses a methodological approach that includes empirical research, literature
2
review, and analysis of practical case studies. The goal is to provide useful information and
practical recommendations to organizations looking to improve their data storage
infrastructure. The purpose of the study is to bridge the gap between theory and practice,
enabling decision-makers to implement cost-effective energy solutions that positively
impact their data management practices.
and regulations.
Additionally, via
8.pinpointing and
adoption of the low-
cost energy
alternatives, the
entities will have an
9.opportunity of
experiencing significant
savings hence an
added benefit of
financial stability
10. and
competitiveness.
11. In addition, the
study bears a great
5
The holistic nature of this research provides a comprehensive response to Big Data storage
issues by considering economic, environmental, and technological aspects. Through such
critical research, the initiative aims for positive transformation, benefiting organizations,
society, and the environment
19.4. Hypothesis
In the topic “Cost-Effective Energy Solutions for Big Data Storage,” as the need for large
data storage increases, so does the need for cost-effective energy solutions. Several
hypotheses can be proposed to address this challenge:
By testing these hypotheses, the research aims to uncover practical and innovative solutions
to reduce the energy consumption and costs of Big Data storage, thereby supporting both
economic environmental sustainability.LITERATURE REVIEW
10
This literature analysis delves into the present state of research on energy-efficient Big Data
storage systems, highlighting emerging trends, technologies, and industry problems. It
addresses the growing need for data storage brought on by the exponential expansion of
digital information as well as the substantial energy usage involved in operating massive data
centers. In order to lower energy consumption and operating costs, the analysis covers
numerous energy-efficient hardware and cooling technologies, integration of renewable
energy sources, and intelligent data management algorithms that have been suggested or put
into practice.
The conceptual framework outlines the relationship between independent variables (energy-
saving strategies) and the dependent variable (energy consumption and costs in Big Data
storage). Below is a visual representation of this framework.
Independent Variables:
1. Energy-Efficient Hardware
o Use of Solid-State Drives (SSDs)
o Low-Power Processors
Dependent Variable:
METHODOLOGY
1. Pragmatism:
Outcome-Oriented:
12
Quantitative Analysis:
o Using historical data to analyze energy consumption patterns and assess the
effectiveness of existing energy-saving technologies.
Qualitative Analysis:
o Conducting case studies and expert interviews to gain insights into innovative
technologies and practical applications.
Mixed-Methods Approach:
o Combining both quantitative and qualitative methods to provide a holistic
view of energy solutions and their impact on Big Data storage.
By adopting a pragmatic research philosophy, the study aims to deliver practical, evidence-
based solutions that address the complex challenges of energy consumption and costs in Big
Data storage.
The research approach for investigating cost-effective energy solutions for Big Data storage
involves a mixed-methods approach. This approach integrates both quantitative and
qualitative methods to provide a comprehensive understanding of the problem and potential
solutions. Here’s a detailed breakdown:
1. Quantitative Approach:
Data Analysis:
o Objective: Analyze historical energy consumption data to identify patterns
and trends in Big Data storage.
o Methods: Statistical analysis, energy consumption modeling, and trend
analysis.
13
2. Qualitative Approach:
Case Studies:
o Objective: Provide detailed examples of how different organizations have
implemented energy-saving strategies.
o Methods: In-depth case studies, site visits, and interviews with key
stakeholders.
o Data Sources: Organizational reports, interviews, and on-site observations.
o Outcome: Practical examples and lessons learned from real-world
applications of energy solutions.
3. Mixed-Methods Approach:
Integration of Data:
o Objective: Combine quantitative data with qualitative insights to create a
holistic understanding of energy solutions.
o Methods: Triangulation of data, synthesis of quantitative and qualitative
findings, and integrated analysis.
o Data Sources: A combination of historical data, case studies, expert opinions,
and technology assessments.
o Outcome: A comprehensive framework for evaluating and implementing
cost-effective energy solutions for Big Data storage.
14
20. In order to obtain a thorough grasp of the subject, the research plan for investigating
energy-efficient Big Data storage options will be organized using a multi-phase approach
that combines several approaches. The following crucial stages will be included in the
strategy:
• Goal: o to establish a strong basis by examining previous studies on Big Data storage-
related energy-efficient technologies, cost-cutting tactics, and environmental effects.
• Techniques:
o Gather and examine case studies, industry reports, and academic literature.
o Creation of a conceptual framework: Determine the links between the important factors
(such as energy-efficient hardware, cooling technologies, and renewable energy sources).
• Result:
A Mixed-Methods Research Design is the research strategy chosen for this investigation of
affordable energy options for Big Data storage. This option offers a thorough strategy for
comprehending and resolving the research problem by combining quantitative and qualitative
research approaches.
Due to the intricacy of energy usage in Big Data storage systems, a comprehensive analysis
that takes into account contextual insights as well as quantitative and qualitative data is
necessary. A more thorough knowledge of the variables at play is possible through a deeper
investigation of the issue made possible by a mixed-methods approach.
Three-way communication
The research can increase the validity and dependability of its findings by cross-validating its
quantitative and qualitative data sets. This combination of data sources strengthens the
argument for the
15
20.3. Sampling
The research on energy-efficient Big Data storage options is heavily reliant on sampling. In
order to obtain pertinent data for analysis, the method comprises choosing a representative
sample of businesses, organizations, and data storage facilities.
Purposive and stratified sample techniques will be combined in the sampling strategy for this
study on affordable energy solutions for Big Data storage. To ensure that the participants
offer insightful feedback pertinent to the study's goals, organizations and industry
professionals with substantial experience in data storage and energy management will be
chosen through the use of purposeful sampling.
Conclusion
The goal of this research is to identify energy-efficient Big Data storage systems that also
consider the environment, economic sustainability, and high energy consumption. The study
looks at cutting-edge technologies and current practices to assist firms optimize their storage
systems in an environmentally responsible and effective manner. The results will direct
useful modifications that will lead to more scalable and environmentally friendly data storage
techniques.
16
REFERENCES
Author links open overlay panelJin Zhang a et al. (2023) The impact of Big Data on
Research Methods in Information Science, Data and Information Management.
Available at: https://www.sciencedirect.com/science/article/pii/S2543925123000128
(Accessed: 06 August 2024).
Author links open overlay panelJin Zhang a et al. (2023) The impact of Big Data on
Research Methods in Information Science, Data and Information Management.
17
Leonelli, S. (2020a) Scientific Research and big data, Stanford Encyclopedia of Philosophy.
Available at: https://plato.stanford.edu/entries/science-big-data/ (Accessed: 06 August
2024).
Leonelli, S. (2020b) Scientific Research and big data, Stanford Encyclopedia of Philosophy.
Available at: https://plato.stanford.edu/entries/science-big-data/ (Accessed: 06 August
2024).
Ryan, E. (2023) Research objectives: Definition & examples, Scribbr. Available at:
https://www.scribbr.com/research-process/research-objectives/ (Accessed: 06 August
2024).
Sreekumar, D. (2024) What are research objectives and how to write them (with examples),
Researcher.Life. Available at: https://researcher.life/blog/article/what-are-research-
objectives-how-to-write-them-with-examples/ (Accessed: 06 August 2024).