0% found this document useful (0 votes)
48 views

Synopsis "Time Series Geospatial Big Data Analysis Using Array Database"

This document provides a summary of a synopsis for a project on analyzing time series geospatial big data using an array database. It includes an abstract describing the storage and processing of earth observation data as multidimensional arrays in an array database. A literature review covers several papers on related topics. The document identifies the problem of managing large volumes of earth observation data and proposes a methodology using the open source Rasdaman array database system. Tools to be used include Rasdaman and OSGeoLive. A schedule outlines work to be done over 4 months, including literature review, module implementation, paper submission, and final submission.

Uploaded by

saurabh
Copyright
© © All Rights Reserved
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views

Synopsis "Time Series Geospatial Big Data Analysis Using Array Database"

This document provides a summary of a synopsis for a project on analyzing time series geospatial big data using an array database. It includes an abstract describing the storage and processing of earth observation data as multidimensional arrays in an array database. A literature review covers several papers on related topics. The document identifies the problem of managing large volumes of earth observation data and proposes a methodology using the open source Rasdaman array database system. Tools to be used include Rasdaman and OSGeoLive. A schedule outlines work to be done over 4 months, including literature review, module implementation, paper submission, and final submission.

Uploaded by

saurabh
Copyright
© © All Rights Reserved
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 5

SYNOPSIS

ON

“TIME SERIES GEOSPATIAL BIG DATA ANALYSIS USING


ARRAY DATABASE”

BACHELOR OF ENGINEERING
IN
COMPUTER SCIENCE AND ENGINEERING

SUBMITTED BY
Jayati Gandhi

UNDER THE GUIDANCE


OF
Prof. Sathish Kumar Penchala
Asst. Professor in CSE dept.

COMPUTER SCIENCE ENGINEERING DEPARTMENT


G.H.RAISONI COLLEGE OF ENGINERING
NAGPUR
2019-20
Abstract:-
Over the past few years, Earth Observation (EO) has been continuously generating much
spatiotemporal data that serves for societies in resource surveillance, environment
protection, and disaster prediction. The proliferation of EO data poses great challenges in
current approaches for data management and processing. Nowadays, the Array Database
technologies show great promise in managing and processing EO Big Data. This paper
suggests storing and processing EO data as multidimensional arrays based on state-of-
the-art array database technologies. A multidimensional spatiotemporal array model is
proposed for EO data with specific strategies for mapping spatial coordinates to
dimensional coordinates in the model transformation. It allows consistent query
semantics in databases and improves the in-database computing by adopting unified array
models in databases for EO data. Our approach is implemented as an extension to
Rasdaman, Open source array Database Management System. It provides flexible, fast,
scalable geo services for multi-dimensional spatio-temporal sensor, image, simulation, and
statistics data of unlimited volume. At final step, there will be Rasdaman UI through which
the stored multidimensional data can be retrieved, viewed and can apply queries to view the
desired results.

Literature survey:-

1). Yangming JIANG and Siwen BI, (2008) “Dynamic Object-Oriented Model and its
Applications for Digital Earth, Digital Earth Summit on Geoinformatics”, Nov, 12-14,
2008, Germany.

This paper has nominated a dynamic object oriented model which is deployed to trigger
changes in digital earth. A dynamic object oriented model which is regarded
spatiotemporal class as a base class of four classes – ZeroTObject (ZTO),
OneTObject(OTO), TwoTObject (TTO), ThreeTObject (TTHTO) where as ZTO is a
temporal node, OTO is a temporal arc, TTO is a temporal polygon, and THTO is a
temporal cube. This model is deployed to trigger changes in digital earth.

2) “An Approach for Assessing Array DBMSs for Geospatial Raster Data” by Jane
Kovanen, Ville Makinen, and Tapani SarjakoskRO, GEO Processing 2018: The Tenth
International Conference on Advanced Geographic Information Systems, Applications,
and Service

In this paper, an approach that can be used to assess the capabilities of Array Database
Management Systems (DBMSs) regarding the management and processing of raster data.
The paper presents a framework that can be used to compare the functionalities of Array
DBMSs and benchmark them. The main feature of the framework is assessing
functionality using both targeted test cases and benchmarking. This assessment is
followed by leveraging the gained experiences to assess non functionality using
characteristics from existing quality models. The framework can be extended by further
DBMSs, benchmarks and additional hardware resources. The assessment was first
implemented for the community editions of SciDB and Rasdaman. The study presents
some key initial observations regarding the particular Array DBMSs.

3) “Geo-Spatial Big Data Analysis: An Overview” by C.Kamali and Gethsiyal Augusta

Advance increasing interest in large-scale, higher solution, real-time geographic


information system (GIS) applications and spatial big data processing, traditional GIS are
not efficient enough to handle due to limited computational capabilities. Geospatial
analytics in big data needed new approaches that are flexible, non-parametric and should
be able for dynamic modeling with non-linear processes. Compared to general big data,
the special thing of geographical big data is Spatiotemporal Association Analysis (SAA)
for scrutinizing the geographical big data. This analysis wraps of some vital elements of
geometrical relations, statistical correlations, and semantics relations for effective
decisive and predictive measurements based solutions. Therefore in this paper the main
aim is to study and review the Spatiotemporal Association Analysis (SAA) in three
aspects such as measurement (observation) adjustment of geometrical quantities, human
spatial behavior analysis with trajectories, data assimilation of physical models and
various observations.

4) “Evaluating the Open Source Data Containers for Handling Big Geospatial Raster
Data” by Fei Hu and Mengchao Xu.

This paper provides a comprehensive evaluation of six popular data containers (i.e.,
Rasdaman, SciDB, Spark, Climate Spark, Hive, and MongoDB) for handling multi-
dimensional, array-based geospatial raster datasets. Their architectures, technologies,
capabilities, and performance are compared and evaluated from two perspectives: (a)
system design and architecture (distributed architecture, logical data model, physical data
model, and data operations); and (b) practical use experience and performance (data
preprocessing, data uploading, query speed, and resource consumption). Four major
conclusions are offered: (1) no data containers, except Climate Spark, have good support
for the HDF data format used in this paper, requiring time- and resource-consuming data
preprocessing to load data; (2) SciDB, Rasdaman, and MongoDB handle small/mediate
volumes of data query well, whereas Spark and Climate Spark can handle large volumes
of data with stable resource consumption; (3) SciDB and Rasdaman provide mature
array-based data operation and analytical functions, while the others lack these functions
for users; and (4) SciDB, Spark, and Hive have better support of user defined functions
(UDFs) to extend the system capability.
5) The Australian Geosciences Data Cube — Foundations and lessons learned by Adam
Lewis and Simon Oliver.

The Australian Geoscience Data Cube (AGDC) aims to realize the full potential of Earth
observation data holdings by addressing the Big Data challenges of volume, velocity, and
variety that otherwise limit the usefulness of Earth observation data. There have been
several iterations and AGDC version 2 is a major advance on previous work. The
foundations and core components of the AGDC are: (1) data preparation, including
geometric and radiometric corrections to Earth observation data to produce standardized
surface reflectance measurements that support time-series analysis, and
collection management systems which track the provenance of each Data Cube product
and formalize re-processing decisions; (2) the software environment used to manage and
interact with the data; and (3) the supporting high performance computing environment
provided by the Australian National Computational Infrastructure (NCI).

A growing number of examples demonstrate that our data cube approach allows analysts
to extract rich new information from Earth observation time series, including through
new methods that draw on the full spatial and temporal coverage of the Earth observation
archives. To enable easy-uptake of the AGDC, and to facilitate future cooperative
development, our code is developed under an open-source, Apache License, Version 2.0.
This open-source approach is enabling other organizations, including the Committee on
Earth Observing Satellites (CEOS), to explore the use of similar data cubes in developing
countries.

Problem identification:-
Traditional storage for EO data uses various kinds of files, such as Network Common
DataForm (NetCDF) for atmospheric and hydrological sciences, GeoTIFF, and
Hierarchical Data Format (HDF) for remote sensing images. These specially-designed
data formats work quite well when the amount of data is not very large. However, issues
start to arise when data volumes increases gradually. The most obvious problem is that it
is not easy to retrieve and query the information needed. To solve this problem, an array
database is designed and implemented as a common database service offering flexible
and scalable storage and retrieval of large volumes of multidimensional array data, such
as sensor, image, simulation or statistics data. It has attracted extensive attention from
academic and industry data scientists

Methodology:-

The main aim of this project is to identify the water body area change during last 10
years. Using open source tool like Rasdaman configure a platform for geospatial data
management and analysis. After configuration download the time series satellite data of
water detection and ingest into database and execute the queries Rasql in database.
Prepare the proper Meta data for the images and store it in the database. Images are taken
from the LANDSAT 8 which is the American Earth Observation Satellite and it has 8
bands. Each band has different applications like coastal and aerosol studies, peak
vegetation detection of cloud contamination and water detection etc. The main purpose of
the bands is to monitor the earth and keep the track of changes on the planet’s surface.
After this implement different algorithms for extracting information from the time series
data. After implementation of algorithms develop a web application to query and
visualize the results.
Tools/Software Used:-

1. RASDAMAN AND OSGEOLIVE

Proposed schedule of work:-

Month Proposed work


July Literature survey
August Implementation of Modules
September Paper submission
October Final Submission and result

References:-

1). Yangming JIANG and Siwen BI, (2008) “Dynamic Object-Oriented Model and its
Applications for Digital Earth, Digital Earth Summit on Geoinformatics”, Nov, 12-14,
2008, Germany.

2) “An Approach for Assessing Array DBMSs for Geospatial Raster Data” by Janne
Kovanen, Ville Makinen, and Tapani SarjakoskRO, GEO Processing 2018: The Tenth
International Conference on Advanced Geographic Information Systems, Applications,
and Service

3). “Geo-Spatial Big Data Analysis: An Overview” by C. Kamali and Gethsiyal Augasta.

4). “Evaluating the Open Source Data Containers for Handling Big Geospatial Raster
Data” by Fei Hu and Mengchao Xu.

5). The Australian Geoscience Data Cube — Foundations and lessons learned by Adam
lewis and Simon Oliver

Signature of student Signature of Guide

You might also like