Waemars 2016
Waemars 2016
Abstract-This paper lists basic knowledge requirement to people spend in front of the artwork shows how interested they
become a data warehouse artist which should have basic data are and having knowledge regarding with that artwork.
warehouse knowledge in order to understand a data warehouse Meanwhile, the more less time to spend shows as uninterested
schema picture as a similarity when a picture or painting artist and do not understand with that artwork. A picture or painting
see an artwork. This paper does not discuss about data artist when see that artwork will have knowledge and
warehouse personal such as data warehouse development imagination how to create that artwork, the process, the
advisor, data warehouse consultant, data warehouse architect, meaning inside it, the style, is it fake or real, who created it and
data warehouse developer or any other jobs related to data similarity with their previous artwork and so on. Someone who
warehouse. This paper only discuss how a people can be a data
have understanding knowledge about that artwork will be keen
warehouse artist which can enjoy to see many database model
to spend much time in front of that artwork, otherwise they just
design pictures, particularly for data warehouse schema pictures
and enjoy to spend much time in front of those pictures.
look at it as nothing.
Moreover, A good datawarehouser or data warehouse artist Meanwhile, an artist is a person who is skilled at some
should be able to represent their data warehouse pictures not activity and Oxford English dictionary defines the older broad
only in usual and bored pictures but treat their data warehouse meaning of artist as one who pursues a practical science or a
pictures as an artwork in order to increase audience's follower of a manual art such as mechanic [3). Moreover, artist
engagement. Furthermore, having knowledge and ability to build can be used loosely or metaphorically to denote highly skilled
and develop data warehouse is value added for data warehouse
people in any non-art activities [3). Obviously, the word of
artist. Thus, a data warehouse artist can recognize and differ
each of database model picture as a database design model or
artist does not only entitle to people who relates to art, but
data warehouse model and see them as science art. beyond that artist is a person who expert in their fields and
make their activity as an artwork. The term of Datawarehouser
Keywords-Data Warehouse; Data Warehouse artist; refers to other art subject such as painter in painting art and
Datawarehouser carver in sculpture art. Thus, Datawarehouser is a data
warehouse artist who can understand database pictures and
particularly data warehouse schema pictures, spend a lot time
I. INTRODUCTION
in front of those pictures and enjoy to see it.
When teaching data warehouse topic in computer science
class, there will be many interested questions from students
such as what the purposes of this subject, do we need to
implement it, what the skills we need, how about if I hate
database, how to understand data warehouse and etc. In class, I
always said to students that they should do what their interest
based on what they have learned in computer science and
beyond that the joy for understand something is the most
important. I said to them that they do not need to become data
warehouse expert, but at least make them become an artist or
recogni zed as data warehouser or a data warehouse artist which
can understand a database/data warehouse model design
pictures, how to design it, where the resource came from, is it
database or data warehouse and many other data warehouse
knowledges.
Picture or painting as an artwork can be seen around us
such as at office, house, gallery or any places in any media
including in digital media and much time which you spend in
front of the artworks will show who you are. The more time Fig. 1. Example picture of Data Warehouse schema for higher education[1 ,2]
2206 2016 IEEE Region 10 Conference (TENCON) - Proceedings of the International Conference
3) SQL statement. 3) SQL statement.
1) Structured and unstructured data. SQL statement is important part to access data warehouse
or database as structured data and consist of Data Definition
When dealing with the data, the data warehouse artist Language (DDL) and Data Manipulation Language (DML).
should know what kind of data their dealing for. Structured DDL is used to define database structure or schema and include
data is a representation of text data in matriX/2 dimensional some SQL statement such as create/alterirename/drop/truncate
where there are column/attribute/field and row/tuple/record. On table and etc. Meanwhile, DML is used to search, add, update
other hand, unstructured data is a representation of data in non- and delete data with such as SQL statement such as select from
matrix/non 2 dimensional where include multimedia data type where, insert into, update set where, delete from where, merge
such as text, image, sound, video and animation. Most of data and etc. When creating data warehouse tables, the DDL SQL
warehouse deal with structured data, however there are some statements are applied, on other hand, the DML SQL statement
data warehouse to process unstructured data[4,6]. Since will be used in order to access data warehouse tables.
represented in column and row, the structured data are easy to
process, and the unstructured data are easy to process as well The DML SQL statement will be used by data warehouse
when transform to structured data[4,5]. in process such as ETL process when insert/update records in
data warehouse tables and creating permanent or temporary
Most of data warehouse use structured data, however there tables for multidimensional purposes or reporting or Online
are some use unstructured data[4,6] and father of Data Analytical Processing (OLAP) analysis. Meanwhile, non-
Warehouse suggested 11 steps approach to build unstructured volatile concept in data warehouse theory has meaning where
data warehouse[7]. Apart of structured and unstructured data, the data once are recorded in data warehouse, then cannot be
there is semi structured dat which is recognized with updated. However, in the next development of data warehouse
eXtensible Markup Language (XML) format and usually theory there are destructive merge and constructive merge
grouped into structured data. Structured data will be saved in concepts, where data in data warehouse can be updated.
storage either as text file or RDBMS. Destructive merge or constructive merge are concepts where
data updated without or with track record recording, where
2) Normalized and non-normalized database. track record recording will record update date into new
Normalized and non-normalized database are field/attribute/column which added in table.
representation of structured data, where normalized and non-
normalized database are usually used for transactional and C. Data Warehouse basic Knowledge.
reporting purposes respectively. Normalized database in At last a datawarehouser should know data warehouse
transactional application/Online Transactional Processing basic kno~ledge such as data warehouse characteristic like
(OLTP)/ Transactional Processing System (TPS) are formed in subject-oriented, integrated, time-variant, non-volatile,
yd normalization where each non primary key attribute in table summarized, non-normalized, metadata, and etc. A data
should fully submitted to primary key. However, it is possible warehouse must recognize and able to create such as fact,
in transactional application for having non-normalized database dimension and sub dimension tables, where dimension table
in order for performance purposes. Next, are the differences will normalized fact table, while sub dimension table will
between normalized and non-normalized database: normalized dimension table. The more dimension tables, then
• Normalized database will have non-redundant records the more normalized of fact table, while the more sub
while non-normalized database will have redundant dimension table, then the more normalized of dimension table.
records. Thus a fact table without dimension table is recognized as
non-~ormalized database, while a fact table with dimension
• Normalized database will include many tables while non- table as act to normalized fact table. 1 fact table can create
normalized database include less table. multidimensional views and create 1 or more reports.
• Because of normalized database include many tables then Moreover, a datawarehouser must know about data
will create many join while non-normalized database warehouse architecture such as client-server, two-tier or three-
include less join in Structured Query Language (SQL) tier and possible real time data warehouse Moreover, they
statement. should know data warehouse architecture like primitive,
centralized and federated. Furthermore, they should such as
• Because of normalized database create many join then will data mart including data warehouse architecture of independent
decrease SQL performance while less join will increase data mart or dependent data mart. OLAP services such as
SQL performance. MOLAP, ROLAP and HOLAP should also be understand,
Obviously, transactional application will use normalized including OLAP operation such as roll up, drill down, slice and
database while reporting application will use non-normalized dice. Finally, this data warehouse basic knowledge will consist
database where non-normalized more performance rather than of some data warehouse sub basic knowledge such as:
normalized database since having less join which increase SQL
1) Data warehouse schema.
performance.
2) ETL process.
3) Data warehouse methodology.
4) From Data Warehouse to OLTPITPS vice versa
5) Data Warehouse cortex.
2016 IEEE Region 10 Conference (TENCON) - Proceedings of the International Conference 2207
1) Data warehouse schema. 4) From Data Warehouse to OLTP/TPS vice versa
There are 3 popular data warehouse schemas such as star, A datawarehouser when observers a database or data
snowflake and fact constellation/galaxy schemas. warehouse schema pictures should understand and be able to
imagine how to make the design and purposes of the picture.
• Star schema is data warehouse schema where consist of I For example, if they see database pictures then they can
fact table and 0 or more dimension tables and without sub imagine how to make that design and what the simple data
dimension table. warehouse schema picture can be created from that picture and
• Snowflake schema is data warehouse schema where consist what kind of reports can be generated. On other hand, when
of I fact table and I or more dimension tables with I or they see data warehouse schema pictures, they can imagine
more sub dimension tables. how to make that design, what the OLTP database resource
design for that picture and what kind of reports can be
• Fact constellation or galaxy schema is data warehouse generated.
schema where more than I fact table share I or more
dimension or sub dimension tables. 5) Data Warehouse cortex.
Fact constellation or galaxy schema is favorite data Data warehouse is not software or hardware which need to
warehouse implementation while star or snowflake schema buy or applied, but it is a technology how to improve our
usually implemented in beginning of data warehouse database environment. Data warehouse is a technology which
implementation. support decision making process or intelligent application
which become data warehouse cortex which use data
2) ETL process. warehouse as their intelligent machine and they are technology
such as Decision Support System (DSS), Executive
ETL is a process where consist of 3 activities such as
Infonnation System (EIS), Dashboard, Business Intelligent
Extraction, Transformation and Loading.
(BI), Knowledge Management (KM), Big Data and etc.
• Extraction is a process to identify source of data either or
both internal orland external data and define the
III. CONCLUSION
extraction procedures.
• Transformation is a process include sub activities such as It is important for datawarehouser to know the basic
clean the data, integrate and consolidation of data, and knowledge understanding why need to implement data
conversion and manipulation of data. warehouse and next are the advantages of data warehouse
• Loading is a process to load data from OLTP to data implementation:
warehouse tables. 1. Separation between transactionaVOLTP/TPS and
There are ETL software application such as Geokettle, reporting/OLAP process, will increase performance.
CloverETL, GETL, Apatar, EpISiteETL, KETL, ETL
2. Having non-normalized database in data warehouse will
Converter, openDigger, common-etl and etc. However, you
reduce time to create reports, where non-normalized
can make it your own ETL with procedure which create with
database will have less table and join which increase SQL
any language programming which consist of DML SQL perfonnance.
statement for accessing OLTP database and move it into data
warehouse.
REFERENCES
3) Data warehouse methodology. [I] S. Warnars, "Perbandingan penggunaan database OLTP dan data
Methodology is how to or step by step to establish the warehouse," Creative Communication and Innovative Technology
(CClT), vol. 8, no. I, pp. 83-100, September 2014.
concept and refers to software engineering science in software
creating such as software development methodology like [2] S. Warnars, "Tata kelola database perguruan tinggi yang optimal dengan
data warehouse," Telkomnika, vo. 8, no. I, pp. 25-34, April 2010.
waterfall, spiral, Rational Unified Process (RUP), Agile and
[3] Wikipedia. (2016, March (0). Artist [Online]. Available:
etc. One of recognized data warehouse methodology is Kimball https :llen.wikipedia.orglwikiiArtist. Accessed Apr. 10, 2016.
lifecycle methodology and there are many proposed data
[4] D. Joa et aI., "Unstructured data integration with a data warehouse,"
warehouse methodology such as data driven, user driven and U.S. Patent 8 290 951, Oct 16,2012.
goal driven [8]. Moreover, the development of data warehouse [5] H.G. Miller and P. Mork, "From data to decision: A value chain for big
can choose between Bill Inmon's approach which is recognized data." IEEE IT Professional, vol. 15, issue. I, pp. 57-59, Feb 2013.
as Enterprise data warehouse (EDW) or Top-down (hub-spoke [6] J. Zheng and X.Li , "Research and application on unstructured data
architecture) and Ralp Kimball's approach which is recognized sharing in education information integration' ," in Proc. 7'h Int. Conf. on
as data mart approach or bottom-up (Bus architecture). Thus, lnf. Tech. in Medicine and Education (lTME), 2015, pp. 529-532.
since data warehouse development is part of building an [7] W.H. Inmon and K. Krishnan, "Building the unstructured data
application software for decision making purposes, then the warehouse: Architecture, Analysis and Design," I"ed. NewJersey,USA:
Technics publications, 20 II.
data warehouse development can use software development
[8] Y. Guo, S.Tang, Y. Tong and D. Yang, "Triple-driven data modeling
methodology such as waterfall, spiral, RUP, Agile and etc. The methodology in data warehousing: a case strudy," in Proc.of 9th Int.
chosen of the software development methodology will depend Work. On Data Warehousing and OLAP, 2006, pp. 59-66.
on each software developer's needed and skills.
2208 2016 IEEE Region 10 Conference (TENCON) - Proceedings of the International Conference