UAE Smart Data Framework EN - Part 2 Implementation Guide
UAE Smart Data Framework EN - Part 2 Implementation Guide
1
TABLE OF CONTENTS
TABLE OF CONTENTS.........................................................................................................................
Introduction.....................................................................................................................................
Context.......................................................................................4
Overview....................................................................................4
Overview....................................................................................6
Role Descriptions......................................................................10
Overview...................................................................................15
Overview...................................................................................26
Types of data............................................................................26
Overview...................................................................................32
2
5.4 Adding metadata and schema............................................63
3
INTRODUCTION
Context
This Smart Data Implementation Guide forms part of the UAE’s Smart Data Framework, as illustrated
below.
The Smart Data Framework outlines a common basis for each UAE Government Entity to develop its
own approach for managing data, in ways that provide maximum flexibility for the Entity to respond
to their own business needs yet which also enable a common approach to data classification,
exchange of data, and data quality.
This Smart Data Implementation Guide, structured in a series of five Guidance Notes, provides
guidance, best practice and recommended processes for Government Entities to follow to ensure
they meet the requirements set out in the Principles and Standards of the Framework.
Overview
The diagram on the following page illustrates a typical process that an Entity might go through,
supported by the five Guidance Notes, to implement the Smart Data Framework in a phased process
over time:
1. Establish the Entity’s data governance roles and processes
2. Build a roadmap to set out the key data management and change management actions that will
need to be taken across the Entity
3. Map out key datasets within an Entity-wide Data Inventory (if that does not already exist)
4. Prioritize which datasets need action first in terms of applying the core Data Standards
5. Implement the Data Standards conformance process through a series of ‘sprints’ through which
datasets are aligned with the Standards in a phased and prioritized process over time.
4
This process is a recommended not mandatory one. An individual Entity may decide to follow a
different approach in some areas, provided that this still results in alignment with the UAE Smart
Data Principles and conformance with the UAE Smart Data Standards.
5
GUIDANCE NOTE 1: ESTABLISHING DATA GOVERNANCE ROLES AND
PROCESSES
Purpose This Guidance Note provides Entities with guidance on governance roles and
processes to support implementation of the UAE Smart Data Framework.
When to use At the outset of each Entity’s Smart Data program
Responsibility Entity Management Board
Overview
This Guidance Note recommends good practice on data governance roles and processes, as a guide
for Entities then to tailor to their specific needs. It provides guidance in turn on:
A recommended process for establishing key data governance roles, identifying suitable
candidates and growing expertise over time
Sample job descriptions with responsibilities and skills recommended the key roles.
1 2
Director General
and Management Review and commit to UAE Smart
Set up initial4central Data Team 7
Board of Federal Data Principles at board level
Government Entity
Continuous
Establish broader improvement
governance 3 4
Director of Data
roles and Identify suitable
processes Establish governance
numbers of Data
relationships and
Custodians and Data
processes
Specialists
Data Management
Officer
5 6
Facilitate ongoing
Capacity building and consultation,
training workshops and
Data Custodians reviews
and Specialists
6
1. Review and commit to the UAE Smart Data Principles at board level
The UAE Smart Data Framework is rooted in a set of guiding principles, which are summarized
below and described in more detail in Part 1 of the Smart Data Framework: Principles and
Standards. The principles for smart data that every Entity should embed within its own governance
systems and business processes cover the following topics.
A key initial starting point should be for the top management team of each Entity to review and sign
up to these principles at Board level, and to identify a senior, empowered member of the Board to
be accountable for leading the Entity’s work to implement these principles.
7
2. Set up initial Smart Data team
It is the responsibility of each Government Entity to decide how best it will operationalize the
principles and standards of the UAE Smart Data Framework within the Entity, and this includes the
choice of data governance roles. The right approach to staffing will vary from Entity to Entity,
dependent on current levels of maturity of data management within the Entity, on the scale of the
Entity’s operations and on how important data is to the functions of the Entity.
However, it is recommended that each Entity establish at least the following roles or their
equivalent:
Director of Data (DD): a senior and empowered staff member, who will lead the Entity’s
Data program, champion and promote data management processes and effective data
publication and exchange and ensure strategic goals are realised. Ideally, the Director of
Data should be a member of the Entity’s management board; as a minimum, they should be
a senior and empowered individual with an ability to rapidly escalate key risks and issues for
resolution at the highest levels in the Entity. For smaller Entities this role might be
performed on a part-time basis, for example by an existing member of staff but with
additional assigned responsibilities.
Data Management Officer (DMO): to report and deputy for the DD and lead on the
operational work managing and coordinating required change management, processes and
coordination to ensure conformance with Smart Data Framework standards. This is an
important and full time role and requires the person responsible to spend a significant part
of their time on smart data standard conformance.
A suitable number of Data Custodians and Data Specialists: to act as business and technical
owners of key datasets and data sources within the Entity. They will understand the
contents and business value of the data, how it was collected and processes and the
accuracy and quality of the data. These could be existing data owners and IT staff with new
responsibilities.
Given that each Entity is different and will have varying existing data infrastructure and processes –
the specific setup must be at the discretion and judgement of the Entity itself. The Entity can choose
to create new positions for each of these or add the titles and responsibilities to existing roles.
The important thing is that the business functions and responsibilities of these roles are carried out
by suitably informed and skilled staff.
We recommend that the appointment of a Director of Data and Data Management Officer are the
first positions filled. The people who fill these roles require a background in data management,
government operations and digital technology – they will lead the Entity’s smart data program and
ensure the Entity’s data is published and exchanged in line with the requirements of the Smart Data
Framework. A background or knowledge of open data is an advantage but not essential. Detailed
job descriptions are given at the end of this Guidance Note.
8
across the business units by suitable and qualified personnel (individuals who know about parts of
the Entity’s data systems). Depending on the size of the Entity, there could be many Custodians that
report to a specific department or business unit Data Custodian who in turn reports to the Data
Management Officer.
5. Capacity Building
Custodians and Specialists should receive training and guidance to help them understand their role
and responsibilities. They should read the Standards and Implementation Guide of the Smart Data
Framework. We also recommend discussing how best to implement the Smart Data Principles with
key representatives of each of the Entity’s business functions. Facilitating such internal discussion is
a key role for the Data Management Officer.
7. Continuous improvement
As the Entity’s data maturity and business processes develop and as the Custodians, Specialists and
Data Management Officer run through multiple sprints of formatting and cataloguing data to ensure
their conformance with the Smart Data Standards, the Entity should continue to refine the roles and
responsibilities of its data staff. The Director of Data should keep the effectiveness of governance
arrangements under review, agreeing changes with the Entity’s Management Board as required.
9
Role Descriptions
The tables below give more detail on the key recommended data management roles for a typical
Government Entity, setting out in turn:
An overview of the role profile
The key responsibilities that the role needs to manage
The skills and competencies needed to do this effectively
Role profile The Director of Data is the senior champion and leader for data within the Entity. They
are responsible both for communicating the social, economic and business benefits of
open data and data exchange as well as ensuring the Entity’s conformance with the
Smart Data Framework standards.
They should be responsible for the development of the data strategy and policies that
are relevant to the government entity and supervise the execution and the
implementation of initiatives that contribute to the management of data efficiently and
work on the data exchange between government entities in a safe, secure and reliable
way to develop methods for service delivery and utilization and make it available as
open data to induce innovation.
The role should be fulfilled by a senior employee, with the necessary influence and
authority within the Entity to be effective in the role. This person will also be outward
facing, collaborating and communicating with external stakeholders. The Director of
Data will be the senior point of contact between the Entity and the Federal Data
Management Office, responsible for communications, coordination and escalation.
Key Leadership
Responsibilities
Overseeing the development of the Entity’s implementation plan and roadmap
for meeting the Smart Data Standard requirements, and directing delivery of
that plan
Leading a program of cultural change within the Entity aimed at embedding the
Smart Data Principles within the Entity, promoting the new ways of working
and championing the benefits of higher data quality and data exchange
Performing public outreach and presentations to increase the strategic use of
the Entity’s datasets
Leading the Entity’s open data initiative
Leading the Entity’s work on benefit realization: ensuring that the benefits of
open and shared data are maximized, through high levels of adoption and
utilization of the data to improve services and decision-making.
Governance
Putting in place the necessary roles (with the appropriate skills) within their
Entity, as outlined in this document
Providing regular reports and conformance information as requested to the
Federal Data Management Office
Improve collection, usage and exchange of data.
Conformance
Ensuring that the Entity's data is consistent with the applicable laws and
policies of data in the UAE, and meets the mandatory requirements of the
Smart Data Framework
Reviewing classified and catalogued datasets to check they are conformant
10
with the Smart Data Framework standards and approving them for publication
and exchange with other Entities
Reviewing data quality reports and statistics
Ensuring timely and effective response to queries and feedback from the public
in relation to the Entity's Open Datasets
Investigating any complaints made in relation to the Entity's Datasets by the
Federal Data Management Office, a data user or a member of the public.
Skills and Ability to communicate effectively, including ability to explain technical content
competencies to non-technical audiences
Ability to collaborate and network with subject matter experts, organizations,
and individuals to provide effective enterprise data management
Experience in technology and data, developing data strategies and overseeing
the improvement of data quality and data exchange
Ability to formulate and set goals
Proven ability to lead cross-functional teams at all organizational levels in
dealing with complex issues
Role Profile This role is the delivery and operational lead for the Entity’s data program, reporting to
the Director of Data. They will need to deliver and manage much of the work to ensure
the Entity is conformant with Smart Data Framework standards. Their role involves
ensuring the right staff are selected as Data Custodians and Data Specialists and
directing, supporting and reviewing their work on Inventories, Prioritization,
Classification and Data Conformance.
They are responsible for ensuring the readiness, reliability and security of the Entity’s
data and accuracy of the metadata, its availability, accessibility and use in a timely
manner to support the operations of the Entity and guide the analysis of data for
decision-making.
Key Leadership
Responsibilities
Effective implementation and oversight of all the data management initiatives
and processes needed to deliver on the requirements of the Smart Data
Framework in the Entity
Providing support and advice to the Data Custodians within their business unit
when classifying data within the scope of their management and assessment
of risks associated with disclosure or exchange
Supporting the Director of Data to ensure that the benefits of open and shared
data are maximised and participating in the development of the Entity’s data
roadmap
Provide mentorship and professional development of staff
Governance
Determining priorities with respect to Open Data publication or Shared Data
exchange.
Co-ordinating the work of the business unit as it prepares its open and shared
data for publication and exchange
Support resolution of any issues and problems in data or conformance to the
data standards
11
Conformance
Preparing regular reports and conformance information as requested to the
DD and Federal Data Management Office
Administering the process of inventorying, prioritising, cataloguing and
classifying the Entity’s data
Cascading knowledge about classification principles and procedures to
required roles within their Entity
Responsible for consistency and quality of data is fit-for-purpose across Entity
Reduce data duplication across the Entity
Skills and Ability to coordinate and manage the work of a large and diverse team
competencies Ability to collaborate with senior management of Business Units, functional
organizations and individuals to provide effective enterprise data management
Ability to provide data object domain insight and direction to Data Custodians
Experience in data management and data processes in all its aspects
Displays understanding of all business processes dependent on data in their
object domain
Proven ability to create presentations and effectively present to management
Role Profile There should be a Data Custodian per dataset or database or data-generating function
within the Entity. This person needs to understand the value and risks associated with
their data so that they can effectively prioritise, classify and catalogue it. They will be
responsible for determining whether the data should be Open or Shared and setting out
the access rules.
Generally, this is a role within a business unit (data generator) who has a business
responsibility (not a technical or a legal one) for ensuring that the data is used
effectively to meet both the business needs of the department and the wider goals of
the Smart Data program. The Data Custodian does not necessarily need to be the
creator or the primary user of the dataset but should understand its value to the Entity.
Key Leadership
Responsibilities
The management of the assigned data including inventorying, prioritizing and
describing datasets
Recommending changes to data management policy and procedures, data
quality and the implementation of UAE Data Standards.
Understanding and promotion of the value of data for Entity-wide purposes
and facilitation of data sharing and integration.
Governance
Ensuring the quality, completeness and up-to-date of their data
Working in collaboration with the Data Management Officer to determine
priorities and associated risks of making data accessible by third parties
Engaging with the external developer community to determine how
enhancements to the data set could facilitate greater levels of re-use.
Conformance
The collection and updating of the assigned data
Management of any third party use of the data in accordance with UAE policies
and processes
Advising and reporting on data management issues
12
Suggesting the terms and conditions upon which Shared Data should be made
available.
Skills and Collaboration skills within the business and Data Management Officer to help
competencies provide effective solutions to data issues and problems
Displays mastery of the portions of business processes executed by their
business area
Proficiency with MS Office, basic and some more advanced data analysis and
process control methods/techniques
Understands the fundamentals of data bases and data structures (tables,
hierarchical structures, flat files, etc.)
Demonstrates understanding of all the functions performed by their business
area
Displays familiarity with systems used within/by their business area
Role Profile This is a role with technical responsibility over data, and in particular with responsibility
for preparing data for publication as open data or for exchange as shared data. Probably
based within IT or database administrator teams, Data Specialists will need to facilitate
between the Information Technology, Information Security and business teams and
ensure the data they are responsible meets the format, schema and quality
requirements in the Smart Data Framework standards.
They should also be able to provide support to the Entity for cross-business definition of
data standards, rules, and hierarchy and refinement of data processes in accordance
with defined standards.
Key Leadership
Responsibilities
Assists with the resolution of data integration issues as requested by the Data
Custodian
Assists the Data Custodian in the definition of data requirements and data rules
Supports projects and initiatives in development and refinement of data
processes and metrics in accordance as requested by Data Custodian
Governance
Supports definition, approval and execution of the Data Quality program
Understands the Information Technology Landscape and has the ability to
identify what data is stored in what systems
Supports efforts to provide data awareness education for senior and upper
management
Works with the Data Custodian in the identification of root causes of major
data problems and supports the implementation of sustainable solutions
Resolves routine data problems
Conformance
Assists the Business Data Custodian with data problem resolution when
requested
Reviews data deletion and archiving requests for data in their span of
responsibility and forwards to approver with appropriate recommendations
13
Skills and Displays mastery of fundamentals of problem solving and basic data quality
competencies analysis
Displays understanding of the portions of business processes executed by their
business area
Strong understanding of the Systems Development Life Cycle and
methodologies, and familiarity with process improvement frameworks.
Proficiency with MS Office, basic data analysis and process control
methods/techniques
Proven ability to work well and contribute to cross-functional teams
Proven ability to present to peers and supervisors
Displays familiarity with systems used within/by their Entity
Displays familiarity with functions performed by their business area
14
GUIDANCE NOTE 2: BUILDING A SMART DATA ROADMAP
Purpose This Guidance Note provides a recommended process, templates and
supporting guidance for each Government Entity to build its own Roadmap for
implementing the UAE Smart Data Framework. An Entity-level Roadmap that
follows this guidance will, if effectively managed, ensure that the Entity:
Achieves significant business benefits from shared and open data
Converges its data management practices over time to conform with
the UAE Smart Data Principles and Smart Data Standards
When to use At the start of each Entity’s Smart Data program.
Responsibility Director of Data, with close involvement and Roadmap sign-off by the Entity’s
Management Board.
Overview
A single, undifferentiated approach for implementing smart data across all Government Entities will
not work. There are a number of factors that will influence the content of an Entity’s Roadmap, such
as the type and size of Entity, the complexity of its delivery ecosystem, and the Entity’s level of
maturity in current data sharing practices. The advice in this Guidance Note is therefore not
mandatory and Government Entities should tailor it to their own business requirements.
15
Relationship Use Guidance Note 1
with other parts to establish / recruit Use Guidance Notes 3-5
4
of Smart Data initial key data to inform Roadmap development
Toolkit governance roles
1 3 4 8
Agree resources for Agree Agree
Present Roadmap v1 to key
Roadmap and business Roadmap v1 with Roadmap v2
Director of Data external stakeholders; use
priorities with the Entity’s Management with Management
feedback to inform Roadmap v2
Management Board Board Board
2 5 7
Data Coordinate initial Lead work to review and
Lead work across the Entity
Management implementation of update Roadmap in light of
Officer to develop Roadmap v1 Roadmap actions feedback
6
Data Custodians Work with the DMO and central data team to Implement Roadmap actions in relation to individual
and Specialists ensure Roadmap is aligned with business needs datasets, and feedback learning to DMO to inform
improvements to Roadmap
Throughout this process, you should seek to take an approach which is:
Iterative and collaborative: You should not develop the Roadmap in isolation or see this as a
one-off exercise. Once an initial Roadmap has been developed, you will want to:
- Share it with other key stakeholders: the Federal Data Management Office, other
Entities addressing similar customer groups, other Entities using your data or
supplying you with data and so on
- Improve it in the light of implementation experience within the Entity.
The process diagram above summarizes this collaborative process in terms of producing a
first version of the Roadmap and then a second. In practice, the Entity will want to keep the
Roadmap updated on an ongoing basis.
User-focused: the Roadmap should be user-centric, i.e. it should identify and address the
needs of key internal and external data users and should allow for regular engagement with
them
Practical: the Roadmap should be achievable within the timeframe, supported with
adequate resources to deliver it and appropriate project management disciplines to ensure
high quality and timely delivery
Phased: the Roadmap should be developed to be delivered in a phased manner, ensuring
that work is closely informed by Guidance Note 4: Prioritization Criteria and process.
16
Recommended Overview of each section
structure
1. Objectives Sets out the scope and purpose of the Roadmap, and describes the Entity’s
vision for how it will manage its data in future to align with UAE Smart Data
Principles.
2. Gap analysis Highlights the key areas in which current ways of working within the Entity are
not currently aligned with the future vision
3. Governance Describes key governance roles and processes for managing implementation of
the Roadmap.
4. Delivery plan Describes work streams in the Roadmap, mapping out key milestones,
deliverables, and dependencies.
5. Risks Sets out the key risks associated with the Roadmap, their likely impact and the
proposed mitigation strategies.
6. Impact Sets out the key benefits that the Entity seeks to deliver through
measurement implementation of its Smart Data Roadmap:
− How will success be measured and when?
− What will success look like?
− How will learnings be incorporated?
The more detailed tables below provide guidance on the scope and purpose of each of these six
recommended sections of the Smart Data Roadmap, issues to address within it, the actions you
should take to inform its development, and the resources that are available to help you.
Section 1: Objectives
Scope and purpose of the section
In order for an Entity to comply with the UAE Smart Data Framework, it needs to establish a new operating
model for its data management, ensuring that all data is managed as a cross-government asset using open
standards. This means each Entity should consider and fully plan the changes that will need to be made
internally to drive the transformations that are needed.
This introductory section should set the scene for each Entity, so that they can describe:
The changes that will be delivered, in terms that resonate with the Entity’s own internal and
external stakeholders.
The benefits that the Entity will achieve through delivery of the plan, and ultimately how these will
contribute towards the wider goals of the UAE Smart Government strategy.
Issues to address
When developing this section of the Roadmap, you will need to consider:
Why does the plan exist and what is it trying to achieve?
What are the required behaviours/actions that need to be taken?
Which areas of the Entity’s activities does this Roadmap cover?1
What is the Entity’s future vision for how it will manage its data?
Actions you should take to inform development of this section
All team members involved in developing the Smart Data Roadmap should have a clear understanding of
the Smart Data Principles, and of the standards and guidance that exists to support them. So before they
embark upon developing the Entity-level Roadmap they should read all the UAE Smart Data Framework
documents.
1
In general, the Roadmap should cover the whole organization. But there may be cases where it is sensible either to exclude some
elements of the organization, or to cover activities that fall outside the organizational boundary of the Entity, but which nevertheless are
most sensibly covered within the Entity’s Roadmap.
17
And it is vital that the Management Board of the Entity understands the key principles, the degree of
change that is involved, and understands and commits to the Roadmap process. While it will not be
necessary for all top managers in the Entity to read the full Smart Data Framework, it is important that they
review the Smart Data Principles and give a steer on how they see these best being implemented within the
organisation.
It may be worthwhile for your immediate data management team to draft this section, review it with the
Management Board, and then to circulate it to the wider stakeholder community to validate whether this
resonates with them.
Resources to help you
There are a number of key documents that will help the Entity to prepare this section of the Roadmap:
UAE Smart Data Framework: overview and principles: this sets out the key purpose and principles
of the UAE smart data initiative, and provides an easy to assimilate guide to the business changes
that are required of Government Entities
The UAE Smart Data Standards: this sets out the minimum mandatory standards that Government
Entities should deliver with their data
Guidance Notes 3 – 5 of this Smart Data Implementation Guide give detailed ‘how to’ guides on
implementation of the standards. Use of this guidance is not mandatory, but it provides a good
starting point for thinking through the Entity’s own approach.
Issues to address
You should make clear the scale of the changes that will be required – informed by real evidence of gaps
and challenges at two levels:
The organisational level: to what extent does the Entity have the governance, culture, processes and
infrastructure needed to manage and reap benefits from smart data?
The dataset level: to what extent are datasets in the Entity currently already aligned with the best
practices set out in the UAE Smart Data Standards?
Appendix B provides a Data Quality Maturity Matrix for auditing quality of a dataset against 5 levels of
maturity – of which Level 3 represents full conformance with all mandatory requirements of the Smart Data
Standards. An organisational self-assessment tool is also available.
Section 3: Governance
Scope and purpose of the section
This section should cover:
‒ Governance model: describing functions, roles and accountabilities – both for ensuring successful
18
delivery of activities and milestones in the Roadmap and also for realisation of the targeted benefits
– and the processes within which these will operate.
‒ Resourcing: staff, financial and other resources that will be deployed in delivering the Roadmap.
Issues to address
When developing this section of the Roadmap, you will need to consider a number of points:
‒ Do we have all roles filled across our data team?
‒ If not, how are we proposing to plug the gaps in the short term?
‒ Do these resources have all the skills and knowledge that they need to carry out their roles
successfully?
‒ What is our own RACI (Responsible, Accountable, Consulted, and Informed) for all data
management processes?
‒ How will we manage our Entity’s involvement in the wider governance for UAE Smart Data?
You will also need to involve the ‘data practitioners’ within your Entity to map out workflow and processes
for data conformance, based on the guidance given in the Smart Data Implementation Guide for
Government Entities.
Issues to address
When developing this section of the Roadmap, you will need to consider a number of points:
19
‒ Is the sequencing logical and aligned with the data preparation and management processes?
‒ Are there any other Entity-specific limitations that may prevent us from being successful in
delivering this plan?
A high-level view of the work streams in a typical Delivery Plan is shown on the following page. This is
divided into four main phases:
1. Initiation: when the initial planning is undertaken and governance systems established
2. Inventorying and prioritization: when the Entity pulls together an inventory of the key data assets it
manages, and prioritises which should be addressed first
3. Data conformance: when prioritized datasets are taken through a systematic process to ensure they
meet the mandatory requirements of the UAE Data Standards for Classification, Quality and Exchange,
in a series of ‘data sprints’.
4. Continuous improvement: when the Entity drives forward longer-term improvements (beyond the
mandatory minimum in the UAE Data Standards) – improvements both to data quality and to the level
of re-use of the Entity’s data by data users across the public and private sectors.
20
Set up
data
team Initiation
Initiation
process and
planning
Develop
Inventorying initial Data
Inventory
and Prioritize
prioritization datasets
Classify Classify
data data
Apply Formats, Apply Formats,
Sprint 2, 3
Permissions, Permissions,
Metadata and Metadata and etc
Data Schema
Schema Quality Audit and
complianc Quality Audit and
Improvement Plan Improvement Plan
e Publish Publish
open open
data data
Exchang Exchang
Sprint 1 e shared
e shared
data data
Benefit realization
21
Actions you should take to inform development of this section
In order to develop a realistic and achievable plan, you will need to:
Involve the ‘data practitioners’ within your Entity. They will need to have a clear understanding of the
processes that will need to be followed, using the training materials and policy products for preparation
and cataloguing.
Undertake sample Data Quality Audits using the Data Quality Maturity Matrix provided in Appendix B,
to get an early sense of the current state of data quality across the Entity and key areas where
improvement will commonly be needed.
Engage with current users of any data that you currently publish as open data or share with other
Entities, to help understand user priorities.
Resources to help you
Guidance Note 1: Data governance roles and processes gives advice on steps to take during the Project
Initiation phase of the Roadmap.
Guidance Note 3: Developing a Data Inventory and Guidance Note 4: Prioritization criteria and process
give guidance on the steps to take during the Inventorization and Prioritization phase of the Roadmap.
Guidance Note 5: Data Conformance Process gives guidance on the steps to take during the Data
Conformance phase of the Roadmap.
Section 5: Risk
Scope and purpose of the section
This section should do two things:
1. set out the current list of key risks associated with delivering the Roadmap, including their likely
impact and proposed mitigation strategies
2. Set out the process that the Entity will follow to raise and manage delivery risks.
Issues to address
When developing this section of the Roadmap, you will need to consider:
‒ Ownership: who is the most suitable owner for each risk, responsible for driving forward the
agreed mitigating action?
‒ Governance: what is the most appropriate escalation route for any high impact risks that are not
being managed satisfactorily?
‒ Coordination: who within our Entity will coordinate the whole risk management process, to
regularly review the register is being managed effectively?
‒ Tools: how will we manage reviewing and updating our risk register, so that the latest version is
visible to all relevant team members and they can contribute updates quickly and easily?
22
framework, developed following international best practice research and consultation, into the key reasons
why ICT-enabled change programs such as UAE Smart Data are mostly likely to fail. We recommend that all
Entities review what risks they face in relation to each of these nine categories, using the checklist tool in
the TGF standard.
Strategic clarity Leadership User focus
Collaborative Supplier
Skills
engagement partnership
Issues to address
When developing this section of the Roadmap, you will need to consider several issues:
What will success look like for our Entity in its use of data?
How will success be measured and when?
Who are the likely owners of each of the Entity-level benefits?
How will learnings be incorporated?
What systems and tools can be put in place to monitor the ongoing delivery of benefits?
What quick wins can we deliver early in the program that will start the ball rolling?
What are the longer term benefits that we are seeking to achieve, and how do we sustain and
embed the business changes required to achieve the desired impacts?
Determine how you can best measure your own Entity’s impact on the delivery of these objectives. Seek to
develop success criteria and targets that are SMART:
1) Specific – clear and unambiguous;
2) Measurable – quantifiable;
3) Achievable – realistic and attainable;
4) Relevant – applicable and worthwhile;
5) Time-bound – delivered within a specific timeframe.
Resources to help you
Further advice on developing an effective approach to Benefit Realization is set out in the global standard
on ICT-enabled, data-driven service transformation (‘The Transformation Government Framework’, or
TGF3).
3
Refer to V2 of the standard published by international open standards consortium OASIS in 2014.
23
24
GUIDANCE NOTE 3: DEVELOPING A DATA INVENTORY
Purpose This document describes how to create a list of datasets which are collected,
managed or maintained by the Entity. While it may not be possible to create a
complete list in one step, this Guidance Note helps Entities ensure that the
most valuable data assets are listed as an initial priority, and then to expand
the Inventory over time.
When to use When the Entity has a management structure and a team responsible for data
in place (for example by using Guidance Note 1: Data governance roles and
processes)
Overview
To realize the strategic vision of efficient and effective data management in government that
enables better decisions and better services, each Entity requires a good understanding of its current
data assets and data processes. The first step of this is to produce an inventory of all datasets in the
entity. This allows the entity to identify gaps where data is currently not fit for purpose, to spot and
address duplication and become more standardized.
This Guidance Note provides help in listing and inventorying the data an entity holds and covers:
Types of data
Structured data
Structured data that is machine-readable data such as a table in a spreadsheet, database, and data
on a geospatial map is the main set of data which needs to be inventoried. Entities should list
existing data which the Entity uses, maintains or collects. This could be data frequently used by the
departments within this Entity which might not have a clear owner. It should also include all data
where the Entity is responsible for collecting and updating the data, even if this work is done by
others on its behalf.
Data should be listed in the form of datasets. A dataset consists of data with its metadata. The
metadata provides context and information about the data. Therefore, a dataset should be an
individual object that makes sense as a whole by itself.
A dataset may be a database or spreadsheet along with its name, location, description. It could also
be a map or a table from a report moved to a spreadsheet.
25
It may be more practical to count a collection of data, such as a database, as one dataset or as
several. You should count it as one dataset if the data within a database is:
thematically related
easiest to describe as a whole
Interrelated.
Otherwise, it likely consists of several datasets. The split depends on the existing and potential use of
the data. It’s up to the data owners who understand the data best to make the judgement decisions
on how data should be listed as datasets.
Unstructured data
While we recommend primarily focusing on structured data, unstructured data or information such
as text documents, diagrams, pictures or media can also be important to publish or share effectively.
Entities will generally have a lot more unstructured content and it will be difficult to inventory all of
it, so two key steps are recommended:
1. Identify opportunities to turn unstructured data into structured data. For example, by
putting tables in Word documents into a spreadsheet or seeing if there’s a geodata version
of a map picture available. Then deal with this structured data as explained in the process
below.
2. Identify information that’s particularly relevant for re-use – within the Entity itself or
externally by the public or other Entities. This might be a report, presentation or videos
which impart important information or can be utilized in new ways, and list these in the
Inventory.
26
1. Identity a data representative per department within the Entity
Each department or business unit within the Entity should have a named responsible Data Custodian
who has a good understanding of the data their department produces, uses and manages. This
person should be someone who is in a senior role (or appointed directly by a senior role) and
regularly deals with data and is aware of the variety of data which exists within their department.
They may be supported by a Data Specialist who has technical ownership or understanding of the
data. For larger departments that handle a lot of data, this role may be covered by more than one
person. Further guidance on the role of Data Custodian and Data Specialists is set out in Guidance
Note 1: Data governance roles and processes.
There is no need to change or rearrange data before adding information about the dataset to the
list, or to collect any data which is not already held.
Draw together existing lists of datasets that are collected, maintained or managed by the Entity.
These may include:
The list of Primary Registries identified by the Federal Data Management Office.
Data which has been previously requested by other Entities, external bodies, Federal Data
Management Office or other departments within the current Entity.
Data listed in the Entity’s Information Asset Register (as required by the Information Security
Regulation)
Existing data catalogs or lists: e.g. data available in a catalog or portal, documentation of
previous information audits, datacenter inventory, management databases or software asset
lists.
Next the Data Custodian and/or Data Specialist should think about and list any datasets the Entity:
Collects
Stores
Maintains and updates
Commissions externally
Aim to be as comprehensive as possible, but it is not expected that all datasets will be captured in
the initial version of the inventory. It may be useful to consider:
Any datasets that have already been openly published or are currently shared with other
Government Entities.
Obvious or high-value datasets: for example, data which can be used to provide a service to
individuals or businesses (such as setting up a business, hiring a car, managing insurance,
choosing where to live), make government more transparent, or any data which is expected
to be managed by this Entity
27
What datasets exist for each type of data: e.g. real-time, operational, reference data,
aggregated data (see illustrative table below).
Any strategic reference data the Entity may hold, see table of examples below:
28
C: Create inventory and prioritize
Within a spreadsheet or table, add the following for each identified dataset from the previous two
steps:
A name or title for the dataset. If it doesn’t have an existing name that you are aware of,
choose a short descriptive name.
A brief description to clarify what data is being referred to and its scope.
The department or business unit responsible for managing the data
The list of data attributes (normally column headings for tabular data) that are used in this
dataset. Entities do not need to list data attributes for unstructured data.
The Data Custodian, if known, i.e. the role or person within the department responsible for
the data
The Data Custodian’s initial assessment of the extent to which this dataset should be a
priority for initial publication as open or shared data (using the process and prioritization
scoring set out in the Guidance Note 4: Prioritization criteria and process).
Example
Bus All current bus Operations Bus number, bus 34/38 Mark Jones
Transport timetables department stop location,
timetables arrival time,
departure time,
frequency
29
4. Prioritize
This Inventory should be taken through the prioritization process described in Guidance Note 4:
Prioritization criteria and process. The Data Management Officer should ensure:
the inventory contains a reasonably comprehensive list of data held by the Entity
no key datasets are missing
it was carried out by the appropriate staff members
It contains the prioritization information specified in Guidance Note 4: Prioritization criteria
and process.
Identify new datasets managed by the Entity (these could be completely new or extensions
and reformulations of existing data)
Respond to user demand for data
Review the existing inventory in light of publication and sharing: both lessons learned and
feedback received from other Entities, the public, external stakeholders and internal staff.
The process to follow should be similar, but instead of listing all possible datasets it should involve
using the existing inventory as a basis and using each step of the process to see how the inventory
can be expanded or amended. Expansion should cover both:
30
GUIDANCE NOTE 4: PRIORITIZATION CRITERIA AND PROCESS
Purpose It will not be possible to ensure all the Entity’s datasets meet the Smart Data
Framework requirements at once. This Guidance Note provides criteria and a clear
process for prioritising the order in which datasets should be made conformant to the
standards prior to their publication or exchange.
When to use After producing an inventory of the Entity’s data assets, and before proceeding to take
the initial highest priority group of datasets through the Data Conformance Process
described in Guidance Note 5.
Overview
This Guidance Note helps Entities focus resources on making the most important datasets conform
to the Smart Data Framework standards first and ensuring there is a clear prioritized plan for which
datasets will be ready for publication and exchange next.
Once a Government Entity has prepared an initial draft of its Data Inventory (using Guidance Note 3:
Developing an Entity-wide Data Inventory), it should not seek to ensure full conformance with the
UAE Smart Data Standards for all its data at once. We recommend prioritising which of the
inventoried datasets it should prepare first for publication as open data or for exchange with other
Entities.
By starting with a subset of its data inventory, Entities can:
Quickly publish and exchange high value and low effort data
Go through the process faster, learn from it and adopt desired changes to the process in
future.
The guidance below looks in turn at:
The recommended process for prioritising datasets
The criteria which are recommended for use within the prioritisation process.
31
Relationship Use Guidance Note 5 to start the
Use Guidance Note 3 to prepare
with other parts process of Smart Data Standards
an inventory of the Entity’s 4
compliance for datasets in the first
of Smart Data
datasets
Toolkit sprint
1 2 7
Review the Prioritised Inventory against
Identity whether the Entity has UAE and Entity strategic objectives, and
Director of Data any datasets matching UAE Identify decide on priority groups of datasets to
Primary Registries datasets manage in a series of ‘sprints’
requested by
the FDMO for 5 6
priority Integrate Custodian
Data Prepare a full Entity-
projects lists together and
Management wide Prioritised
validate overall
Officer Inventory
priority
3 4
Assess datasets each
Data Custodians Review complete
Custodian is responsible
and Specialists ordering and make
for against prioritisation
adjustments
criteria
32
4. Review Ordering
Data Custodians should review their complete prioritized list (which includes primary registries,
datasets needed for projects and results of applying the prioritization criteria to the data in their
inventory) and re-arrange as needed. For example, if there are many datasets with the same score,
use judgement to prioritize between them or if a dataset looks out of place and feels like it should be
above or below others, rearrange as needed.
33
The tools below provide simple-to-use recommended approaches for quantifying both of these
dimensions.
34
Please assess each dataset for the following, recording the score in the right-hand column:
Readiness criteria Question Scoring criteria (pick most suitable Score
score and 1 – medium if unsure)
Accuracy How accurate is the data? 2 – High accuracy (we review and
check accuracy)
1 – Medium accuracy
0 – Low accuracy (there are known
errors in the data)
Completeness How complete is the data? 2 – High completeness (we have all
the data at current granularity)
1 – Medium completeness
0 – Low completeness (there is
known missing data, or this data
will not make sense by itself)
Timeliness 2 - Latest month / week / year is
How up to date is the data?
available
1 - Data is not time sensitive OR we
have all the data apart from latest
month / week / year
0 - Data is out of date
Validation 2 – Yes, data is published with same
Does the data use a schema
headings / fields (schema) each
or is standardised?
time
1 – The data does not use a schema
AND is not published regularly (i.e.
it is one off data)
0 – Data is regularly updated, but
does not use a set schema
Ownership
Is there a clear specific data 2 – Yes
owner?
0 – No
Description
Does the data have existing 2 – Yes
metadata - that is,
0 – No
information on what the
data is about, how it was
generated etc.?
Accessibility
Is the data already published 2 – Yes
somewhere or available on
0 – No
the web / through an API?
Interoperability
Is the data in an open 2 – Yes
machine-readable format?
0 – No
License
Does the data have a 2 – Yes
license?
0 – No
Combine the two scores to get an overall priority score out of 38.
35
GUIDANCE NOTE 5: DATA CONFORMANCE PROCESS
Purpose This Guidance Note outlines the process that Entities are recommended to
follow to ensure their data conforms with the Smart Data Standards, and is
ready for publication and exchange.
When to use Before publishing data as open data or exchanging shared data with other
Entities. Entities should focus in turn on successive batches of data that have
been prioritised for data conformance in line with the advice in Guidance
Note 4: Prioritization process and criteria.
Responsibility The Director of Data has overall accountability for ensuring effective systems
are in place to manage data conformance, but the lead responsibility for
operating these systems will lie with the Data Management Officer
3
Validate and publish
B Check dataset descriptions and
Director of Data
quality meet Smart Data Standard
requirements
1
A C
Data Oversee and manage the data compliance process of the data Validate and
Publish or
Management custodians and data specialists within each department ensuring add to expanded
exchange data
Officer each dataset is fully described and quality improved Data Inventory
2
Manage datasets
A Classify B Choose C D Add E
Data Custodians Document Manage
each appropriate Metadata & Data Quality Ongoing
and Specialists Permissions
dataset Format develop Schema
(DC1) (DE1) (DE7) (DE2 and DE3) (DQ1)
36
Ensuring that Data Custodians and Data Specialists are fully briefed on their roles and on the
requirements of relevant Smart Data Framework standards
Facilitating opportunities for Data Custodians and Data Specialists to come together and
exchange experiences and lessons learned through the process.
A summary is set out below of the steps that need to be taken as part of this coordinated approach:
Steps 2[A] to 2[E] look at the actions which Data Custodians and/or Data Specialists should
take to ensure that an individual dataset is conformant with the UAE Smart Data Standards
Steps 3[A] to 3[C] then look at the actions which the Data Management Officer and Director
of Data should then take to validate and approve for publication the datasets that have
come through this process.
2A: Classify
The Data Custodian should classify the dataset as Open, Confidential, Sesnitive or Secret, in
accordance with the [DC1] Data Classification specification. Detailed guidance on the process to
follow is given below in Guidance Note 5.1: Classifying data.
Once this is done, you might be left with the original dataset and one or more derived (or ‘child’)
datasets which have been modified to allow an Open classification. Both the original and derived
datasets should be catalogued separately in the following steps.
Decide on an appropriate format in which to make the data available that complies with the [DE1]
Data Formats specification and produce a sample dataset in that format. Detailed guidance on the
process to follow is given below in Guidance Note 5.2: Formatting data.
For data which will be exchanged with other Enitities rather than published as Open Data, you will
need to comply with [DE7] Shared Data Access Permissions. This will involve determining and then
documenting the appropriate permissions model. Detailed guidance on the process to follow is
given below in Guidance Note 5.3: Documenting a permissions model for shared data.
Describe each dataset with metadata ensuring that all Core Metadata fields required in [DE2]
Metadata are complete and as many Optional Metadata fields as can be easily filled in. Detailed
guidance on the process to follow is given below in Guidance Note 5.4: Adding metadata and
schema.
Assess the data against the [DQ1] Data Quality Principles. The Data Custodian should then:
Identify and implement any ‘quick wins’
37
Then develop a longer term plan for improving the quality of the dataset to better meet user
requirements.
The Data Custodian’s work on this will need to feed into broader work on improving data quality in
the Entity, as detailed in Guidance Note: 2 Building a Smart Data Roadmap.
Detailed guidance on the process that Data Custodians should follow is given below in Guidance
Note 5.5: Managing data quality.
When to use Before a dataset is published as open data or exchanged with other Entities,
the data should be correctly classified.
The Data Custodian that the Entity has identified as accountable for a
Responsibility
particular dataset should be the lead person responsible for applying the Data
Classification Standard to the dataset.
The Entity’s Data Management Officer is responsible for supporting all Data
Custodians across the Entity as they undertake this task, and for ensuring a
consistent approach at the Entity-wide level.
Overview
At the start of the Data Conformance Process, it may be that a dataset has already been classified as
Open, Confidential, Sensitive or Secret – because FGEs have been required to use such a classification
for several years (see for example the ‘Regulation of Information Security at the Federal Entities of
UAE Cabinet Resolution’ No. (21), 2013). Previously, however, Entities were free to establish their
own criteria for determining what sort of data they assigned to each class. Now, following
agreement of the UAE Smart Data Standards, there is a common government-wide set of criteria
which all Government Enities should apply. These criteria are intended to enable much greater levels
38
of open data publication and data exchange between organisations than has historically been the
practice in the UAE.
The table below gives criteria for assessing what data falls into each class, with examples. Deciding
the classification level depends on a risk assessment to assess the level of damage that may result
from unrestricted disclosure of the data (to privacy, security, commercial confidentiality etc).
Data classification
Open Criteria:
Data that can be openly disclosed to individuals, governmental, semi-government entities
and private sector for use, re-use and sharing with third parties. This should be the
default classification for all non-personal data, and exceptions to this should have a
documented rationale that clearly explains why open publication of the data would
contravene specific criteria listed below that require classification as Confidential,
Sensitive or Secret
Examples:
Open data can include:
Real time data: constantly updating data, often high volume and high velocity
Examples include: weather data; footfall through airport; cars passing toll
booths; pollution levels; real-time location data; electricity usage
Operational data: the records that are made as part of an Entity carrying out its day
to day basis
Examples include: Entity organisation chart; forecast or modelling data; buildings
owned/maintained; budget; staff levels; performance against metrics
Reference data: authoritative or definitive data that rarely changes about things
Examples include: timetables; names and locations of schools, hospitals, bus
stops; tax codes; land holdings; mapping data; indicators; address data
Aggregated data: analysed and summarised data, which provides overview
information in relation to other types of data
Examples include: hospital operation success rates; school exam pass rates;
population statistics; housing; tourist numbers by month/year; nationalities of
visitors
Confidential Criteria:
This is the default classification for datasets containing personal data which is non-
sensitive. "Personal data" means any information relating to an identified or identifiable
natural person; an identifiable person is one who can be identified, directly or indirectly,
in particular by reference to an identifier such as a name, an identification number,
location data, online identifier or to one or more factors specific to the physical,
physiological, genetic, mental, economic, cultural or social identity of that person. Non-
sensitive personal data refers to all types of personal information which are not
‘confidential’ (as defined in the criteria for Sensitive Data below).
39
Adversely affecting public safely, criminal justice and enforcement activities.
Examples:
Typically, non-sensitive personal data will include information which is personal but does
not impact on the reputation of the person. Examples include name, date of birth and
address.
Examples of other types of Confidential information include:
Minutes of meetings, internal regulations and policies, and government-body
performance reports
Correspondence within a government body or with other government bodies or third
parties
Financial transactions and financial reports
Company data such as tenders or contracts which provide for non-disclosure clauses
Individual’s dealings with the government, which include personal data (details of
ownership of properties of various kinds, commercial or professional licenses,
personal documents, residence permits, visas, and leases).
Sensitive Criteria:
This is the default classification for datasets containing sensitive personal data. Sensitive
personal data are personal data that directly or indirectly reveal an Individual's family,
racial or ethnic origin, sectarian origin, political opinions, religious or philosophical beliefs,
their union membership, criminal record, health, sexual orientation, genetic data or
biometric data
Examples:
For example, this might be the details and content of:
Draft government laws and policies and legislation
Audit reports of a government body
Employees’ complaints and investigation minutes
Staff salaries and performance reports
Confidential financial expenses
Data, plans or technical documentation for technological information systems and
networks of a governmental body
Credit card or bank accounts data
Judgments, irregularities or violations under investigation relevant to individuals
Attachment orders over assets and property of individuals and companies.
Secret Criteria:
Data the unrestricted disclosure or exchange of which may cause significant damage to
the supreme interests of the United Arab Emirates and very high damage to government
40
bodies, companies or individuals, such as:
Disclosing any personal information of a VIP (very important person) or infringing any
Intellectual Property Rights of a VIP
A significant or noticeable negative impact to the supreme interests of the United
Arab Emirates
A sharp decrease in the ability of one of the vital bodies to carry out its functions, or
very high damage to its assets, heavy financial loss, clear negative impact on the
image of the body and a loss of public confidence in such body and in the government
in general
Causing significant damage to private sector entity that have vital and strategic roles
in the national economy, which may lead to heavy financial losses, bankruptcy or loss
of its leading role
Seriously endangering the safety and lives of certain individuals associated with a
security role (e.g., security forces and police) or as parties to serious judicial cases
(e.g. witnesses)
Information the disclosure of which would negatively affect the maintenance of
security and the administration of justice, or cause major, long-term impairment to
the ability to investigate or prosecute serious crimes.
Examples:
Examples include details and content of:
It is therefore vital that every dataset prioritised for open publishing and for inter-Entity exchange
has its classification status reviewed against the requirements of the Data Classification Standard.
The diagram below summarises the process that the Government Entity is recommended to follow
when classifying a dataset against the Data Classification Standard mandated in the UAE Smart Data
Framework.
41
Relationship
The current batch of datasets to be classified will be
with other parts
listed in the prioritized data inventory created using
4
of Smart Data
the process recommended in Guidance Note 3 and 4
Toolkit
1 3
Data
Provide support to Data Custodians as they classify data to ensure a Validate and add to
Management
consistent approach expanded Data Inventory
Officer
2
A B C D E F G H Add
Data Custodians Check for Check for Weigh Assess Consider Identify classifi-
and Specialists Think open
barriers to negative public level of public derivative cation to
disclosure effects interest restriction inventory public data metadata
It is recommended that the Data Management Officer is responsible for overseeing and helping the
Entity’s Data Custodians and Data Specialists complete steps [2A] – [2B] above, by:
Establishing a clear internal timetable for completing the data cclassification process, aligned
with Entity and Federal milestones for publishing exchanging data
Ensuring that Data Custodians and Data Specialists are fully briefed on their roles and on the
requirements of Data Classification Standard
Facilitating opportunities for Data Custodians and Data Specialists to come together and
exchange experiences and lessons learned through the process.
For each dataset, the responsible Data Custodian should classify the dataset using the eight step
process recommended above. These steps that an individual Data Custodian should go through can
also be visualized as a logical decision model, as illustrated below.
42
A.1. THINK OPEN
B.
2. Barriers to SHARED E.5. Assess level of restriction
disclosure? Yes DATA (Confidential,
Restricted, Sensitive,
Confidential Secret)
, Very Confidential
No
No
F.6. Include in public inventory?
C.
3. Negative D.
4. Benefits
effects from Yes outweigh
disclosure? negatives?
G.
7. Create related open data?
No Yes (E.g. with only part of the data)
OPEN
H.8. Add classification to dataset
DATA
metadata
It is vital to recognize the UAE Government’s strategic commitment to high levels of openness.
When following the steps of this procedure, the default assumption about a dataset should be that it
will be classified as open. Exceptions require a compelling case linked to clear criteria, which should
be documented and then personally signed off by the Entity’s Director of Data.
‘Thinking Open’ is often the most difficult part of the classification procedure, especially if the Entity
is inexperienced with open data. Staff may be concerned that publication will reflect badly on them
where, for example, some of the data may be interpreted as unfavorable, or the data may have gaps
or inaccuracies. It is vital that staff understand that they will be have the backing and support of the
management for the decision to publish data in which problems are later found. Such problems
plague all Entities and all data, and publication should be seen as an opportunity to help find and
improve errors and problems.
For these reasons, it is very helpful if at the start of the data conformance process, the senior
management communicate to the staff their and the Entity’s commitment to openness. The Director
of Data should be available to respond to any concerns raised by staff.
The following steps should be carried out by the person(s) most familiar with the data, such as the
Data Custodian, for each of the datasets in the current batch going through the data conformance
process.
43
2B. Check for barriers to disclosure
There are certain criteria that may preclude a dataset being classified as ‘Open’ and then disclosed as
Open Data. These include two absolute barriers to disclosure. A dataset cannot be Open if its
publication would:
Represent a significant threat to the supreme national interest and/or national security.
Check that your dataset’s publication would not violate one of these conditions. In most cases it
should be obvious if one of these barriers applies, but in cases of doubt you may need to consult
your Entity's legal department. If a dataset is barred from publication by one of these barriers, then
it cannot be classified as Open. Proceed to Step [E] to determine whether it should be classified as
Confidential, Sensitive or Secret.
If the barriers to disclosure in Step [B] do not apply, then there are other possible harmful effects to
consider before the data can be confirmed as open. Consider whether release of the dataset would
entail a significant risk of one or more of the following by checking whether the answer ‘yes’ to any
of the questions listed in the checklist below.
44
If not, then classify as ‘Open’ and add to the Data
Inventory.
Engage with the IPR holder to establish whether it will give consent to
opening up the data, potentially with some license restrictions
If not, classify the dataset as Confidential or Sensitive, depending on
the degree of damage that would be caused by breach of
confidentiality – unless there is an overriding public interest in
publishing. (See Steps D and E)
In cases where the Private-Sector Entity’s IPR arises from the
performance of a commercial contract on behalf of the Government
Entity, seek to re-negotiate these contract terms, particularly at any
45
contract review or renewal points.
Note: In answering this question, Government Entities should note that it
is not acceptable to classify a dataset as Confidential on the grounds that a
Government Entity has Intellectual Property Rights in the data, even in
cases where it is currently exploiting that IPR on a commercial basis.
Rather, the dataset should be classified as Open Data, albeit with
consideration given to the nature of the licensing and pricing basis on
which it is made Open.
Note: greater transparency is in general a force for social good rather than
a social risk.
Risk of negatively affecting 7. Consider whether disclosure of this data pose risks to the
the administration of justice administration of justice and maintenance of security?
and maintenance of security
If any risks identified under these two questions are:
Specific and clear, not general and vague
Evidence-based
46
Proceed to Step [D].
Classify the data as Open and add this classification to the inventory
Move on to another dataset, or proceed to step [H].
If harmful effects of publishing are identified in Step [C], then there is a presumption not to publish,
but they are not absolute barriers to disclosure. In some instances, the public interest in publishing a
dataset may outweigh the negative consequences.
Consider whether there is a high economic value or public interest in publishing the data. For
example, would making the data open:
Have significant economic benefits, e.g. could the data be used in the provision of new high-
value services?
Increase transparency of government spending or decision making?
If so, it should provisionally decide whether it would be reasonable and proportionate to publish the
data, in spite of the negative effects identified in Step [C]. The final decision will lie with the Federal
Data Management Office.
If you consider that the public interest outweighs the risk of harm:
Where a dataset cannot be classified as Open after following Steps [A]-[D], it should be classified as
Confidential, Sensitive or Secret, depending on the damage that would be risked by disclosure.
Where there are No potential for damage, classify the data as Confidential
Where the potential for damage is limited, classify the data as Sensitive
Where the potential for damage is very high, classify the data as Secret.
The [DC1] Data Classification Criteria within the UAE Smart Data Standards sets out the criteria to
be applied when making this classification.
As illustrated below, this classification will affect who can see the data. Confidential data will be
easier to share between officials to whom it is directly relevant, based on their area of work and
seniority. Sensitive data is more restricted with new access permissions requiring explicit approval.
Secret data will have access strictly controlled to named individuals.
47
Where a dataset cannot be classified as Public after following Steps A-D
By default, all Confidential and Sensitive datasets should be included in the published version of the
Entity’s Data Inventory. That is, it will be a matter of public record that the Entity holds the data,
even though the data itself will not accessible except from authenticated and authorized users.
If an Entity wishes to make an exception to this, it should demonstrate that simply putting into the
public domain the fact that the dataset exists (as opposed to the data itself) will cause negative
impacts of the type considered in Step [C]. This decision should be agreed personally by the Entity’s
Director of Data.
Where data has been categorized as Confidential or sensitive a balance needs to be stuck between
the need for confidentiality and the benefits of openness. It may be possible to publish a summary,
redacted version, extract, or other derivative of the data, which would have value as open data but
avoid the negative effects identified at Step [C].
For example, personal data can be removed from a dataset through a range of anonymization
techniques as illustrated below. In order to anonymize (or conceal identity in the information), and
in accordance with the guidelines of the European Union, "Data should be stripped of sufficient
elements so that the author of such data cannot be identified." Specifically, data should be
processed in such a way as to make it impossible to identify a natural person by using "all possible
means and reasonable use." It should be borne in mind that the process of processing information
to strip it of identifiable information is not reversible.
48
Anonymization Process (Hiding Personal Identity)
Consider a dataset of school students’ educational results. The data would be of value in various ways: for
example, to researchers looking at variation in educational achievement between different genders or
different areas, or economic and social value through an app provided by a startup to help parents compare
different schools. However, the dataset has been labelled as Confidential because the records include
personal information about students and releasing the dataset would breach their privacy.
In this example, there are a number of ways that a derivative dataset could be prepared and published,
depending on the details. It may be that simply anonymizing the records would be sufficient, as individual
students could no longer be identified. If the data is very granular and specific, it may need to be aggregated
or small number suppressed to ensure that individual results or performance can’t be traced to particular
people. In this case, results could be shown by year group, gender and school or with particular attributes /
fields removed.
Data Custodians should therefore consider whether it would be possible to publish a modified
version of the data. If there is the possibility of a derivative dataset that avoids the barriers and
negative effects in Steps [B]-[C], or where the negatives are outweighed by public interest as in Step
[D], then the Data Custodian should list a new derivative (or ‘child’) dataset in the Data Inventory,
noted as such and linked from the original dataset, but classified as Open Data.
This dataset should then also be catalogued following the rest of the Data Conformance process.
There may also be cases where Confidential or Very Confidential datasets could be summarized or
otherwise adapted in ways which, while still not allowing open publication, might enable less
restrictive sharing across Government Entities. Again, if this is the case, then a new dataset should
be created on the Data inventory, at the appropriate lower classification level.
49
2H. Add classification and documentation to dataset metadata, as part of the Data
Inventory
The classifications and supporting reasons process should be documented, in order to inform the
Data Inventory. The classification will form a mandatory part of the metadata for the dataset, along
with the other elements specified in the [DE2] Metadata specification within the UAE Smart Data
Standards. For a smaller Entity this could be done in a standalone document or spreadsheet, but a
large Entity with sufficient technical resources may wish to install their own data catalogue, allowing
data custodians from each department to enter and edit metadata on the datasets for which they
are responsible.
Once all datasets have been fully catalogued, the classification and its supporting documentation
(along with the rest of the Metadata and Format sample) will be collected by the Data Management
Officer and reviewed. The Data Management Officer should then include all relevant results and
metadata for those datasets in the Data Inventory for the Entity.
The resulting catalogued Inventory should then be reviewed internally by the Director of Data. The
Director of Data should confirm that:
Every dataset being catalogued in the current batch (as identified in the Prioritization
process) has been classified correctly
Where a dataset has been classified other than Open, a proper consideration has been given
to whether a derived dataset could be recorded as Open or with a less Confidential
classification (evidenced by the documented reasoning)
The reasons for classifying any data as non-Open are documented in the Data Inventory.
50
5.2 Formatting data
This Guidance Note outlines the recommended process for ensuring that a
Purpose
dataset is correctly formatted in accordance with UAE Smart Data Standard:
[DE1] Data Formats.
When to use Before a dataset is published as open data or exchanged with other Entities, the
data should be correctly formatted.
The Data Custodian is accountable for ensuring the correct formatting of
Responsibility
dataset for which he or she is responsible, but may delegate responsibility for
the work to a Data Specialist.
Process
Each dataset that is published openly or exchanged with other Entities by a Government Entity
should comply with the [DE2] Data Formats specification. To achieve this, we recommend the
responsible Data Specialist should:
1. Identify the type of dataset that is being prepared for conformance
2. Choose an appropriate format to match that type
3. Produce a sample dataset which can be easily shared, shown and approved
4. Lastly: add the format to the dataset metadata and continue with conformance process
Director of Data
4
Data Validate and add to
Oversee and manage the data compliance process of
Management expanded Data
the data custodians and data specialists
Officer Inventory
1 2 3
51
1. Identify type of dataset
The first step in choosing a data format is to determine what kind of data you are dealing with.
Different types of data have different properties and need to be formatted in different ways.
Tabular data
Most government data are tabular data. If the data you are dealing with is a list, or would make
sense to record in a spreadsheet then it is almost certainly tabular data.
Tabular data consists of rows, each of which is an individual record in the dataset, and columns, each
of which represents one field of the record. For instance, a dataset about schools might be:
Unique Id Name Highest age Lowest age
AB292 Blue Water High 11 5
HG383 Green Tree Academy 11 5
The second row containing “Blue Water High” is the record about Blue Water High, and the column
titled “Highest age” contains the data from the highest age field about each school.
Geospatial data
Geospatial data relates to information about how you would draw things on a map.
We know that data is geospatial when:
It contains the coordinates used to point to something on a map - for instance a latitude and
longitude pair - for example the location of parking spaces, or public libraries.
It contains the shape that we would draw onto a map to represent a particular area. For
instance: data about the catchment area for a school; the boundaries of an electoral district;
administrative regions for school districts; or zoning areas for planning permission.
Real time data is generally provided immediately via an API (Application Programming Interface) that
can be consumed by other software applications. Data is real time if it changes so frequently that
most questions you would ask about it would be quickly out of date.
One example would be the status of trains on a rail network, or information about current flights -
departures, arrivals and delays at an airport.
Data being provided to power real-time services which frequently access or need to update records
automatically should also be provided via an API.
Some data is structured, but does not fit into a tabular form in a natural manner. If your data is
hierarchical or contains many levels, then it is likely structured non-tabular data. Examples would
include the organization chart for your department, or a project plan.
52
2. Choose an appropriate format
Once you have identified the type of data you are dealing with in any specific dataset, use the
following table sets out format requirements for common types of data to use:
In addition to the above generic criteria, there are format-specific criteria detailed below.
Data Format Conformance requirements
CSV data The format of a CSV dataset will be conformant if:
It contains a header row which includes the name of the column
The formatting of dates or numbers is consistent throughout the whole file
It does not include empty rows
It does not include rows with missing or extra cells
It does not use header names more than once in the same file
It does not include any commentary or explanatory text
Structured non The format of structured non tabular data is conformant if:
tabular data It conforms to a pre-existing open standard for representing such data, such as GTFS,
Popolo, the Schema.org job posting standard
or
b) It is in a valid open machine-readable standard such as JSON, XML and:
The structure of this data is clearly documented and published alongside it
The structure of the data is appropriate for re-use given the nature of the domain
to which the data relates.
53
Real-time and An API is conformant if the API endpoint and API documentation is available.
service data API documentation should include:
Clear reference information providing the functions, remote call and methods for
the API
Guidance to help developers experiment with the API
Information about security, versioning and rate limiting so users can plan their
commitment to using the API
Entities may provide an API to their data in addition to publishing or exchanging the data in
one of the other formats.
Alternative formats
If you want to use an alternative format not listed above, you should have a clear reason for why
that is the most appropriate format to publish in. You should also check you’ve selected the most
open, standardized and machine-readable format available to meet the requirements.
If a field contains a comma, a line ending or a double quote then the field is escaped by
wrapping it in double quotes. Within a field that is escaped like that, any double quotes
are doubled up.
Geospatial data Linting tools are available for GeoJson such as http://geojsonlint.com/ which will help
catch errors in your data files. For KML it is possible to validate your data against the KML
Schema (https://developers.google.com/kml/schema/kml21.xsd?csw=1)
High quality open tools exist to convert geospatial data between formats, and can be
included in automated dataset generation pipelines to easily publish in multiple formats.
One good example is Ogr2Ogr http://www.gdal.org/ogr2ogr.html
Structured non- For many types of common dataset there exist open standards for representing that
tabular data information as structured data which should be re-used as much as is possible.
Examples of such standards include:
- Schemas found on http://schema.org/
- The Popolo data standard for people, organizations and voting
http://www.popoloproject.com/
Non tabular structured data should in general use JSON, unless there is a clear reason to
use an alternative format, such as a common standard in an alternative format (e.g. GTFS
for transport data)
Real-time data APIs should be designed to meet the requirements for your use-case and with privacy and
54
and APIs security built in. Where possible, ensure data minimization – giving access to the smallest
amount of information required for the service outcome or to enable a decision. For
example, sending ‘yes’, ’no’ or ‘not found’ in response to a query of whether a citizen or
user is over 18 or has a valid driving license instead of sending personal information.
Guidance on good practice when designing and documenting APIs can be found in here:
- UK Government Service Manual
- US White House API standards
An example of data API documentation for the UK Government Registers is here.
The sample is representative of what will be available for users. Its purpose it to help the Data
Custodian, Data Management Officer, Director of Data and the Federal Data Management Office see
that the data:
- Conforms to the Smart Data Framework standards
- Makes sense as a dataset
55
5.3 Documenting a permissions model for shared data
This Guidance Note outlines the recommended process for determining who
Purpose
may access a dataset and with what level of access in conformance with the
[DE7] Shared data access permissions specification.
When to use When preparing Confidential or Sensitive data for exchange with another Entity
for the first time. Also when responding to future requests for additional access
permissions.
Process
For each dataset in the current prioritized batch being catalogued, the responsible Data Custodians
need to ensure that the requirements of the [DE7] Shared data access permissions specification are
met in respect of Confidential and Sensitive Data. The following process is recommended.
Director of Data
4
Data Oversee and manage the data compliance process of the data
Management custodians and data specialists within each department Establish longer
Officer ensuring each dataset is fully described and quality improved term processes
to review and
respond to
1 2 3 requests for new
Baseline the Shared Data
Data Custodians Check against Document an initial set of
current Access
and Specialists UAE Privacy Shared Data Access
permissions Permissions
Principles Permissions
model
56
Similarly, are there individuals or groups of people in other organisations who are permitted
access to the data, and on what basis?
Does the Entity have the consent of the data subject to share their data with all those people
who currently access it?
If not, is the dataset covered by sector specific regulations which mean that such consent is
not required?
Are there controls in place to ensure that people permitted access to the data may only use
it for specified business purposes?
Is the level of access proportionate to the stated purpose? (For example, if an official has a
business need to check whether an individual is over 18, they should be permitted yes/no
query access to the data rather than being able to see the date of birth of the individual.)
Entities should embed the following UAE Data Privacy Principles in their data management practices,
and in those of third parties contracted to manage data and services on their behalf.
Data Privacy Description
Principles
1. Consent Personal Data in relation to individuals and Commercial Data in relation to Private
Entities should not be disclosed or shared without the data subject’s consent.
When providing a service to an individual or a Private Entity, Government Entities
should seek the consent of that data subject for the data to be exchanged with
other Government Entities for the purpose of enabling any Government Entity to
provide services to the data subject without the need for the data subject to
provide the same information again.
2. Transparency Data subjects should be informed - at the point of data collection - when and by
whom their data is being collected, why it is needed, and how it will be used.
3. Purpose Data should only be used for limited and explicitly stated purposes and not for
any other purposes without first gaining informed consent from the data subject.
4 Proportionality When data is requested and stored, the type of data collected should be the
minimum required to carry out the stated purpose, individual users of the data
should only be given the minimum access to that data that they need, and the
data should not be kept for longer than is necessary for that purpose.
5 Personal access Data subjects should be enabled to:
and control - Access and take copies of data that is held about them
- Correct inaccuracies in data that is held about them
- Request removal of data that is held about them, but is no longer relevant
or applicable to the business of the Entity
6. Security Collected data should be protected by robust and tested security safeguards
(technical and organizational) against such risks as loss and unauthorized access,
destruction, use, modification or disclosure.
57
To help achieve this, Government Entities should
7. Sectoral Each sector has its own laws and regulations, some of which relevant to the basis
compliance on which data can be shared with other entities or with public. Examples of these
laws include the United Arab Emirates Penal Code, the Copyrights’ Act, and the
Telecommunications’ Act.
Entities should ensure they comply with both relevant sectoral regulations and
this Standard, and should notify the Federal Data Management Office in the
event of any perceived conflict.
8. Documentation Entities should document who is permitted to access each data set, either in the
form of the [DE5] Open Data License (for all Open Data) or through a
documented set of [DE7] Shared Data Access Permissions.
Entities should produce and maintain privacy metadata in relation to these access
permissions, as part of their broader work on [DE2] Metadata, and store this in
their Data Inventory.
9. Awareness Entities should develop an awareness programme for their data privacy policy,
which shall be disseminated to all staff within the Entity who manage data (both
from business and technical areas) in order to remind them of the Entity's
obligations and their personal responsibilities concerning data privacy.
10 Accountability Entities should establish and publicise effective complaints and redress
mechanisms for data subjects who believe they are failing to manage their data in
accordance with the above principles.
Who may have access to the shared data. These permissions may be given to either:
- Named individuals
- One or more classes of individuals, such as government employees:
In a specific professional function (such as finance, HR, operations, IT)
In a specific grade
In specific positions (such as Head of Finance)
In specific Entities, or departments within Entities
With specific levels of security clearance
- A combination of the two.
58
What purpose this access is for. This documentation is particularly important to ensure
conformance with the ‘purpose’ and ‘proportionality’ principles of [DE6] Data protection
and privacy and to enable effective auditing.
The level of access that they may have. These permissions (which may be different for
different data users) should specify whether access to the dataset is permitted as:
- Query-only access
- Read-only access
- Read-write full access.
Key elements of this documentation should then be codified in the metadata for the dataset – see
Guidance Note 5.4: Adding metadata and schema – as the dataset moves to the next stage of the
Data Conformance process.
Government Entities should embed the following Access Permission Principles in their data
management practices,
Access Permission Description
Principles
1. Entities should Access to shared data shall be approved by the Government Entity responsible for
facilitate cross- that data.
government However, data ownership does not mean the monopoly of data by any Government
data sharing of Entity, or entitle it to obstruct the reasonable needs of other parties to access that
their data data in pursuit of their legitimate functions. This means that:
- Whenever a Government Entity wishes to use data that is owned and managed
by another Government Entity, the data-owning Government Entity has a duty
to respond rapidly and positively to that request
- Data-owning Entities have a duty to invest in systems and process which
facilitate rapid, effective and secure data-sharing – in particular in respect of
datasets that have been identified as Primary Registries
- Government Entities may not charge other Government Entities to access their
Shared Data.
2. Use of the Wherever possible, Government Entities should exchange data via the Federal
Federal Electronic Data Platform, or Emirate-level electronic platforms that are securely
Electronic Data inter-connected with the Federal Electronic Data Platform. For Federal Government
Platform Entities use of the Federal Electronic Data Platform is mandatory, and exceptions to
this require prior written approval from the Federal Data Management Office.
3. Data Sharing Government Entities that share and exchange non-open data with other Entities
Access should document (and record within their [DE2] Metadata):
Permissions - Who may have access to the shared data. These permissions may be given to
should be either:
documented
Named individuals
One or more classes of individuals, such as government employees:
In a specific professional function (such as finance, HR, operations,
IT)
In a specific grade
In specific positions (such as Head of Finance)
In specific Entities, or departments within Entities
With specific levels of security clearance
A combination of the two.
59
- What purpose this access is for. This documentation is important to ensure
compliance with the ‘purpose’ and ‘proportionality’ principles of [DE6] Data
protection and privacy and to enable effective auditing.
- The level of access that they may have. These permissions (which may be
different for different data users) should specify whether access to the dataset
is permitted as:
Query-only access
Read-only access
Read-write full access.
For most data users, query-only access that returns the minimum necessary
information will be sufficient for their business purposes. Access permissions should
therefore be designed to give access to the smallest amount of information required
for the service outcome or to enable a decision. (For example, sending ‘yes’, ’no’ or
‘not found’ in response to a query of whether a citizen or user is over 18 or has a
valid driving license instead of sending personal information.)
4 Access to Entities should establish systems to ensure that:
shared data - A shared dataset can only be accessed by identified individuals, who have been
should be appropriately authenticated as being permitted such access under the terms of
secured and the Data Sharing Access Permissions
audited
- All Shared Data access via electronic platforms should store an audit log of what
data was accessed, when and by whom.
Mandatory actions
Develop a detailed Data Sharing Plan, setting out how they will implement the Data Exchange
Principles described in this Standard, including any investments in systems and processes that
they will need. They should share this Plan with the Federal Data Management Office.
Respond in writing within a reasonable time to requests for data sharing from other Entities,
giving either:
- Agreement to the request, and a clear timetable for implementation
- A refusal of the request, accompanied by a clear rationale for the refusal that is rooted in
the principles of this Standard.
Notify the Federal Data Management Office of all requests from other Entities for sharing and
exchange of Confidential or Sensitive Data. This includes requests both for access to a dataset
which is currently not shared, and requests to add new individuals or classes of individual to the
Shared Data Access Permissions for a dataset that is already being shared across Entities.
Notification should be made as follows:
- Approved Confidential: For Confidential data sharing requests from other Entities which
the data-owning Entity approves, it may notify the Federal Data Management Office after
the event, for example by giving a quarterly update on all data-sharing initiatives it has
approved.
- Refused Confidential and Sensitive for data sharing requests which the data-owning Entity
proposes to refuse, it should inform the Federal Data Management Office at the same time
as declining the requesting Entity, documenting its rationale for declining the request. The
60
office has the power to issue binding decisions to change data access decisions in the cases
of dispute between Government Entities or with third parties.
- Approved Confidential: Given the extra sensitivity of such data, where a data-owning Entity
believes that sharing Confidential Data with another Entity is in the public interest and
follows the principles of both this Standard and DE6] Data protection and privacy, it should
consult the Federal Data Management Office before giving approval.
Recommended actions
When implementing Access Permission Principle 5 (“Access to shared data should be secured and
audited”), Government Entities are recommended to make this audit functionality openly available
for use by individual data subjects. This means:
Configuring electronic platforms and supporting business processes so that individual data
subjects (citizens, residents and businesses) can see an audit trail of who accessed their data,
and for which documented purpose (excluding security service or law enforcement access)
Providing mechanisms by which data subjects can raise concerns / escalate if they believe access has
been misused.
The service standard set out in the [DE7] Shared Data Access Permissions specification is that
Government Entities should respond to such requests within rreasonable period, giving either:
Agreement to the data sharing request, and a clear timetable for implementation
A refusal of the request, accompanied by a clear rationale for the refusal that is rooted in
the principles of the [DE7] Shared Data Access Permissions specification.
In certain cases, the Specification requires this response to be agreed with the Federal Data
Management Office.
In the early stages of sharing a dataset with other Entities, such responses will be managed on a
case-by-case basis. As the Entity develops more experience of assessing the privacy implications of
data sharing, it will increasingly want to codify that experience into a set of rules that can be
simplified and automated -to speed up access permission management in a risk-based way
61
5.4 Adding metadata and schema
This Guidance Note outlines the recommended process for ensuring that a
Purpose
dataset has all required metadata in accordance with UAE Smart Data
Standard: [DE2] Metadata and [DE3] Schema.
When to use Before a dataset is published as open data or exchanged with other Entities,
the data should be appropriately described so that it is discoverable and users
understand its reliability.
The Data Custodian is accountable for ensuring the dataset contains all
Responsibility
required metadata. The Data Specialist is responsible for providing a data
Schema if applicable.
Process
For each dataset in the current prioritized batch being catalogued, the responsible Data Custodians
and Data Specialists need to ensure the Metadata and Schema and Data Quality requirements are
met. The following process is recommended.
Director of Data
4
Data Oversee and manage the data compliance process of the data Validate and add to
Management custodians and data specialists within each department expanded Data
Officer ensuring each dataset is fully described and quality improved Inventory
1 2 3
Ensure all Add existing / Add Recommended
Data Custodians Mandatory easy to fill Metadata and a Data Schema
and Specialists Metadata is Recommended for Primary Registries and
added Metadata high priority datasets
1. Ensure all Mandatory Metadata is complete for the dataset – this should include the title,
description, subject, format, size, publisher, custodian, classification, access permissions, license,
coverage (temporal and geospatial) as well as the data files and last updated timestamp.
Discoverability Title Brief descriptive name for the dataset. Should Mandatory
communicate subject and scope.
62
Description A description of the dataset. This could provide Mandatory
more detail about what the data contains and what
it’s about, how and why is was collected, any
known errors or limitations. Ideally the description
covers all the relevant context that would be useful
for users to help them decide if this data is fit for
their purpose.
Subject The top level theme or category for the data. For Mandatory
example: health, transport, business, education.
This should be a pre-defined taxonomy or
vocabulary that is common across UAE. It could
have one level or include sub categorizations.
Technical Data files Links to or uploads of the data relevant to this Mandatory
information dataset. Might be in multiple formats (for example
as CSV and Excel).
If providing an API, ensure the API endpoint and
API documentation is linked to or uploaded.
Size of the dataset Size of the dataset files (in MB, kB, etc.) If using a Mandatory
platform to publish the data, this can be configured
to display automatically.
Last Updated Timestamp of when this dataset was last updated. Mandatory
If using a platform to publish the data, this can be
configured to display automatically.
Source Publisher The name of the Entity that owns the dataset. This Mandatory
should be in the format “Entity, Business Unit”, for
example “TRA, Wireless Networks & Service
Section”.
Contact information The email address or web form that should be used Recommended
63
to contact the Entity for queries, feedback or
requests concerning this dataset
Temporal Coverage Indicates the earliest date that the data in this Mandatory
start date dataset relates to.
Should use ISO 8601 date format.
Temporal Coverage Indicates the latest date that the data in this Mandatory
end date dataset relates to.
Should use ISO 8601 date format.
Geographic coverage Region covered by this data – for instance, the Mandatory
name of a city, district, council or country.
Language Language used in the dataset. Should use ISO 639 Mandatory
codes.
Access License Link or copy of the license terms under which the Mandatory
data may be used. By default, this should use [DE4]
Open data licensing for Open Data.
Personal data? Does this data contain any personal data? Yes/No. Recommended
Personal data means any information relating to an
identified or identifiable natural person; an
identifiable person is one who can be identified,
directly or indirectly, in particular by reference to
an identifier such as a name, an identification
number, location data, online identifier or to one
or more factors specific to the physical,
physiological, genetic, mental, economic, cultural
or social identity of that person.
Sensitive personal Does this data contain Sensitive personal data? Recommended
data? Yes/No.
Sensitive personal data are personal data that
directly or indirectly reveal an individual's family,
racial or ethnic origin, communal origin, political
opinions, affiliations, religious or philosophical
beliefs, their union membership, criminal record,
health, sexual life, genetic data or biometric data.
64
Reliability Provenance Details of how the data was collected, processed, Recommended
redacted or amended.
Data provenance documents the sources, inputs,
organizations, systems, and processes that have
formed and influenced the data, in effect providing
a historical record of the data and its origins. This
allows data-dependency analysis, awareness of
limitations and coverage, error detection and
auditing. It helps other users (including the Entity)
understand the limitations and level of trust they
can place in the data.
The Entity should aim to standardize its
provenance descriptions over time and provide it in
a machine readable method – for example by using
the World Wide Web Consortium standard for data
provenance.
Example of what to include in methodology:
Where the data came from (survey, third party
etc.)
Sample size (if survey)
Data collection method (face-to-face
interviews, online, requests from authorities)
Exclusions (what data what not included and
why)
Statistical aggregation methods used (small
number suppression, averaging, etc.)
The office of National Statistics in the UK regularly
publishes provenance and methodology data which
can be reviewed for a real data example.
Publishing Frequency The rate at which the data in the dataset will be Recommended
updated. Responses should correspond to a value
contained in the Dublin Core Collection Description
Frequency Vocabulary (that describes frequency
periods from “triennial” through to “continuous”).
Updates are expected to be additional data files
following the same schema, but new temporal
coverage (e.g. latest month).
To ensure good practice, the Data Custodian
should:
65
collection or if particular fields are unvalidated and
rely on the data subject self-reporting.
2. Add any existing or easy to fill Recommended Metadata fields out of tags, schema, unique ID,
contact information, source system, provenance, publishing frequency, known issues and data
completeness as well as details on whether the data contains personal or sensitive personal data
or intellectual property and associated terms of use.
3. Add all Recommended Metadata and a data Schema, using the relevant standards, for Primary
Registries datasets or high priority (as defined in the Guidance Note 4: Prioritization Criteria and
Process), structured and regularly updated datasets. These might be datasets needed for cross-
entity service delivery projects or which have been frequently requested by users and deliver on
strategic objectives.
SQL databases will already have a schema in place, although Entities may want to ensure these
are fit for purpose by modelling the data to be stored and deciding on the relationships,
vocabularies, validation and range(s) to be applied.
Structured non-tabular (e.g. JSON) data should provide a schema in JSON Schema format
according to the specification here: http://json-schema.org/.
Tabular (e.g. CSV) data should be expressed as a JSON Table Schema according to the open
specification here: https://frictionlessdata.io/specs/table-schema/.
A human-readable version of a schema looks like:
66
Source: http://csvlint.io/schemas/530b16c163737676e9260000
Having completed this process, proceed to assess whether the dataset meets the requirements of
the Data Quality Standard, following the process described below in Guidance Note 5.3 Managing
data quality. If Guidance Notes 5.1, 5.2, 5.3 and 5.4 have been followed properly, then all the
mandatory quality requirements will already now be met. But there may be additional steps you
take in improving data quality which will generate additional metadata requirements.
67
The Entity-level context for data quality
Managing data quality should be a strategic priority for the Entity as a whole. Having discoverable,
reliable, trusted and well managed data enables the Entity to be more efficient, effective and
accountable. It allows for the automation of processes as well as enabling better service delivery and
decision making. The [DQ3] Data Quality Improvement Plan specification requires all Government
Entities to develop a Data Quality Plan. Guidance on how to do this at an Entity-wide level is given
in:
Guidance Note 1: Establishing data governance roles and processes , which gives advice on
specific roles which will have key responsibilities for data quality and suggest there needs to
be at minimum:
- A full time dedicated role of a Data Management Officer or equivalent who day-to-day
manages and tracks the quality of the Entity’s data
- Data Custodians and Data Specialists who own and are responsible for the data quality
of the data assets they manage, undertaking data quality assessments and data
cleansing
- A Director of Data to set the strategic direction and requirements for data quality
management across the Entity
Guidance Note 2: Building a smart data roadmap , which gives advice on how to embed data
quality within a broader roadmap for managing and improving data
68
specific proprietary software and ensuring it is interoperable with other data) and,
- Is made available for bulk download or via an API either on the web or through a
platform with reliable lasting permanent access that is supported over time.
Accuracy The data is sufficiently accurate for its intended use and any gaps, known limitations,
approximations or errors are clearly described so that re-users understand the
limitations of the data.
Users both inside and outside the Entity should have a way to communicate their
requirements for greater accuracy and have those acted on by the Data Custodian /
Data Specialist responsible.
Descriptiveness Data has context to that potential re-users know what is in the data and how reliable it
is so that they can effectively judge whether it is fit for their purpose.
This means all datasets should have associated metadata and ideally a schema
specifying the ranges and values of each field.
Re-users should be able to understand how the data was created and processed, it’s
temporal and geographic coverage, granularity and limitations.
Timeliness Data is published or made accessible in real-time or soon after the data has been
generated. The data being published for re-use as open data or exchanged with other
Entities should be the same data as that being used for its intended purpose within the
data generating Entity.
If it is regularly updating data (such as a monthly report) the update schedule should be
clear in the metadata and should be reasonably followed closely to ensure re-users can
rely and trust this data for operational needs and decision making purposes.
Completeness The data should make sense as a complete dataset. It should be usable without
requiring other data (other than Primary Registries data) to make sense or use of it. This
means data should be published or exchanged as datasets which are comprehensive and
relevant missing records should be flagged.
Validation The data should be valid and effort made to ensure it is accurate and reliable over time.
This means for core and frequently updated data:
- Using a schema
- Having a clear data model with unique identifiers for the main objects in the
data (for example National ID for citizens)
- Regularly cleaning and testing the data to remove errors or duplicates.
69
Not all of these Principles will need to be applied in full to every dataset in order for data to be
appropriate for purpose. Data quality may be appropriate for current purposes even if one or more of
the principles is overlooked. (For example, data may be collected with a lower accuracy level to provide
data in a timely manner if time is a priority.) This means that these Principles should be balanced
against the importance and intended use of the relevant data.
However, some quality characteristics are essential in order to enable effective data publication and
exchange. These characteristics have been built in as mandatory elements of the Data Exchange
Standards, as summarised in the table below.
Once roles are in place and there is a prioritized inventory of the Entity’s data assets, the assigned
Data Custodians should first assess the state of data quality, then assess the quality level required by
data users and then make a plan to close the gap between the two over time, as illustrated below.
Director of Data
6
Data Oversee and support the data quality audit, user needs gathering and
Management quality improvement plan. Ensure plans are ambitious, well designed and
Officer achievable, as well as tracking and reviewing progress. General data
quality
maintenance of
1 2 3 4 5 core data
Define Create Quality Report and functions
Data Custodians Perform a Gather input
and Specialists required Improvement track against
data audit from users
quality level Plan targets
70
1. Assess current data quality by performing a data audit
Use the Data Quality Maturity Matrix provided in Appendix B to assessing the dataset against the
UAE Data Quality Principles, looking in turn at:
- Does the data have a clear owner? Is this the authoritative source of data?
- How accessible is the data?
- How accurate is the data?
- How well described is the data?
- Is the data up to date and has a publishing schedule?
- Is the data complete? Can it be used and make sense by itself?
- Has the data been validated against a schema or checked for duplication, errors, and
inaccuracies?
The Data Quality Maturity Matrix defines, for each of the seven Data Quality Principles, five levels of
maturity:
Level 1: Initial – unmanaged data, no owner, no open format, no metadata etc
Level 2: Partially conformant – the dataset has an identified owner and is making progress
towards conformance with the Data Quality Standard
Level 3: Conformant – the dataset meets all core requirements of data quality and UAE Data
Standards
Level 4: Improving – the dataset meets all core requirements and also is implementing
additional good practices
Level 5: Optimizing – data quality fully meets the needs of current and potential future users,
with clear systems for driving continuous improvement.
Descriptiveness
Timeliness
Completeness
Validation
The detailed tool for use when completing this matrix is at Appendix B of the UAE Smart Data
Implementation Guide.
It is also important to assess the needs of potential users who may not yet have access to the data or
would use the data if it was of a higher quality or reliability. Therefore, it is recommended to invite
input and feedback from potential future users. The Entity should publish its Data Inventory on its
71
electronic portal, including datasets that have not yet been prepared for publication and exchange,
in order to give potential users visibility of its data assets. It should also provide online channels for
data users to give feedback on their priorities for expanding the number of datasets that are
available on the portal and improving the quality of existing open data.
3. Define and determine required data quality per dataset or data source
Using the feedback from users, define what good data quality looks like for the dataset. Develop a
documented statement of Data Quality Requirements – including what the appropriate target level
should be for each element of the Data Quality Maturity Matrix provided in Appendix B. Record any
specific measures or indicators which are key to ensuring the data is reliable and fit for purpose for
the majority of users.
4. Create a plan to close the gap between existing and required quality level
Create a plan to reach defined quality level. This may require new processes, change management,
upskilling and training, better tools or other steps. Ensure your plan is ambitious, but realistic.
Milestones should be specific, measurable and time bound with clarity on who is responsible for the
milestone being achieved and how this will be measured and tracked.
Data Specialist and the Entity’s IT and security teams should create automated tracking for
quantitative measures such as % of metadata complete, use of open machine-readable formats, % of
publishing frequency dates realized, whether datasets have a schema, results of data validation
against schema, results of scripts to check duplicate records or unconformity data entries etc.
Data Custodians should also track qualitative measures such as feedback from users and impact of
use of data.
Embedding the [DQ1] Data Quality Principles across all data within an Entity will require a
phased, prioritised and Entity-wide plan of action.
Development and delivery of such a plan is a mandatory requirement for Government Entities.
An effective Data Quality Improvement Plan will be prioritized, baselined, user-focused, SMART,
managed and reported as described in the table below.
An effective Description
Data Quality
Improvement
Plan is
Prioritised The Plan should focus first on driving up the quality of data needed for Primary
Registries, the Entity’s own core business functions, and other high priority datasets.
Within these priority areas, it should focus first on fixing known quality issues – and in
particular focused on ensuring that the Core Quality Standards are met.
Baselined The Entity should ensure that its plans are informed by Data Quality Audits that give a
clear assessment of current performance against the [DQ1] Data Quality Principles.
These should include quantitative and qualitative measures of data quality, including
both use of the [DQ2] Data Quality Maturity Matrix and additional measures that are
relevant to the specific dataset.
User-focused For all priority datasets, the Entity should develop clear statements of Data Quality
Requirements. These should be evidence-based and reflect the documented quality
needs of users.
In developing these user requirements, the Entity should engage with existing internal
and external data users – but also consider the wider potential re-use of their data
(either as open data or shared data exchanged with other Entities).
These Data Quality Requirements should specify and define required data quality
72
measures for their different types of data sources and business processes, aligned with
the [DQ1] Data Quality Principles.
SMART For each priority dataset, the Entity should:
- Identify the gaps between the current baseline perfromance level as revealed in
the Data Quality Audit and the data quality requirement as expressed by users
- Set quantitative and/or qualitative targets for improvement. Targets should be
SMART (Specific, Measurable, Achievable, Relevant and Time-bound)
A on-size-fits-all approach to data quality targets across the Entity is not
recommended: rather these should be related to the current and potential use of the
specific dataset, to ensure the quality is appropriate for that use.
Managed The Entity should set out an overall Entity-wide plan for how it will deliver its targets
for quality improvement. This should include:
- Establishing clear accountabilities for data quality, at the Entity-wide level and for
each dataset
- Establishing systems and processes that guarantee data quality as part of the
normal business activity of the Entity
- Building data quality requirements into any contracts and outsourcing of data
management or data generation
- Assessing data quality of third party suppliers (which could include another
government Entity, business partner, customer, service provider or other
stakeholder), and performing spot checks (ideally against Service Level
Agreements with the data supplier).
Reported Entities should establish systems to track and report on data quality status, with the
Entity’s Management Board receiving regular progress reports (for example, on a
quarterly basis) showing progress across the Entity as a whole and by individual
business units.
Ideally, elements of this reporting will be automated and managed in real-time, for
example through:
- automated reports on data quality indicators (such as completeness of metadata
fields, use of schemas, success rate of validation, use of open machine readable
formats, update frequency % met, etc)
- Regular analysis of structured data against its data schema.
73
Data-cleansing step Description
1. Extract data from Data profiling tools perform complex analysis on data, and to perform this
operational data analysis directly against live data sources is not recommended. Data extraction
sources for profiling may be performed using separate ETL tools, or may be a capability of the data
profiling tools themselves.
2. Perform data This shall occur as part of a regular data audit process, enabling data quality
profiling analysis issues to be identified. The output of data profiling shall be used to build the
technical knowledge base for data cleansing.
3. Build cleansing The cleansing knowledge base includes mappings and correction rules that may
knowledge base for be automatically applied. For example, the range of mobile phone formats
each data profile identified by data profiling may include (nnn) nnn nnnn, +nnn nnnnnnn, nnn
nnn-nnnn. The knowledge base should include the rules for converting these
formats into a single format.
A knowledge base may include the ability query external data services, such as
telephone number validation, reference data management systems, and data
enriching systems, such as an Emirates ID service to provide more Citizen profile
data.
Physically, the knowledge base may be one or more systems, and may include
master data management tools, reference data management tools, and vendor
specific data cleansing solutions.
4. Automated Automated cleansing may be performed in batch against live systems, typically
cleansing using out of hours, and subject to sufficient testing. The size of the batch chosen
knowledge base should be determined by the smallest batch of data that can reasonably be
completed within the time window allowed.
The choice of records that form part of the each cleansed batch shall be
defined, for example, through order of insertion, age based (newest/oldest)
first, or most active records first.
Automated cleansing can also be applied to data extracts; however, the plan to
refresh the live data with the cleansed data should be considered carefully to
avoid conflicts where the live data has since changed.
5. Interactive data Automatic matching will reject data that cannot be cleansed. The Data
cleansing Custodian shall use this rejected data to perform manual cleansing. The
recording of cleansing decisions should be fed back into the knowledge base to
improve the quality of automated matching. This iterative cycle will initially
occur often during the development of the knowledge base.
6. Automated Automated cleansing services can then be delivered as interactive services,
cleansing services allowing information systems to have data validated and cleansed at the point
of data entry. For example, a CRM system for capturing a citizen's name and
address may make a service request to the automated cleansing service to
enrich the address, validate the telephone number, and match the individual
citizen with their other records stored in datasets elsewhere within the Entity.
74
5.6 Validation and publication of data
Purpose This Guidance Note provides Entities with guidance on the process and steps
for validating and then publishing datasets that have been prepared for data
conformance using Guidance Notes 5.1 to 5.5.
When to use Before publication of Open Data or exchange of Shared Data
Responsibility Data Management Officer, reporting to the Director of Data who has overall
accountability for conformane with UAE Data Exchange Standard.
Once the relevant Data Custodian has taken a dataset through the process described in Guidance
Notes 5.1 to 5.5 (that is: classify; format; document the permissions model; add metadata and
develop schema; manage data quality), the Data Management Officer and Director of Data will need
to validate and approve the dataset either for publication or for sharing and exchange with other
government entities over appropriate electronic networks.
First, the Data Custodian / Specialist should provide all of the relevant information (on classification,
format, metadata and quality) to the Data Management Officer in a single ‘conformant dataset’ with
the conformant data sample file, the metadata, and the details of the business processes needed to
support quality publication (what needs doing, who is responsible, timelines).
This complete dataset should be reviewed by the Data Management Officer and added to an Entity-
wide Data Inventory. The Data Management Officer should check that each dataset:
Has a classification. In cases where the classification is not Open, then a clear rationale for
this is documented and an Open derivative dataset provided
Has a data quality assessment report
Has a sample dataset in the appropriate format
Contains all Mandatory Metadata as defined in the [DE2] Metadata specification
Has easy-to-add and appropriate Recommended Metadata
That the above have been provided by a qualified person familiar with the data
The Director of Data should then satisfy themselves that each dataset is conformant to the Smart
Data Framework standards. In most cases, this decision will be taken within the Entity itself by the
Director of Data.
In some cases, however, the UAE Smart Data Standards require that the approval of the Federal Data
Management Office is given ahead of publication or exchange. In particular, the consent of the
Office will need to be given in advance of publication for:
a) Any Open data which the Entity believes it should charge users a fee to access, despite the
general principle that open data will be published as Open Data
b) Any data that Entities wish to exchange which has been classified as Confidential.
75
Approval to publish Publication platform
Once satisfied that the relevant approvals are in place, the Director of Data should communicate to
the relevant teams that the compliant datasets can be published as open data or exchanged with
external Entities. As illustrated above, publication to the smart data electronic platform will be
through one of two routes depending on the type of data being published.
For Open data, compliant datsets should be published through the data-owning Entity’s online
portal. By default, this should be as Open Data - except in exceptional circumstances where an
access fee is charged following the approvals process illustrated above and in conformance with the
[DE5] Data commercialization and fair trading specification. The portal should:
Include a full list of datasets on the Entity’s Data Inventory (including as yet unpublished
data, in order to facilitate feedback on future publication priorities from data users)
For Open Data, the portal should:
- Provide a clear and user-friendly Open Data Licence, which complies with the [DE4]
Open Data Licensing specification giving a clear and unambiguous license to use and
distribute including for commercial purposes. (The UAE Federal Open Data License
at Appendix A is recommended as the ideal way of meeting this requirement.)
- Enable anonyomous access to the data, without requiring users to register any
personal details or fill out forms
- Not charge any access fees
- Not discriminate between types of user
- Provide data in open, machine-readable formats that comply with the [DE1] Data
Formats specification, or enable direct API access to the dataset
- Provide bulk download functionality for data and gurantee a level of permanence by
not breaking URIs and ensuring original URIs redirect if the dataset location (URL)
changes.
- Provide metadata that complies with the [DE2] Metadata specification
- Provide measures of data quality, including use of the [DQ2] Data Quality Maturity
Matrix
76
- Provide online mechanisms for users to give feedback about data quality and to
express their views on future priorities for expanding the number of datasets that
are available on the portal and improving the quality of existing open data
Or, in exceptional cases where the Federal Data Management Office have approved the
Entity to charge an access fee for Open Data, the Entity should:
- Ensure that the rationale for charging, and the principles that the Entity applies to
ensure fair and competitive provision in line with the requirements of the [DE5]
Commercialization and fair trading specification, are published clearly on the
website
- Provide access to effective complaints and redress mechanisms, again in
conformance with the [DE5] Commercialization and fair trading specification.
For Shared Data, the Entity should use the Government Service Bus as the key platform for
exchanging its data with other Entities. No charge is required for Entities to integrate with the
Government Service Bus. For Shared Datasets that have been identified as Primary Registries, the
data should be made available:
Via API
Under the terms of a Service Level Agreement setting out the expectations that data users
can have in terms of the Entity’s commitment to the quality of the data
With privacy ‘designed-in’. This means that the API should give access to the smallest
amount of information required for the service outcome or to enable a decision. (For
example, sending ‘yes’, ’no’ or ‘not found’ in response to a query of whether a citizen or user
is over 18 or has a valid driving license instead of sending personal information.)
77
Approval to charge for access to raw Open Data will only be given when this is clearly in the
public interest, and where it is not feasible for the Government Entity to collect and publish the
data without charging access fees.
In making this determination, the Federal Data Management Office will take into account that all
Government Entities are expected - as part of their routine operations and investment planning
– to continually improve the methods, quality and timeliness of the data they collect without
seeking to charge data users for this.
Whenever, in such exceptional cases, a Government Entity does charge fees for access to raw
Open Data it should follow these principles:
Raw Data Description
Commercialization
Principles
1. Public interest Charged-for Open Data should be accompanied by a clear published
explanation on the Government Entity’s electronic portal of why such charging
furthers the goals of UAE Smart Government and is in the public interest.
2. Fair pricing Open Data should be available to all users on a fair, reasonable and non-
and discriminatory basis.
conditions
3. Account- The Entity should establish and publicise effective complaints and redress
ability mechanisms for third parties who believe that it is failing to comply with the
above principles.
2. Fair The Government Entity should ensure that it does not have an unfair advantage
competition over third parties who might also wish to market similar services. In particular, this
means:
- Publication of the underlying Open Data: the Government Entity should
publish as Open Data on its electronic portal the underlying Open Data that it
is using to create value-added services. This publication should be
78
undertaken at the same time as, or before, launch of the Government Entity’s
value-added data service.
- No use of Shared Data: for the purposes of developing a value-added data
service, the Government Entity should only use data that has been classified
as Open Data.
- No use of public funds: in order that Private Entities may compete on a fair
basis in the provision of commercial services using Open Data, the
Government Entity should set fees for any value-added data services in ways
that at least recover the full costs of providing those services, including a
reasonable return on investment, and should ensure that the provision of
these value-added data services is not based upon anti-competitive support
or funding from other Government Entities.
3. Fair pricing The Government Entity should make the value-added data services available to all
and users on a fair, reasonable and non-discriminatory basis
conditions
4. Account- The Entity should establish and publicise effective complaints and redress
ability mechanisms for third parties who believe that it is failing to comply with the above
principles.
79
APPENDIX A: UAE FEDERAL OPEN DATA LICENSE
The Federal Open Data License is shown below in two forms:
A user-friendly, plain language summary for publication on the web pages of Open Data
Portals
The detailed License terms which support this and which should be available for download
from Open Data Portals and linked from the summary.
This is a summary of (and not a substitute for) the license that applies to Your use of Information
accessed via [name of Open Data Portal] ("License"). A copy of that license may be accessed
<here>.
1. Overview
We grant you a worldwide, royalty-free, perpetual, non-exclusive license to use and re-use the
Information that is available under this license freely and flexibly, subject to the conditions below.
copy, reproduce and communicate to the public the Information in any format
! not use the Information in any way that is unlawful and/or misleading to the general
public
! Include the name or identification of the author and retain any copyright notice featured
in the original material
4. Exemptions:
The License does not permit the use of:
X any trademarks associated with a Database or with the Open Data Platform
80
Federal open data license: full text
When you access and use the Information, you accept and agree to be bound by the terms and
conditions of this License in connection with your use of the information provided by the License
issuer.
Federal Government Entity Means any Ministry, authority, department, public body,
independent body, public institution, federal government council
or any other governmental or public institution of the federal
government of the United Arab Emirates;
Original Materials Means all the contents of any database (or any part thereof) that
includes any data, content, work products or other materials that
have been collected in the database and made available to
disclose by the license grantor to users under the terms of this
general license; or derived materials;
Modified Materials Means any work in any medium (whether currently produced or
to be created in the future) created by entity or created by any
other recipient that incorporates or uses any original information
or material either alone or in conjunction with materials from
another source and as an independent product;
You or the conscience of the Means the individual or entity using the original material to
addressee develop modified or derivative materials under this general
license;
Copyright and disseminating Means the rights granted by copyright and / or similar or related
rights closely related to copyright including performance,
broadcasting and recording of sound and rights in databases or
81
literary collections;
User License Means the license you apply to the user of the modified materials
or derivative materials in accordance with the terms and
conditions of this general license;
Participation Means the disclosing of material to the public through any means
or process requiring permission under this license, such as
copying, public disclosing, public performance, distribution,
dissemination, media or import, and making material available to
the public, including by means of access to materials In the place
and time of their choice;
Exceptions and Limitations Means any exclusion or other restriction on copyright and similar
rights applied to your use of the Original Materials;
2.1.1 Take into consideration the terms and conditions of this General License,
2.1.2 The license grantor grants you a free and non-exclusive license to download
information over the Internet and technology media
2.1.3 The License grantor authorizes you to exercise the licensed rights in all media and
formats currently produced and known or to be created later, and to apply the
necessary technical modifications on.
2.1.4 This General License may not be sublicensed and irrevocable for the exercise of rights
under this License in order to copy and share original materials, in whole or in part, or
to produce, reproduce and share modified or derivative materials.
2.2 The License grantor waives and/or agrees not to endorse any right or authority or to prohibit any
entity or individual from making the technical modifications necessary to exercise the licensed
privileges, including necessary technical modifications and performing any authorized variations
in this section and does not result in any modified or derivative materials.
82
3.1 Terms and conditions of Use:
3.1.1 You must ensure use is not contrary to UAE or international laws.
3.2 You may allow users to use the original materials and, if you do so, you must comply with the
terms of this license and you are prohibited from displaying or imposing any additional or
different terms or conditions on the use of that information by any other user.
3.3.1 Personal data within information such as identity documents such as passport number
or national identity;
3.3.2 Information that not disseminated and not disclosed with the consent of the license
grantor;
3.3.3 Rights of third party that license grantor have no authority to disclose;
3.3.4 Any images (including logos, drawings or photographs) that disclose within the original
materials;
3.3.5 Information subject to other intellectual property rights, including patents, trademarks
and design rights
4.1 Attribution:
4.1.1 The user may share the original materials by considering the following:
4.1.3 Inclusion of an electronic link or hyperlink to the original material in reasonable form,
4.2.1 This license grant users the right to redistribute, modify, change, and quote from your
materials whether for commercial or non-commercial purposes as long as they
associate/attribute your original material to your name.
4.2.2 This license grant users the right to modify, improve, and create new derivative
materials from previously modified or derivative materials, whether for commercial or
non-commercial purposes, as long as they authorize their new derivative works with
the user's license under the terms of this license.
4.2.3 You must comply with the requirements stated in Section 3 and include them in the
User License if the contents of the entire database or portion of the original material
are shared.
83
Article (5) Disclaimer of warranties and limitation of liability
5.1 The License grantor is not responsible for any damage or misuse suffered by third parties as a
result of the use of such data and does not guarantee the continuity of the availability of such
data or part thereof, nor shall it be liable to users of such data and any damage or loss they may
suffer due to reuse.
5.2 It is prohibited to sell or resell any original information have been used in accordance with this
License for any fee or amount of money or for any form of compensation or reimbursement.
6.2 The license grantor has rights to disclose the original materials under separate terms or
conditions or discontinue the disclosing of the original materials at any time; however, the
terms of this license shall remain in force notwithstanding such cancellation.
6.3 The terms of the License shall remain in force after termination of this General License.
7.2 Any arrangements, considerations or agreements relevant to the original materials not
mentioned in this License shall be considered separate and independent from the terms and
conditions of the General License.
7.3 This General License shall not be construed as derogation, restriction, prohibition or imposition
of conditions on any use of the original materials which may be made legally without
permission under this General License.
7.4 It is not permitted to disclaimer of any condition or provision in this General License and
neglecting compliance to unless expressly agreed to by the license grantor
7.5 Nothing in this General License shall constitute or be construed as a restriction or waiver of any
privileges or immunities applicable to the license grantor or to you, including immunity from
legal procedure in any jurisdiction or authority.
7.6 This License is governed by and construed in accordance with the laws of the United Arab
Emirates
End of license
84
APPENDIX B: DATA QUALITY MATURITY MATRIX – ASSESSMENT TOOL
Quality
Principle 1 = Initial 2 = partially conformant 3 = Conformant 4 = Improving 5 = Optimizing
Ownership and The dataset has no clear A named Data Custodian takes As at Level 2. In addition: As at Level 3. In addition: As at Level 4. In addition:
authority accountable owner within the personal responsibility for the - The Data Custodian has - Feedback mechanisms - There is clear evidence
Entity. Multiple data users quality of the data. engaged with current and have been established to that effective processes
keep and manage duplicate The Data Custodian has potential future users of allow data users to request are in place to enable
versions of the data. undertaken a baseline the data to understand quality improvements user-driven continuous
assessment of current data and document their Data - In the case of a Primary improvement.
quality, documenting known Quality Requirements, and Registry, the dataset is - In the case of a Primary
quality issues. is managing a plan to close now widely used as the Registry, the dataset is
any gaps between current single authoritative source now accompanied by clear
and required quality levels of data. There are no Service Level Agreements
- For a data set used by duplicate versions of the for data users
multiple organizations, data managed elsewhere.
systems and processes
have been established to
ensure that it can be
managed as a Primary
Registry (ie able to provide
the data as a service to all
relevant users).
Accessibility4 The data is inaccessible by third The data is at least one of: The data is accessible through As at Level 3. In addition, As Level 4. In addition, the
parties because it is: - Published on the web or both: - Published data is available dataset is linked to other
- Not published on the web via an API - Publication on the web or for bulk download relevant data to provide
or currently shared with - Available to external users via an API - The data uses URIs / URLs context.
other Entities. in an open machine- - Publication in an open to enable others easily to
readable format. machine-readable format. link their data to it.
Accuracy Accuracy issues in the dataset There are significant accuracy Known accuracy problems are As at Level 3. In addition, the Data accuracy is fit for purpose
4
The maturity levels for accessibility draw on the Five Star deployment scheme for open data developed by Sir Tim Berners-Lee, but expanded to cover shared data as well as open data. An open data set scoring 1
to 5 on the Five Star model would score the same on the accessibility dimension of the UAE Data Quality Maturity Model.
(errors, gaps, limitations) are problems with the data, but documented and explained to Entity is actively reaching out to for both existing and potential
either unknown, or known to these are documented and data users. Accuracy level is potential data users to uses of the data, based on
be very significant. explained to data users. adequate for current use or understand how accuracy clear, documented user-
purpose. improvements could support research and feedback.
new use cases for the data.
Descriptiveness The dataset has no metadata or The dataset has some The dataset has all mandatory The dataset has all mandatory As at Level 4. In addition, the
schema. metadata or a schema metadata. metadata and it has a schema. dataset has all the additional
describing the data. In the case of a Primary recommended metadata.
Registry, it has a schema.
Time- For The dataset is out of date. The dataset is regularly As at Level 2. In addition, the As at Level 3. In addition, As at Level 4. In addition, data
liness updated updated, on a timescale that dataset has a publishing guarantees exist that the most updates are managed in real-
datasets meets the needs of current schedule which is being met in up to date data will be available time, with publication or
users. practice and which is included in future over a specified exchange occurring at the same
in metadata for publishing period. time for internal data users and
frequency. external data re-users.
For one- The dataset is out of date, to The dataset is out of date, but The dataset is recent enough to [No Level 3 for one-off [No Level 4 for one-off
off the point where it has no useful still has some use value for meet all the needs of its users. datasets] datasets]
datasets value. users.
Completeness There is significant missing data There is missing data in the Data completeness is fit for As at Level 3. In addition, the Data completeness is fit for
in the dataset or coverage is dataset or coverage is poor and current purposes. It makes Entity is actively reaching out to purpose for both existing and
poor and this is not this is documented and sense as a dataset, can be used potential data users to potential uses of the data,
documented. explained to data users. by itself or in combination with understand how more based on clear, documented
reference data and any missing complete data coverage could user-research and feedback.
records or gaps are support new use cases for the
documented and explained. data.
Validation No validation of data. For creating/recording the As at Level 2. In addition, all As at Level 3. In addition, this is As level 4. In addition, regular
data, use is made of fields which do not need to be encoded into a schema against data cleaning is carried out to
vocabularies or validated fields free text are now validated (e.g. which the data can remove duplicate records and
(e.g. ensuring phone numbers address lookup from postcode, automatically be validated. errors across data systems.
are in a conformant agreed checking a number is entered
format). when a number is expected).
- 86 -