RESPONSIBLE APPROACHES TO DATA Sharing
RESPONSIBLE APPROACHES TO DATA Sharing
KE Y TAKE AWAYS:
• Open sharing of timely and accurate data is essential to effective and efficient humanitarian
response. How humanitarian organizations approach data sharing directly relates to trust and
cooperation in the sector.
• As the humanitarian data ecosystem grows, the opportunities and risks of sharing data become
clearer, prompting organizations to explore more limited approaches to data sharing.
• Humanitarian organizations widely recognize the sensitivity of personal data: its exposure has a
high likelihood of causing harm. The majority of non-personal data is safe to share openly, but
non-personal data can also be sensitive and should be handled with caution.
• Humanitarian organizations should take into account four factors when deciding whether to
share non-personal data: (i) utility; (ii) sensitivity; (iii) human and technical capacity; and (iv)
governance.
• Humanitarian organizations should identify and compare all available approaches for data
sharing, considering the most open approach first and working down to more limited approaches
as necessary.
INTRODUC TION
The use and exchange of data have become core functions of humanitarian organizations. Staff regularly
need to decide whether and how to share their organization’s data, even if their role is not primarily
focused on data or information management. Beyond individual organizations, the interest in the
sharing and use of data generated in humanitarian action has also grown. In response to this interest, the
humanitarian sector has seen a surge in data generation and sharing in recent years.
Open sharing of timely and accurate data is essential to effective and efficient humanitarian response and
should remain a key objective for the sector. For example, the COVID-19 epidemiological data1 compiled
and shared daily by the Johns Hopkins University Center for Systems Science and Engineering has been
1
Access the Novel Coronavirus (COVID-19) Cases Data on HDX.
How humanitarian organizations approach data sharing directly relates to trust and cooperation in the
sector. Maintaining trust within the data ecosystem is critical to the sustainability of data sharing and
relates to issues such as the quality of the data, the level to which the data will be secured after sharing
and the responsible use of data by the recipient. Because data in the humanitarian sector often relates to
the most at-risk populations, managing and sharing it warrants caution.
Many humanitarian organizations have developed or updated their guidance, governance and practices
to support different aspects of data responsibility: the safe, ethical and effective management of data. The
sector has also seen an increasing number of collaborative efforts to improve data responsibility beyond
individual organizations.3 Still, as the humanitarian system learns more about the risks associated with
data sharing, organizations face more complex challenges in sharing this data responsibly.4
This guidance note aims to support decision-making around the sharing of non-personal data in
humanitarian settings. It explains data sensitivity, provides common examples of sensitive non-personal
data, and explains an approach to information and data sensitivity classification in humanitarian settings.
It also offers a framework that organizations can use to weigh four factors that help determine whether
data can be shared and explains common approaches for doing so responsibly.
When the Humanitarian Data Exchange (HDX) was launched in 2014, it held close to 900 datasets, shared
by a handful of ‘early adopter’ organizations. By the end of 2020, that number had grown to over 18,000
datasets. Only approved organizations are able to share data on the platform. They can make data
available publicly to anyone who visits the site or privately to only the members of their organizations.
In 2017, the HDX team added another option for data sharing: HDX Connect. This feature enables
organizations to only publish the metadata, with the underlying data available upon request. If access is
granted, the data is shared bilaterally without passing through the HDX platform. For example, Ground
Truth Solutions use HDX Connect to provide access to COVID-19 Community Perceptions Data collected in
Iraq.
As part of its quality assurance process, the HDX team also runs a disclosure risk assessment on any
resource added to the platform that contains microdata. The HDX team does this because it may be
possible to re-identify individuals or expose confidential information even after direct identifiers have been
removed from microdata.5
Some organizations on HDX have become more oriented towards controlled access to their data, either
due to the sensitive nature of the data, increased pressure to track and report on how the data is used,
or resource constraints related to operational sustainability. HDX will always support different ways of
sharing data — however, open access remains the best option for the majority of data that is generated for
humanitarian response.
2
“From August 2016 through August 2020 (the period for which the data is available), growth in monthly users from HRP+ countries was 943%
compared to 566% across all countries.” From the HDX Case Study, September 2020.
3
These include, for example, the IASC Sub-Group on Data Responsibility in Humanitarian Action, the Protection Information Management
initiative, and the Responsible Data for Children initiative, among others.
4
For a better understanding of the challenge facing humanitarian organizations when sharing data specifically in protracted humanitarian crises,
see ALNAP, Data Collection, Analysis and Use in Protracted Humanitarian Crises, June 2020.
5
Learn more about the Centre’s risk mitigation process for microdata, ‘Statistical Disclosure Control’ or SDC in the Learning Path on the topic.
1. Data about the context in which a response is taking place (e.g. legal frameworks, political, social
and economic conditions, infrastructure, etc.) and the humanitarian situation (e.g security incidents,
protection risks, drivers of the situation or crisis).
2. Data about the people affected by the situation and their needs, the threats and vulnerabilities they
face, and their capacities.
3. Data about humanitarian response actors and their activities (e.g. as reported in 3W/4W/5W).
The majority of this data is safe to share openly. However, non-personal data can also be sensitive.
Examples of sensitive non-personal data include data on groups experiencing gender-based violence or the
location of ethnic minorities in conflict settings. Such data is considered sensitive because it enables the
identification of groups of individuals by demographically defining factors, such as ethnicity, gender, age,
occupation, religion or location of origin. Non-personal data can also create risk in other ways, for example
by exposing the location of medical facilities in areas where they are prone to attack. As the awareness of
the risk associated with sharing such data continues to grow, some organizations are turning from a focus
on open data to more controlled sharing.
Many organizations have information and data sensitivity classifications (see figure 1 below) that define
which data falls into which category of sensitivity in order to facilitate responsible data management. These
classifications may also be developed as a collective exercise to help organizations align around what
constitutes sensitive data in their context and identify the appropriate disclosure or dissemination methods
for different data types depending on their sensitivity.
6
Personal data should not be shared openly, and management of personal data should always comply with national and regional data protection laws, or with
internal data protection policies in the case of organizations covered by privileges and immunities.
7
UNOCHA (2019), Working Draft Data Responsibility Guidelines.
FOUR FAC TOR S FOR DETERMINING WHETHER TO SHARE NON - PER SONAL DATA
There are four factors humanitarian organizations should take into account when deciding whether to
share non-personal data.
3. What human and technical capacity do the organizations sharing and using the data have?
Both the organization sharing data and the organization(s) receiving and using the data should
have sufficient human and technical capacity for responsible data management. This includes staff
availability, data literacy, technical infrastructure and related resources. In environments with low
connectivity, bandwidth-heavy data sharing methods may not be appropriate. For contexts with
known security risks, data should typically be shared through more limited approaches.
8
See the Guidance Note on Data Impact Assessments.
9
For data management in the humanitarian sector, risk can be defined as the likelihood and impact of harm resulting from data management.
10
The World Economic Forum, together with Washington University Centre for Information Assurance and Cybersecurity, the Sustainable
Development Solutions Network TReNDS and the NYU GovLab, have begun building a repository of data sharing agreements to support the
professionalization of this practice through their Contracts for Data Collaboration (C4DC) project.
11
The licenses recommended for data sharing via HDX are listed here: https://data.humdata.org/about/license.
12
For more information about data incident management in humanitarian response, see our Guidance Note on Data Incident Management.
Organizations should determine the best approach to data sharing based on the four factors above. These
approaches range from open sharing to maximize the benefit of data, to more limited approaches such
as bilateral data sharing or only sharing data insights. The table below contains an overview of different
approaches to data sharing and offers examples of some commonly used tools and platforms.
In comparing these different approaches, always consider the most open approach first and work down
to more limited approaches as necessary. Different data types will require different ways of sharing. For
example, large datafiles will require specialized infrastructure and Application Programming Interfaces
(APIs) are suitable for data that is published in the same format on a regular basis. Because technologies
for data sharing continue to evolve, organizations should regularly revisit and compare available data
sharing approaches.
13
Not all tools and platforms in this overview have been vetted by the UN Secretariat. Always consult the relevant Information Technology advisors
before using a new tool.
14
UNHCR’s MicroData Library.
15
IFRC’s GO Platform.
16
Within humanitarian responses, one of the most common ways to share data is via email attachments. When sharing data via email, always take
the necessary security precautions. This way of sharing is responsible in some cases, but there are often more suitable ways to share data. For
information on how to encrypt email, see for example: https://www.cloudwards.net/how-to-encrypt-your-emails/.
17
The Open Algorithms Project.
18
Aircloak Insights.
19
To learn more about homomorphic encryption as a way of sharing the value of sensitive data, see here: htps://www.microsoft.com/en-us/research/
project/homomorphic-encryption/ and here: https://www.wired.com/story/google-private-join-compute-database-encryption/.
20
To learn more about multi-party computation, see here: https://www.tno.nl/en/focus-areas/information-communication-technology/roadmaps/
data-sharing/secure-multi-party-computation/.
A relatively new approach to utilizing data without transferring the data itself is ‘querying’. Querying
allows third parties to formulate specific questions to be asked of the data without accessing it directly.
The resulting insights can then be checked for sensitivity and any other issues by the holder of the data.
This approach avoids transfers of data which can cause legal and ethical concerns, while still allowing for
valuable insights to be used for public good.
In implementing a querying approach, it is critical to establish governance in the form of instructions and
boundaries regarding the queries that may be sent, in order to prevent retrieval of sensitive information by
posing a combination of questions.21 Vetting users as well as their questions should always be a key step in
the process around this type of approach.
Commercial solutions to set up querying approaches include Aircloak Insights, which acts as a ‘proxy
between analysts and the sensitive data they need to work with.’ Another querying tool is the Open
Algorithms (OPAL) platform. This tool was specifically developed for the humanitarian and development
sectors and is currently being piloted in Colombia.
In close collaboration with Flowminder and building on their Flowkit, JIPS developed a prototype
querying approach to enable humanitarian and development actors to safely access and query sensitive
individual-level data without needing to share it. The team developed a technical workflow to demonstrate
the viability of this approach with one single data provider and mapped the problems and limitations in
case of multiple data providers.
Organizations are encouraged to share their experience in promoting responsible data sharing with the
Centre for Humanitarian Data via [email protected].
The Centre for Humanitarian Data ('the Center'), together with key partners, is publishing a series of eight
guidance notes on Data Responsibility in Humanitarian Action over the course of 2019 and 2020. The Guidance
Note series follows the publication of the working draft OCHA Data Responsibility Guidelines in March 2019.
Through the series, the Centre aims to provide additional guidance on specific issues, processes and tools
for data responsibility in practice. This series is made possible with the generous support of the Directorate-
General for European Civil Protection and Humanitarian Aid Operations (DG ECHO).
This document covers humanitarian aid activities implemented with the financial assistance of the
European Union. The views expressed herein should not be taken, in any way, to reflect the official opinion
of the European Union, and the European Commission is not responsible for any use that may be made of
the information it contains.
21
For an explanation of this risk, see for example: https://www.usenix.org/conference/usenixsecurity19/presentation/gadotti.