0% found this document useful (0 votes)

11 views12 pages

Sciadv Abb7973

This research article discusses the use of generative models to synthesize high-resolution radiographs, addressing the limitations of data sharing in medical imaging due to privacy concerns. The study demonstrates that synthesized images can closely resemble real radiographs, enabling improved training of computer vision algorithms without compromising patient data. The proposed method allows hospitals with limited datasets to contribute to and benefit from enhanced medical imaging technologies through federated learning strategies.

Uploaded by

Kübra Temel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views12 pages

Sciadv Abb7973

Uploaded by

Kübra Temel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

SCIENCE ADVANCES | RESEARCH ARTICLE

COMPUTER SCIENCE Copyright © 2020

The Authors, some
Breaking medical data sharing boundaries by using rights reserved;
exclusive licensee
synthesized radiographs American Association
for the Advancement
of Science. No claim to
Tianyu Han1, Sven Nebelung2, Christoph Haarburger3, Nicolas Horst4, Sebastian Reinartz1,5, original U.S. Government
Dorit Merhof4,6,7, Fabian Kiessling6,7,8, Volkmar Schulz1,6,7*†, Daniel Truhn3,5* Works. Distributed
under a Creative
Computer vision (CV) has the potential to change medicine fundamentally. Expert knowledge provided by CV can Commons Attribution
enhance diagnosis. Unfortunately, existing algorithms often remain below expectations, as databases used for NonCommercial
training are usually too small, incomplete, and heterogeneous in quality. Moreover, data protection is a serious License 4.0 (CC BY-NC).
obstacle to the exchange of data. To overcome this limitation, we propose to use generative models (GMs) to
produce high-resolution synthetic radiographs that do not contain any personal identification information. Blinded
analyses by CV and radiology experts confirmed the high similarity of synthesized and real radiographs. The com-
bination of pooled GM improves the performance of CV algorithms trained on smaller datasets, and the inte-
gration of synthesized data into patient data repositories can compensate for underrepresented disease entities.
By integrating federated learning strategies, even hospitals with few datasets can contribute to and benefit from
GM training.

Downloaded from https://www.science.org on December 01, 2023

INTRODUCTION patient privacy issues prohibit combining data from multiple sites.
The application of computer vision (CV) in medicine promises to This is even more conspicuous given that the majority of patients
personalize diagnosis, decision management, and therapy based on are willing to share their data for research purposes if adequate
the combination of patient information with knowledge of thousands measures have been taken to protect their privacy (7). Secure ways
of experts and the outcomes of billions of patients. In recent years, to share and merge medical images are essential for the develop
scientific effort has focused on applications of CV in medicine, in ment of future CV algorithms (8).
particular in radiology (1). Where there has been progress toward Federated learning has gathered attention and is suitable where
this vision of an omniscient radiological CV, this has mostly been data sharing is hindered by privacy considerations. In this paradigm,
anticipated by corresponding technical advances in the field of CV a central model is updated by exchanging encrypted gradients or
on natural images. A prominent example is convolutional neural weights between global and selected models (9). To further improve
networks (CNNs), which had their breakthrough when the perform privacy in medical applications, a fraction of weights or gradients
ance of AlexNet surpassed more conventional CV algorithms in within local models can be blurred by injecting random noise, i.e.,
2012 (2). Since then, CNNs have matched and even surpassed hu differential privacy. Such a random module has been successfully
man performance on natural image recognition tasks (3). Similar integrated into a federated brain segmentor (10). However, in the
developments took place in medicine, where CNNs performed com conventional federated learning settings, the central instance cannot
parably to the performance of experts in computed tomography inspect the raw training data due to privacy concerns, and hence,
(CT) screening for lung cancer (4) and retinal disease detection (5). modeling tasks become challenging.
However, human performance in CV on medical images has so far Another promising solution to overcome data sharing limitations
only been achieved but not surpassed. Whenever human perform is the use of generative adversarial networks (GANs), which enable the
ance in CV on medical images was achieved, large datasets were generation of an anonymous and potentially infinite dataset of images
used, often pooled from many sites, containing thousands of images. based on a limited database of radiographs. GANs are a special class
Going a step further and surpassing human performance in CV on of neural networks that were first introduced by Goodfellow et al.
natural images, however, always required even larger databases con (11) in 2014 and have since then been advanced to generate high-
taining up to billions of natural images (6). resolution, photorealistic synthetic images (12). While the first imple
Unfortunately, collecting and sharing such large quantities of mentations of GANs made it possible to synthesize unconditioned
medical images seem inconceivable, caused, in part, by their insuf images, the development and usage of informative priors to drive
ficient public availability. Even if the combined data worldwide generators that output conditional samples are desired in medical
reach billions of images, like in the case of thoracic radiographs, applications. A common choice for such a conditional prior is an
existing image as used in pix2pix (13) and Cycle-GAN (14). Recently,
1
Physics of Molecular Imaging Systems, Experimental Molecular Imaging, RWTH Cycle-GAN–based networks have gained attention in the medical
Aachen University, Aachen, Germany. 2Department of Diagnostic and Interven- imaging community due to their capabilities of achieving intermo
tional Radiology, University Hospital Düsseldorf, Düsseldorf, Germany. 3Aristra GmbH, dality image transitions. On the basis of Cycle-GAN frameworks,
Berlin, Germany. 4Institute of Imaging and Computer Vision, RWTH Aachen Uni-
versity, Aachen, Germany. 5Department of Diagnostic and Interventional Radiology,
researchers such as Wolterink et al. (15) and Chartsias et al. (16)
University Hospital Aachen, Aachen, Germany. 6Fraunhofer Institute for Digital Medi- successfully demonstrated bidirectional CT–magnetic resonance
cine MEVIS, Bremen, Germany. 7Comprehensive Diagnostic Center Aachen (CDCA), imaging (MRI) transitions in both brain and heart imaging. Fur
University Hospital RWTH Aachen, Aachen, Germany. 8Institute for Experimental thermore, Zhang et al. (17) introduced a segmentor-based shape
Molecular Imaging, RWTH Aachen University, Aachen, Germany.
*These authors contributed equally to this work. consistency term to the Cycle-GAN loss and achieved realistically
†Corresponding author. Email: [email protected] looking volumetric CT-MRI data transitions. The performance of

Han et al., Sci. Adv. 2020; 6 : eabb7973 2 December 2020 1 of 11

SCIENCE ADVANCES | RESEARCH ARTICLE

segmentors and GANs was boosted by shape consistency and online eration of synthesized radiographs, was much faster with a rate of
augmentation, respectively. Nevertheless, image-based conditioning 67,925, 41,379, and 4511 generated radiographs per hour at the
always carries the risk of leaking patient-sensitive data to the gener three spatial resolution stages. Sample images are shown in fig. S4A
ator during the training process. for a spatial resolution of 256 × 256. Further images for spatial res
Here, we propose to use generative models (GMs) on the basis of olutions of 512 × 512 and 1024 × 1024 are given in the Supplemen
convolutional GANs (18) to break the boundary of sharing medical tary Materials.
images and to enable merging of disparate databases without the We have chosen the multiscale structural similarity (MS-SSIM)
limitations that are now restricting the collection of radiographs in as a metric (19) to detect a possible mode collapse of our GAN (i.e.,
a public database (see Fig. 1). To demonstrate the performance of missing diversity in the images). The MS-SSIM has been successfully
our concept, we show that fully synthetic and thus anonymous used in predicting perceptual similarity judgments of humans. A
images can be generated, which look deceivingly real—even to the lower MS-SSIM reflects perceptually distinct samples and proves
expert’s eye—and that these images can be used in the medical data the high diversity of a dataset. In fig. S2, we depict the MS-SSIM of
sharing process. Our concept proposes how medical images or data 1000 randomly selected pairs of samples within a given pathology
can be shared in the future. class. As can be seen, the overall MS-SSIM among synthesized pairs
is comparable to that of real sample pairs.

RESULTS Ability of human readers to distinguish synthesized

Generation of synthesized radiographs radiographs from real x-ray images
Generating synthesized two-dimensional images in high resolution To test the quality of the synthesized radiographs (i.e., radiographs
is a nontrivial task and has just recently been made feasible by using synthesized by the generator), six readers were presented 50 synthe

Downloaded from https://www.science.org on December 01, 2023

progressive growing during training (12) or by using large-scale sized radiographs each and 50 radiographs of real patients in ran
networks that demand massive amounts of computing power. As domized order, and the readers were separately tasked with deciding
the computing power required for the latter approach is, in general, whether the presented radiograph was real or synthesized. The tests
not accessible to most hospitals, we used progressive spatial resolu were repeated with spatial resolutions of 256 × 256, 512 × 512, and
tion growing during training of our networks. Thus, the GAN was 1024 × 1024, resulting in a total of 18 tests with 100 radiographs each.
trained by starting with a spatial resolution of 4 × 4 and stepping up To assess whether experience with machine learning or radio
in powers of 2 (8 × 8, 16 × 16, 32 × 32, 64 × 64, 128 × 128, 256 × 256, logical expertise was necessary to identify synthesized radiographs,
512 × 512) to a spatial resolution of 1024 × 1024. the readers were grouped and chosen as follows: group 1 consisted
We measured the time needed to train a GAN on a dataset size of three readers that had a background in CV (readers 1, 2, and 3,
of 112,120 radiographs with a hardware setup that is accessible to who had 4, 2, and 5 years of experience in CV, respectively), while
any small hospital: We used a desktop computer with an Intel group 2 consisted of experienced radiologists (readers 4, 5, and 6,
Xeon(R) E5-2650 v4 processor (Intel, Santa Clara, CA) and an who had 4, 19, and 6 years of experience in general radiology, no
Nvidia Tesla P100 16 GB GPU (Nvidia, Santa Clara, CA). To com dedicated specialization to thoracic radiology).
pletely train the GAN with this setup to generate synthesized x-rays Accuracies in differentiating the synthesized images from the
with a spatial resolution of 256 × 256 took 60 hours. Continuing the real images at spatial resolutions of 256 × 256 were 60 ± 5% for
training to generate synthesized x-rays of spatial resolutions 512 × group 1 and 51 ± 5% for group 2. Generating convincing radio
512 and 1024 × 1024 took 114 and 272 hours of computational time, graphs at higher spatial resolutions proved more difficult, and ex
respectively. Once the training had finished, inference, i.e., gen perts were able to distinguish real from synthesized radiographs

Fig. 1. Concept of constructing a public database without disclosing patient-sensitive data. The GAN in each hospital consists of a generator G and a discriminator
D. During training, patient-sensitive data (shown in red) are never exhibited to the generators G directly. Patient-sensitive data is only exhibited to discriminator D while
it is trying to differentiate between real and synthesized radiographs. After training is completed, only the generators G are transferred to a public database and can be
used to generate synthesized radiographs.

Han et al., Sci. Adv. 2020; 6 : eabb7973 2 December 2020 2 of 11

SCIENCE ADVANCES | RESEARCH ARTICLE

more easily at spatial resolutions of (512 × 512)/(1024 × 1024) with renchyma, which reflect the network’s difficulty to generate fine
accuracies of 67 ± 17%/82 ± 4% for group 1 and 65 ± 5%/77 ± 13% details (see fig. S4E).
for group 2. Thus, radiologists and CV experts performed similarly
when identifying synthesized radiographs at high resolutions when Ensuring non-transference of private information
judged by their accuracy. As shown in table S2, the sensitivity, i.e., To exclude the possibility that the GAN memorizes and subse
the correct identification of a synthesized radiograph, was higher quently merely reproduces the given training examples, 1000 ran
than the specificity, i.e., the correct identification of a real radiograph. domly synthesized radiographs were generated, and their nearest
This is probably attributable to the fact that some synthesized radio neighbors in the database of real radiographs were sought accord
graphs show telltale signs of synthesization (see fig. S4E) and thus ing to the structured similarity index. All 1000 radiographs along
allow for a more reliable identification. While radiologists predom with their respective three nearest neighbors were then plotted, and
inantly detected errors in anatomical details such as bone shape or a board-certified radiologist assessed whether an entity from the
rib cage morphology, CV experts tended to focus more on tiny de database of real radiographs had been duplicated.
tails such as wave-like patterns (see fig. S4E). There was no inter In this set of 1000 randomly drawn GAN images, we did not find
reader agreement between the readers for spatial resolutions of 256, a single instance in which the synthesized radiograph looked iden
underlining the fact that identification of synthesized radiographs tical to its closest neighbor in the real dataset (fig. S4B). When as
at this spatial resolution stage is hardly possible (see Table 1). At sessing similarity in terms of the SSIM, we did not find a single case
higher spatial resolutions, the interreader agreement was consistently in a set of 105 randomly drawn synthesized radiographs, in which a
higher following the found higher accuracy in identifying synthe digital twin was found in any of the real radiographs. In addition,
sized radiographs. These results were observed under restrictions: the reader was asked to examine the synthesized radiograph for lo
The radiographs were assessed on conventional 24-inch computer cal information that might lead to the identification of a specific

Downloaded from https://www.science.org on December 01, 2023

monitors without zooming into the images. The radiographs were patient, e.g., an anatomic variant unique to a patient or a necklace
presented in a given order: first, the low-resolution radiographs, with a name on it. No such information was found in this set of
followed by the mid- and then the high-resolution radiographs. The 1000 images.
readers were not allowed to go back and change previous decisions. We reason that the duplication of images from the database of
When these restrictions were lifted, accuracy in determining whether real radiographs is unlikely. The GAN consists of a generator and a
a radiograph is real or synthesized was significantly increased. A discriminator network. Only the discriminator network will be in
radiologist with 9 years of experience who was given unlimited direct contact with patient images. The generator is never directly
amounts of time and who first examined the high-resolution radio presented a patient image in the training process. Thus, only the part
graphs on specialized radiological monitors to identify typical GAN- of the architecture (the generator) that has never been presented
related artifacts before going back to the 256 × 256 radiographs was with real patient images is transferred to the central database.
able to identify synthesized radiographs in 86% of cases.
The difficulties to generate convincing radiographs at high reso Performance of classifiers trained on
lutions were understandable, as the task becomes more difficult synthesized radiographs
with the growing number of pixels: even for low-resolution gray To demonstrate the feasibility of our approach in a clinical setting,
scale images of 100 × 100 pixels and 8-bit grayscale depth, the num as shown in Fig. 1, we have decided to apply our concept to the de
ber of possible different images amounts to 256(100 × 100). The GAN tection of pneumonia. In the United States alone, pneumonia ac
was tasked with identifying the subset of real-looking images out of counted for over 500,000 visits to emergency departments and over
this set that grows exponentially in size with increasing spatial reso 50,000 deaths in 2015 (20). The Radiological Society of North
lution. Not unexpectedly, this process was not perfect, and although America (RSNA) has recently hosted a challenge to automatically
the GAN managed to capture the general appearance of a real ra detect pneumonia in x-rays using machine learning algorithms. Often,
diograph at high resolutions, small details revealed the synthesized local hospitals can only gather medical datasets with limited diver
origin. After having performed the tests and with the knowledge of sity due to a specific patient population with associated pathologies.
the ground truth, the readers conferred to identify these typical pat However, diversity of the datasets is crucial to the performance of
terns that allowed for the differentiation of real from synthesized deep learning algorithms due to the complex features of a specific
images at high resolution. Among these were unphysiological con pathology. By using our approach of pooled GANs, different patients
figurations of the pulmonary vessels, aberrant bone structures, and from different locations can be jointly considered and thus boost the
subtle periodic, wave-like patterns superimposed on the lung pa diversity of the local dataset without violating any privacy protection

Table 1. Real/synthesized radiographs test. Accuracy and interreader agreement for the group of three CV experts, three radiologists, and all readers when
differentiating whether the presented radiograph is real or synthesized.
256 × 256 512 × 512 1024 × 1024
Accuracy, % Fleiss’ kappa Accuracy, % Fleiss’ kappa Accuracy, % Fleiss’ kappa
CV experts 60 ± 5 −0.03 67 ± 17 0.07 82 ± 4 0.46
Radiologists 51 ± 5 0.10 65 ± 5 0.18 77 ± 13 0.39
All readers 55 ± 7 0.00 67 ± 14 0.07 80 ± 10 0.37

Han et al., Sci. Adv. 2020; 6 : eabb7973 2 December 2020 3 of 11

SCIENCE ADVANCES | RESEARCH ARTICLE

laws. We simulated a local dataset with a limited diversity by using potential influence that the size of the training set might have had
a subset of the RSNA dataset with 1000 real x-rays for training, of which on the performance of the classifiers. Similarly, improved perform
only 5% exhibited signs of pneumonia. The resulting classifier achieved ance measures were found for sensitivity, specificity, accuracy,
an area under the curve (AUC) of 0.74 on a test set of 6000 previously positive predictive value (PPV), negative predictive value (NPV),
unseen x-rays from the RSNA dataset (see Fig. 2B). To alleviate the lim and F1 score (see Fig. 2C). This experiment thus demonstrated that
iting scarcity of pathological images and improve the classifier perform our pooled dataset approach is capable of improving deep learning
ance, we used our public database of generated images [trained on classifiers by enriching scarce datasets.
the National Institutes of Health (NIH) and the Stanford dataset]. To simulate the data merging scenario as outlined in Fig. 1, we
From this database, we randomly sampled a total of 500 synthesized compared the results of a comprehensive pathology classification,
x-rays: 100 that exhibited signs of pneumonia and 400 that exhibited i.e., cardiomegaly, effusion, pneumothorax, atelectasis, edema, con
no signs of pneumonia. These were then joined with 500 real x-rays solidation, and pneumonia, with a classifier solely trained on the
from the RSNA subset (450 healthy and 50 pneumonia), which re NIH-GAN versus a classifier that was trained on merged synthesized
sulted in a set of 1000 x-rays for training of the classifier (healthy: images from different sources. Generated samples of our Stanford-
450 real and 400 synthesized; pneumonia: 50 real and 100 synthe GAN can be found in fig. S5. The average values of the AUC, accuracy,
sized). When trained on this artificially enriched set of x-rays, the sensitivity, and specificity all increased significantly after integra
performance of the classifier increased with an overall AUC of 0.81. tion of the synthesized external dataset (see Fig. 3). This demonstrated
We hypothesize that the reason for the improvement in performance that the merging of multiple databases of generated radiographs can
was probably due to the greater diversity of pathological cases as boost the performance of classifier networks and can alleviate the
produced by the generator: As reflected by the lower MS-SSIM in performance bottleneck due to insufficient amounts of training data.
Fig. 2A, the GAN-augmented dataset (MS-SSIM, 0.18 ± 0.09) achieved The performance improvements have been achieved without

Downloaded from https://www.science.org on December 01, 2023

a higher level of diversity in comparison with the local RSNA subset any techniques of domain adaption, i.e., without any efforts to ho
(MS-SSIM, 0.24 ± 0.12). Note that for both cases, we have chosen to mogenize the appearance of the radiographs from different data
train the classifiers on the same number of x-rays to exclude any bases. Adopting these techniques not only offers an opportunity for

B C

Fig. 2. Pooled GAN training can improve pneumonia detection by enriching the diversity of the dataset. (A) Distributions of MS-SSIM of randomly selected
2450 pneumonia-positive pairs. Higher diversity of pneumonia cases in the GAN-augmented dataset is confirmed by a lower MS-SSIM (GAN-augmented MS-SSIM: 0.18 ±
0.09 versus RSNA subset MS-SSIM: 0.24 ± 0.12). (B)The performance of the classifier when trained on 1000 x-rays from the GAN-enriched dataset (healthy: 450 real and
400 synthesized; pneumonia: 50 real and 100 synthesized) reaches an AUC of 0.81 in pneumonia detection, outperforming that of a classifier trained on 1000 real x-rays
(healthy, 950; pneumonia, 50) that reaches an AUC of 0.74. (C) Similarly, improved performance measures were found for sensitivity (Sens), specificity (Spec), accuracy
(Accu), PPV, NPV, and F1 score. We used a test set of 6000 x-rays randomly sampled from the RSNA dataset to calculate those scores. The GANs used to generate the
synthesized x-rays were trained based on the NIH and Stanford datasets.

Han et al., Sci. Adv. 2020; 6 : eabb7973 2 December 2020 4 of 11

SCIENCE ADVANCES | RESEARCH ARTICLE

Downloaded from https://www.science.org on December 01, 2023

Fig. 3. Using pooled synthesized data from different sites, classification performance can be increased. To simulate the scenario in Fig. 1, two classifiers were
trained and compared: a classifier solely trained on anonymous radiographs generated with the NIH-GAN (blue) and a classifier trained on the pooled anonymized data-
set generated with the NIH-GAN and the Stanford-GAN (red). The schematic of the data selection process is shown in (A). AUC, sensitivity, and specificity for the seven
diseases are given in (B). In particular, the classification performances of formerly problematic cases such as edema, consolidation, and pneumonia were boosted by
merging data from multiple sites (red arrows).

further performance improvements through domain adaption— trollable gradient/model updates, it is difficult to detect adversarial
now, an active area of research (21)—but would also most likely attacks and protect against them (23). However, the security of
make the classifier network more robust to deployments in differ GAN-based federated learning has the advantage of offering an ad
ent environments. This is an important aspect in the translation of ditional degree of freedom for screening of databases by using con
CV algorithms from the workbench to clinics. fidence calibrated checking (24) or manual inspection (25). We
therefore investigated the use of federated learning in training one
Federated averaging facilitates the training of local GANs central GAN as an alternative to the pooled GAN approach.
Large amounts of data are required to obtain robust results from To simulate hospitals with limited amount of training data, we
GANs that are trained locally. This potentially limits the sites at randomly sampled 20,000 patient radiographs from the Stanford
which a GAN can be trained to large hospitals. Federated learning CheXpert dataset and then partitioned them into 20 local clients
algorithms (22) offer a remedy to this limitation as the GAN can be each receiving 1000 patient radiographs. We trained and compared
trained without the original images, leaving the protected space of the following models: a centralized “20k model,” which was trained
the hospital. One possible reservation is that, because of the uncon on 20,000 patient radiographs, a centralized “1k model,” which was

Han et al., Sci. Adv. 2020; 6 : eabb7973 2 December 2020 5 of 11

SCIENCE ADVANCES | RESEARCH ARTICLE

solely trained on 1000 local radiographs, and a federated “20 × 1k Exemplary frames visualizing the transition are given in fig. S4C for
model,” which was trained federally (22) on 20 distributed datasets cardiomegaly and effusion. With cardiomegaly, we observed an
consisting of 1000 radiographs each (see Fig. 4A). enlargement of the projected heart shape, reflecting the expected
An important property of Wasserstein GANs is that their dis radiological change. Similarly, effusion showed the typical opacifi
criminator loss directly reflects the quality of generated samples cation of the lower lung field mirroring the collection of fluid there.
(26). We therefore visualized the negative discriminator loss in Animations for all of the 14 disease states are given in the Supple
Fig. 4B. As can be seen in Fig. 4B, because of insufficient training mentary Materials.
images, the centralized 1k GAN overfitted quickly and led to an un Second, the pixel-wise difference image between the fully dis
stable training. However, as indicated by a lower loss and Fréchet eased and the healthy radiograph was calculated and superimposed
inception distance (FID) in Fig. 4 (B and C), the federated trained on the healthy radiograph as a colormap (see Fig. 5A for a schematic
GAN (federated 20 × 1k) overcame this local overfitting issue and of the process). Examples of such found visualizations are given in
significantly outperformed the locally trained GAN. The loss curve Fig. 5B for all 14 pathologies.
of the federated GAN was smoother because it represented an aver One advantage of having full control over the disease state of the
age over local iteration losses. GAN radiographs is that any combination of diseases in a single
radiograph can be generated by changing the corresponding entries
Generated images as a visualization of in the input vector simultaneously. We found that the disease state
what neural networks see as represented by the GAN transition reflected the underlying dis
The images generated by the generator could be specifically con ease and was in good agreement with radiological expertise if many
trolled: By changing the part of the input vector signifying the dis marked examples of this disease were present in the original dataset
ease, radiographs with specific pathologies could be generated. We and if the disease-related changes occurred on a large scale of the

Downloaded from https://www.science.org on December 01, 2023

used two techniques of visualizing the disease-specific hallmark radiographs (e.g., cardiomegaly or effusion) rather than on small
changes. First, the disease-specific entry in the input vector was patches at different sites (nodules).
gradually changed from 0 to 1, while all other entries were kept at 0. To uncover correlations between disparate pathologies, we let
The generated images then showed the transition from healthy to the classifier rate the score of a specific pathology when the GAN was
diseased states and were stitched together to form an animation. tasked with generating a synthesized radiograph of another disease

A C

Fig. 4. Federated learning facilitates GAN training when facing insufficient amounts of local data. Hospitals can use federated learning algorithms to train a global
GAN, and the central GAN deposit can serve as a hub. (A) Illustration of the GAN-related federated learning system. After local model initialization, local hospitals B and C
(in red frames) were selected to update their local models. The global generator and discriminator were updated by the weights (w) transferred to the aggregation server
(red arrows). All local models were subsequently redefined by the updated global GAN (blue arrows). The exchange of local and global weights continued until the glob-
al GAN converged. (B) Discriminator loss curves for three trained Wasserstein GANs. The Wasserstein GAN trained by federated averaging algorithm (federated 20 × 1k)
outperformed the centralized GAN trained on only 1000 x-rays (centralized 1k) and performed comparably to the centralized 20k GAN. (C) FID evaluations of the GAN
training process.

Han et al., Sci. Adv. 2020; 6 : eabb7973 2 December 2020 6 of 11

SCIENCE ADVANCES | RESEARCH ARTICLE

Generator
Healthy
(null vector) Concatenation

Healthy Atelectasis Cardiomegaly Consolidation Edema

Random patient
(512-dimensional vector
with random entries)
Overlay
...

Cardiomegaly Effusion Empyhsema Fibrosis Hernia Infiltration

(one hot encoding)

Concatenation Subtraction

Mass Nodule Pleural thickening Pneumonia Pneumothorax

A B

Downloaded from https://www.science.org on December 01, 2023

C D

Fig. 5. Learned pathological features. (A) Generation of the disease-specific pixel map. A randomly chosen vector with 512 Gaussian distributed entries characterizes
one specific patient. The GAN was tasked with generating a healthy and a diseased radiograph of that patient (cardiomegaly in this example). A subtraction map was
generated to denote the changes brought about by the disease and was superimposed as a colormap over the generated healthy radiograph. (B) Disease-specific pat-
terns generated by the generator for an exemplary randomly drawn pseudopatient. Red denotes higher signal intensity in the pathological radiograph, while blue de-
notes lower signal intensity. Note that for some diseases such as cardiomegaly and edema, the pattern looks realistic, while the GAN struggled with diseases that have a
variable appearance and where ground truth data are limited, e.g., pneumonia. (C) Revealing correlations within generated pathological radiographs by the classifier
trained on the real dataset. For each pathology, 5000 random synthesized radiographs with a pathology label drawn from a uniform distribution between 0.0 and 1.0
were generated. The images were then rated by the classifier network, and Pearson’s correlation coefficient was calculated for each pairing of pathologies [shown in (C)
with the GAN cardiomegaly label on the x axis and the cardiomegaly and fibrosis classifier output on the y axis in red and blue, respectively]. (D) Resulting correlation
coefficients for all 14 × 14 pairings are displayed and color coded in (B). Clustering on the x axis (i.e., the GAN label axis) was performed to group related diseases. The
obtained clustered blocks are marked with white-bordered boxes.

and calculated the Pearson’s correlation coefficient (Fig. 5C). We This might be due to the limited number of those cases in the data
found that the clustering of related pathologies based on the cor sets. For example, for pneumonia-labeled cases in the NIH dataset,
relation coefficient agreed well with clinical intuition: Infiltration, only 1431 cases are positive and 110,689 are negative. As can be seen
pneumonia, consolidation, effusion, and edema—all pathologies in Fig. 5D, the NIH generator cannot reliably generate pneumonia-
where lung opacity increases—were related to each other while be related features. The nature of pneumonia was better captured by
ing distant from, e.g., emphysema or pneumothorax—diseases that the Stanford-GAN in Fig. 5D, which shows the challenging cases
are associated with increased radiolucency. The magnitude of diag from this particular dataset. This GAN was trained with a much
onal elements in the 14 × 14 matrix in Fig. 5D directly reflects the higher number of 20,656 pneumonia cases in the Stanford CheXpert
quality of the generation of pathological images with our method. dataset.
Diseases, such as cardiomegaly (corr = 0.8) and effusion (corr = 0.7)
could be reliably generated due to their localized and predictable
pathological features. However, the GAN trained on these par DISCUSSION
ticular datasets performed less reliable in generating infiltration In this study, we demonstrated that GANs can be used to gener
(corr = 0.5), emphysema (corr = 0.5), and pneumonia (corr = 0.5). ate deceivingly real-looking radiographs and to merge databases of

Han et al., Sci. Adv. 2020; 6 : eabb7973 2 December 2020 7 of 11

SCIENCE ADVANCES | RESEARCH ARTICLE

radiological images without disclosing patient-sensitive information. approach of pooling locally trained GANs is to apply a quality crite
This helps to build large radiological image databases for the training rion such as the FID or the inception score (28) assessment. Locally
of CV algorithms. While radiographs may, in principle, be abundantly trained generators can be rejected to be included in the central GAN
available, universal access is, in general, severely restricted due to repository. Second, federated learning allows for training of a single
data protection laws: privacy concerns restrict the export of sensitive global GAN with several smaller distributed databases as demon
patient information to extramural institutions, and often, only a small strated by Fig. 4A. In this way, several smaller databases can be
fraction of the available data can be used (e.g., a patient consent form combined and act as one large database without actually sharing the
may not be universally available). In these cases, GANs that have underlying patient information.
been trained in-house may serve as a mean to distribute the infor Attention needs to be paid to adversarial attacks on distributed
mation contained within the database without actually providing a learning systems. Models might be affected by poisoning attacks (29).
real snapshot of patient-sensitive data: only the weight distribution Local gradients can be easily manipulated and distorted before be
of the GAN needs to be transferred, and a representative synthesized ing transmitted to central servers, and adversarial attacks might not
dataset of millions of radiographs may be generated in reasonable be detected in the federated learning approach. Our GAN-based
computational time at a peripheral site. This is in contrast to previous distributed learning approach offers passive and active robustness
works of Shin and co-workers, in which lower–spatial resolution against adversarial attacks. The posterior distribution could be esti
synthesized images could be produced but always required recourse mated (30), and the confidence threshold (24) of any given example in
to the original patient images as inputs to an image to image trans the local training set could be deduced. Such confidence thresholds
lational network (13). Another group has previously demonstrated could be used to detect and filter the suspicious training examples
that synthesized chest radiographs can be used to augment training, to secure GAN training from dataset poisoning (29). In addition,
see, e.g., Salehinejad et al. (27). However, they used a less advanced adversarial training (31) is an efficient method to increase model

Downloaded from https://www.science.org on December 01, 2023

GM and were only capable of synthesizing radiographs with limited robustness against adversarial perturbations. We demonstrated (in
quality, which are of less clinical value. We demonstrated the feasi fig. S3) that the robustness of our radiograph classifier was signifi
bility of using GANs as a tool of effective oversampling when the cantly improved by adversarial training.
pathology distribution within a medical dataset was highly im The concepts demonstrated in this work rely on two-dimensional
balanced. In particular, we demonstrated how a deep learning classifier images, but there is no principal restriction on the number of di
can benefit from using synthesized x-rays from a publicly accessible mensions that the real and synthesized images are allowed to have
database in cases where only few instances of a particular disease are or even that the data have to consist of images only. Thus, the same
present. Moreover, our developed GANs could be used to visualize concept could be translated to volumetric CT or magnetic resonance
what the generator neural network sees and to reveal correlations images, to fluoroscopy, to time series of volumetric data (e.g.,
between diseases. The synthesis of pathological radiographs and the contrast-enhanced CT), or even to imaging data in conjunction with
subsequent analysis by classification networks reveal correlations clinical data (e.g., an MRI with associated expression profiles of lab
between diseases. For example, there seemed to be a block of diseases oratory tumor markers). However, because of the exponentially in
(lower right corner of Fig. 5D) that was characterized by lung opaci creasing size of the data, we expect that the problem of generating
fication, namely, effusion, consolidation, pneumonia, infiltration, synthesized data of very high dimensionality is much more difficult
and edema. This makes sense from a clinical viewpoint as these dis and that a far greater number of real cases would be needed for the
eases are clinically correlated, and similar observations hold for the GAN to converge.
remaining clusters in this figure. Cardiomegaly was an outlier in the Diagnosis in the clinical setting usually relies on more than just
sense that it was associated with seemingly only one disease from imaging and comprises the patients’ demographics, their medical
that block (effusion), but not the others. This again makes sense, history, and previous and ongoing treatments. Future work will inves
considering that effusion can be a direct consequence of congestive tigate how to include these important parameters into our approach
heart failure. The correlation to edema was quite low for this case, by letting the GANs generate not only radiographs but also accom
which may indicate that the edema was not described consistently panying clinical data such as laboratory values. However, to realize
in the radiological reports and was therefore difficult for the neural this, more training data are probably needed, as the data to be syn
network to synthesize and classify. thesized will have higher dimensions/degrees of freedom. Federated
A potential problem in training a GAN on-site exists when the learning as presented here can help overcome those difficulties by
set of training images is limited—as is the case in small community providing the means to combine several distinct databases.
hospitals. In those cases, the GAN might not converge, and realistically
looking radiological images might not be produced. To overcome
this problem, we proposed to use federated learning for training of MATERIALS AND METHODS
the GAN: radiological images remained on-site and never left the Dataset and preprocessing
premises, while only the weight updates got transferred to the Three datasets were used in this study: first, the ChestX-ray dataset
central repository. We simulated this approach by splitting a set of released by the NIH in 2017, containing 112,120 frontal radiographs
20,000 patients from the CheXpert dataset into 20 smaller datasets, of 30,805 unique patients. At the time of its publication, this dataset
and we found that the federally trained model significantly outper comprised 8 disease entities and was later updated to contain
formed the locally constrained model. 14 pathologies (32). To ensure that no information leaked into the
One caveat when dealing with many smaller distributed databases test set used for the evaluation of the algorithms, patient-wise strat
is the potential low quality of locally trained GANs. To prevent ification into training (21,528 patients, 78,468 radiographs, 70%),
the central repository from being contaminated by inferior synthetic validation (3,090 patients, 11,219 radiographs, 10%), and test set
x-rays, we propose two possible remedies: One way of pursuing the (6187 patients, 22,433 radiographs, 20%) was performed. The test

Han et al., Sci. Adv. 2020; 6 : eabb7973 2 December 2020 8 of 11

SCIENCE ADVANCES | RESEARCH ARTICLE

set was kept separately until the final testing of the algorithms. Dedicated explanations about techniques used in our GANs can
Detailed label statistics for the ChestX-ray 14 dataset can be found be found at the “Network training details” section in the Supple
in the “Preprocessing steps in CheXpert dataset” section in the Sup mentary Materials.
plementary Materials and in table S3. Second, a densely connected CNN with 121 layers (DenseNet-121)
The second dataset used in this study is the CheXpert dataset, was used as a classifier. It was pretrained on 14 million natural images
which has been released by Irvin et al. (33) in January 2019. It con [ImageNet database (2)] and subsequently trained on the radiographs
tains 224,316 chest radiographs of 65,240 patients. This dataset was in this study. The architecture has been shown to achieve state-of-
used to train a second GAN to demonstrate the feasibility of the the-art performance on the ChestX-ray dataset (35) before. Imple
proposed data sharing approach (see Fig. 1). A detailed explanation mentations were done using TensorFlow 1.9.0 and PyTorch 1.1.0.
of the label preparation and statistics for the CheXpert dataset is
given in table S3. Algorithms of classification were tested on the Training of the GANs
NIH test set. Therefore, no subdivision of the CheXpert dataset into We trained two GANs on the basis of two separate datasets in a
test, training, and validation sets was needed, and all available frontal progressive growing strategy: on the NIH ChestX-ray14 dataset and
radiographs of the CheXpert dataset (n = 191,027) were used for the Stanford CheXpert dataset. Note that weights were initialized
training of the GAN. randomly. Training proceeded in repetitive stages: once training of
The third dataset used in this study is a dataset of x-rays released one spatial resolution stage stabilized after being presented a total of
by the RSNA to host a challenge about pneumonia detection. We 600,000 real radiographs (with repetitions), the layers responsible
used this dataset to train a classifier for pneumonia detection and to for the next spatial resolution stage were gradually faded in and train
test whether the inclusion of synthesized x-rays could improve the ing continued with another 600,000 radiographs during this fade-in
performance of said classifier. stage (again with repetitions). In total, discriminators of GANs were

Downloaded from https://www.science.org on December 01, 2023

Before training, radiograph datasets such as NIH and CheXperts each presented 12 million radiographs. The training scheme was chosen
were downsampled to dedicated spatial resolutions, i.e., ranging from so that the GANs learned to first explore the large-scale pattern and
4 × 4, 8 × 8, …, 210 × 210, and converted into separate files. Thus, overall contrast before focusing their attention on finer details.
each of those files contained all training radiographs with a fixed To measure whether the images generated by the generator con
spatial resolution. The radiographs’ intensity values were normalized verged to real-looking images, we used the FID between a set of
to the range of [−1, 1] (12). 10,000 real x-rays and 10,000 generated x-rays at each training epoch
(28). We ensured an equal contribution from each pathological class
Model architecture and implementation by using a uniform distribution among the 14 classes, i.e., roughly
Two neural network architectures were used here. First, GANs as 700 radiographs per class. To compute the FID, we extracted features
introduced by Goodfellow et al. (11) were adapted to incorporate an of radiographs from the third pooling layer of the inception network
input condition (19) to selectively generate synthesized radiographs (28). The FID score among real and synthesized radiographs was
with a certain pathology. We used two different inputs to the then computed according to
networks: the conditional vector, which controls the type of disease
‖   Tr( Σ  x  + Σ  g  − 2 (Σ  x Σ  g)  2)
‖
1_
present in the synthetic image, and the random noise vector, which FID(x, g ) =  μ  x  − μ  g 22 + (2)
determines which item from the set of possible x-rays is generated.
Both vectors are concatenated and fed to the network as an input where  and  are mean and covariances of a multivariate Gaussian
(19). As depicted in fig. S1C, such concatenation-based conditioning distribution that models the feature distributions. We found that the
is equivalent to adding bias to the hidden activations based on the FID decreased nearly monotonically, indicating that the general ap
conditional input (34). In addition, we also added an auxiliary clas pearance of the generated images approaches that of the real x-rays.
sifier at the end of the discriminator and additional classification The corresponding figures depicting the evolution of the FID are
loss terms in the objective given in fig. S1D.

CG =   𝔼 [ − logP(C = c∣ x˜ )]

L Training of the classifier with real and synthesized data
x  ˜~ℙg
   
         
(1) All classifier models used validation-based early stopping with sig
LCG  =   𝔼 [ − logP(C = c∣x )] moid binary cross-entropy loss as the criterion. No oversampling of
x~ℙr
underrepresented classes was used except for the experiment, in which
where c is the pathological class label. we specifically tested for the effect of oversampling. Training of the
To generate high–spatial resolution images, we used progressive classifier network was done for a variety of different settings.
growing, a technique in which the GAN is trained in progressively In the experiment depicted in Fig. 2, we first trained a classifier
higher–spatial resolution stages (12). The network architecture re on a set of 1000 real x-rays (950 healthy, 50 exhibited signs of pneu
sulting in a final spatial resolution of 1024 × 1024 is shown in table S5. monia) provided by the RSNA. Subsequently, we trained a classifier
We picked leaky rectified linear unit (ReLU) ( = 0.2) and pixel norm on a set of 500 real and 500 synthesized x-rays (450 healthy real,
(12) as the major activation function and normalization layer. Note 50 pneumonia real, 400 healthy synthesized, and 100 pneumonia
that, instead of using a common tanh activation function, Karras et al. synthesized), whereby the synthesized radiographs were generated
(12) suggested to use linear activation at the end of the generator. by generators that had previously been trained on the NIH and
During training, we used a mini-batch size of 128 for spatial resolutions Stanford datasets. As a test set, we used a random subset of the dataset
42 − 322 and then decreased the batch size by a factor of 2 when spatial published by the RSNA, comprising 6000 real x-rays with the rela
resolution doubled to account for the limited memory size: 64 × 64 → 64, tion healthy:pneumonia as 2:3. In addition, we tested whether this
128 × 128 → 32, 256 × 256 → 16, 512 × 512 → 8, and 1024 × 1024 → 4. concept could also be used in a more challenging task of differentiating

Han et al., Sci. Adv. 2020; 6 : eabb7973 2 December 2020 9 of 11

SCIENCE ADVANCES | RESEARCH ARTICLE

between a variety of diseases. As not all of the 14 pathologies labeled To determine the number of needed samples for the performed
in the NIH dataset had been labeled in the Stanford dataset, we only experiments, we used power analyses according to (36). In general, all
classified those pathologies that were present in both datasets’ labels, of our performed experiments followed a binomial distribution,
namely, cardiomegaly, effusion, pneumothorax, atelectasis, edema, because each decision for a radiograph was binary: either yes (e.g.,
consolidation, and pneumonia. We trained a classifier to differentiate was real for the case of deciding between real and synthesized radio
between these classes with three different training sets: (i) synthesized graph or disease was present for the case of the classifiers) or no
x-rays generated by the NIH-GAN, (ii) synthesized x-rays generated (was not real or disease not present). We could thus use the binomial
_
by both the NIH-GAN and the Stanford-GAN, and (iii) real x-rays absolute numbers = √n × p × q ,
formula for the SD of absolute numbers: SD 
from the NIH dataset. _
p×q
In addition, an experiment was carried out, in which the generated percentages = √_
or equally well the SD of percentages: SD  n  .
images were evaluated by the trained DenseNet to discover correla The difference of metrics, such as AUC, sensitivity, and specificity,
tions between different pathologies. For each pairing of pathology was defined as a metric (see table S4). For the total number of
as generated by the generator and pathology as classified by the n = 1000 bootstrapping, models were built after randomly per
classifier, we calculated Pearson’s correlation coefficient and performed muting predictions of two classifiers, and metric difference metrici
clustering on the resulting correlation matrix. were computed from their respective scores. We obtained the P value
of individual metrics by counting all metrici above the threshold
Federated averaging GAN metric. Statistical significance was defined as P < 0.001.
The pseudocode of our federated averaging GAN is given in
algorithm S1. Specifically, we controlled our federated learning ex SUPPLEMENTARY MATERIALS
periment by setting 10% of local clients that ran local GAN updates Supplementary material for this article is available at http://advances.sciencemag.org/cgi/

Downloaded from https://www.science.org on December 01, 2023

(C = 10%), 10 local generator iterations on each round (E = 10), and content/full/6/49/eabb7973/DC1

a local batch size of 32 (b = 32). Following Gulrajani et al. (26), pa View/request a protocol for this paper from Bio-protocol.

rameters of local Wasserstein GAN training was set to  = 10 and

ndiscriminator = 5. All local models were initialized identically. During REFERENCES AND NOTES
1. S. Bickelhaupt, P. F. Jaeger, F. B. Laun, W. Lederer, H. Daniel, T. A. Kuder, L. Wuesthof,
one global update round, as shown in Fig. 4A, a subset of clients
D. Paech, D. Bonekamp, A. Radbruch, S. Delorme, H.-P. Schlemmer, F. H. Steudle,
(10% here) was picked to run local GAN updates on isolated datasets. K. H. Maier-Hein, Radiomics based on adapted diffusion kurtosis imaging helps
Local clients were asked to transmit updated weights (red arrows in to clarify most mammographic findings suspicious for cancer. Radiology 287, 761–770
Fig. 4A) to the aggregation server once local updates were finished. (2018).
The global model was updated by the weighted average over collected 2. A. Krizhevsky, I. Sutskever, G. E. Hinton, Imagenet classification with deep convolutional
neural networks. Commun. ACM 60, 84–90 (2017).
weights (22). To finish the global round, all local models were updated 3. K. He, X. Zhang, S. Ren, J. Sun, Delving deep into rectifiers: surpassing human-level
by the weights from the global model (blue arrows in Fig. 4A). performance on ImageNet classification, in Proceedings of the IEEE International
Conference on Computer Vision (ICCV), Santiago, Chile, 7 to 13 December 2015.
Reader study 4. A. Hosny, C. Parmar, T. P. Coroller, P. Grossmann, R. Zeleznik, A. Kumar, J. Bussink,
R. J. Gillies, R. H. Mak, H. J. W. L. Aerts, Deep learning for lung cancer prognostication:
Six readers were tasked with identifying whether a radiograph was
A retrospective multi-cohort radiomics study. PLOS Med. 15, e1002711 (2018).
real or synthesized. The tests were performed as follows. Each reader 5. V. Gulshan, L. Peng, M. Coram, M. C. Stumpe, D. Wu, A. Narayanaswamy, S. Venugopalan,
was given 30 s within which she or he had to decide whether the K. Widner, T. Madams, J. Cuadros, Development and validation of a deep learning
presented radiograph was real or synthesized. To prevent readers algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA 316,
from identifying GAN-related features on the high–spatial resolu 2402–2410 (2016).
6. D. Mahajan, R. Girshick, V. Ramanathan, K. He, M. Paluri, Y. Li, A. Bharambe, L. van der
tion radiographs first—which are harder to produce and thus pre Maaten, Exploring the limits of weakly supervised pretraining, in Proceedings of the
sumably more prone to artifacts—and transferring that knowledge European Conference on Computer Vision (ECCV), Munich, Germany, 8 to 14 September
to the low–spatial resolution images, the radiographs were presented 2018.
in the following order: 100 radiographs of 256 × 256, 100 radiographs 7. M. M. Mello, V. Lieou, S. N. Goodman, Clinical trial participants’ views of the risks
and benefits of data sharing. N. Engl. J. Med. 378, 2202–2211 (2018).
of 512 × 512, and, lastly, 100 radiographs of 1024 × 1024. All pre
8. L. M. Prevedello, S. S. Halabi, G. Shih, C. C. Wu, M. D. Kohli, F. H. Chokshi, B. J. Erickson,
sented radiographs were different, i.e., the 256 × 256 were different J. Kalpathy-Cramer, K. P. Andriole, A. E. Flanders, Challenges related to artificial
from the 512 × 512 and 1024×1024 radiographs. Reading tests were intelligence research in medical imaging and the importance of image analysis
done on a 24-inch computer monitor. To exclude the possibility that competitions. Radiol. Artif. Intell. 1, e180031 (2019).
readers investigated the metal markers or pixel-hardcoded letters 9. J. Xu, F. Wang, Federated learning for healthcare informatics (2019); arXiv:1911.06270.
10. W. Li, F. Milletarì, D. Xu, N. Rieke, J. Hancox, W. Zhu, M. Baust, Y. Cheng, S. Ourselin,
(e.g., denoting patient side—L or R) as potential artifacts to differ M. J. Cardoso, A. Feng, Privacy-preserving Federated Brain Tumour Segmentation, in
entiate between real and synthesized images, these were covered by International Workshop on Machine Learning in Medical Imaging (Springer, 2019), pp. 133–141.
an independent investigator before handing out the x-ray to the testers. 11. I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville,
Y. Bengio, Generative adversarial nets, in Advances in Neural Information Processing
Systems (Neural Information Processing Systems Foundation, Inc., 2014), pp. 2672–2680.
Statistical analysis
12. T. Karras, T. Aila, S. Laine, J. Lehtinen, Progressive growing of GANs for improved quality,
For each of the experiments, we calculated the following parameters stability, and variation (2017); arXiv:1710.10196.
on the test set: AUC, accuracy, sensitivity, and specificity. To assess 13. H.-C. Shin, N. A. Tenenholtz, J. K. Rogers, C. G. Schwarz, M. L. Senjem, J. L. Gunter, K. P. Andriole,
the errors due to sampling of the specific test set, we used bootstrapping M. Michalski, Medical image synthesis for data augmentation and anonymization using
with 10,000 redraws. The SE of the accuracy in the real versus generative adversarial networks, in International Workshop on Simulation and Synthesis in
Medical Imaging (Springer, 2018), pp. 1–11.
synthesized tests for each human reader was calculated among the 14. J.-Y. Zhu, T. Park, P. Isola, A. A. Efros, Unpaired image-to-image translation using
reader performances, and Fleiss’ kappa was used to assess interreader cycle-consistent adversarial networks, Proceedings of the 2017 IEEE International
agreement between readers. Conference on Computer Vision (ICCV), Venice, Italy, 22 to 29 October 2017.

Han et al., Sci. Adv. 2020; 6 : eabb7973 2 December 2020 10 of 11

SCIENCE ADVANCES | RESEARCH ARTICLE

15. J. M. Wolterink, A. M. Dinkla, M. H. F. Savenije, P. R. Seevinck, C. A. T. van den Berg, localization of common thorax diseases, in Proceedings of the IEEE Conference on
I. Išgum, Deep MR to CT synthesis using unpaired data, in Simulation and Synthesis in Computer Vision and Pattern Recognition (IEEE, 2017), pp. 2097–2106.
Medical Imaging (Springer, 2017), pp. 14–23. 33. J. Irvin, P. Rajpurkar, M. Ko, Y. Yu, S. Ciurea-Ilcus, C. Chute, H. Marklund, B. Haghgoo,
16. A. Chartsias, T. Joyce, R. Dharmakumar, S. A. Tsaftaris, Adversarial image synthesis for R. Ball, K. Shpanskaya, J. Seekins, D. A. Mong, S. S. Halabi, J. K. Sandberg, R. Jones,
unpaired multi-modal cardiac data, in Simulation and Synthesis in Medical Imaging D. B. Larson, C. P. Langlotz, B. N. Patel, M. P. Lungren, A. Y. Ng, CheXpert: A large chest
(Springer, 2017), pp. 3–13. radiograph dataset with uncertainty labels and expert comparison. arXiv:1901.07031
17. Z. Zhang, L. Yang, Y. Zheng, Translating and segmenting multimodal medical volumes (2019).
with cycle- and shape-consistency generative adversarial network, Proceedings of the IEEE 34. E. Perez, F. Strub, H. de Vries, V. Dumoulin, A. Courville, FiLM: Visual reasoning with a
Conference on Computer Vision and Pattern Recognition, Salt Lake City, Utah, USA, 18 to general conditioning layer. arXiv:1709.07871 (2017).
22 June 2018. 35. P. Rajpurkar, J. Irvin, R. L. Ball, K. Zhu, B. Yang, H. Mehta, T. Duan, D. Ding, A. Bagul,
18. A. Radford, L. Metz, S. Chintala, Unsupervised representation learning with deep C. P. Langlotz, B. N. Patel, K. W. Yeom, K. Shpanskaya, F. G. Blankenberg, J. Seekins,
convolutional generative adversarial networks (2015); arXiv:1511.06434. T. J. Amrhein, D. A. Mong, S. S. Halabi, E. J. Zucker, A. Y. Ng, M. P. Lungren, Deep learning
19. A. Odena, C. Olah, J. Shlens, Conditional image synthesis with auxiliary classifier GANs, for chest radiograph diagnosis: A retrospective comparison of the CheXNeXt algorithm
Proceedings of the 34th International Conference on Machine Learning-Volume 70 to practicing radiologists. PLOS Med. 15, e1002686 (2018).
(JMLR.org, 2017), pp. 2642–2651. 36. B. Rosner, Fundamentals of Biostatistics (Nelson Education, 2015).
20. P. Rui, K. Kang, National Hospital Ambulatory Medical Care Survey: 2015 Emergency 37. M. Arjovsky, S. Chintala, L. Bottou, Wasserstein GAN. arXiv:1701.07875 (2017).
Department Summary Tables (2015). https://www.cdc.gov/nchs/data/nhamcs/web_ 38. D. P. Kingma, P. Dhariwal, Glow: Generative flow with invertible 1x1 convolutions,
tables/2015_ed_web_tables.pdf [accessed 16 January 2020]. in Advances in Neural Information Processing Systems (2018), pp. 10215–10224.
21. S. Conjeti, A. Katouzian, A. G. Roy, L. Peter, D. Sheet, S. Carlier, A. Laine, N. Navab, 39. T. Salimans, A. Karpathy, X. Chen, D. P. Kingma, PixelCNN++: Improving the PixelCNN
Supervised domain adaptation of decision forests: Transfer of models trained in vitro with discretized logistic mixture likelihood and other modifications. arXiv:1701.05517
for in vivo intravascular ultrasound tissue characterization. Med. Image Anal. 32, 1–17 (2016). (2017).
22. B. McMahan, E. Moore, D. Ramage, S. Hampson, B. A. y Arcas, Communication-Efficient Learning 40. J. Hoffman, E. Tzeng, T. Park, J.-Y. Zhu, P. Isola, K. Saenko, A. A. Efros, T. Darrell, CyCADA:
of Deep Networks from Decentralized Data, in Proceedings of the 20th International Cycle-consistent adversarial domain adaptation. arXiv:1711.03213 (2017).
Conference on Artificial Intelligence and Statistics, Fort Lauderdale, FL, USA, 20 to 22 April 2017.

Downloaded from https://www.science.org on December 01, 2023

23. A. N. Bhagoji, S. Chakraborty, P. Mittal, S. Calo, Analyzing federated learning through an Acknowledgments
adversarial lens. arXiv:1811.12470 (2018). Funding: This research project was supported by the START program of the Faculty of
24. D. Stutz, M. Hein, B. Schiele, Confidence-calibrated adversarial training: Generalizing to Medicine, RWTH Aachen, Germany, through the START rotation program granted to D.T. and
unseen attacks. arXiv:1910.06259 (2019). by the DFG, Germany, through a grant given to S.N. Author contributions: T.H., D.T., V.S., and
25. S. Augenstein, H. B. McMahan, D. Ramage, S. Ramaswamy, P. Kairouz, M. Chen, F.K. conceived the idea and approach. F.K., V.S., S.R., S.N., C.H., N.H., D.M., and D.T. contributed
R. Mathews, B. A. y Arcas, Generative models for effective ML on private, decentralized to the experiments. T.H., D.T., C.H., and N.H. developed the code infrastructure and GAN
datasets. arXiv:1911.06679 (2019). training setup. T.H., D.T., F.K., and V.S. wrote the manuscript. Competing interests: The
26. I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, A. C. Courville, Improved training of authors declare that they have no competing interests. Data and materials availability: This
Wasserstein GANs, in Advances in Neural Information Processing Systems, I. Guyon, study used three publicly available datasets: NIH ChestX-ray14 dataset (https://nihcc.app.box.
U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, R. Garnett, Eds. (Curran com/v/ChestXray-NIHCC), Stanford CheXpert dataset (https://stanfordmlgroup.github.io/
Associates, Inc., 2017), pp. 5767–5777. competitions/chexpert), and RSNA pneumonia dataset (https://kaggle.com/c/rsna-
27. H. Salehinejad, S. Valaee, T. Dowdell, E. Colak, J. Barfett, Generalization of deep neural pneumonia-detection-challenge). The full images used in our real/synthesized radiograph test
networks for chest pathology classification in X-rays using generative adversarial are available at https://drive.google.com/open?id=1_snb7hQ47WYxJEYK95G3cYlWSqckvRDW.
networks, in 2018 IEEE International Conference on Acoustics, Speech and Signal Processing Details of the implementation as well as the weights of the neural networks after training and
(ICASSP) (IEEE, 2018), pp. 990–994. the full code producing the results of this paper are made publicly available at https://github.
28. M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, S. Hochreiter, GANs trained by a two com/peterhan91/Thorax_GAN.git. All data needed to evaluate the conclusions in the paper
time-scale update rule converge to a local nash equilibrium, in Advances in Neural are present in the paper and/or the Supplementary Materials. Additional data related to this
Information Processing Systems (Curran Associates, Inc., 2017), pp. 6626–6637. paper may be requested from the authors.
29. S. G. Finlayson, J. D. Bowers, J. Ito, J. L. Zittrain, A. L. Beam, I. S. Kohane, Adversarial attacks
on medical machine learning. Science 363, 1287–1289 (2019). Submitted 19 March 2020
30. Y. Li, Y. Gal, Dropout inference in Bayesian neural networks with alpha-divergences, in Accepted 14 October 2020
Proceedings of the 34th International Conference on Machine Learning-Volume 70 Published 2 December 2020
(JMLR.org, 2017), pp. 2052–2061. 10.1126/sciadv.abb7973
31. A. Madry, A. Makelov, L. Schmidt, D. Tsipras, A. Vladu, Towards deep learning models
resistant to adversarial attacks. arXiv:1706.06083 (2017). Citation: T. Han, S. Nebelung, C. Haarburger, N. Horst, S. Reinartz, D. Merhof, F. Kiessling, V. Schulz,
32. X. Wang, Y. Peng, L. Lu, Z. Lu, M. Bagheri, R. M. Summers, ChestX-ray8: Hospital-scale D. Truhn, Breaking medical data sharing boundaries by using synthesized radiographs. Sci. Adv.
chest X-ray database and benchmarks on weakly-supervised classification and 6, eabb7973 (2020).

Han et al., Sci. Adv. 2020; 6 : eabb7973 2 December 2020 11 of 11

Breaking medical data sharing boundaries by using synthesized radiographs
Tianyu Han, Sven Nebelung, Christoph Haarburger, Nicolas Horst, Sebastian Reinartz, Dorit Merhof, Fabian Kiessling,
Volkmar Schulz, and Daniel Truhn

Sci. Adv. 6 (49), eabb7973. DOI: 10.1126/sciadv.abb7973

View the article online

Downloaded from https://www.science.org on December 01, 2023

https://www.science.org/doi/10.1126/sciadv.abb7973
Permissions
https://www.science.org/help/reprints-and-permissions

Use of this article is subject to the Terms of service

Science Advances (ISSN 2375-2548) is published by the American Association for the Advancement of Science. 1200 New York Avenue
NW, Washington, DC 20005. The title Science Advances is a registered trademark of AAAS.

Copyright © 2020 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim
to original U.S. Government Works. Distributed under a Creative Commons Attribution NonCommercial License 4.0 (CC BY-NC).

Applied Data Science Questions
No ratings yet
Applied Data Science Questions
15 pages
Medical Image Analysis With Transformers
No ratings yet
Medical Image Analysis With Transformers
66 pages
Improving Quality of Medical Scans Using GANs
No ratings yet
Improving Quality of Medical Scans Using GANs
7 pages
GAN-based Synthetic Medical Image Augmentation
No ratings yet
GAN-based Synthetic Medical Image Augmentation
10 pages
共享医疗数据同时保证隐私
No ratings yet
共享医疗数据同时保证隐私
16 pages
Generative Adversarial Networks and Its Applications in The Biomedical Image Segmentation: A Comprehensive Survey
No ratings yet
Generative Adversarial Networks and Its Applications in The Biomedical Image Segmentation: A Comprehensive Survey
36 pages
Chapter MedicalImageGenerationusingGAN
No ratings yet
Chapter MedicalImageGenerationusingGAN
21 pages
Comparison of Vision Transformers and Convolutional Neural Networks in Medical Image Analysis: A Systematic Review
No ratings yet
Comparison of Vision Transformers and Convolutional Neural Networks in Medical Image Analysis: A Systematic Review
22 pages
Generative Adversarial Network For Medic
No ratings yet
Generative Adversarial Network For Medic
19 pages
Project 1
No ratings yet
Project 1
14 pages
CNNs
No ratings yet
CNNs
22 pages
1 s2.0 S001048252030086X Main
No ratings yet
1 s2.0 S001048252030086X Main
7 pages
A Survey On Deep Learning Techniques For Medical Image Analysis Riyaj
100% (1)
A Survey On Deep Learning Techniques For Medical Image Analysis Riyaj
20 pages
Applsci 11 11185
No ratings yet
Applsci 11 11185
19 pages
A Review On Generative Adversarial Networks Used For Image Reconstruction in Medical Imaging
No ratings yet
A Review On Generative Adversarial Networks Used For Image Reconstruction in Medical Imaging
5 pages
Bharath Simha Reddy 2021 IOP Conf. Ser. Mater. Sci. Eng. 1022 012020
No ratings yet
Bharath Simha Reddy 2021 IOP Conf. Ser. Mater. Sci. Eng. 1022 012020
11 pages
Science Technology and Innovation
No ratings yet
Science Technology and Innovation
16 pages
Deep Convolutional Neural Networks For Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
No ratings yet
Deep Convolutional Neural Networks For Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
14 pages
Deep Learning An Update For Radiologists
No ratings yet
Deep Learning An Update For Radiologists
19 pages
Deep Learning For Medical Image Analysis Applicati
No ratings yet
Deep Learning For Medical Image Analysis Applicati
10 pages
Research Proposal Azeem
No ratings yet
Research Proposal Azeem
10 pages
Deep Learning and Convolutional Neural Networks For Medical Image Computing Precision Medicine, High Performance and Large Scale Datasets
100% (15)
Deep Learning and Convolutional Neural Networks For Medical Image Computing Precision Medicine, High Performance and Large Scale Datasets
17 pages
Jointly Learning Convolutional Representations To Compress Radiological Images and Classify Thoracic Diseases in The Compressed Domain
No ratings yet
Jointly Learning Convolutional Representations To Compress Radiological Images and Classify Thoracic Diseases in The Compressed Domain
8 pages
Merlin: A Vision Language Foundation Model For 3D Computed Tomography
No ratings yet
Merlin: A Vision Language Foundation Model For 3D Computed Tomography
28 pages
Assessing The Ability of Generative Adversarial Networks To Learn Canonical Medical Image Statistics
No ratings yet
Assessing The Ability of Generative Adversarial Networks To Learn Canonical Medical Image Statistics
10 pages
Deep Learning and Convolutional Neural Networks For Medical Imaging and Clinical Informatics
No ratings yet
Deep Learning and Convolutional Neural Networks For Medical Imaging and Clinical Informatics
452 pages
Medical Images Classification Using Deep Learning
No ratings yet
Medical Images Classification Using Deep Learning
47 pages
Chapter 8
No ratings yet
Chapter 8
6 pages
Mam e
No ratings yet
Mam e
23 pages
Sensors: HEMIGEN: Human Embryo Image Generator Based On Generative Adversarial Networks
No ratings yet
Sensors: HEMIGEN: Human Embryo Image Generator Based On Generative Adversarial Networks
16 pages
Diseases Prediciton of MRI Images Using CNN (Literature Review)
No ratings yet
Diseases Prediciton of MRI Images Using CNN (Literature Review)
7 pages
Nihms 1660484
No ratings yet
Nihms 1660484
8 pages
Bioengineering 11 00406 v2
No ratings yet
Bioengineering 11 00406 v2
21 pages
Research Paper
No ratings yet
Research Paper
12 pages
04 Manuscript
No ratings yet
04 Manuscript
15 pages
11.deep Learning Applications in Medical Image Analysis-Brain Tumor
No ratings yet
11.deep Learning Applications in Medical Image Analysis-Brain Tumor
4 pages
Final
No ratings yet
Final
21 pages
U-KAN Makes Strong Backbone For Medical Image Segmentation and Generation
No ratings yet
U-KAN Makes Strong Backbone For Medical Image Segmentation and Generation
14 pages
Verma 2021
No ratings yet
Verma 2021
6 pages
Title:: Version of Record
No ratings yet
Title:: Version of Record
28 pages
Surveycont
No ratings yet
Surveycont
37 pages
Atm 08 11 713
No ratings yet
Atm 08 11 713
15 pages
Rahman 24 A
No ratings yet
Rahman 24 A
19 pages
Research Paper
No ratings yet
Research Paper
5 pages
Miscnn: A Framework For Medical Image Segmentation With Convolutional Neural Networks and Deep Learning
No ratings yet
Miscnn: A Framework For Medical Image Segmentation With Convolutional Neural Networks and Deep Learning
11 pages
(IJCST-V13I2P9) :abdul Majid, Muhib A.Lambay, Shaikh Nabil Ahmed, Shaikh Mohd Talha, Kaldane Abdussalam, Hasan Muchale
No ratings yet
(IJCST-V13I2P9) :abdul Majid, Muhib A.Lambay, Shaikh Nabil Ahmed, Shaikh Mohd Talha, Kaldane Abdussalam, Hasan Muchale
5 pages
Final Paper
No ratings yet
Final Paper
7 pages
Review Article: Advances in Deep Learning-Based Medical Image Analysis
No ratings yet
Review Article: Advances in Deep Learning-Based Medical Image Analysis
14 pages
4684 Down
No ratings yet
4684 Down
22 pages
Literature Survey: Performance Comparision of Residual Deep Network For The Brain Tumor Detection
No ratings yet
Literature Survey: Performance Comparision of Residual Deep Network For The Brain Tumor Detection
19 pages
Multimodal GenAi Pranav
No ratings yet
Multimodal GenAi Pranav
7 pages
The Use of Generative Adversarial Networks in Medical Image Augmentation
No ratings yet
The Use of Generative Adversarial Networks in Medical Image Augmentation
14 pages
MISSFormer An Effective Transformer For 2D Medical Image Segmentation
No ratings yet
MISSFormer An Effective Transformer For 2D Medical Image Segmentation
12 pages
Journal of Healthcare Engineering - 2019 - Fu - Machine Learning For Medical Imaging
No ratings yet
Journal of Healthcare Engineering - 2019 - Fu - Machine Learning For Medical Imaging
2 pages
NeurIPS 2019 Transfusion Understanding Transfer Learning For Medical Imaging Paper
No ratings yet
NeurIPS 2019 Transfusion Understanding Transfer Learning For Medical Imaging Paper
11 pages
GAN Review - Models and Medical Image Fusion Applications
No ratings yet
GAN Review - Models and Medical Image Fusion Applications
15 pages
Federated Learning For Multi Center Imaging Diagnostics: A Simulation Study in Cardiovascular Disease
No ratings yet
Federated Learning For Multi Center Imaging Diagnostics: A Simulation Study in Cardiovascular Disease
12 pages
Soft Computing
No ratings yet
Soft Computing
17 pages
Images Data Practices For Semantic Segmentation of Breast Cancer Using Deep Neural Network
No ratings yet
Images Data Practices For Semantic Segmentation of Breast Cancer Using Deep Neural Network
17 pages
Original Work
No ratings yet
Original Work
10 pages
Model Evaluation - II
No ratings yet
Model Evaluation - II
12 pages
Chakraborty Et Al 2024 Housing Sensitive Health Conditions Can Predict Poor Quality Housing
No ratings yet
Chakraborty Et Al 2024 Housing Sensitive Health Conditions Can Predict Poor Quality Housing
8 pages
Hyperuricemia As An Independent Predictor and Prognostic Factor in The Development of Lupus Nephritis PDF
No ratings yet
Hyperuricemia As An Independent Predictor and Prognostic Factor in The Development of Lupus Nephritis PDF
8 pages
Jurnal 7
No ratings yet
Jurnal 7
11 pages
CS583 Supervised Learning
No ratings yet
CS583 Supervised Learning
166 pages
EEG Machine Learning With Higuchi's Fractal Dimension and Sample Entropy As Features For Successful Detection of Depression
No ratings yet
EEG Machine Learning With Higuchi's Fractal Dimension and Sample Entropy As Features For Successful Detection of Depression
34 pages
Gina Brown Pent Mon
No ratings yet
Gina Brown Pent Mon
69 pages
Logistic Regression For Spam Filtering: Niclas Englesson
No ratings yet
Logistic Regression For Spam Filtering: Niclas Englesson
37 pages
Elbaz Et Al 2024 Clinical Prediction Model For Bacterial Co Infection in Hospitalized Covid 19 Patients During Four
No ratings yet
Elbaz Et Al 2024 Clinical Prediction Model For Bacterial Co Infection in Hospitalized Covid 19 Patients During Four
10 pages
Pininterest Visual Search
No ratings yet
Pininterest Visual Search
10 pages
Signal Detection Theory
No ratings yet
Signal Detection Theory
9 pages
Lung Cancer Detection and Nodule Type Classification Using Image Processing and Machine Learning
No ratings yet
Lung Cancer Detection and Nodule Type Classification Using Image Processing and Machine Learning
6 pages
Estimation and Detection Theory Lab Manual v2
No ratings yet
Estimation and Detection Theory Lab Manual v2
5 pages
Deep Learning For Functional Data Analysis With Adaptive Basis Layers
No ratings yet
Deep Learning For Functional Data Analysis With Adaptive Basis Layers
11 pages
Lai Et Al., 2021: Human-AI Collaboration in Healthcare - A Review and Research Agend
No ratings yet
Lai Et Al., 2021: Human-AI Collaboration in Healthcare - A Review and Research Agend
10 pages
Testing Accuracy Ratio PDF
No ratings yet
Testing Accuracy Ratio PDF
6 pages
Untitled
No ratings yet
Untitled
31 pages
Technical and Vocational Training Institute (Tvti) : By: ETAFERAHU FELEKE .ID NO, TTMR/161/15
No ratings yet
Technical and Vocational Training Institute (Tvti) : By: ETAFERAHU FELEKE .ID NO, TTMR/161/15
26 pages
Lead Score Case Study Presentation
No ratings yet
Lead Score Case Study Presentation
16 pages
Sappok AJIID 2015
No ratings yet
Sappok AJIID 2015
15 pages
Sensors 20 00726 With Cover
No ratings yet
Sensors 20 00726 With Cover
26 pages
Employability Prediction Model Using Academic Performance in Higher Education Through Deep Learning Techniques
No ratings yet
Employability Prediction Model Using Academic Performance in Higher Education Through Deep Learning Techniques
16 pages
Business Employer's Perspectives On Hiring Former Persons Deprived of Liberty in The City of Angeles, Philippines
No ratings yet
Business Employer's Perspectives On Hiring Former Persons Deprived of Liberty in The City of Angeles, Philippines
25 pages
Assessing
No ratings yet
Assessing
19 pages
The - Bias - SME Bankruptcy Prediction
100% (1)
The - Bias - SME Bankruptcy Prediction
18 pages
Ecology and Evolution - 2025 - Ahmad - Understanding Multi Scale and Multi Species Habitat Selection by Mammals in The
No ratings yet
Ecology and Evolution - 2025 - Ahmad - Understanding Multi Scale and Multi Species Habitat Selection by Mammals in The
26 pages
Vision Transformer Attention With Multi-Reservoir Echo State
No ratings yet
Vision Transformer Attention With Multi-Reservoir Echo State
17 pages
Duro Et Al 2018 Validity and Clinical Utility of Different Clock Drawing Test Scoring Systems in Multiple Forms of
No ratings yet
Duro Et Al 2018 Validity and Clinical Utility of Different Clock Drawing Test Scoring Systems in Multiple Forms of
9 pages

Sciadv Abb7973

Uploaded by

Sciadv Abb7973

Uploaded by

SCIENCE ADVANCES | RESEARCH ARTICLE

COMPUTER SCIENCE Copyright © 2020

Downloaded from https://www.science.org on December 01, 2023

Han et al., Sci. Adv. 2020; 6 : eabb7973 2 December 2020 1 of 11

RESULTS Ability of human readers to distinguish synthesized

Downloaded from https://www.science.org on December 01, 2023

Han et al., Sci. Adv. 2020; 6 : eabb7973 2 December 2020 2 of 11

Downloaded from https://www.science.org on December 01, 2023

Han et al., Sci. Adv. 2020; 6 : eabb7973 2 December 2020 3 of 11

Downloaded from https://www.science.org on December 01, 2023

Han et al., Sci. Adv. 2020; 6 : eabb7973 2 December 2020 4 of 11

Downloaded from https://www.science.org on December 01, 2023

Han et al., Sci. Adv. 2020; 6 : eabb7973 2 December 2020 5 of 11

Downloaded from https://www.science.org on December 01, 2023

Han et al., Sci. Adv. 2020; 6 : eabb7973 2 December 2020 6 of 11

Healthy Atelectasis Cardiomegaly Consolidation Edema

Cardiomegaly Effusion Empyhsema Fibrosis Hernia Infiltration

Mass Nodule Pleural thickening Pneumonia Pneumothorax

Downloaded from https://www.science.org on December 01, 2023

Han et al., Sci. Adv. 2020; 6 : eabb7973 2 December 2020 7 of 11

Downloaded from https://www.science.org on December 01, 2023

Han et al., Sci. Adv. 2020; 6 : eabb7973 2 December 2020 8 of 11

Downloaded from https://www.science.org on December 01, 2023

​ ​CG​ ​= ​ 𝔼​ ​​ [ − logP(C = c∣​ x˜​ )]

Han et al., Sci. Adv. 2020; 6 : eabb7973 2 December 2020 9 of 11

Downloaded from https://www.science.org on December 01, 2023

rameters of local Wasserstein GAN training was set to  = 10 and

Han et al., Sci. Adv. 2020; 6 : eabb7973 2 December 2020 10 of 11

Downloaded from https://www.science.org on December 01, 2023

Han et al., Sci. Adv. 2020; 6 : eabb7973 2 December 2020 11 of 11

Sci. Adv. 6 (49), eabb7973. DOI: 10.1126/sciadv.abb7973

View the article online

Downloaded from https://www.science.org on December 01, 2023

Use of this article is subject to the Terms of service

You might also like

CG =   𝔼 [ − logP(C = c∣ x˜ )]