0% found this document useful (0 votes)

13 views11 pages

Research Paper Medical

Uploaded by

xaceh61553

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views11 pages

Research Paper Medical

Uploaded by

xaceh61553

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Medical Image Analysis (2024)

Contents lists available at ScienceDirect

Medical Image Analysis

journal homepage: www.elsevier.com/locate/media

Cascaded Multi-path Shortcut Diffusion Model for Medical Image Translation

Yinchi Zhoua,∗, Tianqi Chend , Jun Houd , Huidong Xiea , Nicha C. Dvorneka,b , S. Kevin Zhoue , David L. Wilsonf , James S.
arXiv:2405.12223v3 [eess.IV] 14 Aug 2024

Duncana,b,c , Chi Liua,b , Bo Zhoug,∗

a Department of Biomedical Engineering, Yale University, New Haven, CT, USA
b Department of Radiology and Biomedical Imaging, Yale School of Medicine, New Haven, CT, USA
c Department of Electrical Engineering, Yale University, New Haven, CT, USA.
d Department of Computer Science, University of California Irvine, Irvine, CA, USA
e School of Biomedical Engineering & Suzhou Institute for Advanced Research, University of Science and Technology of China, Suzhou, China
f Department of Biomedical Engineering, Case Western Reserve University, Cleveland, OH, USA.
g Department of Radiology, Northwestern University, Chicago, IL, USA

ARTICLE INFO ABSTRACT

Article history: Image-to-image translation is a vital component in medical imaging processing, with
many uses in a wide range of imaging modalities and clinical scenarios. Previous meth-
ods include Generative Adversarial Networks (GANs) and Diffusion Models (DMs),
2000 MSC: 41A05, 41A10, 65D05, which offer realism but suffer from instability and lack uncertainty estimation. Even
65D17 though both GAN and DM methods have individually exhibited their capability in med-
ical image translation tasks, the potential of combining a GAN and DM to further im-
Keywords: Image Translation, Diffusion
prove translation performance and to enable uncertainty estimation remains largely un-
Model, Uncertainty, Cascade Framework
explored. In this work, we address these challenges by proposing a Cascade Multi-path
Shortcut Diffusion Model (CMDM) for high-quality medical image translation and un-
certainty estimation. To reduce the required number of iterations and ensure robust per-
formance, our method first obtains a conditional GAN-generated prior image that will
be used for the efficient reverse translation with a DM in the subsequent step. Addition-
ally, a multi-path shortcut diffusion strategy is employed to refine translation results and
estimate uncertainty. A cascaded pipeline further enhances translation quality, incorpo-
rating residual averaging between cascades. We collected three different medical image
datasets with two sub-tasks for each dataset to test the generalizability of our approach.
Our experimental results found that CMDM can produce high-quality translations com-
parable to state-of-the-art methods while providing reasonable uncertainty estimations
that correlate well with the translation error.

© 2024 Elsevier B. V. All rights reserved.

1. Introduction imaging modalities, such as Digital Radiography (DR), Com-

puted Tomography (CT), and Magnetic Resonance Imaging
Image-to-image translation (I2I) plays an important role in (MRI). The applications can be summarized into both intra-
medical imaging with wide applications in different medical modality I2I and inter-modality I2I in medical imaging. In the
applications of medical X-ray, intra-modality I2I can achieve
the high-quality reconstruction of images under radiation dose
∗ Corresponding
author. reduction scenarios. For example, CT radiation dose reduction
e-mail: [email protected] (Yinchi Zhou),
can be accomplished by translating the sparse-view CT, i.e. ac-
[email protected] (Bo Zhou)
2 B. Zhou et al. / Medical Image Analysis (2024)

quired with a reduced number of projection views, into the full-

view CT (Zhou et al., 2021; Zhang et al., 2018; Wu et al., 2021).
Dual-energy (DE) DR radiation dose can be reduced by nearly
half by translating the standard single-shot DR into two-shot
DE images, i.e. soft-tissue and bone images (Zhou et al., 2019;
Yang et al., 2017; Liu et al., 2023b). In MRI applications, intra-
modality I2I can be used for image acquisition acceleration. For
example, one can use T1 to assist the synthesis/reconstruction
of T2 and FLAIR with no or undersampled k-space data (Yang
et al., 2020; Zhou and Zhou, 2020). In the application of CT-
free PET or SPECT attenuation correction, inter-modality I2I
that translates PET or SPECT into CT also helps remove the
need for CT acquisition, thus reducing the overall radiation dose
(Zhou et al., 2024; Chen et al., 2022b,a). Therefore, building an
accurate and robust I2I method that is generalizable to a wide
range of medical imaging applications is important.
With the recent advancements in deep learning (DL), many
DL-based I2I methods have been proposed and adapted to the Fig. 1. Illustration of previous I2I diffusion model generation process.
medical imaging field, demonstrating promising performance. Starting the reverse process with different noise initialization leads to di-
In general, prior I2I methods can be summarized into two vergent translation results.
classes: Generative Adversarial Network (GAN) and Diffusion
Model (DM).
Li et al. (2023) developed a Brownian Bridge Diffusion Model
With paired training data available for I2I, one of the most (BBDM) that learns the translation between two domains di-
widely used I2I GANs is the conditional GAN (cGAN (Isola rectly through the bidirectional diffusion process, i.e. Brownian
et al., 2017)), which consists of 1) a generator that aims to trans- Bridge, rather than a conditional generation process. Similarly,
late an input image into a target image, and 2) a discriminator Liu et al. (2023a) proposed a Schrodinger Bridge I2I Diffu-
that conditions on the initial input and the translation for adver- sion Model (I2SB) that directly learns the nonlinear diffusion
sarial training. A large amount of cGAN variants have been de- processes between two domains. Both had shown improved
veloped for various medical imaging applications. For example, I2I performance in natural image translation tasks. Similar to
Huang et al. (2021) proposed a GAN with dual discriminators I2I GANs, these DM methods have been applied in medical
on both image and gradient domains for low-dose CT (LDCT) imaging. For example, Moghadam et al. (2023) utilized DDPM
to full-dose CT (FDCT) translation. Denck et al. (2021) pro- to synthesize artificial histopathology images with rare cancer
posed a cGAN with an additional input of MRI acquisition in- subtypes to mitigate data imbalance problems for medical data.
formation for intra-MRI-modality translations. Nie et al. (2018) Lyu and Wang (2022) proposed to translate CT into MRI with
proposed to modify the cGAN with the addition of a gradient- conditional DDPM and score-matching models. The forward
based loss function, and showed successful applications in MRI and backward diffusion processes are guided by T2 MRI. Gong
to CT translation and 3T-MRI to 7T-MRI translation. Based on et al. (2023) proposed to perform brain PET image denoising
this, Zhou et al. (2019) further designed a multi-scale cGAN for with MRI as prior information to improve image quality. Gao
single-shot DR image to DE image translation. In PET, Gong et al. (2023) utilized a contextual contained network in the DM
et al. (2020) also proposed a GAN with parameter transferring to improve the LDCT denoising. Furthermore, 2D DMs have
for low-dose PET (LDPET) to full-dose PET (FDPET) trans- also been employed for 3D translation tasks, including low-
lation. Even though reasonable translation performance can be count PET image denoising (Xie et al., 2023), CT reconstruc-
achieved with simple and fast one-step inference from the gen- tion (Chung et al., 2023), and MRI super-resolution and recon-
erator, training GANs can be challenging due to the need to struction (Lee et al., 2023). Direct extension to 3D DM were
balance between the optimization of the generator and discrim- also explored (Pan et al., 2023). However, there are several
inator (e.g. find the saddle point of the min-max objective). The unique challenges of DM for I2I. First, those methods require
training is therefore susceptible to non-convergence and mode iterating over a large number of steps in the reverse process, and
collapse. most methods start the generation with pure random noise (Sa-
On the other hand, I2I diffusion models have been recently haria et al., 2022; Lyu and Wang, 2022; Gong et al., 2023; Xie
developed and show superior performance than GANs. For et al., 2023; Chung et al., 2023; Lee et al., 2023). This proto-
general-purpose I2I with DM, Saharia et al. (2022) proposed col not only significantly slows down the translation speed, but
a unified framework, Palette, which adds conditional image in- could also lead to diverged and sub-optimal translation results
puts to the previously developed Denoising Diffusion Proba- if different random noise initialization were used in the input
bilistic Model (DDPM (Ho et al., 2020)), thus enabling the when running multiple reverse runs (Figure 1). Even though di-
I2I functionality of DDPM. To reduce the randomized initial- rect bridging methods (Li et al., 2023; Liu et al., 2023a) are
ization process and improve the stability in I2I DM, direct translation deterministic given that no random noise input is
bridging diffusion methods have been investigated. Notably, used, they still require a large number of reverse iteration steps.
B. Zhou et al. / Medical Image Analysis (2024) 3

Fig. 2. The overall workflow of our proposed Cascade Multi-path Shortcut Diffusion Model (CMDM). CMDM consists of a one-step inference model
(green) and cascades of MPD block (grey). Each MPD block consists of multiple shortcut reverse paths starting with a prior image with different noise.
The cascades are connected with residual averaging operations.

Another challenge of the deterministic translation is that they bust performance. Second, we proposed to perform this short-
also cannot generate translation uncertainty maps which is cru- cut reverse process multiple times with different noise additions
cial for medical images, since the model’s prediction error can to the cGAN-generated prior. Then, refined translation can be
be used to pinpoint problem areas or give clinicians more infor- obtained by averaging the multi-path shortcut diffusion results.
mation (Shi et al., 2021; Jungo and Reyes, 2019; Wolleb et al., Meanwhile, the translation uncertainty can also be estimated
2022). It is then a unique advantage of the stochastic sampling by computing the standard deviation of the multi-path shortcut
process of the conditional DDPM (Saharia et al., 2022) to ob- diffusion results. Lastly, to further refine the translation, we de-
tain the uncertainty map through running the DM repeatedly vised a cascade pipeline with the multi-path shortcut diffusion
with multiple random noises (Wolleb et al., 2022). Therefore, embedded in each cascade. Between each cascade, we used a
it is desirable to develop an I2I DM method that can generate residual averaging strategy where each cascade’s prior image is
high-quality converged translation results with a reduced num- perturbed by averaging the last cascade’s output and the previ-
ber of required iterations, while also being able to provide trans- ous prior image. We collected three datasets in different medi-
lation uncertainty estimation. cal imaging modalities with different image translation applica-
tions. Our experimental results on these datasets demonstrated
Looking into prior works, even though both GAN and DM
that we can generate high-quality translation images, competi-
methods have individually exhibited their capability in medi-
tive with the prior state-of-the-art I2I methods. We also show
cal image translation tasks, the potential of combining GAN
our method can generate reasonable uncertainty estimation that
and DM for further improving translation performance re-
correlates well with the translation error.
mains largely unexplored. With this and to address the afore-
mentioned challenges in DM, we proposed a Cascade Multi-
path Shortcut Diffusion Model (CMDM) for medical image- 2. Methods
to-image translation in this work. Specifically, CMDM con-
sists of three key components. Firstly, we proposed to utilize 2.1. Cascaded Multi-path Shortcut Diffusion Model
a cGAN-generated prior image with diffusion (i.e. noise ad- The overall architecture of the Cascaded Multi-path Short-
dition) for providing an arbitrary time point’s input in the re- cut Diffusion Model (CMDM) is illustrated in Figure 2. The
verse process. With this shortcut strategy, 1) we need fewer CMDM consists of (1) a one-step inference model, i.e. cGAN
number of iterations thus reducing the processing time, and 2) (Isola et al., 2017), for generating a prior translation image, and
the reverse process starts with prior information from cGAN (2) a conditional denoising diffusion probabilistic model (cD-
instead of pure noise, thus leading to more consistent and ro- DPM) to further refine the prior translation image in a cascade
4 B. Zhou et al. / Medical Image Analysis (2024)

and multi-path fashion. The training and inference details are standard normal distribution N(yT |0, I) at T , the reverse process
as follows. starts at a pre-defined time point t s ∈ [0, T ] with
Training: Let us denote the input image as x and the translation √
ŷts = γts y prior + 1 − γts ϵ prior
p
target as y0 . For the prior image generation part, we utilized a (8)
generative network, i.e. UNet (Ronneberger et al., 2015), that
where y prior = f prior (x) and ϵ prior ∼ N(0, I). t s is empirically set
aims to predict y0 from x. The network can be trained in a
to 250, depending on the translation application. By rearrang-
conditional adversarial fashion (Isola et al., 2017) using both a
ing equation 6, we can approximate the target image y0 as
pixel-wise L2 loss
yt − 1 − γt fdm (x, ŷt , γt )
p
Lgen = || f prior (x) − y0 ||22 , (1) y0 = √ . (9)
γt
and a conditional adversarial loss
Then, by substituting this estimation of y0 into the posterior
Ladv = −log( fadv (y0 |x)) − log(1 − fadv ( f prior (x)|x)), (2) distribution of q(yt−1 |(y0 , yt )) in equation 5, each iteration of the
reverse process can be formulated as
where f prior (·) is the generative network for generating the prior
image and fadv (·) is the discriminator network. 1 1 − αt p
yt−1 = √ (yt − p fdm (x, yt , γt )) + 1 − αt ϵt (10)
On the other hand, the diffusion model consists of a forward αt 1 − γt
diffusion process and a reverse denoising process. The forward
diffusion process is a Markovian process that gradually adds where ϵt ∼ N(0, I). By starting the reverse process at shortcut
Gaussian noise to the target image y0 over T iterations, and can time point t = t s with guidance from the prior image, the condi-
be defined as: tional diffusion model is closer to the endpoint, i.e. t = 0, thus
providing less diverged prediction from multiple predictions.
T
Y To further improve the robustness, instead of only performing
q(y1:T |y0 ) = q(yt |yt−1 ), (3)
a single shortcut reverse path, we perform multiple shortcut re-
t=1
verse paths at t s with different noise initialization of ϵ prior in
√
where q(yt |yt−1 ) = N(yt−1 ; αt yt−1 , (1 − αt )I), and αt are the equation 8, and ensemble these multi-path predictions by aver-
noise schedule parameters. T is empirically set to 1000 here aging
Np
such that yT is visually indistinguishable from Gaussian noise. 1 X p
Then, the forward process can be further marginalized at each yavg
0 = y (11)
N p p=1 0
step as:
√
q(yt |y0 ) = N(yt ; γt y0 , (1 − γt )I), (4) where y0p is the prediction from a single shortcut reverse path
where γt = s=0 α s . Then, the posterior distribution of yt−1
Qt and N p is the number of shortcut paths. To further refine the
given (y0 , yt ) can be derived as: translation prediction, we perform this operation in a cascade
style. To avoid over-fitting in the reverse process, we designed
q(yt−1 |y0 , yt ) = N(yt−1 |µ, σ2 I), (5) a residual averaging strategy for new prior image generation in
√ √ the next cascade. Specifically, the new prior image is the av-
γ (1−α ) αt (1−γt−1 )
where µ = t−1
1−γt
t
y0 + and σ2 = (1−γt−1
1−γt yt )(1−αt )
1−γt . eraged image from the previous prior image and the translated
With this, the noisy image during the forward process can thus image from the last cascade. The full algorithm is summarized
be written as in Algorithm 1.
√
ŷt = γt y0 + 1 − γt ϵ
p
(6)
2.2. Dataset Preparation
where ϵ ∼ N(0, I). Here, the goal is to estimate the noise and
thus gradually remove it during the reverse process to recover We collected three medical image datasets with different
the target image y0 . In our conditional diffusion model, we uti- medical image translation applications to validate our method.
lized another generative network fdm (·) to estimate the noise The first application is the image translation of conventional
with another pixel-wise L2 loss single-exposure chest radiography images into two-shot-based
dual-energy (DE) images, which aims to reduce the expensive
Ldm = || fdm (x, ŷt , γt ) − ϵ||22 (7) system cost of the DE system and higher radiation dose of two
X-ray shots. Specifically, we collected 210 posterior-anterior
where x is the input image that is also used as conditional input DE chest radiographs with a two-shot DE digital radiography
here. ŷt is the noisy image, and γt is the current noise level. system (Zhou et al., 2019; Wen et al., 2018). The data was
The prior image generation network and the diffusion model acquired using a 60 kVp exposure followed by a 120 kVp ex-
network were trained separately. posure procedure with 100 ms between exposures. The size of
Inference: Once the prior image generation network f prior (·) the images is 1024 × 1024 pixels. Based on this dataset, we
of cGAN and the conditional diffusion network fdm (·) are con- further divide this task into two sub-tasks, including the trans-
verged from training, we can use them in CMDM for image lation of standard chest radiography into the soft-tissue image,
translation. The overall inference pipeline of CMDM is illus- and the translation of standard chest radiography into the bone
trated in Figure 2. Instead of starting the reverse process from a image. The second application is image translation across MRI
B. Zhou et al. / Medical Image Analysis (2024) 5

Algorithm 1: Inference Process - Cascaded Multi-path Shortcut Diffusion Model (CMDM)

Input: x ∈ N d1 ×d2
Initialize #1: t s ∈ [0, T ]: the start timestep of denoising process
Initialize #2: Nc : the number of cascades; N p : the number of shortcut paths
Initialize #3: f prior (·): prior image generation network; fdm (·): conditional diffusion network
for c = 1, 2, 3, ..., Nc do
if c = 1 then
yc0 = f prior (x) ; ▷ Prior image generation by one-step CNN inference
else
yc0 = 12 (yavg
0 + y0 ) ;
c−1
▷ Subsequent prior image generation by residual averaging
for p = 1, 2, 3, ..., Npp do
√
ytps = γts yc0 + 1 − γts ϵ p , ϵ p ∼ N(0, I) ; ▷ Adding noise to the prior image for shortcut at t s
for t = t s , t s − 1, t s − 2, ..., 1 do
ϵt ∼ N(0, I) ; ▷ Sampling noise in the reverse process
p √
yt−1 = √1αt (ytp − √1−αt fdm (x, ytp , γt )) + 1 − αt ϵt ; ▷ Iterative reverse process in a single path
1−γt
PN p
yavg
0 =
1
Np
p
p=1 y0 ; ▷ Averaging the multiple shortcut paths outputs
return yavg
0 ; ▷ Outputting the last cascade’s multi-path averaging result

Table 1. Quantitative comparisons of translation results from different methods. I2I applications include DE X-ray image generation (soft-tissue and bone
image), Sparse-view CT reconstruction (1/6 projection under-sampling and 1/4 projection under-sampling), and MRI inter-modality synthesis (T1-to-T2
and T1-to-FLAIR). The best results are marked in bold. ”†” means the differences between CMDM and all the previous baseline methods are significant
at p < 0.002. The averaged inference time of each method is reported in the right column.
DE X-ray Soft-Tissue Bone Average
Evaluation PSNR SSIM MAE PSNR SSIM MAE Time (Sec)
UNet 39.76 ± 2.36 0.984 ± 0.003 0.606 ± 0.071 41.33 ± 3.18 0.988 ± 0.003 0.571 ± 0.066 0.013
cGAN 39.82 ± 2.37 0.985 ± 0.003 0.603 ± 0.072 41.36 ± 3.17 0.988 ± 0.003 0.572 ± 0.065 0.013
Palette v1 42.89 ± 2.34 0.987 ± 0.002 0.390 ± 0.047 43.06 ± 3.16 0.989 ± 0.002 0.373 ± 0.042 13.670
Palette v2 43.11 ± 2.34 0.988 ± 0.002 0.382 ± 0.045 43.47 ± 3.13 0.990 ± 0.002 0.363 ± 0.043 273.420
I2SB 43.18 ± 2.35 0.988 ± 0.002 0.381 ± 0.045 43.49 ± 3.14 0.990 ± 0.002 0.367 ± 0.043 14.551
BBDM 43.08 ± 2.35 0.988 ± 0.002 0.382 ± 0.044 43.52 ± 3.13 0.989 ± 0.002 0.359 ± 0.043 15.121
Ours 44.27 ± 2.33† 0.991 ± 0.002† 0.369 ± 0.041† 44.58 ± 3.16† 0.992 ± 0.003† 0.348 ± 0.038† 154.663
CT 1/6 Sparse-view 1/4 Sparse-view Average
Evaluation PSNR SSIM MAE PSNR SSIM MAE Time (Sec)
UNet 44.11 ± 1.38 0.977 ± 0.004 0.372 ± 0.047 46.32 ± 1.27 0.981 ± 0.004 0.315 ± 0.040 0.006
cGAN 44.13 ± 1.39 0.978 ± 0.004 0.370 ± 0.047 46.35 ± 1.28 0.981 ± 0.004 0.314 ± 0.040 0.006
Palette v1 44.96 ± 1.24 0.980 ± 0.003 0.321 ± 0.041 46.75 ± 1.26 0.987 ± 0.004 0.310 ± 0.039 8.863
Palette v2 45.56 ± 1.24 0.981 ± 0.003 0.318 ± 0.040 46.95 ± 1.25 0.988 ± 0.003 0.308 ± 0.038 177.202
I2SB 45.86 ± 1.26 0.982 ± 0.003 0.317 ± 0.039 46.91 ± 1.26 0.989 ± 0.003 0.309 ± 0.039 9.561
BBDM 45.73 ± 1.24 0.981 ± 0.003 0.318 ± 0.040 46.96 ± 1.26 0.989 ± 0.003 0.309 ± 0.038 9.987
Ours 46.42 ± 1.22† 0.986 ± 0.003† 0.302 ± 0.039† 47.02 ± 1.25† 0.990 ± 0.003† 0.299 ± 0.038† 108.821
MRI T1 → T2 T1 → FLAIR Average
Evaluation PSNR SSIM MAE PSNR SSIM MAE Time (Sec)
UNet 27.17 ± 1.56 0.885 ± 0.042 0.222 ± 0.051 27.38 ± 1.59 0.891 ± 0.046 0.216 ± 0.052 0.006
cGAN 27.19 ± 1.58 0.887 ± 0.044 0.220 ± 0.052 27.41 ± 1.58 0.891 ± 0.047 0.217 ± 0.053 0.006
Palette v1 27.52 ± 1.57 0.890 ± 0.044 0.218 ± 0.051 27.68 ± 1.54 0.897 ± 0.046 0.210 ± 0.052 8.863
Palette v2 27.68 ± 1.55 0.891 ± 0.043 0.211 ± 0.051 27.79 ± 1.52 0.899 ± 0.044 0.206 ± 0.051 177.202
I2SB 27.85 ± 1.56 0.892 ± 0.043 0.208 ± 0.051 27.89 ± 1.54 0.898 ± 0.043 0.208 ± 0.052 9.561
BBDM 27.88 ± 1.56 0.892 ± 0.043 0.207 ± 0.051 27.86 ± 1.53 0.899 ± 0.045 0.207 ± 0.051 9.987
Ours 27.93 ± 1.54† 0.898 ± 0.042† 0.202 ± 0.051† 27.98 ± 1.54† 0.901 ± 0.044† .201 ± .051† 108.821

modalities, which aims to speed up the MRI acquisition that re- for each patient, and resized to 256 × 256 × 18. 360 2D axial
quires multiple protocols (Zhou and Zhou, 2020). Specifically, images are generated for each protocol. We further sub-divided
we collected an in-house MRI dataset consisting of 20 patients. this task into two components: translating the T1 image into the
We scanned each patient using three protocols, including T1, T2 image, and translating the T1 image into the FLAIR image.
T2, and FLAIR, resulting in three 3D volumes of 320×230×18 The third application is the image translation of sparse-view
6 B. Zhou et al. / Medical Image Analysis (2024)

Fig. 3. Qualitative comparison of translation results and corresponding error map from different methods. Examples from DE X-ray soft-tissue generation
(Left), Sparse-view CT reconstruction (Middle), and MRI T1-to-T2 synthesis are shown. The image quality metrics of each sample are indicated at the
bottom left of the images.

CT (SVCT) images into full-view CT images, which aims to 2.3. Evaluation Metrics and Baselines Comparisons
reduce the radiation dose in CT acquisition (Zhou et al., 2021,
2022b). We collected 10 whole-body CT scans from the AAPM To evaluate the translated image quality for the above-
Low-Dose CT Grand Challenge (McCollough, 2016). Each 3D mentioned applications, we used the Peak Signal-to-Noise Ra-
scan contains 318 ∼ 856 axial slices covering a wide range of tio (PSNR), Structural Similarity Index (SSIM), and Mean Ab-
anatomical regions from chest to abdomen to pelvis, resulting solute Error (MAE) that was computed against their corre-
in a total of 3397 axial 2D images. Using the CT projection sponding paired ground truth. For baseline comparisons, we
simulator, the fully sampled sinogram data was generated via compared our method’s results against previous one-step CNN-
360 projection views uniformly spaced between 0 and 360 de- based and diffusion-based image-to-image translation methods,
grees. Then, we uniformly sampled 90 and 60 projection views including cGAN (Isola et al., 2017), Palette (Saharia et al.,
from the 360 projection views, mimicking 4- and 6-fold pro- 2022), Schrodinger Bridge Diffusion Model (I2SB) (Liu et al.,
jection view/radiation dose reduction. The paired full-view and 2023a), and Brownian Bridge Diffusion Models (BBDM) (Li
sparse-view CT images were then reconstructed using Filtered et al., 2023). Given that Palette utilizes random Gaussian noise
Back Projection (FBP) based on these sinograms with the size as the initial input, we also compared two versions of Palette,
of 256 × 256. For all three applications/datasets, we performed including Palette with 1-sampling run (Palette v1) and Palette
5-fold cross-validation for evaluation considering their moder- with 20-sampling runs with results averaging (Palette v2). I2SB
ate scale. and BBDM only have the 1-sampling run version given that
there is no randomized input during sampling. Furthermore,
we also conducted ablative studies on the hyper-parameters of
CMDM, including the shortcut time point, number of shortcut
B. Zhou et al. / Medical Image Analysis (2024) 7

Fig. 4. Ablative studies on the reverse starting time (Left), the number of paths (Middle), and the number of cascades (Right). DE X-ray soft-tissue image
generation and 1/6 SVCT reconstruction were utilized for these studies. Peak performances were annotated on the plots with the corresponding image
quality metric, i.e. PSNR.

paths, and number of cascades. the streak artifact in the input FBP SVCT image. However,
significant residual errors can be found in the femoral head and
2.4. Implementation Details pelvic bone regions. On the other hand, we observe that the pre-
vious diffusion-based methods can suppress these errors, with
We implemented our method in PyTorch and performed ex-
PSNR reaching close to 46dB. Furthermore, with CMDM com-
periments using an NVIDIA H100 GPU. We train all models
bining cGAN and Diffusion, we can see that the overall error
with a batch size of 8 for 500k training steps. The Adam solver
of our translation results are reduced even more, and the image
was used to optimize our models with lr = 1 × 10−4 , β1 = 0.9,
quality is enhanced to PSNR of 46.23dB. Similar observations
and β2 = 0.99. We used an EMA rate of 0.9999. A 10k linear
can be found for the T1-to-T2 translation example in the last
learning rate warmup schedule was implemented. We used a
two columns.
linear noise schedule with 1000 time steps.
The quantitative comparisons were summarized in Table 1.
3. Experimental Results Similar to the observations from the visualizations, we can
see the traditional CNN-based approaches generally under-
Figure 3 shows qualitative comparisons between previous performed the diffusion-based approaches. For example, the
state-of-the-art and our methods. Examples from the DE X- cGAN only achieved an average PSNR of 39.82dB and MAE
ray dataset, SVCT reconstruction dataset, and MRI translation of 0.603 for the soft-tissue image translation, while the single
dataset are illustrated. For the DE X-ray example (left two reverse path Palette, i.e. Palette v1, significantly outperformed
columns), we can see all the previous translation methods can it with PSNR of 42.89dB and MAE of 0.390. Running multiple
generate reasonable soft-tissue images, i.e. rib-suppression im- reverse paths of Palette and averaging the outputs, i.e. Palette
ages, from the standard X-ray image. While cGAN could gen- v2, led to improved performance which reached similar per-
erate visually plausible results with a PSNR of 44.74dB, the formances of I2SB and BBDM with PSNR of 43.11dB and
translated images still suffer from relatively inaccurate quantifi- MAE of 0.382. In the last row, our CMDM achieved an av-
cation as indicated by the error map. On the other hand, we erage PSNR of 44.27dB and MAE of 0.369 that significantly
can see the previous diffusion-based methods, e.g. Palette and outperformed all the previous baseline methods. Comparing
BBDM, both achieved significantly better translation as com- the soft-tissue image translation task to the bone image trans-
pared to cGAN with PSNR improving to 46.05dB, with much lation task, the CMDM had slightly higher performance on the
fewer pixel-wise errors indicated by the error maps. In the latter task since the bone image without complex soft-tissue tex-
last row, we can find that our CMDM further improved over ture can be relatively easier to generate as compared to the soft-
the previous diffusion-based methods with PSNR reaching to tissue image. For the inference speed in DE X-ray applications,
47.34dB, where further reduced pixel-wise error can be found I2SB and BBDM with a single reverse path took an average of
in the cardiac and lung regions. Similarly, for the SVCT exam- 14.55 and 15.12 seconds, respectively. CMDM with the best
ple (the middle two columns), cGAN can reasonably suppress performance took an average of 154.66 seconds per inference
8 B. Zhou et al. / Medical Image Analysis (2024)

since multiple shortcut reverse paths are needed. Similar to the ablative studies on CMDM’s uncertainty estimation. Two ex-
quantitative results for the DE X-ray, we found our CMDM con- amples of DE X-ray and MRI T1-to-T2 translation are shown
sistently outperformed previous CNN and diffusion-based base- in Figure 5. On the bottom, both the pixel-wise absolute error
line methods for both the SVCT reconstruction applications and and the pixel-wise uncertainty (i.e. computed by the standard
the MRI inter-modality translation applications. deviation of multiple shortcut path predictions) are visualized.
The corresponding scatter plot of their pixel-wise relationship
is also shown on the right. We found that the pixel-wise un-
Table 2. Quantitative comparison of CMDM with different prior strategies.
Analysis with DE X-ray soft-tissue generation task, 1/6 SVCT reconstruc- certainty and the absolute error have a good correlation. For
tion task, and T1-to-T2 MRI synthesis task are reported. the DE X-ray example and the MRI example here, we have
MAE DE X-Ray SVCT MRI a correlation coefficient equal to 0.76 and 0.81, respectively.
w/o prior 0.379 ± 0.043 0.316 ± 0.042 0.210 ± 0.051 This is particularly useful when ground truth is unavailable to
UNet prior 0.370 ± 0.041 0.306 ± 0.041 0.203 ± 0.051 compute the translation error, where uncertainty can indicate
UFNet prior 0.366 ± 0.041 0.303 ± 0.040 0.201 ± 0.050 the potential error distributions. The correlation of the pixel-
cGAN prior 0.369 ± 0.041 0.302 ± 0.039 0.201 ± 0.051 wise uncertainty and the absolute error for the whole test set is
summarized in Table 4. By running multiple sampling runs of
We conducted ablative studies for the hyper-parameters in Palette (Saharia et al., 2022), i.e. Palette v2, it can also produce
CMDM, including the reverse starting time, the number of the pixel-wise standard deviation for uncertainty estimation. In
shortcut paths, and the number of cascades. The results for the Table 4, we can see CMDM achieving a better-averaged corre-
DE X-ray and SVCT are summarized in Figure 4. First, for the lation across all three translation applications.
reverse starting time, we can see that setting t s to around 200
yields the best performance, and the performance starts to de-
Table 4. Averaged correlation of the pixel-wise absolute error and the pixel-
grade if we further increase it. It is worth noticing that using wise uncertainty, i.e. computed by the standard deviation of multiple
t s = 200 here not only yields the best performance but allows paths’ predictions. DE X-ray soft-tissue generation task, 1/6 SVCT recon-
us to reduce the inference time by about 5 times as compared struction task, and T1-to-T2 MRI synthesis task are reported.
to the previous diffusion methods that start at t = 1000 or be- Correlation DE X-Ray SVCT MRI
yond. Second, for the number of shortcut paths, we can see Palette v2 0.678 ± 0.162 0.702 ± 0.137 0.676 ± 0.108
that the performance increases as we use an increasing num- CMDM 0.695 ± 0.142 0.718 ± 0.127 0.687 ± 0.089
ber of paths. The performance started to converge when 20
paths were used. Because the inference time increased linearly
as we increased the number of paths, we chose the converg- 4. Discussion
ing point N p = 20 in our method. Thirdly, for the number of
cascades, we found that the performance gradually boosted as In this work, we developed a novel image translation method,
the number of cascades increased. However, peak performance called CMDM, that efficiently integrates GAN and DM to en-
was reached when Nc = 3, and the inference started to overfit, able high-quality medical image-to-image translation. There
leading to degraded translation performance. Lastly, we inves- are several key advantages of this method. First, we utilized
tigated the impact on CMDM when different prior image gen- a previous CNN-based translation method to generate a virtual
erations were used, including priors from UNet (Ronneberger ”t = 0” image for the diffusion model. This image is added
et al., 2015), Under-to-fully-complete Network (UFNet (Zhou with the scheduled noise, so we can start the diffusion reverse
et al., 2022a)), and cGAN (Isola et al., 2017). As we can process at a scheduled shortcut time point. As illustrated in
see from Table 2, using CMDM with prior always outperforms Figure 1, initializing the reverse process with pure noise may
CMDM without prior. Among all the prior generated, CMDM lead to sub-optimal results, while here, starting the reverse pro-
with priors generated from cGAN and UFNet yields the best cess with a roughly estimated image (e.g. cGAN’s prediction)
performance. Moreover, we also studied CMDM with or with- with the scheduled noise not only can help stabilize the reverse
out the conditional input for the diffusion part. As we can see sampling process, but also reduce the required number of re-
from Table 3, CMDM without conditional input can still gen- verse iterations, i.e shorten the inference time. Second, instead
erate a reasonable translation guided by the prior image. How- of adding one noise schedule (Chung et al., 2022; Gao et al.,
ever, CMDM with condition input with more translation guid- 2023), we added different noises to this ”t = 0” image and per-
ance still yields the best performance. formed the same reverse process multiple times in each cas-
cade. The corresponding cascade output is simply the averaged
outputs from these paths. This averaging operation inherently
Table 3. Quantitative comparison of CMDM with or without images to be
reduces the randomness from the different noises and thus im-
translated as conditional inputs in the diffusion part. Analysis with DE
X-ray soft-tissue generation task, 1/6 SVCT reconstruction task, and T1- proves the translation robustness. Based on results from mul-
to-T2 MRI synthesis task are reported. tiple reverse runs, we can generate pixel-wise uncertainty esti-
MAE DE X-Ray SVCT MRI mation for the translation results, which is also a key advantage.
w/o condition 0.517 ± 0.059 0.339 ± 0.043 0.219 ± 0.053 Lastly, we also devised a cascade framework with a residual
w condition 0.369 ± 0.041 0.302 ± 0.039 0.202 ± 0.051 averaging strategy. This design helps us enhance performance
without training additional models, but may come at the cost of
In addition to the translation performance, we also conducted additional inference time. It is worth noticing that our CMDM
B. Zhou et al. / Medical Image Analysis (2024) 9

Fig. 5. Examples of CMDM’s uncertainty estimation for DE X-ray soft-tissue image generation (Left) and MRI T1-to-T2 synthesis (Right). The relationship
plots between the absolute error (bottom left) and the uncertainty (bottom right) were shown as well. Positive correlations with R > 0.75 were found for
both cases.

can be viewed as a plug-and-play module that helps improve clinical scenarios for estimating the error, we believe our un-
the performance of cGAN, i.e. the one-step inference model certainty estimation is potentially useful for the user to decide
used in CMDM, as shown in Table 1. Ideally, our approach which region is trustworthy for downstream applications, such
can also be added as a plug-and-play module to the other pre- as diagnosis and treatment planning.
vious translation methods for potential translation performance
improvements. The presented work also has limitations with several poten-
tial improvements that are important subjects of our future stud-
We collected three medical image datasets with a total of ies. Firstly, we only validated our method on three different
six different medical image translation tasks to validate our modalities, and evaluations on more diverse applications could
method. From our experimental results, we demonstrated our be included. Even though we framed CMDM as an image-space
method can generate high-quality translated images that con- post-processing tool here, we believe it can be further tailored
sistently outperformed previous baseline methods (Figure 3 and to specific translation problems. For example, we could include
Table 1). For example, we can see that CMDM achieved PSNR physic-informed modules, such as data consistency (Schlem-
> 44dB for both DE soft-tissue image generation and DE bone per et al., 2017; Song et al., 2021), in CMDM which may fur-
image generation, while all the previous methods are below ther improve its applications in medical image reconstruction
44dB. Although CMDM achieves the best performance, it re- (Zbontar et al., 2018; Sidky and Pan, 2022). Secondly, the
quires a relatively longer inference time as compared to pre- current CMDM is implemented in a 2D fashion, while 3D is
vious methods that need a single reverse run. For example, desirable in many medical image translation tasks. Theoret-
CMDM needs 154.66 seconds on average for the DE X-ray ap- ically, we could directly change all the networks in CMDM
plication, but Palette v1, I2SB, and BBDM only need about 13 into 3D networks to enable 3D applications, but it may be in-
seconds. However, we can either reduce the number of cascades feasible with the current computation resources. For example,
or the number of shortcut paths in CMDM to balance the com- we attempted to employ the 3D CMDM with an input size of
putation time and performance need. The default settings in 256 × 256 × 128 on an 80G H100 GPU, however, it cannot fit
our CMDM are Nc = 3 and N p = 20. According to the studies into the memory even with a single batch size. Alternatively,
reported in Figure 4, we could reduce the number of cascades we could also utilize multi-view diffusion or 2.5D or memory-
(Nc ) to 1 to shorten the inference time by nearly three times efficient strategies to scale CMDM into 3D (Chung et al., 2023;
which would result in PSNR=43.75dB. This result still outper- Xie et al., 2023; Bieder et al., 2024; Chen et al., 2024) which
formed all the previous baseline methods (Table 1). Similarly, will be extensively investigated in our future works. Thirdly, the
we could also reduce the number of shortcut paths (N p ) to 10 to inference speed is still considered relatively long as compared
cut the inference time by nearly half and still outperform all the to previous methods, especially the classic CNN-based meth-
previous baseline methods. On the other hand, we believe these ods. While we discussed the trade-off between performance and
hyper-parameters also need to be tuned for different translation speed in the previous paragraph, it is also desirable to maintain
applications to find the optimal balance between performance optimal performance with increased inference speed. Utiliz-
and computation/time budgets. Besides the translation itself, ing accelerated diffusion models, such as DDIM and Resshift
CMDM also generates pixel-wise uncertainty estimation. As (Song et al., 2020; Yue et al., 2024), in CMDM could poten-
we can see from Figure 5 and Table 4, CMDM’s uncertainty es- tially help achieve this goal. To accelerate the inference speed
timation demonstrated good correlations with the absolute error for time-critical clinical scenarios, such as real-time translation
that can only be computed when the ground truth is available. in intervention radiology, one could also consider alternative
Since ground truth is commonly unavailable when deployed in solutions. For example, we could also consider distilling the
10 B. Zhou et al. / Medical Image Analysis (2024)

diffusion model knowledge into the one-step inference GAN Declaration of Competing Interest
model (Kang et al., 2024), such that GAN with diffusion model
performance and real-time capability can be realized. Fourthly, The authors declare that they have no known competing fi-
in the current implementation of CMDM, we did not imple- nancial interests or personal relationships that could have ap-
ment ways to monitor the first step’s image generation. If un- peared to influence the work reported in this paper.
satisfactory results were generated in the first step, the error
could propagate to the next step. However, this should be re-
flected on the CMDM final uncertainty map where increased Credit authorship contribution statement
uncertainty value, i.e. pixel-wise standard deviation, should be
observed. On the other hand, we could also further include un- Yinchi Zhou: Conceptualization, Methodology, Software,
certainty estimation techniques, e.g. Monte Carlo Dropout (Gal Visualization, Validation, Formal analysis, Writing original
and Ghahramani, 2016), in the first step of cGAN, thus mon- draft. Tianqi Chen: Results analysis, Writing - review and
itoring prior image generation. Lastly, CMDM requires data editing. Jun Hou: Results analysis, Writing - review and edit-
with paired images for training, but such data may not always ing. Huidong Xie: Conceptualization, Methodology, Software,
be available in certain applications. Unpaired translation diffu- Writing - review and editing. Nicha C. Dvornek: Writing - re-
sion model strategies (Sasaki et al., 2021; Özbey et al., 2023) view and editing S. Kevin Zhou: Data preparation, Writing -
could also potentially be deployed here to mitigate this chal- review and editing. David L. Wilson: Data preparation, Writ-
lenge. For example, one could consider using CycleGAN to ing - review and editing. James S. Duncan: Writing - review
generate the prior image, and then using a multi-path version of and editing Chi Liu: Writing - review and editing Bo Zhou:
UNIT-DDPM (Sasaki et al., 2021) for further refinement of the Conceptualization, Methodology, Software, Visualization, Val-
prior image. This is an interesting direction to be investigated idation, Formal analysis, Writing original draft, Supervision.
in our future works. Moreover, future works also include evalu-
ations of how CMDM impacts the downstream clinical applica-
References
tions. For example, we will investigate if the CMDM-translated
images provide similar lesion detection capability or radiomic Bieder, F., Wolleb, J., Durrer, A., Sandkuehler, R., Cattin, P.C., 2024. Denoising
features when compared to the ground truth images, thus vali- diffusion models for memory-efficient processing of 3d medical images, in:
dating the clinical values of our method. Medical Imaging with Deep Learning, PMLR. pp. 552–567.
Chen, T., Hou, J., Zhou, Y., Xie, H., Chen, X., Liu, Q., Guo, X., Xia, M., Dun-
can, J.S., Liu, C., et al., 2024. 2.5 d multi-view averaging diffusion model for
3d medical image translation: Application to low-count pet reconstruction
5. Conclusion with ct-less attenuation correction. arXiv preprint arXiv:2406.08374 .
Chen, X., Pretorius, P.H., Zhou, B., Liu, H., Johnson, K., Liu, Y.H., King, M.A.,
Liu, C., 2022a. Cross-vender, cross-tracer, and cross-protocol deep transfer
learning for attenuation map generation of cardiac spect. Journal of Nuclear
Our work proposes a Cascaded Multi-path Shortcut Diffusion Cardiology 29, 3379–3391.
Model (CMDM) - a simple and novel strategy for high-quality Chen, X., Zhou, B., Xie, H., Shi, L., Liu, H., Holler, W., Lin, M., Liu, Y.H.,
medical image-to-image translation. The proposed method first Miller, E.J., Sinusas, A.J., et al., 2022b. Direct and indirect strategies of
utilizes a classic CNN-based translation method to generate a deep-learning-based attenuation correction for general purpose and dedi-
cated cardiac spect. European Journal of Nuclear Medicine and Molecular
prior image. By adding different noises to this image, we then Imaging 49, 3046–3060.
run multiple reverse samplings starting with the noisy images, Chung, H., Ryu, D., McCann, M.T., Klasky, M.L., Ye, J.C., 2023. Solving
i.e. shortcuts. With this process in each cascade, the transla- 3d inverse problems using pre-trained 2d diffusion models, in: Proceedings
tion output is obtained by averaging them, and the uncertainty of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,
pp. 22542–22551.
estimation is obtained by calculating the standard deviation. Chung, H., Sim, B., Ye, J.C., 2022. Come-closer-diffuse-faster: Accelerating
Based on this, a cascade framework with residual averaging is conditional diffusion models for inverse problems through stochastic con-
further proposed to gradually refine the translation. For vali- traction, in: Proceedings of the IEEE/CVF Conference on Computer Vision
and Pattern Recognition, pp. 12413–12422.
dation, we utilized three medical image datasets across X-ray,
Denck, J., Guehring, J., Maier, A., Rothgang, E., 2021. Mr-contrast-aware
CT, and MRI. Our experimental results showed that CMDM image-to-image translations with generative adversarial networks. Interna-
can provide high-quality translation results, better than previous tional Journal of Computer Assisted Radiology and Surgery 16, 2069–2078.
translation baselines for different sub-tasks. In parallel, CMDM Gal, Y., Ghahramani, Z., 2016. Dropout as a bayesian approximation: Repre-
senting model uncertainty in deep learning, in: international conference on
also provides reasonable uncertainty estimations that correlate machine learning, PMLR. pp. 1050–1059.
well with the translation error maps. We believe CMDM could Gao, Q., Li, Z., Zhang, J., Zhang, Y., Shan, H., 2023. Corediff: Contextual
be potentially adapted to other applications where both high- error-modulated generalized diffusion model for low-dose ct denoising and
quality translation and uncertainty estimation are required. generalization. IEEE Transactions on Medical Imaging .
Gong, K., Johnson, K., El Fakhri, G., Li, Q., Pan, T., 2023. Pet image denois-
ing based on denoising diffusion probabilistic model. European Journal of
Nuclear Medicine and Molecular Imaging , 1–11.
Gong, Y., Shan, H., Teng, Y., Tu, N., Li, M., Liang, G., Wang, G., Wang, S.,
Acknowledgments 2020. Parameter-transferred wasserstein generative adversarial network (pt-
wgan) for low-dose pet image denoising. IEEE transactions on radiation and
plasma medical sciences 5, 213–223.
This work was supported by the National Institutes of Health Ho, J., Jain, A., Abbeel, P., 2020. Denoising diffusion probabilistic models.
(NIH) grant R01EB025468 and grant R01CA275188. Advances in neural information processing systems 33, 6840–6851.
B. Zhou et al. / Medical Image Analysis (2024) 11

Huang, Z., Zhang, J., Zhang, Y., Shan, H., 2021. Du-gan: Generative adver- from dual energy chest x-rays with sliding organ registration. Computerized
sarial networks with dual-domain u-net-based discriminators for low-dose Medical Imaging and Graphics 64, 12–21.
ct denoising. IEEE Transactions on Instrumentation and Measurement 71, Wolleb, J., Sandkühler, R., Bieder, F., Valmaggia, P., Cattin, P.C., 2022. Dif-
1–12. fusion models for implicit image segmentation ensembles, in: International
Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A., 2017. Image-to-image translation Conference on Medical Imaging with Deep Learning, PMLR. pp. 1336–
with conditional adversarial networks, in: Proceedings of the IEEE confer- 1348.
ence on computer vision and pattern recognition, pp. 1125–1134. Wu, W., Hu, D., Niu, C., Yu, H., Vardhanabhuti, V., Wang, G., 2021. Drone:
Jungo, A., Reyes, M., 2019. Assessing reliability and challenges of uncertainty Dual-domain residual-based optimization network for sparse-view ct recon-
estimations for medical image segmentation, in: Medical Image Comput- struction. IEEE Transactions on Medical Imaging 40, 3002–3014.
ing and Computer Assisted Intervention–MICCAI 2019: 22nd International Xie, H., Gan, W., Zhou, B., Chen, X., Liu, Q., Guo, X., Guo, L., An, H.,
Conference, Shenzhen, China, October 13–17, 2019, Proceedings, Part II Kamilov, U.S., Wang, G., et al., 2023. Dose-aware diffusion model for 3d
22, Springer. pp. 48–56. ultra low-dose pet imaging. arXiv preprint arXiv:2311.04248 .
Kang, M., Zhang, R., Barnes, C., Paris, S., Kwak, S., Park, J., Shechtman, E., Yang, Q., Li, N., Zhao, Z., Fan, X., Chang, E.I.C., Xu, Y., 2020. Mri cross-
Zhu, J.Y., Park, T., 2024. Distilling diffusion models into conditional gans. modality image-to-image translation. Scientific reports 10, 3753.
arXiv preprint arXiv:2405.05967 . Yang, W., Chen, Y., Liu, Y., Zhong, L., Qin, G., Lu, Z., Feng, Q., Chen, W.,
Lee, S., Chung, H., Park, M., Park, J., Ryu, W.S., Ye, J.C., 2023. Improving 3d 2017. Cascade of multi-scale convolutional neural networks for bone sup-
imaging with pre-trained perpendicular 2d diffusion models. arXiv preprint pression of chest radiographs in gradient domain. Medical image analysis
arXiv:2303.08440 . 35, 421–433.
Li, B., Xue, K., Liu, B., Lai, Y.K., 2023. Bbdm: Image-to-image translation Yue, Z., Wang, J., Loy, C.C., 2024. Resshift: Efficient diffusion model for
with brownian bridge diffusion models, in: Proceedings of the IEEE/CVF image super-resolution by residual shifting. Advances in Neural Information
Conference on Computer Vision and Pattern Recognition, pp. 1952–1961. Processing Systems 36.
Liu, G.H., Vahdat, A., Huang, D.A., Theodorou, E.A., Nie, W., Anandku- Zbontar, J., Knoll, F., Sriram, A., Murrell, T., Huang, Z., Muckley, M.J.,
mar, A., 2023a. I2 SB: Image-to-image schrödinger bridge. arXiv preprint Defazio, A., Stern, R., Johnson, P., Bruno, M., et al., 2018. fastmri:
arXiv:2302.05872 . An open dataset and benchmarks for accelerated mri. arXiv preprint
Liu, Y., Zeng, F., Ma, M., Zheng, B., Yun, Z., Qin, G., Yang, W., Feng, Q., arXiv:1811.08839 .
2023b. Bone suppression of lateral chest x-rays with imperfect and lim- Zhang, Z., Liang, X., Dong, X., Xie, Y., Cao, G., 2018. A sparse-view ct
ited dual-energy subtraction images. Computerized Medical Imaging and reconstruction method based on combination of densenet and deconvolution.
Graphics 105, 102186. IEEE transactions on medical imaging 37, 1407–1417.
Lyu, Q., Wang, G., 2022. Conversion between ct and mri images using diffusion Zhou, B., Chen, X., Xie, H., Zhou, S.K., Duncan, J.S., Liu, C., 2022a. Du-
and score-matching models. arXiv preprint arXiv:2209.12104 . doufnet: dual-domain under-to-fully-complete progressive restoration net-
McCollough, C., 2016. Tu-fg-207a-04: overview of the low dose ct grand work for simultaneous metal artifact reduction and low-dose ct reconstruc-
challenge. Medical physics 43, 3759–3760. tion. IEEE transactions on medical imaging 41, 3587–3599.
Moghadam, P.A., Van Dalen, S., Martin, K.C., Lennerz, J., Yip, S., Fara- Zhou, B., Chen, X., Zhou, S.K., Duncan, J.S., Liu, C., 2022b. Dudodr-net:
hani, H., Bashashati, A., 2023. A morphology focused diffusion proba- Dual-domain data consistent recurrent network for simultaneous sparse view
bilistic model for synthesis of histopathology images, in: Proceedings of and metal artifact reduction in computed tomography. Medical Image Anal-
the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. ysis 75, 102289.
2000–2009. Zhou, B., Hou, J., Chen, T., Zhou, Y., Chen, X., Xie, H., Liu, Q., Guo, X.,
Nie, D., Trullo, R., Lian, J., Wang, L., Petitjean, C., Ruan, S., Wang, Q., Shen, Tsai, Y.J., Panin, V.Y., et al., 2024. Pour-net: A population-prior-aided over-
D., 2018. Medical image synthesis with deep convolutional adversarial net- under-representation network for low-count pet attenuation map generation.
works. IEEE Transactions on Biomedical Engineering 65, 2720–2730. arXiv preprint arXiv:2401.14285 .
Özbey, M., Dalmaz, O., Dar, S.U., Bedel, H.A., Özturk, Ş., Güngör, A., Çukur, Zhou, B., Lin, X., Eck, B., Hou, J., Wilson, D., 2019. Generation of virtual dual
T., 2023. Unsupervised medical image translation with adversarial diffusion energy images from standard single-shot radiographs using multi-scale and
models. IEEE Transactions on Medical Imaging . conditional adversarial network, in: Computer Vision–ACCV 2018: 14th
Pan, S., Abouei, E., Wynne, J., Chang, C.W., Wang, T., Qiu, R.L., Li, Y., Peng, Asian Conference on Computer Vision, Perth, Australia, December 2–6,
J., Roper, J., Patel, P., et al., 2023. Synthetic ct generation from mri using 2018, Revised Selected Papers, Part I 14, Springer. pp. 298–313.
3d transformer-based denoising diffusion model. Medical Physics . Zhou, B., Zhou, S.K., 2020. Dudornet: learning a dual-domain recurrent
Ronneberger, O., Fischer, P., Brox, T., 2015. U-net: Convolutional networks network for fast mri reconstruction with deep t1 prior, in: Proceedings of
for biomedical image segmentation, in: Medical Image Computing and the IEEE/CVF conference on computer vision and pattern recognition, pp.
Computer-Assisted Intervention–MICCAI 2015: 18th International Con- 4273–4282.
ference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, Zhou, B., Zhou, S.K., Duncan, J.S., Liu, C., 2021. Limited view tomographic
Springer. pp. 234–241. reconstruction using a cascaded residual dense spatial-channel attention net-
Saharia, C., Chan, W., Chang, H., Lee, C., Ho, J., Salimans, T., Fleet, D., work with projection data fidelity layer. IEEE transactions on medical imag-
Norouzi, M., 2022. Palette: Image-to-image diffusion models, in: ACM ing 40, 1792–1804.
SIGGRAPH 2022 Conference Proceedings, pp. 1–10.
Sasaki, H., Willcocks, C.G., Breckon, T.P., 2021. Unit-ddpm: Unpaired im-
age translation with denoising diffusion probabilistic models. arXiv preprint
arXiv:2104.05358 .
Schlemper, J., Caballero, J., Hajnal, J.V., Price, A.N., Rueckert, D., 2017. A
deep cascade of convolutional neural networks for dynamic mr image recon-
struction. IEEE transactions on Medical Imaging 37, 491–503.
Shi, Y., Zhang, J., Ling, T., Lu, J., Zheng, Y., Yu, Q., Qi, L., Gao, Y., 2021.
Inconsistency-aware uncertainty estimation for semi-supervised medical im-
age segmentation. IEEE transactions on medical imaging 41, 608–620.
Sidky, E.Y., Pan, X., 2022. Report on the aapm deep-learning sparse-view ct
grand challenge. Medical Physics 49, 4935–4943.
Song, J., Meng, C., Ermon, S., 2020. Denoising diffusion implicit models.
arXiv preprint arXiv:2010.02502 .
Song, Y., Shen, L., Xing, L., Ermon, S., 2021. Solving inverse problems
in medical imaging with score-based generative models. arXiv preprint
arXiv:2111.08005 .
Wen, D., Nye, K., Zhou, B., Gilkeson, R.C., Gupta, A., Ranim, S., Couturier, S.,
Wilson, D.L., 2018. Enhanced coronary calcium visualization and detection

Numerical Analysis Durham UNI
No ratings yet
Numerical Analysis Durham UNI
87 pages
DLL Grade 10 Classroom Observation 1
100% (3)
DLL Grade 10 Classroom Observation 1
3 pages
Cryptography: Introduction: Prof. Ashok K Bhateja, IIT Delhi
No ratings yet
Cryptography: Introduction: Prof. Ashok K Bhateja, IIT Delhi
20 pages
Booth's Multiplication Algorithm
No ratings yet
Booth's Multiplication Algorithm
27 pages
Chapter 3 - Forecasting - EXCEL TEMPLATES
No ratings yet
Chapter 3 - Forecasting - EXCEL TEMPLATES
14 pages
Soft Computing
No ratings yet
Soft Computing
30 pages
Lecture10 PDF
No ratings yet
Lecture10 PDF
66 pages
FPGA - Ch0 - Folding
No ratings yet
FPGA - Ch0 - Folding
84 pages
Modeling and Simulation of Cryogenic Processes Using Ecosimpro
No ratings yet
Modeling and Simulation of Cryogenic Processes Using Ecosimpro
22 pages
Diagnostics: Transmed: Transformers Advance Multi-Modal Medical Image Classification
No ratings yet
Diagnostics: Transmed: Transformers Advance Multi-Modal Medical Image Classification
15 pages
Block and Stream Ciphers
No ratings yet
Block and Stream Ciphers
8 pages
On The Controllers' Design To Stabilize Ground Resonance Helicopter
No ratings yet
On The Controllers' Design To Stabilize Ground Resonance Helicopter
16 pages
Transformers in Medical Imaging
No ratings yet
Transformers in Medical Imaging
41 pages
Monticelli 1985
No ratings yet
Monticelli 1985
7 pages
Physics137a sp2012 mt2 Haxton Soln PDF
No ratings yet
Physics137a sp2012 mt2 Haxton Soln PDF
5 pages
@vtucode - in BCS515B Module 3 Textbook
No ratings yet
@vtucode - in BCS515B Module 3 Textbook
32 pages
AI-based Data Corrections For Attenuation and Scat
No ratings yet
AI-based Data Corrections For Attenuation and Scat
15 pages
DS Notes Removed
No ratings yet
DS Notes Removed
14 pages
Medical Image Analysis With Transformers
No ratings yet
Medical Image Analysis With Transformers
66 pages
Hw5 C2a Vergara
No ratings yet
Hw5 C2a Vergara
5 pages
General-Purpose Location Embeddings With Satellite Imagery
No ratings yet
General-Purpose Location Embeddings With Satellite Imagery
33 pages
Image-to-Image Translation: Methods and Applications
No ratings yet
Image-to-Image Translation: Methods and Applications
19 pages
A Review of Transfer Learning For Medical Image CL
No ratings yet
A Review of Transfer Learning For Medical Image CL
27 pages
ch7 Part1 4up
No ratings yet
ch7 Part1 4up
4 pages
CNS Lab Manual
No ratings yet
CNS Lab Manual
32 pages
NeurIPS 2019 Transfusion Understanding Transfer Learning For Medical Imaging Paper
No ratings yet
NeurIPS 2019 Transfusion Understanding Transfer Learning For Medical Imaging Paper
11 pages
Cyclegan Models For Mri Image Translation: Cassandra Czobit and Reza Samavi
No ratings yet
Cyclegan Models For Mri Image Translation: Cassandra Czobit and Reza Samavi
12 pages
Mam e
No ratings yet
Mam e
23 pages
NLA Lecture Notes
No ratings yet
NLA Lecture Notes
86 pages
1 s2.0 S1361841523003067 Main
No ratings yet
1 s2.0 S1361841523003067 Main
32 pages
A Preliminary Study On Accelerating Simulation Optimization With GPU Implementation
No ratings yet
A Preliminary Study On Accelerating Simulation Optimization With GPU Implementation
15 pages
Transformers in Medical Imaging - A Survey
No ratings yet
Transformers in Medical Imaging - A Survey
40 pages
Chapter MedicalImageGenerationusingGAN
No ratings yet
Chapter MedicalImageGenerationusingGAN
21 pages
Data Structure and Algorithm
No ratings yet
Data Structure and Algorithm
30 pages
Generative Adversarial Networks and Its Applications in The Biomedical Image Segmentation: A Comprehensive Survey
No ratings yet
Generative Adversarial Networks and Its Applications in The Biomedical Image Segmentation: A Comprehensive Survey
36 pages
Coa 2
No ratings yet
Coa 2
10 pages
Exp 7 PDF
No ratings yet
Exp 7 PDF
11 pages
Simplex Projection Walkthrough
No ratings yet
Simplex Projection Walkthrough
8 pages
Assessing The Ability of Generative Adversarial Networks To Learn Canonical Medical Image Statistics
No ratings yet
Assessing The Ability of Generative Adversarial Networks To Learn Canonical Medical Image Statistics
10 pages
The Use of Generative Adversarial Networks in Medical Image Augmentation
No ratings yet
The Use of Generative Adversarial Networks in Medical Image Augmentation
14 pages
AI in Medical Imaging Informatics Current Challenges and Future Directions
No ratings yet
AI in Medical Imaging Informatics Current Challenges and Future Directions
21 pages
Introduction To Medical Image Synthesis Using Deep LearningA Review
No ratings yet
Introduction To Medical Image Synthesis Using Deep LearningA Review
6 pages
1 s2.0 S0925231222003174 Main
No ratings yet
1 s2.0 S0925231222003174 Main
25 pages
PPML
No ratings yet
PPML
7 pages
Diffusion models in medical imaging - A comprehensive survey (科研通-ablesci.com)
No ratings yet
Diffusion models in medical imaging - A comprehensive survey (科研通-ablesci.com)
22 pages
11.deep Learning Applications in Medical Image Analysis-Brain Tumor
No ratings yet
11.deep Learning Applications in Medical Image Analysis-Brain Tumor
4 pages
1 s2.0 S001048252030086X Main
No ratings yet
1 s2.0 S001048252030086X Main
7 pages
Radia Fedr
No ratings yet
Radia Fedr
7 pages
LambScargle VanderPlas 2018 ApJS 236 16
No ratings yet
LambScargle VanderPlas 2018 ApJS 236 16
28 pages
Conversion Between CT and MRI Images Using Diffusion and Score-Matching Models
No ratings yet
Conversion Between CT and MRI Images Using Diffusion and Score-Matching Models
10 pages
MedCoDi-M: A Multi-Prompt Foundation Model For Multimodal Medical Data Generation
No ratings yet
MedCoDi-M: A Multi-Prompt Foundation Model For Multimodal Medical Data Generation
40 pages
Deep Learning For Medical Image Analysis Applicati
No ratings yet
Deep Learning For Medical Image Analysis Applicati
10 pages
Denck JImag 2021
No ratings yet
Denck JImag 2021
13 pages
EHealth
No ratings yet
EHealth
16 pages
Block Chain Technology Mini Project 2 FINAL
No ratings yet
Block Chain Technology Mini Project 2 FINAL
31 pages
A Review of Predictive and Contrastive Self-Supervised Learning For Medical Images
No ratings yet
A Review of Predictive and Contrastive Self-Supervised Learning For Medical Images
35 pages
Project Work
No ratings yet
Project Work
46 pages
Data Structures USING C
No ratings yet
Data Structures USING C
3 pages
Research Paper
No ratings yet
Research Paper
12 pages
A Review On Generative Adversarial Networks Used For Image Reconstruction in Medical Imaging
No ratings yet
A Review On Generative Adversarial Networks Used For Image Reconstruction in Medical Imaging
5 pages
2 TM
No ratings yet
2 TM
7 pages
VikasSwaroop Resume AI
No ratings yet
VikasSwaroop Resume AI
1 page
Med Research Paper
No ratings yet
Med Research Paper
18 pages
Improving Quality of Medical Scans Using GANs
No ratings yet
Improving Quality of Medical Scans Using GANs
7 pages
Synthetic CT Generation From MRI Using 3D Transformer-Based Denoising Diffusion Model
No ratings yet
Synthetic CT Generation From MRI Using 3D Transformer-Based Denoising Diffusion Model
22 pages
AI in MRI
No ratings yet
AI in MRI
9 pages
Medical Research Paper
No ratings yet
Medical Research Paper
13 pages
Principles of Database Systems Study Material
No ratings yet
Principles of Database Systems Study Material
186 pages
Using Software Engineering Technology To Improve T
No ratings yet
Using Software Engineering Technology To Improve T
11 pages
Sciadv Abb7973
No ratings yet
Sciadv Abb7973
12 pages
Paper 53-Evaluating The Effectiveness of Brain Tumor Image
No ratings yet
Paper 53-Evaluating The Effectiveness of Brain Tumor Image
10 pages
Difussion Models Survey
No ratings yet
Difussion Models Survey
33 pages
Project 1
No ratings yet
Project 1
14 pages
Sensors: Frequency-Domain-Based Structure Losses For Cyclegan-Based Cone-Beam Computed Tomography Translation
No ratings yet
Sensors: Frequency-Domain-Based Structure Losses For Cyclegan-Based Cone-Beam Computed Tomography Translation
20 pages
23mid0119 VL2024250503941 Ast01
No ratings yet
23mid0119 VL2024250503941 Ast01
2 pages
Tutorial Sheet 2
No ratings yet
Tutorial Sheet 2
1 page
Flow Chart
No ratings yet
Flow Chart
1 page
Synthetic CT Generation From Mri Using Improved Dualgan
No ratings yet
Synthetic CT Generation From Mri Using Improved Dualgan
5 pages
IJISRT24JAN1199
No ratings yet
IJISRT24JAN1199
8 pages
GAN-based Synthetic Medical Image Augmentation
No ratings yet
GAN-based Synthetic Medical Image Augmentation
10 pages
GAN Review - Models and Medical Image Fusion Applications
No ratings yet
GAN Review - Models and Medical Image Fusion Applications
15 pages
Denker2021 - Conditional Normalizing Flow
No ratings yet
Denker2021 - Conditional Normalizing Flow
27 pages
Bioengineering 11 00406 v2
No ratings yet
Bioengineering 11 00406 v2
21 pages
Chapter 8
No ratings yet
Chapter 8
6 pages
Attention Based Cross-Domain Synthesis and Segmentation From Unpaired Medical Images
No ratings yet
Attention Based Cross-Domain Synthesis and Segmentation From Unpaired Medical Images
13 pages
P3 - III - I B.tech Revaluation Results NOV-2024
No ratings yet
P3 - III - I B.tech Revaluation Results NOV-2024
5 pages
Surveycont
No ratings yet
Surveycont
37 pages
Seminar Report
No ratings yet
Seminar Report
4 pages
Measurement Guidance in Diffusion Models Insight From Medical Image Synthesis
No ratings yet
Measurement Guidance in Diffusion Models Insight From Medical Image Synthesis
15 pages
Mutual Information Guided Diffusion For Zero-Shot Cross-Modality Medical Image Translation
No ratings yet
Mutual Information Guided Diffusion For Zero-Shot Cross-Modality Medical Image Translation
14 pages
Low-Dose CT Image Synthesis For Domain Adaptation Imaging Using A Generative Adversarial Network With Noise Encoding Transfer Learning
No ratings yet
Low-Dose CT Image Synthesis For Domain Adaptation Imaging Using A Generative Adversarial Network With Noise Encoding Transfer Learning
15 pages
Image-to-Image Translation: Methods and Applications: Yingxue Pang, Jianxin Lin, Tao Qin, and Zhibo Chen
No ratings yet
Image-to-Image Translation: Methods and Applications: Yingxue Pang, Jianxin Lin, Tao Qin, and Zhibo Chen
24 pages
First Review
No ratings yet
First Review
10 pages
Multistage Transfer Learning For Medical Images
No ratings yet
Multistage Transfer Learning For Medical Images
47 pages
CheatSheet Scikit Learn
No ratings yet
CheatSheet Scikit Learn
10 pages
Zeroth Review
No ratings yet
Zeroth Review
9 pages
First Review
No ratings yet
First Review
11 pages
Image-To-Image Translation Methods and Applications
No ratings yet
Image-To-Image Translation Methods and Applications
23 pages
Generative Adversarial Networks (Gans) For Medical Image Processing: Recent Advancements
No ratings yet
Generative Adversarial Networks (Gans) For Medical Image Processing: Recent Advancements
14 pages
Final
No ratings yet
Final
21 pages
Artificial Intelligence for Image Super Resolution
From Everand
Artificial Intelligence for Image Super Resolution
Debmitra Ghosh
No ratings yet

Research Paper Medical

Uploaded by

Research Paper Medical

Uploaded by

Medical Image Analysis (2024)

Contents lists available at ScienceDirect

Medical Image Analysis

Cascaded Multi-path Shortcut Diffusion Model for Medical Image Translation

Duncana,b,c , Chi Liua,b , Bo Zhoug,∗

ARTICLE INFO ABSTRACT

© 2024 Elsevier B. V. All rights reserved.

1. Introduction imaging modalities, such as Digital Radiography (DR), Com-

quired with a reduced number of projection views, into the full-

Algorithm 1: Inference Process - Cascaded Multi-path Shortcut Diffusion Model (CMDM)

You might also like