0% found this document useful (0 votes)
26 views6 pages

FFT Slides

The document discusses methods for implementing fast cryogenic FFT for sensors operating at 50GHz, highlighting the limitations of fully parallel implementations in cryogenic environments. It presents two approaches: a pipelined FFT using Passive Transmission Lines for buffering and a sliding window FFT that updates previous FFT values with new samples. The sliding FFT can be optimized further with batch updates to reduce throughput requirements, though this increases hardware demands.

Uploaded by

mcraftdm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views6 pages

FFT Slides

The document discusses methods for implementing fast cryogenic FFT for sensors operating at 50GHz, highlighting the limitations of fully parallel implementations in cryogenic environments. It presents two approaches: a pipelined FFT using Passive Transmission Lines for buffering and a sliding window FFT that updates previous FFT values with new samples. The sliding FFT can be optimized further with batch updates to reduce throughput requirements, though this increases hardware demands.

Uploaded by

mcraftdm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

UW-Madison • Department of Electrical and Computer Engineering

Physical Computation Laboratory

Fast cryogenic FFT for sensors

George Tzimpragos
April 24, 2025
Fully Parallel FFT
- We assume the FFT is applied to
a stream of samples from a sensor that
operates at 50GHz

- In the fully parallel implementation,


butterfly units form a network that
connects inputs a fixed
distance apart at each stage

- A fully parallel implementation is not


suitable for the cryogenic
environment’s area and power budget

2
Approach 1: Pipelined FFT
- To apply the FFT to a steaming input
sequence, we can use a pipelined
implementation like SDF (Figure)
- At each stage, inputs get buffered for a
delay according to the distance of
connected samples in the fully parallel
network, so they get aligned to be
processed by a butterfly unit
- All memory for this implementation is in
the form of delays of constant size,
which can be implemented cheaply with
delay constructs based on Passive
Transmission Lines (PTLs)
- There is no feedback from the outputs
of a stage to its inputs, so the butterfly
units can be further pipelined to
increase throughput

3
Reducing clock speed
- In order to provide a throughput of 50
GSa/s at a lower clock speed for the
butterfly units, we can divide the inputs
round-robin to create slower
sub-sequences, apply a pipelined FFT to
each in parallel, then combine them
(MDF architecture)
- While this architecture has 50% utilization,
there are similar architectures that fix this by
shuffling the buffered data (eg MDC)
- The tradeoff between lowering operating
speed requirements and increasing
hardware requirements due to increased
parallelism extends to considerations of
different logic families (eg RSFQ-Faster,
xSFQ-More area efficient)

4
Approach 2: Sliding window FFT
- In a continuous input setting, rather than calculating a new FFT from
scratch every N samples, the known previous values of the FFT can be
updated for each new sample using a complex multiplication and an
addition for each frequency bin
- Unfortunately, a feedback loop that includes that addition and
multiplication from one cycle to the next prohibits the pipelining of these
operations, so the clock frequency cannot be reduced under 50 GHz

5
Sliding FFT with Batch updates
- We can reduce the throughput
requirement by collecting a batch of k
consecutive samples and applying the
update to the value of the FFT for all
of them together, parallelizing the
complex multiplications
- The batch size can be increased until
the update can be performed at a
feasible clock frequency, but more
hardware will be required for the
parallelized calculation
- Sliding FFT is appealing when the
downstream task only uses the results
for a few frequencies, as only those will
need to be implemented

You might also like