Final Year Project: Sparse-View CT Reconstruction
My undergraduate Final Year Project on sparse-view CT reconstruction using physics-guided deep learning, with a companion study guide on the theoretical foundations.
🎯 Project Overview
Sparse-View CT Reconstruction for Semiconductor and Human Imaging
Summary: Computed Tomography (CT) is a critical imaging modality across both semiconductor inspection and medical diagnostics, enabling non-destructive visualization of internal structures. In medical settings, conventional CT requires many projection views, leading to high radiation doses, while in semiconductor inspection, acquiring dense projections increases acquisition time and throughput costs. A key challenge is therefore to reconstruct high-quality CT images from sparse projection data, reducing dose, time, and resource demands without sacrificing structural fidelity.
This project aims to develop advanced sparse-view CT reconstruction algorithms that are broadly applicable to semiconductor devices and human anatomy. Traditional methods such as filtered back projection (FBP) fail under sparse sampling, introducing streaking artifacts and detail loss. To overcome these challenges, we will design physics-guided deep learning frameworks that integrate knowledge of CT forward models with the expressive power of neural networks.
Keywords: CT Imaging · Sparse Reconstruction · Physics-guided Deep Learning · Inverse Problems
Reference survey: Review of Sparse-View or Limited-Angle CT Reconstruction Based on Deep Learning
📖 Chapter 1: Why Do We Need These Unconventional CT Approaches?
The Fundamental Trade-off in CT Imaging
CT (Computed Tomography) works by firing X-rays from 300+ angles around the patient and recording the transmitted signal at each angle. The conflict between image quality and patient safety drives the development of sparse and limited-angle CT.
Full-dose CT captures projections at many angles with high X-ray intensity, producing clear images, but exposes the patient to ionizing radiation that carries a carcinogenic risk.
Low-dose CT reduces radiation by capturing fewer or weaker projections. This gives rise to two main acquisition strategies:
- Sparse-view CT: Instead of projections at every degree, only every 5th degree is acquired. Many intermediate angles are missing.
- Limited-angle CT: Projections can only be captured within a restricted angular range (e.g., 0° to 120°), with the remaining angles physically inaccessible.
Resulting Image Artifacts
Reconstructing a 3D volume from an insufficient set of projections produces characteristic degradations:
- Streak artifacts: Sharp linear bands overlaying the image, resembling scribbles across the scan
- Noise: Loss of clarity, appearing as a grainy, snow-like texture
- Ill-posed problem: Fewer measurements than unknowns means there is no unique, deterministic solution
| Method | Speed | Artifact Handling | Practical Use |
|---|---|---|---|
| FBP (analytical) | ⚡ Fast | Poor at sparse views | Clinical standard (full-view) |
| Iterative (TV) | 🐢 Minutes/scan | Good | Limited clinical use |
| Deep Learning | ⚡ Fast | Excellent | Emerging clinical use |
Ill-posed problem is the mathematical root cause of reconstruction difficulty — more unknowns than measurements means infinite possible solutions exist.
🔬 Chapter 2: From X-Rays to Sinograms to CT Slices — The Physics Pipeline
X-Ray Attenuation
As X-rays pass through tissue, they are partially absorbed by bone, muscle, and other structures. Lambert-Beer Law describes this attenuation mathematically. Dense bone absorbs more X-rays; air-filled lungs absorb almost none. The detector records the surviving energy, providing a measure of how much material the ray passed through.
The Sinogram: CT’s Raw Data Format
A single X-ray projection captures one “shadow” of the body at one angle. Stacking all projection profiles into a single 2D image produces the sinogram.
Why “sinogram”? A single point inside the body traces a perfect sinusoidal curve in the sinogram as the X-ray source rotates.
Radon Transform and Back-Projection
- Radon Transform (forward process): Converts the internal density map (CT image) into the sinogram by integrating density along each X-ray path
- Back-projection (inverse process): Smears the sinogram data back into image space to reconstruct the object
FBP: The Classic Reconstruction Algorithm
FBP (Filtered Back-Projection) applies a sharpening filter before back-projecting:
- Fourier transform: Convert each projection angle into the frequency domain
Ramp filter: Multiply by ω to suppress low frequencies (which cause blur) and boost high frequencies (which carry edges and fine detail) - Back-projection: Smear the filtered data back into image space
FBP’s weakness: The algorithm assumes complete projection data at every angle. When data is missing — as in sparse-view or limited-angle CT — the reconstruction contains severe streak artifacts.
🧮 Chapter 3: The Mathematics of CT Reconstruction
The Radon Transform: Generating Projections
Consider an object described by a 2D density function $f(x, y)$. Each point on a projection profile is the line integral of density along the corresponding ray path:
\[g(\rho, \theta) = \int_L f(x, y) \, dl\]where θ is the rotation angle and ρ is the perpendicular distance from the rotation center to the ray.
The Central Slice Theorem
The 1D Fourier transform of a projection at angle θ equals the 2D Fourier transform of the object, evaluated along a line through the origin at the same angle.
In other words, each projection profile, when Fourier-transformed, gives one “slice” through the object’s 2D frequency spectrum. This is the mathematical foundation of FBP.
FBP: Three Steps to a Sharp Image
| The blurring in simple back-projection arises because the Fourier slices sample the 2D frequency plane densely near the origin. The ramp filter $ | \omega | $ compensates by suppressing over-represented low frequencies and amplifying under-represented high frequencies that encode edges. |
🖼️ Chapter 4: AI as an Image Enhancer — Image Domain Post-Processing
The simplest strategy is to let FBP perform the geometric reconstruction (however noisy), then train a neural network to clean up the resulting image.
U-Net: The Workhorse of Medical Image Processing
U-Net is the most widely cited architecture for this task:
- Encoder (contracting path): Progressively reduces spatial resolution while increasing feature depth
- Decoder (expanding path): Progressively restores spatial resolution to produce a full-resolution output
- Skip connections: Preserve fine spatial detail (vessel edges, organ boundaries) that would otherwise be lost during downsampling
GAN-Based Refinement
Standard U-Net outputs can appear over-smoothed (“plastic-looking”). Generative Adversarial Networks address this:
- Generator: Produces the denoised CT image
- Discriminator: Learns to distinguish AI-generated outputs from real full-dose scans
Diffusion Models (DDPM)
Diffusion models have emerged as the highest-quality generative approach — iterative denoising produces extremely stable, high-fidelity outputs but requires many inference steps, making it slow for time-critical clinical applications.
Residual Learning
A key training strategy across all image-domain methods is residual learning: rather than predicting the clean image directly, the network predicts the artifact/noise component to be subtracted from the FBP input:
\[\text{Clean image} = \text{FBP image} - \text{Predicted artifact}\]This simplifies the learning problem and accelerates convergence.
Important caveat: If the FBP reconstruction has lost structural information entirely (e.g., a small lesion completely obscured by a streak), even the best image-domain network cannot recover it — it can only hallucinate. This is the primary motivation for sinogram-domain and dual-domain approaches.
📡 Chapter 5: AI as a Data Repairer — Sinogram Domain Pre-Processing
Instead of fixing the image after reconstruction, sinogram-domain methods repair the missing projection data before reconstruction. If the sinogram can be completed, standard FBP can produce a clean image without any deep learning in the image domain.
Motivation
Sparse-view CT sinograms look like a venetian blind — regular columns of measured data separated by gaps of missing angles. The goal is to fill those gaps.
Three Completion Methods
- Linear Interpolation + CNN Refinement: Use traditional linear interpolation to fill gaps with a rough estimate, then apply a CNN to correct the interpolation error
- Multi-Scale CNN with Dense Connections: Multi-scale receptive fields capture large-scale sinusoidal trends and fine local detail simultaneously
- GAN-Based Completion: When missing angles are numerous (e.g., limited-angle CT), GANs can learn to generate entire missing regions
Limitation: Errors in the completed sinogram can be amplified by FBP into visible artifacts in the reconstructed image. The sinogram-domain network also lacks direct awareness of what the corresponding anatomical structures should look like.
🔁 Chapter 6: Dual-Domain Joint Processing — The Best of Both Worlds
flowchart LR
S["📡 Sinogram"]:::input --> SN["🤖 Sino-Net\n(repair gaps)"]:::net
SN --> FBP["⚡ Differentiable\nFBP"]:::proc
FBP --> IN["🤖 Image-Net\n(remove artifacts)"]:::net
IN --> DC["🔄 Data\nConsistency"]:::check
DC --> SN
classDef input fill:#4A90D9,stroke:#2c5f8a,color:#fff
classDef net fill:#5BA85A,stroke:#3a6e39,color:#fff
classDef proc fill:#888,stroke:#555,color:#fff
classDef check fill:#D97B4A,stroke:#9e5430,color:#fff
Dual-domain methods combine sinogram-domain and image-domain processing, with the two branches communicating through the reconstruction operator.
The Enabling Technology: Differentiable FBP
Traditionally, FBP was a standalone algorithm outside the neural network. Differentiable FBP embeds the FBP operation as a differentiable layer inside the network graph. This allows gradients to flow through the reconstruction step, so the sinogram-domain network can “anticipate” how its outputs will affect the final image quality.
Three Dual-Domain Strategies
- Serial approach (SPID model): Sinogram CNN → Differentiable FBP → Image CNN
- Iterative approach (DuDoDR-Net): Alternating cycles of sinogram correction → FBP → image refinement → re-projection → correction…
- Transformer-enhanced approach: Swin Transformer architectures applied in both domains, capturing long-range dependencies
Trade-off: Dual-domain architectures are significantly more complex and memory-intensive, requiring multiple forward/backward projections and two interacting networks.
🔄 Chapter 7: Deep Learning Meets Iterative Reconstruction
Rather than treating deep learning and iterative reconstruction as separate paradigms, this approach integrates them — the neural network becomes part of the iterative algorithm.
Classical Iterative Reconstruction
Traditional iterative reconstruction solves:
\[\hat{x} = \arg\min_x \frac{1}{2} \|Ax - y\|^2 + \lambda R(x)\]where:
- $Ax = y$ is the data fidelity term — the reconstructed image $x$, when forward-projected through operator $A$, should match the measured projections $y$
- $R(x)$ is the regularization term — a prior assumption about what a good CT image looks like
- $\lambda$ is the regularization weight, balancing fidelity and prior
Learned Regularization
Deep learning replaces the hand-crafted regularizer $R(x)$ with a learned regularizer: a neural network trained on thousands of CT images that captures rich, data-driven priors about realistic anatomy.
Deep Unrolling
Deep unrolling “unrolls” the iterative optimization loop into a fixed-depth neural network:
- Each layer of the network corresponds to one iteration of the optimization algorithm
- The step sizes, regularization weights, and other parameters become learnable parameters optimized by backpropagation
FISTA-Net is a prominent example: it unrolls the FISTA algorithm into a neural network, inheriting FISTA’s theoretical convergence guarantees while learning all parameters end-to-end.
🔮 Chapter 8: End-to-End Mapping — Direct Sinogram-to-Image Reconstruction
The most radical departure from traditional CT pipelines: a neural network takes the raw sinogram as input and outputs the reconstructed CT image directly, without any explicit FBP or iterative step.
The core problem:
\[y = Ax + u\]where $y$ is the measured sinogram, $x$ is the unknown image, $A$ is the Radon transform, and $u$ is noise. End-to-end methods learn the inverse mapping from $y$ directly to $x$.
AUTOMAP: Fully Learned Reconstruction
AUTOMAP uses fully connected layers to learn the complete mapping from sinogram to image. Limitation: For 512×512 images, the fully connected layer would contain billions of parameters — computationally prohibitive.
iRadonMAP: Physics-Informed End-to-End Networks
iRadonMAP structures its layers to mirror the FBP pipeline — filtering step, then back-projection step — but implements each with learnable parameters rather than fixed formulas. A differentiable back-projection layer encodes the geometric relationship as a network operation.
Self-Supervised and Unsupervised Methods
IntraTomo and related self-supervised approaches circumvent the need for paired training data by leveraging the physics of CT itself as the training signal:
- Initialize with a random image estimate
- Forward-project the estimate back into sinogram space
- Compare the simulated sinogram with the acquired measurements
- Update the image estimate to reduce the mismatch
No ground truth image is ever needed.
💡 Summary: Five Paradigms in Deep Learning CT Reconstruction
| Paradigm | Pipeline | Strength | Weakness |
|---|---|---|---|
| Image domain post-processing | FBP → CNN | Simple, fast training | Cannot recover lost information |
| Sinogram domain pre-processing | CNN → FBP | Physically motivated | Errors amplified by FBP |
| Dual-domain | SinoNet ↔ FBP ↔ ImgNet | Highest accuracy, physically consistent | Complex, memory-intensive |
| Iterative + deep learning | Unrolled optimization | Theoretically grounded, adaptive | Requires forward projection operator |
| End-to-end | Sinogram → CNN → Image | Fully data-driven | Hard to train; needs massive datasets |
The field has progressed from “fix the image after reconstruction” to “jointly optimize physics and learning” — a trajectory that continues to push the boundaries of what is achievable from minimal radiation dose.
Part of my FYP documentation series. Next: experimental results comparing FBP-ConvNet vs. dual-domain approaches on the AAPM dataset.