Post

Final Year Project: Sparse-View CT Reconstruction

My undergraduate Final Year Project on sparse-view CT reconstruction using physics-guided deep learning, with a companion study guide on the theoretical foundations.

Final Year Project: Sparse-View CT Reconstruction

🎯 Project Overview

Sparse-View CT Reconstruction for Semiconductor and Human Imaging

Summary: Computed Tomography (CT) is a critical imaging modality across both semiconductor inspection and medical diagnostics, enabling non-destructive visualization of internal structures. In medical settings, conventional CT requires many projection views, leading to high radiation doses, while in semiconductor inspection, acquiring dense projections increases acquisition time and throughput costs. A key challenge is therefore to reconstruct high-quality CT images from sparse projection data, reducing dose, time, and resource demands without sacrificing structural fidelity.

This project aims to develop advanced sparse-view CT reconstruction algorithms that are broadly applicable to semiconductor devices and human anatomy. Traditional methods such as filtered back projection (FBP) fail under sparse sampling, introducing streaking artifacts and detail loss. To overcome these challenges, we will design physics-guided deep learning frameworks that integrate knowledge of CT forward models with the expressive power of neural networks.

Keywords: CT Imaging · Sparse Reconstruction · Physics-guided Deep Learning · Inverse Problems

Reference survey: Review of Sparse-View or Limited-Angle CT Reconstruction Based on Deep Learning


📖 Chapter 1: Why Do We Need These Unconventional CT Approaches?

The Fundamental Trade-off in CT Imaging

CT (Computed Tomography) works by firing X-rays from 300+ angles around the patient and recording the transmitted signal at each angle. The conflict between image quality and patient safety drives the development of sparse and limited-angle CT.

Full-dose CT captures projections at many angles with high X-ray intensity, producing clear images, but exposes the patient to ionizing radiation that carries a carcinogenic risk.

Low-dose CT reduces radiation by capturing fewer or weaker projections. This gives rise to two main acquisition strategies:

  • Sparse-view CT: Instead of projections at every degree, only every 5th degree is acquired. Many intermediate angles are missing.
  • Limited-angle CT: Projections can only be captured within a restricted angular range (e.g., 0° to 120°), with the remaining angles physically inaccessible.

Resulting Image Artifacts

Reconstructing a 3D volume from an insufficient set of projections produces characteristic degradations:

  • Streak artifacts: Sharp linear bands overlaying the image, resembling scribbles across the scan
  • Noise: Loss of clarity, appearing as a grainy, snow-like texture
  • Ill-posed problem: Fewer measurements than unknowns means there is no unique, deterministic solution
MethodSpeedArtifact HandlingPractical Use
FBP (analytical)⚡ FastPoor at sparse viewsClinical standard (full-view)
Iterative (TV)🐢 Minutes/scanGoodLimited clinical use
Deep Learning⚡ FastExcellentEmerging clinical use

Ill-posed problem is the mathematical root cause of reconstruction difficulty — more unknowns than measurements means infinite possible solutions exist.


🔬 Chapter 2: From X-Rays to Sinograms to CT Slices — The Physics Pipeline

X-Ray Attenuation

As X-rays pass through tissue, they are partially absorbed by bone, muscle, and other structures. Lambert-Beer Law describes this attenuation mathematically. Dense bone absorbs more X-rays; air-filled lungs absorb almost none. The detector records the surviving energy, providing a measure of how much material the ray passed through.

The Sinogram: CT’s Raw Data Format

A single X-ray projection captures one “shadow” of the body at one angle. Stacking all projection profiles into a single 2D image produces the sinogram.

Why “sinogram”? A single point inside the body traces a perfect sinusoidal curve in the sinogram as the X-ray source rotates.

Radon Transform and Back-Projection

  • Radon Transform (forward process): Converts the internal density map (CT image) into the sinogram by integrating density along each X-ray path
  • Back-projection (inverse process): Smears the sinogram data back into image space to reconstruct the object

FBP: The Classic Reconstruction Algorithm

FBP (Filtered Back-Projection) applies a sharpening filter before back-projecting:

  1. Fourier transform: Convert each projection angle into the frequency domain
  2. Ramp filter: Multiply byωto suppress low frequencies (which cause blur) and boost high frequencies (which carry edges and fine detail)
  3. Back-projection: Smear the filtered data back into image space

FBP’s weakness: The algorithm assumes complete projection data at every angle. When data is missing — as in sparse-view or limited-angle CT — the reconstruction contains severe streak artifacts.


🧮 Chapter 3: The Mathematics of CT Reconstruction

The Radon Transform: Generating Projections

Consider an object described by a 2D density function $f(x, y)$. Each point on a projection profile is the line integral of density along the corresponding ray path:

\[g(\rho, \theta) = \int_L f(x, y) \, dl\]

where θ is the rotation angle and ρ is the perpendicular distance from the rotation center to the ray.

The Central Slice Theorem

The 1D Fourier transform of a projection at angle θ equals the 2D Fourier transform of the object, evaluated along a line through the origin at the same angle.

In other words, each projection profile, when Fourier-transformed, gives one “slice” through the object’s 2D frequency spectrum. This is the mathematical foundation of FBP.

FBP: Three Steps to a Sharp Image

The blurring in simple back-projection arises because the Fourier slices sample the 2D frequency plane densely near the origin. The ramp filter $\omega$ compensates by suppressing over-represented low frequencies and amplifying under-represented high frequencies that encode edges.

🖼️ Chapter 4: AI as an Image Enhancer — Image Domain Post-Processing

The simplest strategy is to let FBP perform the geometric reconstruction (however noisy), then train a neural network to clean up the resulting image.

U-Net: The Workhorse of Medical Image Processing

U-Net is the most widely cited architecture for this task:

  • Encoder (contracting path): Progressively reduces spatial resolution while increasing feature depth
  • Decoder (expanding path): Progressively restores spatial resolution to produce a full-resolution output
  • Skip connections: Preserve fine spatial detail (vessel edges, organ boundaries) that would otherwise be lost during downsampling

GAN-Based Refinement

Standard U-Net outputs can appear over-smoothed (“plastic-looking”). Generative Adversarial Networks address this:

  • Generator: Produces the denoised CT image
  • Discriminator: Learns to distinguish AI-generated outputs from real full-dose scans

Diffusion Models (DDPM)

Diffusion models have emerged as the highest-quality generative approach — iterative denoising produces extremely stable, high-fidelity outputs but requires many inference steps, making it slow for time-critical clinical applications.

Residual Learning

A key training strategy across all image-domain methods is residual learning: rather than predicting the clean image directly, the network predicts the artifact/noise component to be subtracted from the FBP input:

\[\text{Clean image} = \text{FBP image} - \text{Predicted artifact}\]

This simplifies the learning problem and accelerates convergence.

Important caveat: If the FBP reconstruction has lost structural information entirely (e.g., a small lesion completely obscured by a streak), even the best image-domain network cannot recover it — it can only hallucinate. This is the primary motivation for sinogram-domain and dual-domain approaches.


📡 Chapter 5: AI as a Data Repairer — Sinogram Domain Pre-Processing

Instead of fixing the image after reconstruction, sinogram-domain methods repair the missing projection data before reconstruction. If the sinogram can be completed, standard FBP can produce a clean image without any deep learning in the image domain.

Motivation

Sparse-view CT sinograms look like a venetian blind — regular columns of measured data separated by gaps of missing angles. The goal is to fill those gaps.

Three Completion Methods

  1. Linear Interpolation + CNN Refinement: Use traditional linear interpolation to fill gaps with a rough estimate, then apply a CNN to correct the interpolation error
  2. Multi-Scale CNN with Dense Connections: Multi-scale receptive fields capture large-scale sinusoidal trends and fine local detail simultaneously
  3. GAN-Based Completion: When missing angles are numerous (e.g., limited-angle CT), GANs can learn to generate entire missing regions

Limitation: Errors in the completed sinogram can be amplified by FBP into visible artifacts in the reconstructed image. The sinogram-domain network also lacks direct awareness of what the corresponding anatomical structures should look like.


🔁 Chapter 6: Dual-Domain Joint Processing — The Best of Both Worlds

flowchart LR
    S["📡 Sinogram"]:::input --> SN["🤖 Sino-Net\n(repair gaps)"]:::net
    SN --> FBP["⚡ Differentiable\nFBP"]:::proc
    FBP --> IN["🤖 Image-Net\n(remove artifacts)"]:::net
    IN --> DC["🔄 Data\nConsistency"]:::check
    DC --> SN

    classDef input fill:#4A90D9,stroke:#2c5f8a,color:#fff
    classDef net fill:#5BA85A,stroke:#3a6e39,color:#fff
    classDef proc fill:#888,stroke:#555,color:#fff
    classDef check fill:#D97B4A,stroke:#9e5430,color:#fff

Dual-domain methods combine sinogram-domain and image-domain processing, with the two branches communicating through the reconstruction operator.

The Enabling Technology: Differentiable FBP

Traditionally, FBP was a standalone algorithm outside the neural network. Differentiable FBP embeds the FBP operation as a differentiable layer inside the network graph. This allows gradients to flow through the reconstruction step, so the sinogram-domain network can “anticipate” how its outputs will affect the final image quality.

Three Dual-Domain Strategies

  1. Serial approach (SPID model): Sinogram CNN → Differentiable FBP → Image CNN
  2. Iterative approach (DuDoDR-Net): Alternating cycles of sinogram correction → FBP → image refinement → re-projection → correction…
  3. Transformer-enhanced approach: Swin Transformer architectures applied in both domains, capturing long-range dependencies

Trade-off: Dual-domain architectures are significantly more complex and memory-intensive, requiring multiple forward/backward projections and two interacting networks.


🔄 Chapter 7: Deep Learning Meets Iterative Reconstruction

Rather than treating deep learning and iterative reconstruction as separate paradigms, this approach integrates them — the neural network becomes part of the iterative algorithm.

Classical Iterative Reconstruction

Traditional iterative reconstruction solves:

\[\hat{x} = \arg\min_x \frac{1}{2} \|Ax - y\|^2 + \lambda R(x)\]

where:

  • $Ax = y$ is the data fidelity term — the reconstructed image $x$, when forward-projected through operator $A$, should match the measured projections $y$
  • $R(x)$ is the regularization term — a prior assumption about what a good CT image looks like
  • $\lambda$ is the regularization weight, balancing fidelity and prior

Learned Regularization

Deep learning replaces the hand-crafted regularizer $R(x)$ with a learned regularizer: a neural network trained on thousands of CT images that captures rich, data-driven priors about realistic anatomy.

Deep Unrolling

Deep unrolling “unrolls” the iterative optimization loop into a fixed-depth neural network:

  • Each layer of the network corresponds to one iteration of the optimization algorithm
  • The step sizes, regularization weights, and other parameters become learnable parameters optimized by backpropagation

FISTA-Net is a prominent example: it unrolls the FISTA algorithm into a neural network, inheriting FISTA’s theoretical convergence guarantees while learning all parameters end-to-end.


🔮 Chapter 8: End-to-End Mapping — Direct Sinogram-to-Image Reconstruction

The most radical departure from traditional CT pipelines: a neural network takes the raw sinogram as input and outputs the reconstructed CT image directly, without any explicit FBP or iterative step.

The core problem:

\[y = Ax + u\]

where $y$ is the measured sinogram, $x$ is the unknown image, $A$ is the Radon transform, and $u$ is noise. End-to-end methods learn the inverse mapping from $y$ directly to $x$.

AUTOMAP: Fully Learned Reconstruction

AUTOMAP uses fully connected layers to learn the complete mapping from sinogram to image. Limitation: For 512×512 images, the fully connected layer would contain billions of parameters — computationally prohibitive.

iRadonMAP: Physics-Informed End-to-End Networks

iRadonMAP structures its layers to mirror the FBP pipeline — filtering step, then back-projection step — but implements each with learnable parameters rather than fixed formulas. A differentiable back-projection layer encodes the geometric relationship as a network operation.

Self-Supervised and Unsupervised Methods

IntraTomo and related self-supervised approaches circumvent the need for paired training data by leveraging the physics of CT itself as the training signal:

  1. Initialize with a random image estimate
  2. Forward-project the estimate back into sinogram space
  3. Compare the simulated sinogram with the acquired measurements
  4. Update the image estimate to reduce the mismatch

No ground truth image is ever needed.


💡 Summary: Five Paradigms in Deep Learning CT Reconstruction

ParadigmPipelineStrengthWeakness
Image domain post-processingFBP → CNNSimple, fast trainingCannot recover lost information
Sinogram domain pre-processingCNN → FBPPhysically motivatedErrors amplified by FBP
Dual-domainSinoNet ↔ FBP ↔ ImgNetHighest accuracy, physically consistentComplex, memory-intensive
Iterative + deep learningUnrolled optimizationTheoretically grounded, adaptiveRequires forward projection operator
End-to-endSinogram → CNN → ImageFully data-drivenHard to train; needs massive datasets

The field has progressed from “fix the image after reconstruction” to “jointly optimize physics and learning” — a trajectory that continues to push the boundaries of what is achievable from minimal radiation dose.


Part of my FYP documentation series. Next: experimental results comparing FBP-ConvNet vs. dual-domain approaches on the AAPM dataset.

This post is licensed under CC BY 4.0 by the author.