Understanding CT Projection Data — Why .npy and What It Actually Means

CT projection data lives in .npy files — here's the why, the what, and how to actually read them.

Posted Mar 24, 2026

By YuXuan Yan

3 min read

Understanding CT Projection Data — Why .npy and What It Actually Means

CT projection data is just a NumPy array — but knowing exactly what that array means is the difference between debugging blind and understanding your pipeline.

🗂️ Why `.npy`? — Storage Format Rationale

.npy is the native binary serialization format for NumPy arrays. For numerical imaging data, it wins on almost every axis:

Format	Speed	Size	Precision	Dependencies
`.npy`	⚡ Fast (memmap)	✅ Compact	✅ Exact	NumPy only
`.csv`	🐢 Slow	❌ Large	⚠️ Lossy	None
`.h5`	✅ Fast	✅ Compact	✅ Exact	`h5py`
`.tiff`	✅ OK	⚠️ Medium	✅ Exact	imaging libs

Key advantage: np.load() restores the exact array — shape, dtype, and memory layout — with zero parsing overhead and zero precision loss.

.npy is a raw binary dump. Loading it is essentially a memory-map read — the OS just points directly at the file bytes.

When would you reach for something else?

.h5 → bundle multiple arrays with metadata (e.g., projections + angles + geometry in one file)
.tiff → interop with imaging tools like ImageJ or Fiji
.mat → MATLAB compatibility

🔢 The Shape of Projection Data

Load your file and inspect it first:

  
import numpy as np
projections = np.load("projections.npy")
print(projections.shape, projections.dtype)
# e.g. (180, 512, 512)  float32

The three axes map directly to CT geometry:

Axis	Meaning
`axis 0`	Projection angle — which view
`axis 1`	Detector rows — slice height
`axis 2`	Detector columns — fan of rays

For sparse-view CT, axis 0 is small by design — e.g., 60 angles instead of 720. The array is physically smaller, but the reconstruction problem becomes much harder.

📐 What the Values Actually Represent

Each value in the array is a line integral — the total X-ray attenuation accumulated along one ray path through the object.

\[p(\theta, t) = \int f(x, y) \, dl\]

Where:

$f(x,y)$ — the object’s attenuation map (what we want to reconstruct)
$(\theta, t)$ — the projection angle and detector position
The integral runs along the ray path

In practice, raw detector readings $I$ are transformed via Beer-Lambert law:

\[p = -\log\left(\frac{I}{I_0}\right)\]

Where $I_0$ is the blank-scan (no object) intensity. This gives you the log-normalized sinogram — the standard input for reconstruction algorithms.

Always check your data has been log-normalized before feeding it into FBP or your U-Net. Raw intensity values will produce garbage reconstructions.

👁️ Visualising Your Data — Sanity Checks

Two views every CT engineer should check:

  
import matplotlib.pyplot as plt

proj = np.load("projections.npy")

# 1. Single projection: one angle's detector image
plt.figure()
plt.imshow(proj[0], cmap='gray')
plt.title("Single Projection (angle 0)")
plt.colorbar()

# 2. Sinogram: all angles, one detector row slice
plt.figure()
plt.imshow(proj[:, proj.shape[1]//2, :], cmap='gray')
plt.xlabel("Detector column")
plt.ylabel("Projection angle")
plt.title("Sinogram (middle row)")
plt.colorbar()

A healthy sinogram looks like smooth sinusoidal curves — hence the name. Each point in the object traces a sine wave across angles.

graph LR
    A["Object point\n(x, y)"] -->|"rotates through angles"| B["Traces a sinusoid\nin sinogram space"]
    B --> C["Radon Transform\np(θ, t)"]
    style A fill:#4A90D9,color:#fff
    style B fill:#E8A838,color:#fff
    style C fill:#5BA85A,color:#fff

If your sinogram shows vertical stripes or gaps, that’s your sparse-view problem — missing angles mean missing sinusoidal segments. Your reconstruction model needs to hallucinate those gaps intelligently.

🧠 One-Sentence Intuition

A .npy projection file is a 3D array of X-ray attenuation line integrals — indexed by angle, detector row, and detector column — and the sinogram is simply what one row of that array looks like when you stack all angles vertically.

🔗 Where This Fits in the Sparse-View Pipeline

graph TD
    A["projections.npy\n(N_angles × H × W)"] --> B["FBP / FDK\nInitial Reconstruction"]
    B --> C["Sparse CT Volume\n(degraded, streaky)"]
    C --> D["3D U-Net\n+ Data Consistency Module"]
    D --> E["High-Quality\nReconstructed Volume"]
    style A fill:#4A90D9,color:#fff
    style B fill:#9B6EBD,color:#fff
    style C fill:#D9534F,color:#fff
    style D fill:#E8A838,color:#fff
    style E fill:#5BA85A,color:#fff

Part of FYP notes series — Sparse-View CT Reconstruction. Next: FBP/FDK reconstruction and its limitations.

FYP, Deep Learning

This post is licensed under CC BY 4.0 by the author.