Capture and Representation in Image Formation

Capture and representation are the foundational stages that transform a real-world scene into a digital image suitable for subsequent processing and analysis.

1. Image Capture

1.1 Physical Interaction and Illumination

A scene comprises objects illuminated by natural or artificial light sources.
Light reflected from or transmitted through the scene carries information about shape, color, and texture.

1.2 Optical System

A lens focuses incoming light rays onto the image sensor.
The aperture adjusts the light flux, trading off depth of field and brightness.
The shutter controls exposure time, influencing motion blur and noise levels.

1.3 Sensor Technologies

CCD (Charge-Coupled Device) and CMOS (Complementary Metal-Oxide Semiconductor) arrays consist of millions of photodiodes that convert photons into analog voltages1.
Line-scan sensors capture one row at a time (common in scanners); area-scan sensors capture the full 2D scene at once.

1.4 Digitization: Sampling and Quantization

Sampling discretizes the continuous spatial domain into an $M \times N$ grid of pixels (spatial resolution).
Quantization maps each pixel’s analog voltage into a finite set of levels (gray-level or color resolution).
E.g., an 8-bit grayscale image uses values 0–255 per pixel2.
Higher spatial or gray-level resolution improves fidelity but increases data size and noise sensitivity.

2. Image Representation

2.1 Pixel-Based Models

A digital image is a 2D function $I(u,v)$ where $u,v\in\{0,\dots,M-1\}\times\{0,\dots,N-1\}$ and

I(u,v)\in\{0,1,\dots,2^B-1\},

with $B$ = number of bits per pixel3.

Grayscale: Single channel of intensities.
RGB Color: Three channels—Red, Green, Blue—each quantized separately.

2.2 Data Structures and File Formats

Bitmap (raster) stores raw pixel arrays (e.g., BMP, TIFF, PNG).
Compressed formats exploit redundancy via lossless (PNG) or lossy (JPEG) schemes.
Vector graphics (SVG) represent shapes mathematically—more suited to diagrams than natural images4.

2.3 Mathematical Models

Pinhole Camera Model:
$x = f\,\frac{X}{Z},\quad y = f\,\frac{Y}{Z},$
where $(X,Y,Z)$ are world coordinates, $(x,y)$ image plane coordinates, and $f$ focal length.
Lens Distortion: Real lenses introduce radial and tangential distortion functions that must be calibrated and corrected.

Understanding capture and representation is critical: it determines image fidelity, influences noise characteristics, and underpins all higher-level vision algorithms.

Shanlaksh

Capture and Representation in Image Formation

Key Takeaway:

1. Image Capture

1.1 Physical Interaction and Illumination

1.2 Optical System

1.3 Sensor Technologies

1.4 Digitization: Sampling and Quantization

2. Image Representation

2.1 Pixel-Based Models

2.2 Data Structures and File Formats

2.3 Mathematical Models

Syllabus for UI and UX Design (CCS370)

UI and UX Unit 1 Important question and answer

UI vs. UX Design

AL3502: Deep Learning for Vision

Introduction to Image Formation

Capture and Representation in Image Formation

Key Takeaway:

1. Image Capture

1.1 Physical Interaction and Illumination

1.2 Optical System

1.3 Sensor Technologies

1.4 Digitization: Sampling and Quantization

2. Image Representation

2.1 Pixel-Based Models

2.2 Data Structures and File Formats

2.3 Mathematical Models

Join the conversation