Clip Image Similarity

A CLIP-based toolkit for embedding image folders and generating compact pairwise distance matrices for retrieval and evaluation.

Features

🔍 CLIP-based image embedding (any OpenCLIP model, default is Apple's DFN5B-CLIP-ViT-H-14-384)
⚡ GPU-accelerated batch inference
📦 Compact flattened pairwise distance arrays (upper-triangular matrix, float32/float16 storage, top-k neighbors)
🔒 Privacy-preserving series label anonymization helper
📊 Mean Average Precision (mAP) computation from either flattened distances or stored top-k neighbors

Quickstart

make run \
  INPUT_DIR=/path/to/images \
  OUTPUT_DIR=/path/to/output \
  MODEL=hf-hub:apple/DFN5B-CLIP-ViT-H-14-384 \
  BATCH_SIZE=16 \
  DEVICE=cuda \
  ANONYMIZE_LABELS=/path/to/labels.json \
  PAIRWISE_DTYPE=float16

Installation

Using Makefile (auto-creates venv and installs dependencies)

make install
source .venv/bin/activate

Running the CLI:

python -m clip_image_similarity.cli \
  --input-dir /path/to/images \
  --output-dir /path/to/output \
  --model hf-hub:apple/DFN5B-CLIP-ViT-H-14-384 \
  --batch-size 16 \
  --device cuda

Or use the Makefile wrapper (installs and activates the venv automatically):

make run INPUT_DIR=/path/to/images OUTPUT_DIR=/path/to/output

Parameters

Parameter	Required	Default	Description
`--input-dir`, `-i`	✅	-	Root directory containing images to process.
`--output-dir`, `-o`	✅	-	Directory where results will be written.
`--model`, `-m`	❌	`hf-hub:apple/DFN5B-CLIP-ViT-H-14-384`	Hugging Face Hub model ID for OpenCLIP.
`--batch-size`, `-b`	❌	`32`	Batch size for embedding computation.
`--device`, `-d`	❌	Auto (CUDA if available)	Device to run on (e.g., `cuda`, `cuda:0`, `cpu`).
`--pairwise-dtype`	❌	`float32`	Numeric precision for storing distances (`float32` or `float16`).
`--top-k`	❌	None	Save top-k neighbors per image instead of full flattened distances.
`--anonymize-labels`	❌	None	Path to labels JSON (series → image paths); converts to series → indices.
`--image-exts`	❌	Common formats	Comma-separated list of image extensions (e.g., `jpg,png,jpeg`).
`--overwrite`	❌	`false`	Allow overwriting existing output files.

Outputs

File	Description
`evaluation_results/pairwise_distances.npz`	Flattened upper-triangular distances `(1 - cosine_similarity)`; dtype `float32` (default) or `float16` via `--pairwise-dtype`, saved with dtype metadata.
`evaluation_results/pairwise_topk.npz`	Emitted when `--top-k` is set; contains per-image neighbor indices/distances plus stored `top_k`, dtype, and index dtype metadata.
`image_paths.json`	Ordered list of image paths corresponding to indices in the flattened array. DO NOT SHARE IF FILENAMES ARE SENSITIVE.
`series_to_indices.json`	Optional; only written when `--anonymize-labels` is provided. Maps series -> list of indices for downstream mAP while keeping paths private.
`config.json`	Snapshot of the run configuration.

Generate Anonymous Labels (optional)

If you ran the CLI without --anonymize-labels but later want to generate series_to_indices.json, you can use the standalone script:

make anonymize-labels OUTPUT_DIR=./results LABELS=./path/to/labels.json

Or run directly:

python -m clip_image_similarity.generate_anonymous_labels \
  --output-dir ./results \
  --labels ./path/to/labels.json \
  --overwrite  # optional: overwrite existing series_to_indices.json

This reads image_paths.json from the output directory and generates series_to_indices.json using your provided labels file.

Compute mAP (optional)

After generating results, compute Mean Average Precision from the flattened distances and series indices:

python -m metrics.map \
  --distances ./results/evaluation_results/pairwise_distances.npz \
  --series-indices ./results/series_to_indices.json \
  --output_csv ./results/metrics/map.csv

If you saved top-k neighbors instead of the full flattened distances:

python -m metrics.map \
  --topk ./results/evaluation_results/pairwise_topk.npz \
  --series-indices ./results/series_to_indices.json \
  --output_csv ./results/metrics/map.csv

If you need to derive indices from labels and paths locally instead, provide --labels and --image-paths to metrics/map.py (using the saved image_paths.json), but be aware that sharing paths reveals filenames:

python -m metrics.map \
  --distances ./results/evaluation_results/pairwise_distances.npz \
  --labels ./resources/labels/images_series_labels.json \
  --image-paths ./results/image_paths.json \
  --output_csv ./results/metrics/map.csv

Performance Considerations

Batch Size

Start with a small batch size (~16 or 32) and gradually increase while monitoring GPU memory usage. For reference, batch size 256 achieves ~81% VRAM utilization on an RTX 5090 (32GB) when processing 30K images.

Precision

Use --pairwise-dtype float16 to reduce storage size by approximately 50% with negligible impact on retrieval accuracy. The default float32 provides higher precision but results in larger output files.

Top-K Mode

When working with large datasets, consider using --top-k to save only the k nearest neighbors per image instead of the full distance matrix if you want to minimize the size of the output. This significantly reduces storage requirements when k << total number of images.

Important: If you plan to compute mAP later, ensure k is at least as large as the size of the largest series in your labels. Otherwise, some relevant images may be excluded from the evaluation.

Benchmarks

Performance benchmarks are available in BENCHMARK.md, including detailed timing breakdowns, resource usage, and throughput metrics.

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
.github/workflows		.github/workflows
clip_image_similarity		clip_image_similarity
metrics		metrics
resources		resources
tests		tests
.gitignore		.gitignore
BENCHMARK.md		BENCHMARK.md
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Clip Image Similarity

Features

Quickstart

Installation

Parameters

Outputs

Generate Anonymous Labels (optional)

Compute mAP (optional)

Performance Considerations

Batch Size

Precision

Top-K Mode

Benchmarks

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Clip Image Similarity

Features

Quickstart

Installation

Parameters

Outputs

Generate Anonymous Labels (optional)

Compute mAP (optional)

Performance Considerations

Batch Size

Precision

Top-K Mode

Benchmarks

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages