Image Analysis 3 - Distances in Poincare Spherical Space
Paper: Poincaré Embeddings for Learning Hierarchical Representations
Project (not looked at yet; used Java):
Implementation: Poincaré Embeddings for Learning Hierarchical Representations
Earlier image distance analysis.
Results for various image histogram methods applied to a sampling of 328 photos with '11' in their sequence number,
from a collection of 8K photos by Bill and Elle, and a sampling of 568 photos with '33' in their sequence number from 16K
photos, including photos from Ellen and Raf & Skot as well. Units are scaled to give a reasonable integer range.
Pair distances stratify into three groups in each case. In the two extreme cases examined, Greyscale and RGB 32^3,
all the photos in the sample are found in the central Group 2 layer of pairs, while Groups 1 and 3 have fewer unique
photos and pairs, and there is a disjoint split of all photos between Groups 1 and 3. This holds for both samples.
Photo/Pair counts per Group
|Metric||N Photos||Group 1||Group 2||Group 3|
Groups 1 & 3: Intersections between Greyscale and RGB 32^3 (photos/aggregate)
|N Photos||Group 1||Group 3|
Intersection of Pairs between Greyscale/RGB32 Groups (pairs/aggregate)
|N Photos||Group 1||Group 2||Group 3|
Sorted picture-picture distances
Sample size 568 photos from 16K
With '33' in sequence number.
Numerical differences due to color profiles, and distributions
There are significant numerical diffs between Oracle and
OpenJDK for a lot of pairs' distance calculations, due to different
default color profiles used.
These diffs don't affect the overall distributions, but they can affect which Group an individual pair falls into.. OpenJDK vs. Oracle:
< 1:1 1:499 513731 > 1:1 1:499 513732 < 1:1 1:776 505618 > 1:1 1:776 505599 < 1:1 1:793 505533 > 1:1 1:793 723759 < 1:1 1:1690 722604 > 1:1 1:1690 509486 < 1:1 1:1869 735261 > 1:1 1:1869 517035
Overall greyscale distributions for ~118M pairs are the same for the two JDK's:
Oracle Greyscale distribution
OpenJDK Greyscale distribution
For some images, ImageIO.read() returns e.g. 673/58903 differing bytes, including these from a stretch of about 100 pixels:
< -9876933 > -9942469 < -9876931 > -9942467 < -4279385 > -4344921
Here is the image:
Histograms from BoofCV.