Based on a report made by the NYU Library, Phase One and Digital Transitions have investigated an issue that results in TIFF files that do not pass JHOVE validation. We’ve tracked the probable cause of the issue to Adobe Photoshop’s handling of some larger EXIF tags. If an EXIF tag is larger than four bytes, and the byte count is not a multiple two, then it must be stored on an offset of the file with padding. This ensures the byte count is always multiple of two. It appears that Photoshop eliminates the padding, which causes these tags to fall out of compliance with the TIFF specification. This affects files from modern raw processors including Adobe Camera Raw, Adobe Photoshop Lightroom, and Capture One 10.For Instance a Raw file processed to Tiff in Capture One 10 will pass JHOVE validation. However if it is opened in Photoshop and saved again, the file will no longer pass JOHVE Validation.
When the issue is detected JHOVE throws the “Not well-formed” status with “ErrorMessage: Value offset not word-aligned:” followed by a number. JHOVE does not detect the issue when using its default settings; the issue is only detected if the TIFF-hul module is used (if using the GUI version select Edit > Select Module > TIFF-hul)
JHOVE is an image validation software developed by Harvard to verify that files meet preservation standards. It is widely used by Museums, Libraries and other Cultural Heritage institutions. http://jhove.openpreservation.org/
Technical Nature of the Error
When Metadata tags are embedded in a TIFF file there are clear specifications on how they must be written. If a tag value is not a multiple of 2 bytes, and is also larger than 4 bytes, the specification mandates that it must be stored at an offset in the file, with padding added to ensure that all tag values are aligned on 2-byte boundaries. The recent ISO standard 12234-3 calls for a variety of variable-size EXIF tags such as LensModel which will almost always be longer than 4 bytes. When such tags are included they may be an odd-length of bytes, requiring the aforementioned padding. It appears Photoshop is trimming out this specification-mandated padding when it re-writes such a file. JHOVE properly notes as a deviation from a strict reading of the TIFF specification.
Many modern raw processors have added support for such EXIF tags, including Capture One 10, Adobe Camera Raw CC 2017, and LightRoom CC 2017. Photoshop will take TIFFs generated by these raw processors, which is fully compliant with the TIFF specification (down to the nitty gritty of the tags-start-on-even-bytes rule), and save a TIFF which has a very slight deviation from the TIFF specification.
Legacy raw processors, such as Capture One 9, do not have support for these EXIF tags. Because they do not include these tags, Photoshop does not mishandle them. Notably such raw processors are not as fully compliant with ISO 12234-3 which recommends such tags.