Photoshop: Redundant data stored in a PSD with multiple copies of the same RAW smart objects

  • 3
  • Problem
  • Updated 3 weeks ago
  • (Edited)
Redundant data is stored in a PSD with multiple copies of the same RAW smart objects.

Steps to reproduce:
  1. Open any RAW photo in Photoshop as a Smart object (tested on CR2 file, size 36 756 199 bytes).
  2. Save the PSD file as test_1.psd (don't close the document) and record its size (in my case: 178 566 666 bytes)
  3. Choose Layer>Smart Object>New smart object via copy.
  4. Save the PSD file as test_2.psd  (don't close the document) and record its size (in my case: 354 302 054 bytes)
  5. Choose Layer>Smart Object>New smart object via copy two more times. There should be 4 layers in total.
  6. Save the PSD file as test_3.psd  (don't close the document) and record its size (in my case: 705 771 044 bytes)
  7. Use 7zip to compress files test_1.psd, test_2.psd, test_3.psd into 3 separate archives (use big sizes for dictionary and word) and record theirs sizes: test.7z (80 785 939 bytes), test_2.7z (80 812 834 bytes), test_3.7z (80 863 735 bytes).
Observation:

Duplicating an embedded RAW file causes the entire RAW file (binary data) to be stored another time, increasing the size of the PSD files by the size of the RAW file each time the Smart Object is duplicated. On the other hand, after compressing all three files are of nearly identical size - that means there is no new information stored, the data is simply duplicated.

Proposed solution:

In the case of RAW files, there is no need to store the same binary data multiple times - only CameraRaw settings (the "sidecar" xmp file) should be stored for each Smart Object copy.

Rationale:

In the process of creating composite HDR files from a single RAW file, there is a need to use multiple copies of the same RAW file with different CameraRaw settings.
Now, such composite file could easily take up 1 GiB or more (with high-megapixel RAW file and 4 copies of the smart object).

Implementing the proposed solution would decrease the size to around 200MiB.
Photo of Damian Sepczuk

Damian Sepczuk

  • 4 Posts
  • 1 Reply Like

Posted 4 weeks ago

  • 3
Photo of Max Johnson

Max Johnson, Champion

  • 488 Posts
  • 235 Reply Likes
This is as-designed. New Smart Object Via Copy creates an *additional* version of that smart object so it will not affect other instances.

To create an instance of an existing smart object, just duplicate the layer like any other.

You should be able to do a separate RAW processing on each instanced layer with a smart-filter via Filters -> Camera RAW Filter... though I've never tried this workflow and don't have a raw file to test on...

Good luck!

An alternate, though not direct solution is to use Lightroom and virtual copies for meta-data-only changes to a base RAW image. It is much more lightweight and flexible for multi-process and multi-crop/export size needs. That may not be acceptable for your workflow, however.
Photo of Damian Sepczuk

Damian Sepczuk

  • 4 Posts
  • 1 Reply Like
Hi Max, thanks for quick reply!

Nice idea! Unfortunately using the Camera RAW Filter is not an option - it works on the processed data, not on the original RAW data (for example: the exposure slider behaves differently, you can't get the original full dynamic range).

I agree that the "front-end" behavior is as-designed. I expect that "New Smart Object Via Copy" creates an independent version of the Smart Object.

But I cannot agree that copying the full byte-stream is the only (or the best) solution in case of RAW files, where you can easily separate the data into two parts - the constant binary data shared between all independent smart objects in the PSD file and the user-modifiable part - the Camera Raw settings. As far as I know it is impossible to change the underlying binary data of an Smart-Object-embedded RAW file (not counting the "replace contents" or "relink" options).

For other types of Smart Objects (where the underlying binary data could be easily changed) implementing a simple copy-on-write seems to be a reasonable step. But I cannot think of any use-case for copying non-RAW Smart Objects with the same underlying binary data (using New Smart Object Via Copy) that cannot be solved using simple layer duplication.