Saturday, April 9, 2011

Fitting of Images Into Frames in Automated and Semi-Automated Page Layout Workflows

When developing workflows around InDesign Server or similar automated page layout workflows, one of the recurring themes is the fitting of image material into frames on a page.

In this blog post I'll demonstrate a method for efficiently storing re-usable image fitting data. The image fitting data is resistant to image replacement or layout changes in the page layout.

This method can easily be implemented in ExtendScript/JavaScript or any other programming language.

I'll also show a corresponding user interface.

It all starts with an image and a frame. Most commonly, both are rectangular, but their shape is not necessarily related: the image might have a landscape orientation and the layout frame that the image needs to fit in might have a portrait orientation.

What I'll present in this article is (as far as I know) a novel way of encoding image fitting data into four numbers, in such a way that the fitting data becomes totally decoupled from the actual frame and image shapes. The same fitting data can be re-applied to images and frames of wildly different shapes, and always gives sensible results.

The fitting data is fairly straightforward to grasp intuitively - the four numbers I use are called (xFactoryFactorscalerotation).

xFactor and yFactor range between -1.0 and 1.0, scale between 0.0 and 1.0, and rotation = 0, 1, 2 or 3.

xFactor = -1.0 means: frame at the extreme left side of image.
xFactor = 0.0 means: frame horizontally in the middle of the image
xFactor = +1.0 means: frame at the extreme right side of the image.

yFactor = -1.0 means: frame at the extreme top side of image.
yFactor = 0.0 means: frame vertically in the middle of the image
yFactor = +1.0 means: frame at the extreme bottom side of the image. 

scale = 1.0 means: frame as large as possible without going 'outside' the image
scale = 0.0 means: frame infinitely small

rotation = 0: image not rotated
rotation = 1: image rotated 90 degrees
rotation = 2: image rotated 180 degrees
rotation = 3: image rotated 270 degrees

Below are a few intuitive samples.

Keep in mind: in the samples below I'm continually using different images and/or frames - the main 'claim to fame' of the decoupled image fitting data is that no matter what image or frame shapes you throw at it, it always gives sensible results.

First two samples, both using the exact same fitting data - xFactor = 0.0, yFactor = 0.0, scale = 1.0, rotation = 0.
As you can see (0.0, 0.0, 1.0, 0) translates to: center the frame on the image, and make the frame as large as possible. It does not matter whether the image has portrait or landscape orientation, and it does not matter whether the frame has portrait or landscape orientation - the same fitting data always works.

(0.0, 0.0, 1.0, 0) is the default fitting data I most commonly use for images and frames that have not been edited by a user.

Now a sample image, and a differently shaped frame, this time using (0.0, 0.0, 0.5, 0) as the fitting data: the scale is 0.5 instead of 1.0. The frame shrinks down to half the largest possible size both horizontally and vertically:
Another sample image and frame, now using (-1.0, -1.0, 0.5, 0): the scale is the same as in the previous sample, but now the frame has moved to the top left corner.
Any fitting data with xFactor = -1.0 and yFactor = -1.0 always means: move the frame to the top left corner.

To help understand my image fitting method, think of the frame as a window through which we can see part of the image.

Imagine I float a rectangle in the same shape as the frame on the page 'on top' of the image. I'll combine two actions:

1) Shifting the frame around within the image
2) Resizing the frame so it covers more or less of the image
l'll initially assume the floating frame rectangle is not allowed to go 'outside' the image. It is forced to stay within the confines of the image.

With regards to the positioning of the floating frame: I will be encoding the position of the center of the frame with respect to the center of the image.

For the sake of argument, I'll first only consider the vertical position of the floating frame.

I observe that when the floating frame covers the full height of the image, it has no room to move vertically (otherwise it would escape the confines of the image).
(No vertical leeway: the frame cannot move up or down without leaving the confines of the image)

In essence, the amount of 'vertical leeway' is determined by the difference in height between the floating frame and the image. If the image and floating frame have the same height, the vertical leeway is zero. At the other extreme, if the floating frame height were reduced to zero, the vertical leeway would be the full height of the image.
(Vertical leeway is difference between image height and frame height)

When the floating frame is positioned in the center of the image, I can move it half the vertical leeway towards the top, and half the vertical leeway towards the bottom.
(When the frame is centered on the image we can move half the vertical leeway up, and half the vertical leeway down)

That brings me to one of the key numerical values: given an image and a frame, I calculate half the vertical leeway (in other words: I subtract the frame height from the image height and divide by two).

I then express the vertical position of the frame center with respect to the image center as a value between -1.0 and 1.0. When I multiply this value with half the vertical leeway, I get the frame's center offset along the y-axis in relation to the image center.

I do the same for the horizontal axis: encode the horizontal offset of the frame center from the image center as a value between -1.0 and 1.0. Multiplying this value with half the horizontal leeway gives me the frame center's horizontal offset from the image center.

To encode the frame size, I first rescale the frame so it 'fits' the image in one of the two dimensions. I make the frame as large as possible without 'spilling out' of the image. When I do that, either the horizontal or vertical sizes of image and floating frame will be equal, and in the other dimension the frame size will be less than, or equal to the image size.

I call the 'equal' dimension 'the dominant dimension' - each time someone gives me an image and a frame, I can determine the dominant dimension for that particular (image, frame) pair - it's either horizontal or vertical (and occasionally both - in which case I pick either).
(The vertical dimension is dominant: the frame can be made to fit the image vertically while it remains narrower than the image horizontally)

I now encode the frame scaling as a number between 0.0 and 1.0 along the dominant dimension. If the scale is 1.0 the frame and image have the exact same size along the dominant dimension. If the scale is 0.5, the frame is half the size of the image along the dominant dimension. The non-dominant dimension is resized proportionally.
(scale = 0.5 means that the frame is half of the image size along the dominant dimension)

Finally, I need to encode image rotation: the image can be rotated underneath the frame. I encode that as an integer number from 0 to 3 (0 = not rotated, 1 = 90 degrees... ).

So, to make an image fit into a frame, I have the fitting data encoded as four numbers:

xFactor = -1.0 to 1.0 (floating point)
yFactor = -1.0 to 1.0 (floating point)
scale = 0.0 to 1.0 (floating point)
rotation = 0 to 3 (integer)

I'll now work through a step-by-step procedure of using this fitting data.

Assume I'm given fitting information (a set of four numbers: (xFactor, yFactor, scale, rotation) as well as an image with dimensions (widthImageUnrotated, heightImageUnrotated) and a frame in a page layout with dimensions (widthFrame, heightFrame).

If scale is 0.0, I don't even start the process - it would lead to a division by zero.

First I take into account the rotation. If the rotation is 1 or 3, I need to swap the image width and image height because the image is turned on its side through a quarter turn.

Based on rotation and (widthImageUnrotatedheightImageUnrotated) I define a new pair of rotation-adjusted image dimensions (widthImage, heightImage).

The next step is determining the dominant dimension. I calculate two ratios:

horRatio = widthFrame widthImage


verRatio = heightFrame / heightImage

The larger of these two ratios belongs to the dominant dimension.

For the sake of argument, I'll first assume horRatio ends up to be the larger of the two. That makes the horizontal dimension the dominant one. This means that I can make the frame 'fit' the image horizontally while leaving the vertical frame height less than or at most equal to the image height.

In that case, the scale division factor that I apply to the image, imageScaleDiv, is determined as

imageScaleDiv = scale / horRatio

In the other case, where the verRatio turns out to be the larger of the two instead of horRatio, I would use

imageScaleDiv = scale / verRatio

I rescale the image by dividing the image dimensions by imageScaleDiv, so its new size becomes (widthImageScaled, heightImageScaled):

widthImageScaled = widthImage / imageScaleDiv


heightImageScaled = heightImage / imageScaleDiv

Normally, imageScaleDiv should not be zero at this point of the procedure - we don't allow for zero scale values.

I now calculate horizontal and vertical leeways:

horHalfLeeway = (frameWidth - imageScaledWith) / 2.0


verHalfLeeway = (frameHeight - imageScaledHeight) / 2.0

Now I multiply these 'half leeways' with the xFactor and yFactor to find the offset of the image center compared to the frame center:

xOffset = xFactor * horHalfLeeway


yOffset = yFactor * verHalfLeeway

This concludes the procedure: I've taken a set of four numbers (xFactor, yFactor, scale, rotation), applied them to an image and a frame, and ended up with sufficient scaling information and positioning information to allow me to put the image into the frame.

The power of this encoding becomes apparent when I retain only the fitting information, and apply the same set of four numbers to a completely different image and/or frame: it still works, and in most cases, the 'fit' is 'natural'.

So, if I have a person's portrait in a frame, and swap it for a different portrait, the fit will most probably still be correct - most headshots share the same basic proportions. Here's an example where the two images are very different, yet re-using the same fitting data works well:
(fitting info: (0.0, 0.0, 0.75, 0) )
(fitting info: (0.0, 0.0, 0.75, 0). The same fitting info as the previous image. Even though the image shape is quite different, the same fitting info still works well)

Another advantage is that I can tie the image fitting info to the user interface, and get something that acts rather 'naturally'.

If I position the cropping frame, say, in the top left corner and change the scale, the cropping frame will remain in the top left corner and extend down and to the right.
(fitting info: (-1.0, -1.0, 0.5, 0) )
(fitting info: (-1.0, -1.0, 0.75, 0). Changing the scale from 0.5 to 0.75 keeps the frame 'stuck' in the corner)

If I position the cropping frame dead center and change the scale: it will grow and shrink while remaining centered.
(fitting info: (0.0, 0.0, 0.5, 0) )
(fitting info: (0.0, 0.0, 0.75, 0). Changing the scale from 0.5 to 0.75 keeps the frame 'stuck' in the center)

When an automated or semi-automatic workflow has some image fitting operation in it, storing the image fitting data as a set of four numbers, and applying them as described here increases the chances that the fitting data will continue to give sensible results even when the image is replaced with another, or the frame is resized later on (e.g. when a template is adjusted). The fitting data is fully decoupled from image and frame dimensions.

The method described here also remains usable outside the limitations I originally set out.

By allowing xFactor and yFactor to go outside of the interval [-1.0, 1.0] I can allow for frames that only partially overlap the image.

Similarly, scale factors outside of the interval ] 0.0, 1.0 ] can be used too - I could allow negative scale factors or scale factors larger than 1.0. However, a zero scale factor will never give sensible results - it results in a division by zero.

Finally, the rotation can be allowed to be a floating point number in the interval [0.0, 4.0[ instead of an integer 0 - 3. That way, I can rotate over arbitrary angles - e.g. a rotation of 0.5 would translate into a rotation of the image of 45 degrees.

The screen shots that accompany this blog post demonstrate the related user interface - the screen shots are from an AJAX implementation I made for a real-life project, using Google Web Toolkit (GWT) running in an ordinary web browser.

I found it helpful if I made the scale slider to show the effective image resolutions (instead of the underlying scale factor).

In general it works like this: initially, in an automated setup, I always start with a default fitting of (xFactoryFactorscalerotation) = (0.0, 0.0, 1.0, 0). That centers the frame onto the image, and makes the frame as large as possible, cropping away part of the image if necessary.

When the user edits the image fitting through my user-interface, I store the resulting fitting parameters (xFactoryFactorscalerotation) alongside the layout (e.g. the script labels are a great spot to put this).

When the layout or image changes, I retrieve the fitting data and re-apply the fitting data to the new image or the modified frame.

It works really well for me - hope it does for you too!

(c) 2011 Kris Coppieters - Rorohiko Ltd.