[go up] [up to pictures] [up all the way] [sideways to old books]
How to Clean Up Scanned Engravings and Old Photographs
Please read these notes if you are helping out either by processing images someone scanned for you or you are scanning images yourself.
If you want to practice, ask Liam for some sample files (liam at holoweb dot net; mention the colour of your socks and/or include a picture of your ankles)
I can also send you some images that I have scanned and not yet processed. Please note that they are usually between 15 Megabtes and 100 Megabytes each. Yes, that's the compressed size.
I'll pay most attention to The Gimp, an open source image editing tool. If you are using proprietary software such as PaintShop Pro or Adobe Photoshop, the steps are very similar, and I'll try and point out differences.
1. Saving the Scan
Save the scanned image in a lossless format such as PNG or TIFF. Do not use JPEG (which introduces slight distortions, and is called lossy because it loses certain kinds of detail to make the file smaller) and do not use GIF (which loses colour information).
If you aren't sure, use PNG.
If your scanning software imports the image into a program such as PaintShop PRO, GIMP or Adobe PhotoShop, remember to quit the scanningtool and save the images fairly often, as at 1200 dpi they can use a lot of memory!
Scanned images of line engravings at less than 1200 dpi are very hard to work with; it's much better to scan in greyscale at 1200dpi than colour at 300dpi.
If the optical resolution of your scanner is less than 1200dpi, use the maximum optical resolution. Send a single sample result before spending a lot of time on it, though. Anything below 600 dpi often isn't worth the effort for engravings, but may work fine for photographs.
2. Colour Correction: Overview
My scanner tends to have two problems: the scan is too light, and too red. Although it's sometimes possible to correct for these errors by calibrating a scanner, the results are rarely perfect, and you'll probably have to do colour correction for every image. Americans can do color correction if they prefer!
Figure 2.1. Part of a scanned engraving
Figure 2.1 shows a detail of an engraving from 1845, scanned at 600 dpi. Although it's hard to tell from this detail, there are three immediate problems.
- Range: neither black nor white are here;
- Red cast: everything is too red;
- Resolution too low: there isn't enough space between the lines to do a good job of cleaning up this image. We'll do our best though!
We'll fix the first two problems one at a time. The third isn't one we can do anything about except scan the image again!
3. Correcting Range of Levels
You need to get the image so that it has both black and white in it before fixing up other problems. If you don't do this first, you'll risk losing details when you do other transformations such as rotating.
In The Gimp, right-click on the image and from the Layer menu choose Colours; you can tear this menu off by clicking on the dotted line, which is useful as we'll be using it a lot. In older versions of the program, the Colours menu might be under Image instead of Layer, although the items within it mostly apply to the current layer.
From the Colours menu you can choose Levels, which brings forth a window with a graph in it, as shown in Figure 3.1.
Figure 3.1. Levels Dialogue
Pressing the Auto button in the Levels dialogue will expand the levels; you can use the preview feature to see the effect in the main window. If the result is too dark or too light, you can move the central small triangle just under the curve to one side or the other.
If Auto doesn't seem to do anything useful for your image, click Cancel and go on to the next step.
PaintShop Pro users: use either Stretch Histogram or Auto Contrast Adjustment to do roughly the same thing.
Figure 3.2. After Adjusting Levels
4. Correcting A Colour Cast
If the image is too red, you have two choices: reduce the red (which will make the image darker, and may lose detail in the darker areas) or increase the blue and green. This works because we have an RBG three-channel image; if you have a CMYK image in PhotoShop it's easier to use the PhotoShop colour correction tools.
image too red | increase blue and green |
---|---|
image too blue | increase red and green |
image too green | increase red and blue |
Although you can use colour balance controls, you can get a finer control using the Curves dialogue. You'll find this in the Layer/Colours menu, either by right-clicking on the image or using the Layer menu at the top of the image window.
Tip: you can turn the menu bar in individual image windows on or off by going to the main GIMP pallette window and choosing File/Preferences; it's under Image Windows/Appearance, at least in Gimp 2.2
Figure 4.1 shows the curves dialogue. Notice that at the top of the window I've selected the Blue channel, so that we're only affecting the level of blue in each pixel. If the levels in your window are hard to see, make the dialogue box larger. If it's still flat, make sure the filename is right (if you're editing more than one file). Another common mistke is to have an active selection, as then the Curves dialogue only affects the pixels you selected. Use Select/None first if you're not sure.
A grey pixel is made up of equal numbers of red, blue and green, so you can examine a pixel that should be grey with the eye-dropper tool (shortcut: the lower-case letter o) and see what the levels actually are. In my case the number next to red is always largest, but the difference is bigger for lightcolours than for dark ones, even after adjusting levels as in the previous section.
Figure 4.1. Blue Curves
Since my image is too red, I increase both the blue and the green curves. It's important to make sure that you don't lose any details you want when you do this, although losing the grain of the paper is just fine.
5. Save Memory
Scanned images can use a lot of memory, and it's not unusual to see The Gimp saying it's using 500 Megabytes or more for a single image. If this is close to or more than the amount of physical RAM in your computer (not disk space but memory), things will go very slowly.
The main Gimp window has a File menu with Preferences, and under Preferences/Environment you will find the Tile Cache Size; I use 100 Megabytes for this, but you may want to experiment.
To save memory, you can go to the main Gimp window and find the Undo History pallette. You can make Gimp show it with the Edit/Edit History menu item. At the bottom right of the Edit History pallette there's a button (in Gimp 2.2 and later) that will throw away all the undo history, and this can save a lot of memory.
If things are still too slow, save the image in Gimp's native xcf format, for example as tmp.xcf, quit Gimp and start it again on the new image file. This works with all image editors (except that the name of the native format changes from progam to progam of course, e.g. use tmp.psp for PaintShop Pro)
Once you've rotated and resized the image, you'll need a lot less memory.
6. Note on Screen Prints
Photographs are often repreduced with a screen made of lots of little dots of varying size. Our eye doesn't see the individual dots, unless (as in cheap newspapers) they are very large. But the scanner will see them!
Use Filters/Generic/Convolution matrix to smooth these. Put a 1 in every box in the matrix (you can use the tab key to go from box to box) except at the corners, where I use 0.3 instead. Choose Automatic for the weighting, or add all the numbers up yourself and enter the total as Divisor if you like! (it should be 22.20 if you use the values I described).
On the right, it should have Red, Green and Blue channels ticked (or Grey for a greyscale image), and Border should say Extend.
This is a controlled blur. You will probably need to do it twice.
After that, try Filters/Blur / Selective Gaussian Blur with a threshold (delta) of maybe 15 to 20, to avoid blurring across sharp edges.
You might also resize the image down to 50% and then do the Selective Gaussian Blur.
Sometimes you see regular patterns appearing over the image that aren't in the photo. This happens when you scan at too low a resolution, or if you rotate the image before blurring out the dots. So do this before you rotate. If that doesn't help, you may need to scan the image again, taking care to have it exactly square on the scanner.
7. Rotate the Image
It's unusual for an image to be perfectly level when it is scanned. For one thing the original printing press often wasn't perfectly lined up, and sometimes sheets went through a little wonky, and sometimes the actual plate itself wasn't perfectly square.
Use the Rotate tool to fix this. If you're using PaintShop Pro 8 or older, you will want to switch to the Gimp for this step! With the rotate tool selected, choose Backward (corrective), Interpolation: Cubic and Preview: Grid. You should get a grid over the image; if not, click on the middle of the image. If you can't see the middle of the image, press Control-Shift-E to scale the image view down so it all fits in your window.
PhotoShop users: right-click on the triangle at the lower-right corner of the Eyedropper tool in the Tools palette to get the Ruler tool. Click on one point of something that should be horizontal, and, keeping the left mouse button pressed, drag to another point on the same line. PhotoShop will draw a guideline for you; now if you choose Image/Rotate Canvas/Arbitrary, the angle will already be filled in for you and you can click OK. I often use the caption at the bottom of the image, if it was included in the scan, although in some books the plate for the image isn't perfectly straight on the page, and does not line up with the text.
Some Window Managers, such as sawfish, let you drag a window so that the title bar is off the top of the screen. This saves a lot of space, and can be more useful than the full screen mode. Make sure you know how to drag the window back, usually by holding the Alt key down while you drag the wndow. For Metacity you may need to configure a Super key in the Gnome Keyboard control panel.
Now that you have the grid visible you'll need to drag it so that it lines up with the image. You can zoom in using the plus and minus (+ and -) and you can zoom straight to 100% with the digit one (1). It's often easiest to choose the bottom of the image, or some printed text beneath the image, and line the grid up with that. Change the Grid line spacing or the Number of grid lines in the gimp main window until a line aligns perfectly with something in the image.
When the grid is rotated so it lines up with the bottom of the image, press Rotate and wait for a while!
When it finishes, use View/Shrink Wrap (or Control-Shift-E) to see the whole picture, and satisfy yourself it's straight.
If you're unhappy, press Undo (Control-z) and try again!
Remember, you only need to get the bottom straight at this point; if the sides or top are crooked, we'll deal with that later.
Once you're happy, use Image/Flatten Image to bring the image back to a single layer and save memory. This is also a good time to save a copy, and to throw away the Undo memory, as described in the previous section.
8. Skew and Perspective
If the image isn't square, but you think that it should be square, you can fix it. This often happens if the book didn't sit flat on the scanner, for example if it's too large.
Use the Crop tool to make a rectangle around the image as close to it as possible without cutting off any image. You can zoom in to full size (e.g. with the View menu or by pressing the one (1) key).
This will show you clearly whether both sides of the image are slanting in the same direction skew or if the image is wider at one end than the other.
If it's skewed, use the Shear tool. If one end is wider than the other, use the perspective tool. The approach is the same in either case.
Choose the right tool, and click the image to get a grid. Change the mode to Corrective as you did with Rotating earlier.
Line the edges of the grid up with the edges of the image, as best you can. You can change thenumber of grid lines (or the grid spacing) to get lines near the edge if necessary.
After this, you will want to flatten the image.
8. Strengthen Those Lines
If you are working on an engraving, you will probably find there are a lot of grey areas where the lines were close together. Assuming you used at least 1200dpi for the scan, and preferably 1600 or 2400, there should be at least four or five pixels between the lines, which should be at least two or three pixels wide.
If that is the case, the gimp has a useful filter; right-click on the image and choose Filters, Distorts, and then Value Propagate. Set the type to more black and the amount to about a third of a pixel. You can also play with the upper and lower threshholds, but I am not sure they work properly. This filter can take several minutes on a large image, but afterwards you can be aggressive with curves and make the gray areas all become white. An alternative is to use Gaussian Blur with a radius or 1.5 to make the dark areas larger. The point of this is that when you make the imnage brigher the edges of the dark lines get eaten away, so you need to strenghthen then first.
9. Cleanup
Now you have an image that's the right way up, and that has a good range of colour from black all the way to white. Crop it as close to the image as you can, and check the colour levels again (in case you just cropped out all the black, for example!)
You may want to experiment with Curves to try to get rid of as much of the paper noise as possible without losing any of the black ink. The same shape curve that I drew in Figure 4.1 is a good thing to try. You can also try Filters/Unsharp Mask with a threshold of maybe 15 to 30.
If you scanned at 1200 dpi and the image is bold, not too fine, you might be able to use the Magic Wand tool to select all the light coloured regions, including between the lines. Shrink the selection by 1 or 2 pixels (not from the image edge though) and then feather the selection (Select/Feather) by 2 or 3 pixels before either using a bucket fill or Cut (control-x), so that the transition isn't too abrupt.
Use the Clone tool (it has an icon of a rubber stamp in the Gimp) to get rid of hairs, bits of old leather or dirt, and other obvious blemishes.
I find this part is usually the most time-consuming, and if you want to send me what you've got up to the start of this stage and have me finish, that's OK, you'll still have helped out!
Your goal in this stage is a clean image with sharp dark lines on a white background.
I sometimes also resize the image larger by 200% (double both width and height) and then use the Gimp Value Propagation filter set to more black (maybe amount of 18% and with an upper threshhold of 50% or so) to strenthen the lines of an engraving before resizing the image down again.
10. Resize and Save
Make sure you have saved a copy of the cleaned-up image before you resize it. If the image is close to monchrome, you can set it to be grayscale (Image/Mode/Greyscale) which saves a lot of memory, makes smaller files, and also helps people download the images more quickly.
If the image is close to a 3:4 ratio (height:width) I usually try to make a 1600 by 1200 pixel version, because people like to use those for their computer screen backgrounds.
If not, I resize down so that the smallest dimension is about a thousand pixels. If that makes any dimension bigger than about 1600 pixels, I resize down so the largest dimension is about a thousand pixels instead. Then I use a sharpen filter (often as much as 40%).
I then save as JPEG, with a descriptive filename such as 0841-Prudhoe-castle-northumberland-1x1.jpg for plate 841. The 1x1 at the end will turn into the actual image size automatically when the image goes online.
Images must be saved in the Gimp, not in PhotoShop. The file sizes are simply too large in PhotoShop. If you only have PhotoShop either download the gimp or send me PNG files instead.
If you save as JPEG, use Gimp's preview feature. Click on Advanced Options (Gimp 2.2 or later) and set the DCT Method to Floating-Point, chose Optimise, Progressive and Force baseline JPG, but do not save EXIF data or a thumbnail.
Smoothing should usually be zero: it tells the image viewer to blur the resulting image, which can hide the JPEG noise but also loses that detail you carefully preserved!
Once you've chosen a quality level that's acceptable to you, using the slider at the top, and that leaves the resulting file under 500KBytes, go to the Comment box, and type quality=75% (where the 75 must be the actual JPEG Quality you used). The lower the number the worse the image, so I prefer values between 75 and 85 percent, but sometimes for a large image a lower value works OK and can make the file much smaller.
I then scale down again by proportion to 75% height and width, and repeat the process, sharpening as needed.
Undo the resize, and scale down to 56.25%, running sharpen again if needed, and perhaps using Curves to brighten or darken the image, as resizing sometimes affects the overall lightness.
Finally, save a copy whose largest side is exactly 500 pixels, by going to Image/Resize and typing 500 into the larger of the Height and Width fields. This will be the preview image, and can be slightly lower quality. You can also add smoothing if you like. If you do, change the Comment box to say, for example quality=35% smoothing=10% for a smoothing value of 0.10 and a quality of 35%. Please make sure the 500-pixel version is well under 100 KBytes in file size.