Dilate by One
Motivation
Dilate
Dilate is a function that accepts a black and white image. It is also known by the names "grow", "bolden", and "expand". It turns on pixels which were near pixels that were on originally, thereby thickening the items in the image.
Erode
Erode is the sister function to dilate. It is also known by the name "shrink". It turns off pixels which were near pixels that were off originally, thereby eating away at the edges of the items in the image.
Detecting Image "Snow"
Dilate and erode can be used in conjunction do detect and correct a common problem with digital and scanned images: snow. Snow is caused by bad pixels on the CCD of a digital camera or dust that gets onto a scanned image or negative. If we dilate and image, and then erode the image, we can see that holes in the image get filled in:
Original  After Dilate  After Dilate and Erode  




A typical algorithm to correct snow would usually be something like this:
Image correctSnow(Image image, int radiusOfSnow){
// Copy the image and convert the copy to black and white
Image blackAndWhiteImage = threshold(copy(image), THRESHOLD_VALUE);
// Copy the threshold and dilate and erode the copy
Image dilatedAndEroded = erode(
dilate(
copy(blackAndWhiteImage),
radiusOfSnow
),
radiusOfSnow
);
// Figure out which pixels were snow based on the difference
// between the thresholded image and the dilated and eroded image
Image snowPixels = diff(blackAndWhiteImage, dilatedAndEroded);
// Fix up any pixels in the original that were marked as snow by
// filling them with color of surrounding pixels
return blendSnowPixels(image, snowPixels);
}
As we will see, dilate and erode can be efficient, no matter how big the snow that you want to remove is.
The Challenge
Write a function that dilates an image. Assume that the image can be of arbitrary size (n by n). An image is represented by a two dimensional array of integers. The image data is binary (zeros and ones), black and white, no color or gray scale. Dilate should turn on any pixel that is touching a pixel in the north, east, south, or west direction (no diagonals) that is already turned on in the input.
This is how dilation is supposed to behave (Flip a couple bits and the press dilate):
First Try
When first presented with this problem most people make a few mistake related to bounds checking (off by one errors and such). Those problems are easily corrected. However the most common incorrect algorithm is that it is easy to step on ones toes.
// Incorrect solution  steps on its own toes
int[][] dilate(int[][] image){
for (int i=0; i<image.length; i++){
for (int j=0; j<image[i].length; j++){
if (image[i][j] == 1){
if (i>0) image[i1][j] = 1;
if (j>0) image[i][j1] = 1;
if (i+1<image.length) image[i+1][j] = 1;
if (j+1<image[i].length) image[i][j+1] = 1;
}
}
}
return image;
}
Second Try
The easiest way to correct this is to create a copy of the image to work with a copy of the image. This is perfectly correct solution, however it uses O(n^2) extra space. Notice that the copy of the image has to be initialized to all zeros and that the current pixel has to be copied over as well as the surrounding pixels.
// Correct, but creates a copy of the image which is inefficient
int[][] dilate(int[][] image){
int[][] imagecopy = new int[image.length][image[0].length];
for (int i=0; i<image.length; i++){
for (int j=0; j<image[i].length; j++){
if (image[i][j] == 1){
imagecopy[i][j] = 1;
if (i>0) imagecopy[i1][j] = 1;
if (j>0) imagecopy[i][j1] = 1;
if (i+1<image.length) imagecopy[i+1][j] = 1;
if (j+1<image[i].length) imagecopy[i][j+1] = 1;
}
}
}
return imagecopy;
}
Best Solution
The best solution is to store information about what is newly turned on (vs was on originally) in the pixels themselves. This solution uses the value of 2 to indicate something was newly turned on. Notice that before a pixel is flipped to a two, it must be checked to ensure that it wasn't a one first. Also notice that a second pass is made at the end to turn the twos back to ones. This does not effect the magnitude of the running time: The algorithm still runs in O(n^2) time.
// Best dilate by one solution
int[][] dilate(int[][] image){
for (int i=0; i<image.length; i++){
for (int j=0; j<image[i].length; j++){
if (image[i][j] == 1){
if (i>0 && image[i1][j]==0) image[i1][j] = 2;
if (j>0 && image[i][j1]==0) image[i][j1] = 2;
if (i+1<image.length && image[i+1][j]==0) image[i+1][j] = 2;
if (j+1<image[i].length && image[i][j+1]==0) image[i][j+1] = 2;
}
}
}
for (int i=0; i<image.length; i++){
for (int j=0; j<image[i].length; j++){
if (image[i][j] == 2){
image[i][j] = 1;
}
}
}
return image;
}
Dilate by K
Motivation
What we have written is dilate by one. This can be generalized into dilate by k. k will be defined in terms of Manhattan distance. In Manhattan, the distance between any two places is the number of blocks you have to walk to get there. You must travel in a stair step fashion as you cannot cut diagonally through buildings. The dilation by k will turn on all pixels that are within k Manhattan distance of a pixel that was on in the input.
Here is how dilation by k is supposed to work:
Nasty Solution
The temptation is to follow a similar algorithm to the one presented by dilate by one. In such a solution we would walk each point in the image. When a point that is on is found, travel k in each direction turning on pixels. While possible to write, such a solution would be fairly messy and run in O(n^2*k^2) time. Since k is bounded by n, that is really a O(n^4) solution. There are both much cleaner ways and much more efficient ways to write the algorithm.
// n^4 (very inefficient) solution for dilate by k
int[][] dilate(int[][] image, int k){
for (int i=0; i<image.length; i++){
for (int j=0; j<image[i].length; j++){
if (image[i][j] == 1){
for (int l=ik; l<=i+k; l++){
int remainingk = kMath.abs(il);
for (int m=jremainingk; m<=j+remainingk; m++){
if (l>=0 && m>=0 && l<image.length && m<image.length && image[l][m]==0){
image[l][m] = 2;
}
}
}
}
}
}
for (int i=0; i<image.length; i++){
for (int j=0; j<image[i].length; j++){
if (image[i][j] == 2){
image[i][j] = 1;
}
}
}
return image;
}
Easy solution
If you pressed the dilate button more than once in the dilate by one examples, you may have noticed that dilating by one k times, is equivalent to dilating by k. This suggests the super easy algorithm of calling what was already written k times. Notice that this runs in O(n^2*k) time. Since k is bounded by n, that is really O(n^3) time.
// easy to write (but n^3 inefficient) dilate by k solution
int[][] dilate(int[][] image, int k){
for (int i=0; i<k; i++){
image = dilate(image);
}
return image;
}
Efficient Solution
To find an O(n^2) solution to this problem, the problem needs to be framed differently. What if there were an oracle that told us (for every pixel) the Manhattan distance to the nearest pixel. Pixels that are on would get a zero from the oracle. Pixels next to them a one, and so on.
The Manhattan distance oracle:
If we had that oracle we could just threshold by the Manhattan distance.
// n^2 solution with Manhattan oracle
int[][] dilate(int[][] image, int k){
image = manhattan(image);
for (int i=0; i<image.length; i++){
for (int j=0; j<image[i].length; j++){
image[i][j] = ((image[i][j]<=k)?1:0);
}
}
return image;
}
Manhattan Oracle
In One Dimension
So the question just becomes how can we write the Manhattan oracle to run in O(n^2) time. The best way to approach this problem is to take it down a dimension. In a single dimensional array, for each point find the distance to the nearest "on" point.
There are two facts that help:
 The maximum distance to a pixel that is on is the length of the array
 A pixel is at most one further from a pixel that is on than the pixel before it
Those with eagle eyes will soon discover that if you walk the array, and find a pixel that is on, you can count up until you get halfway to the next pixel that is on. That suggests that a two pass solution: One pass to find the nearest pixel to the left and one pass to find the nearest pixel to the right. The distance to the nearest pixel will be the minimum of the pixel to the left and the pixel to the right. In cases in which the distance is unknown, it is safe to assume the maximum because that value will be corrected later.
// O(n) solution to find the distance to "on" pixels in a single dimension array
int[] distance(int[] arr){
// traverse forwards
for (int i=0; i<arr.length; i++){
if (arr[i] == 1){
// first pass and pixel was on, it gets a zero
arr[i] = 0;
} else {
// pixel was off
// It is at most the length of array
// away from a pixel that is on
arr[i] = arr.length;
// One more than the pixel to the left
if (i>0) arr[i] = Math.min(arr[i], arr[i1]+1);
}
}
// traverse backwards
for (int i=arr.length1; i>=0; i){
// what we had on the first pass
// or one more than the pixel to the right
if (i+1<arr.length) arr[i] = Math.min(arr[i], arr[i+1]+1);
}
return arr;
}
In Two Dimensions
Generalizing from one dimension, we have a similar solution, but now have pixels on four sides of us instead of two sides to which to pay attention. Again, we can make two passes. On the first pass, look north and west, adding one to each (use the minimum). On the second pass, look south and east, adding one to each (use the minimum). The maximum Manhattan distance that a pixel can be away from a pixel that is "on" is the sum of the width and the height of the image.
// O(n^2) solution to find the Manhattan distance to "on" pixels in a two dimension array
int[][] manhattan(int[][] image){
// traverse from top left to bottom right
for (int i=0; i<image.length; i++){
for (int j=0; j<image[i].length; j++){
if (image[i][j] == 1){
// first pass and pixel was on, it gets a zero
image[i][j] = 0;
} else {
// pixel was off
// It is at most the sum of the lengths of the array
// away from a pixel that is on
image[i][j] = image.length + image[i].length;
// or one more than the pixel to the north
if (i>0) image[i][j] = Math.min(image[i][j], image[i1][j]+1);
// or one more than the pixel to the west
if (j>0) image[i][j] = Math.min(image[i][j], image[i][j1]+1);
}
}
}
// traverse from bottom right to top left
for (int i=image.length1; i>=0; i){
for (int j=image[i].length1; j>=0; j){
// either what we had on the first pass
// or one more than the pixel to the south
if (i+1<image.length) image[i][j] = Math.min(image[i][j], image[i+1][j]+1);
// or one more than the pixel to the east
if (j+1<image[i].length) image[i][j] = Math.min(image[i][j], image[i][j+1]+1);
}
}
return image;
}
4 thoughts on “Efficiently Implementing Dilate and Erode Image Functions”
Thank you for the post. I enjoyed reading it. Just one comment. The following statement “If you pressed the dilate button more than once in the dilate by one examples, you may have noticed that dilating by one k times, is equivalent to dilating by k.” is correct for a rectangular structuring element, but it is not for a circular one, for example. And the circular one is also a popular choice…
Hi.
Great article. It is very useful for me.
But there is one mistake. When you write about complexity in O notation.
As I understood, n – is a one dimension of your image(array). And n^2 – is a square of it.
If it’s true, then you must write complexity O(n), if you’r algorithm pass throw(or copy) image only once.
Because O argument(n) describes all data, instead of only one side of image.
If algorithm passes throw the image with AxB size only once it means O(n) complexity. Not O(n^2).
If algorithm passes throw the image with AxB size (AxB)*(AxB) times it means O(n^2) complexity.
If you don’t understand or not agree, please reply and I’ll try to explain more clear.
It depends how your define
n
. I define it as ann
byn
image, so an image hasn^2
pixels.actually, his bigO analysis is correct. If you represent an image as a 2D grid, that means you’re running two nested loops. For NxN image, that is O(N^2)