Detect diffs of similar images in OpenCV-Python


What I want to do

FOR EXAMPLE, CHECKING THE APPEARANCE OF OFFICE DOCUMENTS CONVERTED TO PDF WITH DIFFERENT TYPESETTING ENGINES. I want an image that can be used as evidence and that people can see and understand somehow. Also, it is difficult to look at all the images, so I give out different degrees numerically, and only the ones with big differences are confirmed by people.

What to prepare

  • Environment in which Python3 runs
  • OpenCV-Python

It's unofficial, but since there is an OpenCV Python environment, I'll put it in crisply. pip install opencv-python The dependency numpy also comes in with it.

Python code
import pathlib
import cv2
import numpy as np

source_dir = pathlib.Path('source_img')
source_files = source_dir.glob('*.*')
target_dir = pathlib.Path('target_img')
result_dir = pathlib.Path('result_img')
log_file = result_dir / pathlib.Path('result.log')
kernel = np.ones((3, 3), np.uint8)

fs = open(log_file, mode='w')
for source_file in source_files:
    source_img = cv2.imread(str(source_file))
    target_file = target_dir /
    target_img = cv2.imread(str(target_file))
    if target_img is None:
        fs.write(target_file + '...skipped.\
    max_hight = max(source_img.shape[0], target_img.shape[0])
    max_width = max(source_img.shape[1], target_img.shape[1])

    temp_img = source_img
    source_img = np.zeros((max_hight, max_width, 3), dtype=np.uint8)
    source_img[0:temp_img.shape[0], 0:temp_img.shape[1]] = temp_img

    temp_img = target_img
    target_img = np.zeros((max_hight, max_width, 3), dtype=np.uint8)
    target_img[0:temp_img.shape[0], 0:temp_img.shape[1]] = temp_img

    result_img = cv2.addWeighted(source_img, 0.5, target_img, 0.5, 0)

    source_img = cv2.cvtColor(source_img, cv2.COLOR_BGR2GRAY)
    target_img = cv2.cvtColor(target_img, cv2.COLOR_BGR2GRAY)
    img = cv2.absdiff(source_img, target_img)
    rtn, img = cv2.threshold(img, 0, 255, cv2.THRESH_OTSU)
    img = cv2.morphologyEx(img, cv2.MORPH_OPEN, kernel)

    contours, hierarchy = cv2.findContours(img, cv2.RETR_TREE,
    result_img = cv2.drawContours(result_img, contours, -1, (0, 0, 255))
    score = 0
    for contour in contours:
        score += cv2.contourArea(contour)
    score /= max_hight * max_width
    fs.write( + ', ' + str(score) + '\
    diff_file = result_dir /
    cv2.imwrite(str(diff_file), result_img)

Here's how to compare. It was too difficult to find Saizelia's mistake, so he solved it with the power of an adult It is a reference.

The score is simply taken by taking the area of the area where the difference was detected, dividing it by the area of the entire image, and spit it out in the log along with the file name.

test-1.png, 0.01231201710816777
test-2.png, 0.0084626793598234

January 7, 2020 Update

We made it possible to compare images even if they are of different sizes.

Execution example

test-1.pngIt is a similar table, but the width of the comparison target was a little smaller and the margin was larger.

Author by

Updated on January 08, 2020