Detect diffs of similar images in OpenCV-Python
What I want to do
FOR EXAMPLE, CHECKING THE APPEARANCE OF OFFICE DOCUMENTS CONVERTED TO PDF WITH DIFFERENT TYPESETTING ENGINES. I want an image that can be used as evidence and that people can see and understand somehow. Also, it is difficult to look at all the images, so I give out different degrees numerically, and only the ones with big differences are confirmed by people.
What to prepare
- Environment in which Python3 runs
- OpenCV-Python
It's unofficial, but since there is an OpenCV Python environment, I'll put it in crisply.
pip install opencv-python
The dependency numpy also comes in with it.
Python code
import pathlib
import cv2
import numpy as np
source_dir = pathlib.Path('source_img')
source_files = source_dir.glob('*.*')
target_dir = pathlib.Path('target_img')
result_dir = pathlib.Path('result_img')
log_file = result_dir / pathlib.Path('result.log')
kernel = np.ones((3, 3), np.uint8)
fs = open(log_file, mode='w')
for source_file in source_files:
source_img = cv2.imread(str(source_file))
target_file = target_dir / source_file.name
target_img = cv2.imread(str(target_file))
if target_img is None:
fs.write(target_file + '...skipped.\
')
continue
max_hight = max(source_img.shape[0], target_img.shape[0])
max_width = max(source_img.shape[1], target_img.shape[1])
temp_img = source_img
source_img = np.zeros((max_hight, max_width, 3), dtype=np.uint8)
source_img[0:temp_img.shape[0], 0:temp_img.shape[1]] = temp_img
temp_img = target_img
target_img = np.zeros((max_hight, max_width, 3), dtype=np.uint8)
target_img[0:temp_img.shape[0], 0:temp_img.shape[1]] = temp_img
result_img = cv2.addWeighted(source_img, 0.5, target_img, 0.5, 0)
source_img = cv2.cvtColor(source_img, cv2.COLOR_BGR2GRAY)
target_img = cv2.cvtColor(target_img, cv2.COLOR_BGR2GRAY)
img = cv2.absdiff(source_img, target_img)
rtn, img = cv2.threshold(img, 0, 255, cv2.THRESH_OTSU)
img = cv2.morphologyEx(img, cv2.MORPH_OPEN, kernel)
contours, hierarchy = cv2.findContours(img, cv2.RETR_TREE,
cv2.CHAIN_APPROX_SIMPLE)
result_img = cv2.drawContours(result_img, contours, -1, (0, 0, 255))
score = 0
for contour in contours:
score += cv2.contourArea(contour)
score /= max_hight * max_width
fs.write(target_file.name + ', ' + str(score) + '\
')
diff_file = result_dir / source_file.name
cv2.imwrite(str(diff_file), result_img)
fs.close()
Here's how to compare. It was too difficult to find Saizelia's mistake, so he solved it with the power of an adult http://kawalabo.blogspot.com/2014/11/blog-post.html It is a reference.
The score is simply taken by taking the area of the area where the difference was detected, dividing it by the area of the entire image, and spit it out in the log along with the file name.
test-1.png, 0.01231201710816777
test-2.png, 0.0084626793598234
January 7, 2020 Update
We made it possible to compare images even if they are of different sizes.
Execution example
It is a similar table, but the width of the comparison target was a little smaller and the margin was larger.