Detect diffs of similar images in OpenCV-Python
What I want to do
FOR EXAMPLE, CHECKING THE APPEARANCE OF OFFICE DOCUMENTS CONVERTED TO PDF WITH DIFFERENT TYPESETTING ENGINES. I want an image that can be used as evidence and that people can see and understand somehow. Also, it is difficult to look at all the images, so I give out different degrees numerically, and only the ones with big differences are confirmed by people.
What to prepare
- Environment in which Python3 runs
It's unofficial, but since there is an OpenCV Python environment, I'll put it in crisply.
pip install opencv-python
The dependency numpy also comes in with it.
import pathlib import cv2 import numpy as np source_dir = pathlib.Path('source_img') source_files = source_dir.glob('*.*') target_dir = pathlib.Path('target_img') result_dir = pathlib.Path('result_img') log_file = result_dir / pathlib.Path('result.log') kernel = np.ones((3, 3), np.uint8) fs = open(log_file, mode='w') for source_file in source_files: source_img = cv2.imread(str(source_file)) target_file = target_dir / source_file.name target_img = cv2.imread(str(target_file)) if target_img is None: fs.write(target_file + '...skipped.\ ') continue max_hight = max(source_img.shape, target_img.shape) max_width = max(source_img.shape, target_img.shape) temp_img = source_img source_img = np.zeros((max_hight, max_width, 3), dtype=np.uint8) source_img[0:temp_img.shape, 0:temp_img.shape] = temp_img temp_img = target_img target_img = np.zeros((max_hight, max_width, 3), dtype=np.uint8) target_img[0:temp_img.shape, 0:temp_img.shape] = temp_img result_img = cv2.addWeighted(source_img, 0.5, target_img, 0.5, 0) source_img = cv2.cvtColor(source_img, cv2.COLOR_BGR2GRAY) target_img = cv2.cvtColor(target_img, cv2.COLOR_BGR2GRAY) img = cv2.absdiff(source_img, target_img) rtn, img = cv2.threshold(img, 0, 255, cv2.THRESH_OTSU) img = cv2.morphologyEx(img, cv2.MORPH_OPEN, kernel) contours, hierarchy = cv2.findContours(img, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE) result_img = cv2.drawContours(result_img, contours, -1, (0, 0, 255)) score = 0 for contour in contours: score += cv2.contourArea(contour) score /= max_hight * max_width fs.write(target_file.name + ', ' + str(score) + '\ ') diff_file = result_dir / source_file.name cv2.imwrite(str(diff_file), result_img) fs.close()
Here's how to compare. It was too difficult to find Saizelia's mistake, so he solved it with the power of an adult http://kawalabo.blogspot.com/2014/11/blog-post.html It is a reference.
The score is simply taken by taking the area of the area where the difference was detected, dividing it by the area of the entire image, and spit it out in the log along with the file name.
test-1.png, 0.01231201710816777 test-2.png, 0.0084626793598234
January 7, 2020 Update
We made it possible to compare images even if they are of different sizes.
It is a similar table, but the width of the comparison target was a little smaller and the margin was larger.