Visual Segmentation Of Documents Contained In Files
Filing date: June 17, 2023
Application number: 18/336888
The present invention is directed to apparatuses, systems, and methods that use artificial intelligence to facilitate document segmentation. Document segmentation can be useful to isolate multiple documents that are included in, e.g., a single image so that each document can be processed individually. In one aspect of the inventive subject matter, a method of document segmentation using artificial intelligence (AI) includes the steps of: receiving, by an AI system, a file comprising an image of a first document and a second document; converting, by the AI system, the file into a tensor; applying a deep learning model to the tensor to create a mask image from the tensor, where the deep learning model has been trained using a training set of images having ground truth masks and where each image in the training set comprises at least two documents; converting the mask image to a grayscale image; applying thresholding to the grayscale image to create a black and white image; applying image processing to the black and white image to identify a first white space and a second white space along with a first contour surrounding the first white space and a second contour surrounding the second white space; where the first contour comprises a first list of vectors that form a first closed shape around the first white space, and wherein the second contour comprises a second list of vectors that form a second closed shape around the second white space; where the first white space has a first area the second white space has a second area; where the black and white image has a total area; comparing the first area to the total area to get a first ratio and comparing the second area to the total area to get a second ratio; comparing the first ratio and the second ratio to a threshold value; upon determining the first ratio exceeds the threshold value, recording the first contour; upon determining the second ratio exceeds the threshold value, recording the second contour; identifying a first minimum bounding rectangle that surrounds the first contour and cropping the image according to the first minimum bounding rectangle to create a first processable image; and identifying a second minimum bounding rectangle that surrounds the second contour and cropping the image according to the second minimum bounding rectangle to create a second processable image.