dawsonia.table_detect.opencv_contours#
Module Contents#
Functions#
A quadratic polynomial with coefficients. Coefficients should be determined by curve-fitting |
|
A cubic polynomial with coefficients. Coefficients should be determined by curve-fitting |
|
Cluster bboxes along |
Data#
API#
- dawsonia.table_detect.opencv_contours.logger#
‘getLogger(…)’
- dawsonia.table_detect.opencv_contours.__all__#
(‘table_detect_opencv_contours’, ‘get_table_structure’, ‘cluster_axis’, ‘create_bbox_array’, ‘set_na…
- dawsonia.table_detect.opencv_contours.table_detect_opencv_contours(filtered_image: numpy.typing.NDArray, thresh_value: numpy.typing.NDArray, binary_tables: numpy.typing.NDArray[numpy.bool_], size_tables: dawsonia.typing.TableSizesGeneric, preproc_cfg: dawsonia.typing.PreprocConfig, original_image: numpy.typing.NDArray) tuple[dawsonia.typing.TablePositions, dawsonia.typing.TableSizes, int]#
- dawsonia.table_detect.opencv_contours.get_table_structure(expected_size_tables: dawsonia.typing.TableSizesGeneric, preproc_cfg: dawsonia.typing.PreprocConfig, filtered_image: numpy.typing.NDArray, filtered_image_inv: numpy.typing.NDArray, thresh_value: numpy.typing.NDArray, original_image: numpy.typing.NDArray, label_tables: numpy.typing.NDArray[numpy.int64], nb_labels: int, min_nb_pixels: int, sensibility: float, list_sizes: dawsonia.typing.TableSizes, list_positions: dawsonia.typing.TablePositions) None#
- dawsonia.table_detect.opencv_contours.create_bbox_array(bboxes: list[dawsonia.typing.BBoxTuple], column_labels: numpy.typing.NDArray[numpy.int64], column_bboxes_idxs: dict[dawsonia.typing.ClusterLabel, numpy.typing.NDArray[numpy.int64]], row_labels: numpy.typing.NDArray[numpy.int64], row_bboxes_idxs: dict[dawsonia.typing.ClusterLabel, numpy.typing.NDArray[numpy.int64]])#
- dawsonia.table_detect.opencv_contours.set_nan_to_odd_bboxes(all_bboxes)#
- dawsonia.table_detect.opencv_contours.fix_missing_bboxes(all_bboxes)#
- dawsonia.table_detect.opencv_contours.page_curvature_along_x_quad(xs, y_median, a0, a1, a2)#
A quadratic polynomial with coefficients. Coefficients should be determined by curve-fitting
- dawsonia.table_detect.opencv_contours.page_curvature_along_x_cubic(xs, y_median, a0, a1, a2, a3)#
A cubic polynomial with coefficients. Coefficients should be determined by curve-fitting
- dawsonia.table_detect.opencv_contours.bboxes_from_contours(contours, area_range=(300, 10000), aspect_ratio_range=(0.05, 20))#
- dawsonia.table_detect.opencv_contours.cluster_axis(ref_image: numpy.typing.NDArray, bboxes: collections.abc.Sequence[dawsonia.typing.BBoxTuple], axis: Literal[0, 1] = 0, distance_threshold: float | None = None, min_cluster_size: int = 0) tuple[numpy.typing.NDArray[numpy.int64], dict[dawsonia.typing.ClusterLabel, numpy.typing.NDArray[numpy.int64]]]#
Cluster bboxes along
axis.Parameters
ref_image: NDArray Image for debugging
bboxes: Iterable[BBoxTuple] Bounding boxes to be clustered
axis: int Axis along which clustering should be done. 0 identifies columns and 1 identifies rows
distance_threshold: float | None Max. distance in pixels between bbox centers within a cluster. It is the linkage distance threshold and above this clusters will not be merged.
min_cluster_size: int Minimum number of bounding bboxes to be considered a cluster
Notes
Reference for this method can be found at https://pyimagesearch.com/2022/02/28/multi-column-table-ocr/. See also
sklearn.cluster.AgglomerativeClustering.