dawsonia.io._pdf#
Module Contents#
Functions#
Read PDF book and detect pages |
|
Analyze PDF metadata for max. pagenumbers and compare it with input parameters |
|
Deduce weather station name from the directory name where it is stored |
|
Read two pages from pdf as image using pdfplumber |
|
Parse string CLI skip arguments and generate a dictionary of tables, rows and columns to skip |
Data#
API#
- dawsonia.io._pdf.logger#
‘getLogger(…)’
- dawsonia.io._pdf.read_pdf_book(path_file: pathlib.Path, first_page: int = 1, last_page: int = 1000000, page_middle: int | None = None, size_cell: list[float] | None = None, table_fmt_dir: pathlib.Path = Path('table_formats')) tuple[int, int, dawsonia.io._book.Book]#
Read PDF book and detect pages
- dawsonia.io._pdf.check_pdf_page_range(path_pdf, first_page=1, last_page=1000000)#
Analyze PDF metadata for max. pagenumbers and compare it with input parameters
Returns
first_page, last_page: tuple[int, int] The first and last page (with corrections if any or raises a ValueError)
- dawsonia.io._pdf.station_name_from_pdf(path_pdf: pathlib.Path) str#
Deduce weather station name from the directory name where it is stored
- dawsonia.io._pdf.year_from_pdf(path_pdf: pathlib.Path) str#
- dawsonia.io._pdf.get_pdf_pages(path_pdf: pathlib.Path, left_page: int, right_page: int) Iterator[dawsonia.typing.NDArray[numpy.int32]]#
Read two pages from pdf as image using pdfplumber