Digitize hAndWritten obServatiONs In weather journAls

Dawsonia is a young project aimed at data-rescue of weather journals stored as PDF / Zarr files. Some of its salient features are:

Table detection

Specializes in tables with grid lines, detecting both its position and structure

Handwritten text recognition

Built-in support for image-to-text AI to read handwritten numbers and symbols.


TOML based configuration file to adapt to different table layouts


Free to use, modify and distribute. Contributions welcome. AGPL-3.0 licensed.

The digitization pipeline is implemented in Python, using well-known open-source scientific libraries. NumPy, SciPy, Pandas, OpenCV, scikit-image, Tensorflow, scikit-learn, Typer… to name a few.

Indices and tables#