dawsonia.ml.ml#

NOTE: Provides options via the command line to perform project tasks.

  • --source: dataset/model name (bentham, iam, rimes, saintgall, washington)

  • --arch: network to be used (puigcerver, bluche, flor)

  • --transform: transform dataset to the HDF5 file

  • --cv2: visualize sample from transformed dataset

  • --kaldi_assets: save all assets for use with kaldi

  • --image: predict a single image with the source parameter

  • --train: train model with the source argument

  • --test: evaluate and predict model with the source argument

  • --norm_accentuation: discard accentuation marks in the evaluation

  • --norm_punctuation: discard punctuation marks in the evaluation

  • --epochs: number of epochs

  • --batch_size: number of batches

Module Contents#

Functions#

model_predict

After training, load model and run predictions on an image

make_datagen

Data (image) reader

make_htr_model

Neural network: Handwritten Text Recognition (HTR) system.

model_test

Test model after training

model_train

Train model

Data#

logger

CHARSET_BASE

INPUT_SIZE

MAX_TEXT_LENGTH

__all__

API#

dawsonia.ml.ml.logger#

‘getLogger(…)’

dawsonia.ml.ml.CHARSET_BASE#

‘0123456789n.-+x’

dawsonia.ml.ml.INPUT_SIZE#

(1024, 128, 1)

dawsonia.ml.ml.MAX_TEXT_LENGTH#

128

dawsonia.ml.ml.__all__#

(‘CHARSET_BASE’, ‘INPUT_SIZE’, ‘MAX_TEXT_LENGTH’, ‘make_htr_model’, ‘model_train’, ‘model_test’, ‘mo…

dawsonia.ml.ml.model_predict(image: str | pathlib.Path | numpy.typing.NDArray | collections.abc.Iterator[numpy.typing.NDArray], arch: str = '', checkpoint_path: pathlib.Path | None = None, input_size: tuple[int, int, int] = INPUT_SIZE, max_text_length: int = MAX_TEXT_LENGTH, charset_base: str = CHARSET_BASE, tokenizer: dawsonia.ml.data.generator.Tokenizer | None = None, model: dawsonia.ml.network.model.HTRModel | None = None) tuple[dawsonia.typing.Prediction, dawsonia.typing.Probability]#

After training, load model and run predictions on an image

dawsonia.ml.ml.make_datagen(batch_size, source_path, max_text_length: int = MAX_TEXT_LENGTH, charset_base: str = CHARSET_BASE)#

Data (image) reader

dawsonia.ml.ml.make_htr_model(arch, checkpoint_path, vocab_size, input_size=INPUT_SIZE, test_mode=True, learning_rate=None, **kwargs)#

Neural network: Handwritten Text Recognition (HTR) system.

See arthurflor23/handwritten-text-recognition

dawsonia.ml.ml.model_test(norm_accentuation, norm_punctuation, output_path, model, dtgen)#

Test model after training

dawsonia.ml.ml.model_train(epochs, output_path, checkpoint_path, model: dawsonia.ml.network.model.HTRModel, dtgen: dawsonia.ml.data.generator.DataGenerator)#

Train model