dawsonia.ml.data.generator#

Uses generator functions to supply train/test with data.

Image renderings and text are created on the fly each time.

Module Contents#

Classes#

DataGenerator

Generator class with data streaming.

Tokenizer

Manager tokens functions and charset/dictionary properties.

API#

class dawsonia.ml.data.generator.DataGenerator(source: str, batch_size: int, charset: str, max_text_length: int, stream: bool = False)#

Generator class with data streaming.

Initialization

next_train_batch() Iterator[tuple[numpy.typing.NDArray, numpy.typing.NDArray]]#

Get the next batch from train partition (yield)

next_valid_batch() Iterator[tuple[numpy.typing.NDArray, numpy.typing.NDArray]]#

Get the next batch from validation partition (yield)

next_test_batch()#

Return model predict parameters.

class dawsonia.ml.data.generator.Tokenizer(chars, max_text_length=128)#

Manager tokens functions and charset/dictionary properties.

Initialization

encode(text)#

Encode text to vector.

decode(text)#

Decode vector to text.

remove_tokens(text)#

Remove tokens (PAD) from text.