Skip to content

Conversation

@kholdrex
Copy link
Member

@kholdrex kholdrex commented Jan 6, 2026

This pull request introduces a new text module and refactors the character-level LSTM example to use improved text utilities for vocabulary management, embedding, and sampling. It also adds a dedicated example demonstrating the use of these text utilities. The changes modernize the text generation pipeline, making it both simpler and more flexible.

Text utilities integration and example:

  • Added a new text module to the library, providing TextVocabulary, CharacterEmbedding, and advanced sampling functions, and re-exported these for easy access. [1] [2]
  • Created a new example file text_utils_example.rs that demonstrates the usage of vocabulary encoding/decoding, character embeddings, LSTM/linear pipeline, and various sampling strategies.

Refactor of character-level LSTM example:

  • Refactored CharacterLSTM in text_generation_advanced.rs to use the new TextVocabulary and CharacterEmbedding, replacing manual character-index mapping and custom embedding logic.
  • Updated training and generation logic to use cross-entropy loss and output logits over the vocabulary, with sampling handled by the new utilities.
  • Simplified and modernized the code structure, removing redundant projection and sampling methods, and improving clarity in the training and generation workflow.

@kholdrex kholdrex assigned kholdrex and unassigned kholdrex Jan 6, 2026
@kholdrex kholdrex added enhancement New feature or request labels Jan 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants