Instructions to use keras-io/cl_s2s with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- TF-Keras
How to use keras-io/cl_s2s with TF-Keras:
# Note: 'keras<3.x' or 'tf_keras' must be installed (legacy) # See https://github.com/keras-team/tf-keras for more details. from huggingface_hub import from_pretrained_keras model = from_pretrained_keras("keras-io/cl_s2s") - Notebooks
- Google Colab
- Kaggle
Tensorflow Keras Implementation of Character-level Recurrent Sequence-to-Sequence Model
This repo contains code using the model. Character-level recurrent sequence-to-sequence model.
Credits: fchollet - Original Author
HF Contribution: Rishav Chandra Varma
Background Information
Introduction
This example demonstrates how to implement a basic character-level recurrent sequence-to-sequence model. We apply it to translating short English sentences into short French sentences, character-by-character. Note that it is fairly unusual to do character-level machine translation, as word-level models are more common in this domain.
Summary of the algorithm
- We start with input sequences from a domain (e.g. English sentences) and corresponding target sequences from another domain (e.g. French sentences).
- An encoder LSTM turns input sequences to 2 state vectors (we keep the last LSTM state and discard the outputs).
- A decoder LSTM is trained to turn the target sequences into the same sequence but offset by one timestep in the future, a training process called "teacher forcing" in this context. It uses as initial state the state vectors from the encoder. Effectively, the decoder learns to generate targets[t+1...] given targets[...t], conditioned on the input sequence.
- In inference mode, when we want to decode unknown input sequences, we: - Encode the input sequence into state vectors - Start with a target sequence of size 1 (just the start-of-sequence character) - Feed the state vectors and 1-char target sequence to the decoder to produce predictions for the next character - Sample the next character using these predictions (we simply use argmax). - Append the sampled character to the target sequence - Repeat until we generate the end-of-sequence character or we hit the character limit.
- Downloads last month
- 11
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support