You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As an exercise to get acquainted with Keras, I want to train a simple model with attention to translate sentences.
I am not calling a tf function, only using Keras layers. But I get the following error:
A KerasTensor cannot be used as input to a TensorFlow function. A KerasTensor is a symbolic placeholder for a shape and dtype, used when constructing Keras Functional models or Keras Functions. You can only use it as input to a Keras layer or a Keras operation (from the namespaces keras.layers and keras.operations). [...]
Here is the code for the model using Keras' functional API:
encoder_inputs=tf.keras.layers.Input(shape=[], dtype=tf.string)
decoder_inputs=tf.keras.layers.Input(shape=[], dtype=tf.string)
embed_size=128encoder_inputs_ids=text_vec_layer_en(encoder_inputs)
decoder_inputs_ids=text_vec_layer_es(decoder_inputs)
encoder_embedding_layer=tf.keras.layers.Embedding(vocab_size, embed_size, mask_zero=True)
decoder_embedding_layer=tf.keras.layers.Embedding(vocab_size, embed_size, mask_zero=True)
encoder_embeddings=encoder_embedding_layer(encoder_inputs_ids)
decoder_embeddings=decoder_embedding_layer(decoder_inputs_ids)
encoder=tf.keras.layers.LSTM(512, return_sequences=True, return_state=True)
encoder_outputs, *encoder_state=encoder(encoder_embeddings)
decoder=tf.keras.layers.LSTM(512, return_sequences=True)
decoder_outputs=decoder(decoder_embeddings, initial_state=encoder_state)
# Attention layer here!# Problems getting it to work on Keras 3attention_layer=tf.keras.layers.Attention()
attention_outputs=attention_layer([decoder_outputs, encoder_outputs])
output_layer=tf.keras.layers.Dense(vocab_size, activation="softmax")
Y_probas=output_layer(attention_outputs)
Expected behavior: The Keras attention layer accepts Keras tensor inputs. Or a more helpful error message is given.
Python version: 3.11.0
Tensorflow version: 2.17.0
Keras version: 3.4.1 (bundled with that Tensorflow version)
The text was updated successfully, but these errors were encountered:
Thanks for reporting the issue. Based on code understand that you are trying to create model with attention for translate sentence.
Here instead of using tf.keras.layers.Attention you can use tf.keras.layers.MultiHeadAttention with query,key and value for dot product. And then those attention output need to combine with decoder output and then create model using function API.
Attached gist for your reference here.
tf.keras.layers.Attention is not fetching the input like this attention_outputs = attention_layer([decoder_outputs, encoder_outputs]). Here you can find more details about attention layer.
Thanks @mehtamansi29 . This does give a working model.
However, my interest was not so much to build a model to translate, but rather to understand how the Keras interface works. Is the behavior of the Attention layer expected? If so, what is the logic? Or is this a bug?
As an exercise to get acquainted with Keras, I want to train a simple model with attention to translate sentences.
I am not calling a tf function, only using Keras layers. But I get the following error:
A KerasTensor cannot be used as input to a TensorFlow function. A KerasTensor is a symbolic placeholder for a shape and dtype, used when constructing Keras Functional models or Keras Functions. You can only use it as input to a Keras layer or a Keras operation (from the namespaces
keras.layers
andkeras.operations
). [...]Here is the code for the model using Keras' functional API:
Expected behavior: The Keras attention layer accepts Keras tensor inputs. Or a more helpful error message is given.
Python version: 3.11.0
Tensorflow version: 2.17.0
Keras version: 3.4.1 (bundled with that Tensorflow version)
The text was updated successfully, but these errors were encountered: