Directly use the backbone functional graph for `CausalLM.generate()` #1862

mattdangerw · 2024-09-22T23:01:04Z

Currently, to support the extra inputs we need for generation (e.g. cache, index, encoder hidden states for seq2seq), we are using layers from our backbone class while disregarding the functional graph and layer connectivity of the backbone. See call_with_cache. If we were able to directly use the backbone graph for generation, we would support a lot more advanced generative use cases.

Keras recently added support for optional functional inputs. We should build on that by adding a number of optional inputs to our backbones (e.g. cache, cache_index, token_positions, attention_mask). This would allow customization in a lot of directions:

Backbones would be more readily useful for more advance non-generative cases without needing to reach into sublayers.
Generation would be more easily customizable by passing a modified backbone to a CausalLM.

The text was updated successfully, but these errors were encountered:

mattdangerw self-assigned this Sep 22, 2024

github-actions bot added the Gemma Gemma model specific issues label Sep 22, 2024

mattdangerw changed the title ~~Directly use the functional graph for generative forward passes~~ Directly use the backbone functional graph for CausalLM.generate() Sep 22, 2024

This was referenced Sep 22, 2024

Add support for JetStream generative inference for all KerasHub LLMs #1863

Open

🗺️ KerasHub Roadmap 🗺️ #1836

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Directly use the backbone functional graph for `CausalLM.generate()` #1862

Directly use the backbone functional graph for `CausalLM.generate()` #1862

mattdangerw commented Sep 22, 2024

Directly use the backbone functional graph for CausalLM.generate() #1862

Directly use the backbone functional graph for CausalLM.generate() #1862

Comments

mattdangerw commented Sep 22, 2024

Directly use the backbone functional graph for `CausalLM.generate()` #1862

Directly use the backbone functional graph for `CausalLM.generate()` #1862