diff --git a/README.md b/README.md
index 4ca23f1c..616e0268 100644
--- a/README.md
+++ b/README.md
@@ -240,6 +240,18 @@ Character Model：<a href="https://civitai.com/models/13237/genshen-impact-yoimi
 Character Model：<a href="https://civitai.com/models/9850/paimon-genshin-impact">Paimon</a>;
 Pose Model：<a href="https://civitai.com/models/107295/or-holdingsign">Hold Sign</a></p>
 
+
+<table>
+  <tr>
+    <td><img src="__assets__/animations/model_09/01.gif"></td>
+    <td><img src="__assets__/animations/model_09/02.gif"></td>
+    <td><img src="__assets__/animations/model_09/03.gif"></td>
+    <td><img src="__assets__/animations/model_09/04.gif"></td>
+  </tr>
+</table>
+<p style="margin-left: 2em; margin-top: -1em">
+Style Model：<a href="https://civitai.com/models/27670?modelVersionId=87497">ShinYunBok</a></p>
+
 ## BibTeX
 ```
 @article{guo2023animatediff,
@@ -256,4 +268,4 @@ Pose Model：<a href="https://civitai.com/models/107295/or-holdingsign">Hold Sig
 **Bo Dai**: [daibo@pjlab.org.cn](mailto:daibo@pjlab.org.cn)
 
 ## Acknowledgements
-Codebase built upon [Tune-a-Video](https://github.com/showlab/Tune-A-Video).
\ No newline at end of file
+Codebase built upon [Tune-a-Video](https://github.com/showlab/Tune-A-Video).
diff --git a/__assets__/animations/model_09/01.gif b/__assets__/animations/model_09/01.gif
new file mode 100644
index 00000000..1622749e
Binary files /dev/null and b/__assets__/animations/model_09/01.gif differ
diff --git a/__assets__/animations/model_09/02.gif b/__assets__/animations/model_09/02.gif
new file mode 100644
index 00000000..819c0ba4
Binary files /dev/null and b/__assets__/animations/model_09/02.gif differ
diff --git a/__assets__/animations/model_09/03.gif b/__assets__/animations/model_09/03.gif
new file mode 100644
index 00000000..02e7cecf
Binary files /dev/null and b/__assets__/animations/model_09/03.gif differ
diff --git a/__assets__/animations/model_09/04.gif b/__assets__/animations/model_09/04.gif
new file mode 100644
index 00000000..16f3cb44
Binary files /dev/null and b/__assets__/animations/model_09/04.gif differ
diff --git a/site/en/gemma/docs/keras_inference.ipynb b/site/en/gemma/docs/keras_inference.ipynb
new file mode 100644
index 00000000..a7d6bc17
--- /dev/null
+++ b/site/en/gemma/docs/keras_inference.ipynb
@@ -0,0 +1,623 @@
+{
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "Tce3stUlHN0L"
+      },
+      "source": [
+        "##### Copyright 2024 Google LLC."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 1,
+      "metadata": {
+        "cellView": "form",
+        "id": "tuOe1ymfHZPu"
+      },
+      "outputs": [],
+      "source": [
+        "#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n",
+        "# you may not use this file except in compliance with the License.\n",
+        "# You may obtain a copy of the License at\n",
+        "#\n",
+        "# https://www.apache.org/licenses/LICENSE-2.0\n",
+        "#\n",
+        "# Unless required by applicable law or agreed to in writing, software\n",
+        "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
+        "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
+        "# See the License for the specific language governing permissions and\n",
+        "# limitations under the License."
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "4qxv4Sn9b8CE"
+      },
+      "source": [
+        "<table class=\"tfo-notebook-buttons\" align=\"left\">\n",
+        "  <td>\n",
+        "    <a target=\"_blank\" href=\"https://ai.google.dev/gemma/docs/keras_inference\"><img src=\"https://ai.google.dev/static/site-assets/images/docs/notebook-site-button.png\" height=\"32\" width=\"32\" />View on ai.google.dev</a>\n",
+        "  </td>\n",
+        "    <td>\n",
+        "    <a target=\"_blank\" href=\"https://colab.research.google.com/github/google/generative-ai-docs/blob/main/site/en/gemma/docs/keras_inference.ipynb\"><img src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" />Run in Google Colab</a>\n",
+        "  </td>\n",
+        "  <td>\n",
+        "    <a target=\"_blank\" href=\"https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/google/generative-ai-docs/main/site/en/gemma/docs/keras_inference.ipynb\"><img src=\"https://ai.google.dev/images/cloud-icon.svg\" width=\"40\" />Open in Vertex AI</a>\n",
+        "  </td>\n",
+        "  <td>\n",
+        "    <a target=\"_blank\" href=\"https://github.com/google/generative-ai-docs/blob/main/site/en/gemma/docs/keras_inference.ipynb\"><img src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\" />View source on GitHub</a>\n",
+        "  </td>\n",
+        "</table>"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "PXNm5_p_oxMF"
+      },
+      "source": [
+        "# Get started with Gemma using KerasNLP\n",
+        "\n",
+        "This tutorial shows you how to get started with Gemma using [KerasNLP](https://keras.io/keras_nlp/). Gemma is a family of lightweight, state-of-the art open models built from the same research and technology used to create the Gemini models. KerasNLP is a collection of natural language processing (NLP) models implemented in [Keras](https://keras.io/) and runnable on JAX, PyTorch, and TensorFlow.\n",
+        "\n",
+        "In this tutorial, you'll use Gemma to generate text responses to several prompts. If you're new to Keras, you might want to read [Getting started with Keras](https://keras.io/getting_started/) before you begin, but you don't have to. You'll learn more about Keras as you work through this tutorial."
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "mERVCCsGUPIJ"
+      },
+      "source": [
+        "## Setup"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "QQ6W7NzRe1VM"
+      },
+      "source": [
+        "### Gemma setup\n",
+        "\n",
+        "To complete this tutorial, you'll first need to complete the setup instructions at [Gemma setup](https://ai.google.dev/gemma/docs/setup). The Gemma setup instructions show you how to do the following:\n",
+        "\n",
+        "* Get access to Gemma on kaggle.com.\n",
+        "* Select a Colab runtime with sufficient resources to run\n",
+        "  the Gemma 2B model.\n",
+        "* Generate and configure a Kaggle username and API key.\n",
+        "\n",
+        "After you've completed the Gemma setup, move on to the next section, where you'll set environment variables for your Colab environment."
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "_gN-IVRC3dQe"
+      },
+      "source": [
+        "### Set environment variables\n",
+        "\n",
+        "Set environment variables for `KAGGLE_USERNAME` and `KAGGLE_KEY`."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 1,
+      "metadata": {
+        "id": "DrBoa_Urw9Vx"
+      },
+      "outputs": [],
+      "source": [
+        "import os\n",
+        "from google.colab import userdata\n",
+        "\n",
+        "# Note: `userdata.get` is a Colab API. If you're not using Colab, set the env\n",
+        "# vars as appropriate for your system.\n",
+        "os.environ[\"KAGGLE_USERNAME\"] = userdata.get('KAGGLE_USERNAME')\n",
+        "os.environ[\"KAGGLE_KEY\"] = userdata.get('KAGGLE_KEY')"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "z9oy3QUmXtSd"
+      },
+      "source": [
+        "### Install dependencies\n",
+        "\n",
+        "Install Keras and KerasNLP."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "UcGLzDeQ8NwN"
+      },
+      "outputs": [],
+      "source": [
+        "# Install Keras 3 last. See https://keras.io/getting_started/ for more details.\n",
+        "!pip install -q -U keras-nlp\n",
+        "!pip install -q -U \"keras>=3\""
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "Pm5cVOFt5YvZ"
+      },
+      "source": [
+        "### Select a backend\n",
+        "\n",
+        "Keras is a high-level, multi-framework deep learning API designed for simplicity and ease of use. [Keras 3](https://keras.io/keras_3) lets you choose the backend: TensorFlow, JAX, or PyTorch. All three will work for this tutorial."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 3,
+      "metadata": {
+        "id": "7rS7ryTs5wjf"
+      },
+      "outputs": [],
+      "source": [
+        "import os\n",
+        "\n",
+        "os.environ[\"KERAS_BACKEND\"] = \"jax\"  # Or \"tensorflow\" or \"torch\".\n",
+        "os.environ[\"XLA_PYTHON_CLIENT_MEM_FRACTION\"] = \"0.9\""
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "599765c72722"
+      },
+      "source": [
+        "### Import packages\n",
+        "\n",
+        "Import Keras and KerasNLP."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 4,
+      "metadata": {
+        "id": "f2fa267d75bc"
+      },
+      "outputs": [],
+      "source": [
+        "import keras\n",
+        "import keras_nlp"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "ZsxDCbLN555T"
+      },
+      "source": [
+        "## Create a model\n",
+        "\n",
+        "KerasNLP provides implementations of many popular [model architectures](https://keras.io/api/keras_nlp/models/). In this tutorial, you'll create a model using `GemmaCausalLM`, an end-to-end Gemma model for causal language modeling. A causal language model predicts the next token based on previous tokens.\n",
+        "\n",
+        "Create the model using the `from_preset` method:"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "yygIK9DEIldp"
+      },
+      "outputs": [],
+      "source": [
+        "gemma_lm = keras_nlp.models.GemmaCausalLM.from_preset(\"gemma2_2b_en\")\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "XrAWvsU6pI0E"
+      },
+      "source": [
+        "The `GemmaCausalLM.from_preset()` function instantiates the model from a preset architecture and weights. In the code above, the string `\"gemma2_2b_en\"` specifies the preset the Gemma 2 2B model with 2 billion parameters. Gemma models with [7B, 9B, and 27B parameters](/gemma/docs/get_started#models-list) are also available. You can find the code strings for Gemma models in their **Model Variation** listings on [Kaggle](https://www.kaggle.com/models/google/gemma).\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "Ij73k0PfUhjE"
+      },
+      "source": [
+        "Note: To run the larger models in Colab, you need access to the premium GPUs available in paid plans. Alternatively, you can perform inferences using Kaggle notebooks or Google Cloud projects.\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "E-cSEjULUhST"
+      },
+      "source": [
+        "Use `summary` to get more info about the model:"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 6,
+      "metadata": {
+        "id": "e5nEbTdApL7W",
+        "outputId": "c7f65192-7fbf-489d-8b1d-a24b1e83cce2",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 408
+        }
+      },
+      "outputs": [
+        {
+          "output_type": "display_data",
+          "data": {
+            "text/plain": [
+              "\u001b[1mPreprocessor: \"gemma_causal_lm_preprocessor\"\u001b[0m\n"
+            ],
+            "text/html": [
+              "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\">Preprocessor: \"gemma_causal_lm_preprocessor\"</span>\n",
+              "</pre>\n"
+            ]
+          },
+          "metadata": {}
+        },
+        {
+          "output_type": "display_data",
+          "data": {
+            "text/plain": [
+              "┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n",
+              "┃\u001b[1m \u001b[0m\u001b[1mTokenizer (type)                                  \u001b[0m\u001b[1m \u001b[0m┃\u001b[1m \u001b[0m\u001b[1m                                            Vocab #\u001b[0m\u001b[1m \u001b[0m┃\n",
+              "┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩\n",
+              "│ gemma_tokenizer (\u001b[38;5;33mGemmaTokenizer\u001b[0m)                   │                                             \u001b[38;5;34m256,000\u001b[0m │\n",
+              "└────────────────────────────────────────────────────┴─────────────────────────────────────────────────────┘\n"
+            ],
+            "text/html": [
+              "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n",
+              "┃<span style=\"font-weight: bold\"> Tokenizer (type)                                   </span>┃<span style=\"font-weight: bold\">                                             Vocab # </span>┃\n",
+              "┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩\n",
+              "│ gemma_tokenizer (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">GemmaTokenizer</span>)                   │                                             <span style=\"color: #00af00; text-decoration-color: #00af00\">256,000</span> │\n",
+              "└────────────────────────────────────────────────────┴─────────────────────────────────────────────────────┘\n",
+              "</pre>\n"
+            ]
+          },
+          "metadata": {}
+        },
+        {
+          "output_type": "display_data",
+          "data": {
+            "text/plain": [
+              "\u001b[1mModel: \"gemma_causal_lm\"\u001b[0m\n"
+            ],
+            "text/html": [
+              "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\">Model: \"gemma_causal_lm\"</span>\n",
+              "</pre>\n"
+            ]
+          },
+          "metadata": {}
+        },
+        {
+          "output_type": "display_data",
+          "data": {
+            "text/plain": [
+              "┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n",
+              "┃\u001b[1m \u001b[0m\u001b[1mLayer (type)                 \u001b[0m\u001b[1m \u001b[0m┃\u001b[1m \u001b[0m\u001b[1mOutput Shape             \u001b[0m\u001b[1m \u001b[0m┃\u001b[1m \u001b[0m\u001b[1m        Param #\u001b[0m\u001b[1m \u001b[0m┃\u001b[1m \u001b[0m\u001b[1mConnected to              \u001b[0m\u001b[1m \u001b[0m┃\n",
+              "┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩\n",
+              "│ padding_mask (\u001b[38;5;33mInputLayer\u001b[0m)     │ (\u001b[38;5;45mNone\u001b[0m, \u001b[38;5;45mNone\u001b[0m)              │               \u001b[38;5;34m0\u001b[0m │ -                          │\n",
+              "├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤\n",
+              "│ token_ids (\u001b[38;5;33mInputLayer\u001b[0m)        │ (\u001b[38;5;45mNone\u001b[0m, \u001b[38;5;45mNone\u001b[0m)              │               \u001b[38;5;34m0\u001b[0m │ -                          │\n",
+              "├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤\n",
+              "│ gemma_backbone                │ (\u001b[38;5;45mNone\u001b[0m, \u001b[38;5;45mNone\u001b[0m, \u001b[38;5;34m2304\u001b[0m)        │   \u001b[38;5;34m2,614,341,888\u001b[0m │ padding_mask[\u001b[38;5;34m0\u001b[0m][\u001b[38;5;34m0\u001b[0m],        │\n",
+              "│ (\u001b[38;5;33mGemmaBackbone\u001b[0m)               │                           │                 │ token_ids[\u001b[38;5;34m0\u001b[0m][\u001b[38;5;34m0\u001b[0m]            │\n",
+              "├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤\n",
+              "│ token_embedding               │ (\u001b[38;5;45mNone\u001b[0m, \u001b[38;5;45mNone\u001b[0m, \u001b[38;5;34m256000\u001b[0m)      │     \u001b[38;5;34m589,824,000\u001b[0m │ gemma_backbone[\u001b[38;5;34m0\u001b[0m][\u001b[38;5;34m0\u001b[0m]       │\n",
+              "│ (\u001b[38;5;33mReversibleEmbedding\u001b[0m)         │                           │                 │                            │\n",
+              "└───────────────────────────────┴───────────────────────────┴─────────────────┴────────────────────────────┘\n"
+            ],
+            "text/html": [
+              "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n",
+              "┃<span style=\"font-weight: bold\"> Layer (type)                  </span>┃<span style=\"font-weight: bold\"> Output Shape              </span>┃<span style=\"font-weight: bold\">         Param # </span>┃<span style=\"font-weight: bold\"> Connected to               </span>┃\n",
+              "┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩\n",
+              "│ padding_mask (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">InputLayer</span>)     │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>)              │               <span style=\"color: #00af00; text-decoration-color: #00af00\">0</span> │ -                          │\n",
+              "├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤\n",
+              "│ token_ids (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">InputLayer</span>)        │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>)              │               <span style=\"color: #00af00; text-decoration-color: #00af00\">0</span> │ -                          │\n",
+              "├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤\n",
+              "│ gemma_backbone                │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">2304</span>)        │   <span style=\"color: #00af00; text-decoration-color: #00af00\">2,614,341,888</span> │ padding_mask[<span style=\"color: #00af00; text-decoration-color: #00af00\">0</span>][<span style=\"color: #00af00; text-decoration-color: #00af00\">0</span>],        │\n",
+              "│ (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">GemmaBackbone</span>)               │                           │                 │ token_ids[<span style=\"color: #00af00; text-decoration-color: #00af00\">0</span>][<span style=\"color: #00af00; text-decoration-color: #00af00\">0</span>]            │\n",
+              "├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤\n",
+              "│ token_embedding               │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">256000</span>)      │     <span style=\"color: #00af00; text-decoration-color: #00af00\">589,824,000</span> │ gemma_backbone[<span style=\"color: #00af00; text-decoration-color: #00af00\">0</span>][<span style=\"color: #00af00; text-decoration-color: #00af00\">0</span>]       │\n",
+              "│ (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">ReversibleEmbedding</span>)         │                           │                 │                            │\n",
+              "└───────────────────────────────┴───────────────────────────┴─────────────────┴────────────────────────────┘\n",
+              "</pre>\n"
+            ]
+          },
+          "metadata": {}
+        },
+        {
+          "output_type": "display_data",
+          "data": {
+            "text/plain": [
+              "\u001b[1m Total params: \u001b[0m\u001b[38;5;34m2,614,341,888\u001b[0m (9.74 GB)\n"
+            ],
+            "text/html": [
+              "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\"> Total params: </span><span style=\"color: #00af00; text-decoration-color: #00af00\">2,614,341,888</span> (9.74 GB)\n",
+              "</pre>\n"
+            ]
+          },
+          "metadata": {}
+        },
+        {
+          "output_type": "display_data",
+          "data": {
+            "text/plain": [
+              "\u001b[1m Trainable params: \u001b[0m\u001b[38;5;34m2,614,341,888\u001b[0m (9.74 GB)\n"
+            ],
+            "text/html": [
+              "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\"> Trainable params: </span><span style=\"color: #00af00; text-decoration-color: #00af00\">2,614,341,888</span> (9.74 GB)\n",
+              "</pre>\n"
+            ]
+          },
+          "metadata": {}
+        },
+        {
+          "output_type": "display_data",
+          "data": {
+            "text/plain": [
+              "\u001b[1m Non-trainable params: \u001b[0m\u001b[38;5;34m0\u001b[0m (0.00 B)\n"
+            ],
+            "text/html": [
+              "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\"> Non-trainable params: </span><span style=\"color: #00af00; text-decoration-color: #00af00\">0</span> (0.00 B)\n",
+              "</pre>\n"
+            ]
+          },
+          "metadata": {}
+        }
+      ],
+      "source": [
+        "gemma_lm.summary()"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "81KHdRYOrWYm"
+      },
+      "source": [
+        "As you can see from the summary, the model has 2.6 billion trainable parameters.\n",
+        "\n",
+        "Note: For purposes of naming the model (\"2B\"), the embedding layer is not counted against the number of parameters."
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "FOBW7piN5-sl"
+      },
+      "source": [
+        "## Generate text\n",
+        "\n",
+        "Now it's time to generate some text! The model has a `generate` method that generates text based on a prompt. The optional `max_length` argument specifies the maximum length of the generated sequence.\n",
+        "\n",
+        "Try it out with the prompt `\"what is keras in 3 bullet points?\"`."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 9,
+      "metadata": {
+        "id": "aae5GHrdpj2_",
+        "outputId": "362ab0d4-80d2-4a65-eb91-f28d055511ec",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 53
+        }
+      },
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "'what is keras in 3 bullet points?\\n\\n[Answer 1]\\n\\nKeras is a high-level neural networks API, written in Python and capable of running on top of TensorFlow, CNTK, Theano, or PlaidML. It is designed to be user-friendly and easy to extend.\\n\\n'"
+            ],
+            "application/vnd.google.colaboratory.intrinsic+json": {
+              "type": "string"
+            }
+          },
+          "metadata": {},
+          "execution_count": 9
+        }
+      ],
+      "source": [
+        "gemma_lm.generate(\"what is keras in 3 bullet points?\", max_length=64)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "qH0eFH_DvYwM"
+      },
+      "source": [
+        "Try calling `generate` again with a different prompt."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 16,
+      "metadata": {
+        "id": "VEyTnnNGvgGG",
+        "outputId": "a6193127-2a71-481d-8521-d3c21f7dd148",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 53
+        }
+      },
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "'The universe is a vast and mysterious place, filled with countless stars, planets, and galaxies. But what if there was a way to see the universe in a whole new way? What if we could see the universe as it was when it was first created? What if we could see the universe as it is now'"
+            ],
+            "application/vnd.google.colaboratory.intrinsic+json": {
+              "type": "string"
+            }
+          },
+          "metadata": {},
+          "execution_count": 16
+        }
+      ],
+      "source": [
+        "gemma_lm.generate(\"The universe is\", max_length=64)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "vVlCnY7Gvm7U"
+      },
+      "source": [
+        "If you're running on JAX or TensorFlow backends, you'll notice that the second `generate` call returns nearly instantly. This is because each call to `generate` for a given batch size and `max_length` is compiled with XLA. The first run is expensive, but subsequent runs are much faster."
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "mw5XkiHU11Ft"
+      },
+      "source": [
+        "You can also provide batched prompts using a list as input:"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 17,
+      "metadata": {
+        "id": "xV6vs8_C2BGt",
+        "outputId": "e972806a-1983-408b-bf0e-43f36c95c14a",
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        }
+      },
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "['what is keras in 3 bullet points?\\n\\n[Answer 1]\\n\\nKeras is a high-level neural networks API, written in Python and capable of running on top of TensorFlow, CNTK, Theano, or PlaidML. It is designed to be user-friendly and easy to extend.\\n\\n',\n",
+              " 'The universe is a vast and mysterious place, filled with countless stars, planets, and galaxies. But what if there was a way to see the universe in a whole new way? What if we could see the universe as it was when it was first created? What if we could see the universe as it is now']"
+            ]
+          },
+          "metadata": {},
+          "execution_count": 17
+        }
+      ],
+      "source": [
+        "gemma_lm.generate(\n",
+        "    [\"what is keras in 3 bullet points?\",\n",
+        "     \"The universe is\"],\n",
+        "    max_length=64)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "MaVWoSpo3XyY"
+      },
+      "source": [
+        "### Optional: Try a different sampler\n",
+        "\n",
+        "You can control the generation strategy for `GemmaCausalLM` by setting the `sampler` argument on `compile()`. By default, [`\"greedy\"`](https://keras.io/api/keras_nlp/samplers/greedy_sampler/#greedysampler-class) sampling will be used.\n",
+        "\n",
+        "As an experiment, try setting a [`\"top_k\"`](https://keras.io/api/keras_nlp/samplers/top_k_sampler/) strategy:"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 20,
+      "metadata": {
+        "id": "mx55VQpN4DAK",
+        "outputId": "29c7a52c-6575-4487-cf83-66c8eb7d8772",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 53
+        }
+      },
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "'The universe is a big place, and there are so many things we do not know or understand about it.\\n\\nBut we can learn a lot about our world by studying what is known to us.\\n\\nFor example, if you look at the moon, it has many features that can be seen from the surface.'"
+            ],
+            "application/vnd.google.colaboratory.intrinsic+json": {
+              "type": "string"
+            }
+          },
+          "metadata": {},
+          "execution_count": 20
+        }
+      ],
+      "source": [
+        "gemma_lm.compile(sampler=\"top_k\")\n",
+        "gemma_lm.generate(\"The universe is\", max_length=64)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "-okKgK4LfO0f"
+      },
+      "source": [
+        "While the default greedy algorithm always picks the token with the largest probability, the top-K algorithm randomly picks the next token from the tokens of top K probability.\n",
+        "\n",
+        "You don't have to specify a sampler, and you can ignore the last code snippet if it's not helpful to your use case. If you'd like learn more about the available samplers, see [Samplers](https://keras.io/api/keras_nlp/samplers/)."
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "jBrbTYasoo-J"
+      },
+      "source": [
+        "## What's next\n",
+        "\n",
+        "In this tutorial, you learned how to generate text using KerasNLP and Gemma. Here are a few suggestions for what to learn next:\n",
+        "\n",
+        "* Learn how to [finetune a Gemma model](https://ai.google.dev/gemma/docs/lora_tuning).\n",
+        "* Learn how to perform [distributed fine-tuning and inference on a Gemma model](https://ai.google.dev/gemma/docs/distributed_tuning).\n",
+        "* Learn about [Gemma integration with Vertex AI](https://ai.google.dev/gemma/docs/integrations/vertex)\n",
+        "* Learn how to [use Gemma models with Vertex AI](https://cloud.google.com/vertex-ai/docs/generative-ai/open-models/use-gemma)."
+      ]
+    }
+  ],
+  "metadata": {
+    "accelerator": "GPU",
+    "colab": {
+      "provenance": [],
+      "gpuType": "T4"
+    },
+    "google": {
+      "image_path": "/site-assets/images/marketing/gemma.png",
+      "keywords": [
+        "examples",
+        "gemma",
+        "python",
+        "quickstart",
+        "text"
+      ]
+    },
+    "kernelspec": {
+      "display_name": "Python 3",
+      "name": "python3"
+    }
+  },
+  "nbformat": 4,
+  "nbformat_minor": 0
+}
\ No newline at end of file