From 4a99db684d406ea8cd1c76cb33c4515c34148c0c Mon Sep 17 00:00:00 2001
From: Aden Haussmann <adenrhaussmann@gmail.com>
Date: Wed, 27 Mar 2024 22:07:46 +0000
Subject: [PATCH 1/2] Change n to m for Y

---
 encoder-decoder.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/encoder-decoder.md b/encoder-decoder.md
index 8db8d2d936..ef72bec0f7 100644
--- a/encoder-decoder.md
+++ b/encoder-decoder.md
@@ -366,10 +366,10 @@ $$ f_{\theta_{\text{enc}}}: \mathbf{X}_{1:n} \to \mathbf{\overline{X}}_{1:n}. $$
 
 The transformer-based decoder part then models the conditional
 probability distribution of the target vector sequence
-\\(\mathbf{Y}_{1:n}\\) given the sequence of encoded hidden states
+\\(\mathbf{Y}_{1:m}\\) given the sequence of encoded hidden states
 \\(\mathbf{\overline{X}}_{1:n}\\):
 
-$$ p_{\theta_{dec}}(\mathbf{Y}_{1:n} | \mathbf{\overline{X}}_{1:n}).$$
+$$ p_{\theta_{dec}}(\mathbf{Y}_{1:m} | \mathbf{\overline{X}}_{1:n}).$$
 
 By Bayes\' rule, this distribution can be factorized to a product of
 conditional probability distribution of the target vector \\(\mathbf{y}_i\\)
@@ -377,7 +377,7 @@ given the encoded hidden states \\(\mathbf{\overline{X}}_{1:n}\\) and all
 previous target vectors \\(\mathbf{Y}_{0:i-1}\\):
 
 $$
-p_{\theta_{dec}}(\mathbf{Y}_{1:n} | \mathbf{\overline{X}}_{1:n}) = \prod_{i=1}^{n} p_{\theta_{\text{dec}}}(\mathbf{y}_i | \mathbf{Y}_{0: i-1}, \mathbf{\overline{X}}_{1:n}). $$
+p_{\theta_{dec}}(\mathbf{Y}_{1:m} | \mathbf{\overline{X}}_{1:n}) = \prod_{i=1}^{m} p_{\theta_{\text{dec}}}(\mathbf{y}_i | \mathbf{Y}_{0: i-1}, \mathbf{\overline{X}}_{1:n}). $$
 
 The transformer-based decoder hereby maps the sequence of encoded hidden
 states \\(\mathbf{\overline{X}}_{1:n}\\) and all previous target vectors

From 2f19d6a6fdafb27000db4b8c4584ad0794e9ad0d Mon Sep 17 00:00:00 2001
From: Aden Haussmann <adenrhaussmann@gmail.com>
Date: Wed, 27 Mar 2024 22:16:21 +0000
Subject: [PATCH 2/2] found another n

---
 encoder-decoder.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/encoder-decoder.md b/encoder-decoder.md
index ef72bec0f7..1a7b5f3aa2 100644
--- a/encoder-decoder.md
+++ b/encoder-decoder.md
@@ -352,7 +352,7 @@ mapping.
 
 Similar to RNN-based encoder-decoder models, the transformer-based
 encoder-decoder models define a conditional distribution of target
-vectors \\(\mathbf{Y}_{1:n}\\) given an input sequence \\(\mathbf{X}_{1:n}\\):
+vectors \\(\mathbf{Y}_{1:m}\\) given an input sequence \\(\mathbf{X}_{1:n}\\):
 
 $$
 p_{\theta_{\text{enc}}, \theta_{\text{dec}}}(\mathbf{Y}_{1:m} | \mathbf{X}_{1:n}).