Skip to content

Commit 4810a7c

Browse files
committed
save
1 parent 6a2d48b commit 4810a7c

File tree

2 files changed

+27
-17
lines changed

2 files changed

+27
-17
lines changed

lecture7.md

+27-17
Original file line numberDiff line numberDiff line change
@@ -260,6 +260,11 @@ class: middle
260260

261261
.center.width-100[![](figures/lec7/translation-attention.png)]
262262

263+
???
264+
265+
- Source = English
266+
- Target = French
267+
263268
---
264269

265270
class: middle
@@ -397,18 +402,13 @@ or $O(logk(n))$ in the case of dilated convolutions [18], increasing the length
397402
between any two positions in the network. Convolutional layers are generally more expensive than
398403
recurrent layers, by a factor of $k$.
399404

400-
As side benefit, self-attention could yield more interpretable models. We inspect attention distributions
401-
from our models and present and discuss examples in the appendix. Not only do individual attention
402-
heads clearly learn to perform different tasks, many appear to exhibit behavior related to the syntactic
403-
and semantic structure of the sentences.
404-
405405
---
406406

407407
class: middle
408408

409409
## A toy example
410410

411-
To illustrate the behavior of the attention mechanism, we consider a toy problem with 1D sequences composed of two triangular and two rectangular patterns. The target sequence averages the heights in each pair of shapes.
411+
To illustrate the behavior of the attention mechanism, we consider a toy problem with 1d sequences composed of two triangular and two rectangular patterns. The target sequence averages the heights in each pair of shapes.
412412

413413
.center.width-100[![](figures/lec7/toy1.png)]
414414

@@ -443,6 +443,8 @@ $$\begin{aligned}
443443
\end{aligned}$$
444444
for any permutation $\sigma$ of the key-value pairs.
445445

446+
(It is also permutation-equivariant with permutation $\sigma$ of the queries.)
447+
446448
---
447449

448450
class: middle
@@ -527,10 +529,6 @@ The encoders start by processing the input sequence. The output of the top encod
527529

528530
.footnote[Credits: Jay Alammar, [The Illustrated Transformer](https://jalammar.github.io/illustrated-transformer/).]
529531

530-
???
531-
532-
R: Check UDL
533-
534532
---
535533

536534
class: middle
@@ -543,10 +541,6 @@ The output of each step is fed to the bottom decoder in the next time step, and
543541

544542
.footnote[Credits: Jay Alammar, [The Illustrated Transformer](https://jalammar.github.io/illustrated-transformer/).]
545543

546-
???
547-
548-
R: Check UDL
549-
550544
---
551545

552546
class: middle
@@ -674,6 +668,16 @@ Large models also enjoy better sample efficiency than small models.
674668

675669
---
676670

671+
class: middle
672+
673+
## Conversational agents
674+
675+
.center.width-70[![](./figures/lec7/chatgpt.png)]
676+
677+
All modern conversational agents are based on the same transformer models, scaled up to billions of parameters, trillions of training tokens, and thousands of petaflop/s-days of compute.
678+
679+
---
680+
677681
class: middle
678682
count: false
679683

@@ -709,8 +713,6 @@ class: middle
709713

710714
Just like text transformers, vision transformers learn representations of the input image that can be used for various tasks, such as image classification, object detection, and image generation.
711715

712-
713-
714716
---
715717

716718
class: middle
@@ -719,7 +721,15 @@ class: middle
719721

720722
.center.width-100[![](./figures/lec7/sam2.png)]
721723

722-
.center[Segment anything (Kirillov et al., 2024) combines a vision transformer with a prompt encoder to produce masks with a transformer-based decoder.]
724+
.center[Segment anything (Kirillov et al., 2023) combines a vision transformer with a prompt encoder to produce masks with a transformer-based decoder.]
725+
726+
---
727+
728+
class: middle, center, black-slide
729+
730+
<iframe width="600" height="450" src="https://www.youtube.com/embed/oYUcl_cqKcs" frameborder="0" allowfullscreen></iframe>
731+
732+
Segment anything (Kirillov et al., 2023)
723733

724734
---
725735

pdf/lec7.pdf

246 KB
Binary file not shown.

0 commit comments

Comments
 (0)