Transcrito

How large language models work, a visual intro to transformers

27 de oct. de 2024 · 15m 31s
How large language models work, a visual intro to transformers
Descripción

The inner workings of large language models (LLMs) like ChatGPT, focusing on the transformer architecture. The speaker starts by defining what LLMs are and how they use pre-trained transformers to...

mostra más
The inner workings of large language models (LLMs) like ChatGPT, focusing on the transformer architecture. The speaker starts by defining what LLMs are and how they use pre-trained transformers to generate text. The main focus is on the attention mechanism, which allows LLMs to learn the relationship between words in a sentence and understand their context. The video uses a visual approach and provides simple analogies to explain complex concepts. It also briefly discusses the embedding process, which translates words into numerical representations, and the softmax function, which normalizes these representations into probability distributions.
mostra menos
Información
Autor Alan Shore and Denise
Organización DeepDive
Página web -
Etiquetas

Parece que no tienes ningún episodio activo

Echa un ojo al catálogo de Spreaker para descubrir nuevos contenidos.

Actual

Portada del podcast

Parece que no tienes ningún episodio en cola

Echa un ojo al catálogo de Spreaker para descubrir nuevos contenidos.

Siguiente

Portada del episodio Portada del episodio

Cuánto silencio hay aquí...

¡Es hora de descubrir nuevos episodios!

Descubre
Tu librería
Busca