Transformer Explainer: Interactive Learning of Text-Generative Models

crown jewel figure
Transformer Explainer helps users (A) visually examine how a text-generative Transformer model (GPT-2) transforms input text to predict next tokens, (B) interactively experiment in real time with key model parameters like temperature to understand prediction determinism, and (C) transition seamlessly between abstraction levels to visualize the interplay between low-level mathematical operations and high-level model structures.
Demo Video
Abstract
Transformers have revolutionized machine learning, yet their inner workings remain opaque to many. We present TRANSFORMER EX- PLAINER, an interactive visualization tool designed for non-experts to learn about Transformers through the GPT-2 model. Our tool helps users understand complex Transformer concepts by integrating a model overview and enabling smooth transitions across abstraction levels of mathematical operations and model structures. It runs a live GPT-2 instance locally in the user's browser, empowering users to experiment with their own input and observe in real-time how the internal components and parameters of the Transformer work together to predict the next tokens. Our tool requires no installation or special hardware, broadening the public's education access to modern generative AI techniques. Our open-sourced tool is available at https://poloclub.github.io/transformer-explainer/. A video demo is available at https://youtu.be/ECR4oAwocjs.
Citation
Transformer Explainer: Interactive Learning of Text-Generative Models
@article{choTransformerExplainerInteractive2024,
  title = {Transformer {{Explainer}}: {{Interactive Learning}} of {{Text-Generative Models}}},
  shorttitle = {Transformer {{Explainer}}},
  author = {Cho, Aeree and Kim, Grace C. and Karpekov, Alexander and Helbling, Alec and Wang, Zijie J. and Lee, Seongmin and Hoover, Benjamin and Chau, Duen Horng},
  year = {2024},
  url = {http://arxiv.org/abs/2408.04619},
  urldate = {2024-08-12},
  archiveprefix = {arXiv},
  journal = {arXiv 2408.04619}
}