Re: An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

This project is part of the UCSC OSPO summer of reproducibility fellowship and aims to create an interactive notebook that can be used to teach undergraduate or graduate students different levels of reproducibility in computer vision research.

The project is based on the paper "An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale" by Dosovitskiy et al., which introduces a novel way of applying the transformer architecture, which was originally designed for natural language processing, to image recognition tasks. The paper shows that transformers can achieve state-of-the-art results on several image classification benchmarks, such as ImageNet, when trained on large-scale datasets.

20 10 6 1 Sep. 20, 2023, 3:20 PM


Launch on Chameleon

Launching this artifact will open it within Chameleon’s shared Jupyter experiment environment, which is accessible to all Chameleon users with an active allocation.

Download Archive

Download an archive containing the files of this artifact.

Download with git

Clone the git repository for this artifact, and checkout the version's commit

git clone
# cd into the created directory
git checkout d17897de3ee0ca27790d9d3f6682c4a63ae6fcf7

Submit feedback through GitHub issues

Version Stats

20 10 6