Reproducibility on Chameleon: Trovi meets YouTube

The Chameleon team is pleased to announce that several ~5 minute YouTube videos have been added to our channel, explaining how to launch a notebook from Trovi, provision resources, and run some example experiments! These videos pair with Trovi experiments, with the authors who packaged the experiment talking through the experiment itself and how to use Chameleon with Jupyter. 

If you’ve never worked with Jupyter Notebook before, you can work alongside the videos for simple explanations during each step. If you teach a class, you can use the videos and Trovi experiments for quick tutorials to get students started experimenting on Chameleon.

Read on to learn more about the experiments currently available, how to access them, and what you need to do to begin experimenting! At the end of this blog, you’ll find steps, complete with pictures, to guide you through launching an experiment on Trovi with Jupyter Notebook.

 

Trovi Experiments with YouTube Videos

 

  1. Tiny-Tail Flash: Near Perfect Elimination of Garbage Collection Tail Latencies in NAND SSDs Reproduction 

Created by Princeton University PhD student Nanqinqin Li, a Jupyter notebook is packaged and available on Trovi, reproducing the Dev Tools Release experiment from this paper. The author has also recorded a video of himself talking through the notebook, launching it from Trovi, and provisioning resources.

Experiment Estimated Time: 30 minutes

Run the notebook yourself: https://bit.ly/3hdiBX4

Watch on YouTube: Reproducing the DTRS experiment from tinyTailFlash on Chameleon with Jupyter

 

  1. LinnOS: Predictability on Unpredictable Flash Storage with a Light Neural Network

This experiment, featured as the October 2020 user experiments blog with its respective paper published at OSDI’20, is available on Trovi as a fully packaged notebook. Experiment with the notebook to create end-to-end baseline and LinnOS workflows. The author has also recorded a video of himself talking through the notebook, launching it from Trovi, and provisioning resources.

Experiment Estimated Time: > 1 hour 

Run the notebook yourself: https://bit.ly/2T5SOrT

Watch on YouTube: Reproducing the LinnOS experiment on Chameleon with Jupyter

 

  1. Image Classification with AlexNet on Stanford Dogs Dataset

A machine learning experiment packaged in Jupyter Notebook, this experiment is designed to be run with tools available within Chameleon and OpenStack. This notebook is packaged and available on Trovi, reproducing the AlexNet experiment from this paper and applying it to the Stanford Dogs dataset. The author has also recorded a video of herself talking through the notebook, launching it from Trovi, and provisioning resources.

Experiment Estimated Time: 1 hour 

Run the notebook yourself: https://bit.ly/33jR1kk

Watch on YouTube: Reproducing AlexNet with Stanford Dogs Dataset on Chameleon using Jupyter

 

  1. Accuracy Levels with DAWNBench and TensorFlow End-to-End Training

This packaged notebook reproduces Figure 1 of the DAWNBench experiment using TensorFlow. Created at Stanford University, DAWNBench introduces a benchmark and competition that focuses on end-to-end training time for a model to reach a fixed accuracy level. The graph reproduced in this experiment illustrated the relationship between the accuracy level and the end-to-end training time for three different batch sizes. The author has also recorded a video of himself talking through the notebook, launching it from Trovi, and provisioning resources.

Experiment Estimated Time: 24 hours

Run the notebook yourself: https://bit.ly/2U8VNjg

Watch on YouTube: Reproducing Figure 1 of DAWNBench experiment with Tensorflow on Chameleon with Jupyter

 

Chameleon Quick Start:

If you’re unfamiliar with using Jupyter Notebook to provision resources with Jupyter Notebook, the Chameleon team has created a few notebooks designed to introduce this and the various capabilities available. You can also read the documentation on using Chameleon with Jupyter Notebook.

 

Openflow Quick Start Example

An artifact designed to help you get started using OpenFlow on Chameleon and can be used as a base for OpenFlow experiments or advanced network appliances. 

 

Jupyter Usage Metric Exploration

This notebook is an example data analysis notebook looking at usage patterns on Chameleon. Feel free to use as a model for your own data analysis needs.

 

Power Management Experiment Example

This example illustrates how to create a reproducible experiment in power management and describes tools available within the Chameleon base images, as well as the orchestration and snapshot capabilities.

 

Other Experiments Reproducible on Chameleon

These experiments follow the same structure as the previous experiments, but the authors haven’t filmed an accompanying video (yet!). 

 

  1. Training Convolutional Neural Networks with Pytorch on MNIST Dataset

 

This packaged notebook reproduces a simple benchmark experiment that trains a convolutional neural network with the MNIST dataset using Pytorch. The MNIST dataset contains 60,000 training images and 10,000 testing images of hand-written digits. 

 

Experiment Estimated Time: 45 minutes

 

  1. Training Convolutional Neural Networks with TensorFlow on MNIST Dataset

 

This packaged notebook reproduces a simple benchmark experiment that trains a convolutional neural network with the MNIST dataset using TensorFlow. This notebook serves as a way to get hands-on experience with Chameleon and the basics of Machine Learning. The MNIST dataset contains 60,000 training images and 10,000 testing images of hand-written digits.

 

Experiment Estimated Time: 45 minutes

 

  1. Using FlyMC to Explore Data Center and Cloud System Bugs

 

FlyMC is a fast and scalable testing approach for data center and cloud systems like Cassandra, Hadoop, Spark, and ZooKeeper, developed by the University of Chicago’s systems research group, UCARE. In this packaged notebook, you’ll use FlyMC to capture the Cassandra bug-5925, step by step. There’s also options to try other Cassandra, ZooKeeper, Spark and MapReduce bugs. 

 

Experiment Estimated Time: 1 hour

 

  1. Image Classification with Network-in-Network model on MNIST Dataset

 

This packaged notebook illustrates how to create a reproducible experiment using machine learning libraries, models, and datasets and describes tools available within Chameleon and OpenStack. It replicates a Network in Network model implementation found on Kaggle and reproduces the original Network in Network model.

 

Experiment Estimated Time: 1 hour

 

 

Launching an Experiment from Trovi: All You Need to Know

 

One of the experiments has piqued your interest, now what? All it takes is 3 simple steps to get started experimenting!

 

  1. After you click on any of the packaged notebook links mentioned in this blog, you’ll be taken to its shared Trovi page. From here, click ‘Launch on Chameleon’. If you aren’t logged into Chameleon, you’ll have to log in, but then the Jupyter Notebook will begin to load.


 


 

  1. Once the Jupyter Notebook loads, you’ll see all the files associated with that experiment. On some experiments, you might have to click through folders first to navigate to the folder with the experimental files.

 

Most of the experiments contain a reservation script, GPU or CPU setup scripts, a yaml setup file, the experiment in a python file, and ‘run_experiment’ and analysis python notebooks. 

 

  1. To run the experiment, click on the python notebook labeled as such (a variation of run_experiment), and begin running cells. 

 

 

Other Notes:

  • Replace the project name with the project name you’re associated with that has an active allocation.

  • You can adjust the Chameleon site where you use resources from by changing OS_REGION_NAME to "CHI@TACC", "CHI@UC", or "CHI@NU"


Add a comment

No comments