ICML2021

Boosting the Throughput and Accelerator Utilization of Specialized CNN Inference Beyond Increasing Batch Size

Jack Kosaian, Amar Phanishayee, Matthai Philipose, Debadeepta Dey, Rashmi Vinayak

20 citations

Abstract

A. Datasets A.1. Datasets and models used in game-scraping application This section provides details on the datasets and models used in the production video-game-scraping workload described in §2. The images in each dataset represent the style of text that will appear in a particular portion of a game screen, which will be used in a downstream event detection pipeline. Dataset generation. The location and style of relevant text in a particular video game may differ from stream-tostream. To avoid the need to manually label streams, the game-scraping application generates synthetic datasets for training, validation, and testing. Specifically, the text that will appear in images for a particular dataset follows a predefined structure. For example, the text appearing in images of the V1 task is of the form "XY.Zk", where X, Y, and Z each represent a digit 0 through 9, and k is the string literal "k". From these specifications, examples that match certain classes of a particular dataset can be generated. For example, V1 classifies the Z digit in the specification above, and might generate "67.8k" and "04.8k" as instances of this specification for class "8". Once an instance of a specification has been constructed, an image containing this text is generated. In order to train a model that is robust to perturbations in text location, text font, and background color/texture, the generation process selects fonts, locations, and backgrounds for the generated image at random from a set of prespecified options. Figures 1-6 below show the effects of this randomization. We now provide details of each dataset used for this task in the paper. Example images chosen randomly from the validation sets of each dataset are displayed. We also describe the detailed architecture of the specialized CNNs employed for each dataset. For brevity, we use the following notation * Work done in part as an intern at Microsoft Research.