Skip to content

chiragcshetty/BaechiPyTorch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Baechi-PyTorch: Automated Model Parallelism in PyTorch

What? To train large DNN's over GPUs with limited memory, the model must be split across multiple devices - Model Parallelism. Similarly, training times can be reduced by distributing parallel branches on the model across the devices.

Why? Currently, the process is manual and largely based on heuristics, as we demonstrate here (Section 1.2)

How? In Baechi, we adopt an algorithmic approach to the placement problem for running DNN training graphs on a small cluster of memory-constrained devices. Baechi-PyTorch , automatically and optimally splits the model, given a number of GPU devices and their memory capacities.

Please find the design and usage information for Baechi-PyTorch here: link

Tensorflow implementation of Baechi can be found here: Baechi
The corresponding paper presented at SoCC 2020.

Draft of Baechi Extended version paper is here. (Currently, under review)

For any queries, suggestions etc please feel free to reach out at cshetty2@illinois.edu

About

Baechi - automated Model Parallelism - implemented over PyTorch

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published