Have you ever ever copy-pasted chunks of utility code between initiatives, leading to a number of variations of the identical code dwelling in numerous repositories? Or, maybe, you needed to make pull requests to tens of initiatives after the identify of the GCP bucket wherein you retailer your information was up to date?
Conditions described above come up approach too usually in ML groups, and their penalties fluctuate from a single developer’s annoyance to the staff’s lack of ability to ship their code as wanted. Fortunately, there’s a treatment.
Let’s dive into the world of monorepos, an structure extensively adopted in main tech firms like Google, and the way they’ll improve your ML workflows. A monorepo provides a plethora of benefits which, regardless of some drawbacks, make it a compelling selection for managing advanced machine studying ecosystems.
We’ll briefly debate monorepos’ deserves and demerits, look at why it’s a wonderful structure selection for machine studying groups, and peek into how Huge Tech is utilizing it. Lastly, we’ll see how you can harness the ability of the Pants construct system to arrange your machine studying monorepo into a sturdy CI/CD construct system.
Strap in as we embark on this journey to streamline your ML mission administration.
This text was first revealed on the neptune.ai blog.
A monorepo (quick for monolithic repository) is a software program growth technique the place code for a lot of initiatives is saved in the identical repository. The thought might be as broad as all of the corporate code written in a wide range of programming languages saved collectively (did someone say Google?) or as slim as a few Python initiatives developed by a small staff thrown right into a single repository.
On this weblog submit, we give attention to repositories storing machine studying code.