Source Control, Configuration, Build
What is Vesta?
- Source Control: keeping track of changes to source files over time; assigning version numbers as sources change
- Configuration: specifying which versions of different sources go together
- Building: processing source files using compilers and other tools to produce derived files
History
- Vesta was a research project at the Digital/Compaq Systems Research Center
- Vesta represents over 10 years of research and development
- The current version (Vesta 2) is a complete from-scratch re-write based on what was learned from the first version
- The Alpha microprocessor group started using Vesta in 1998
History
- When the Alpha group was acquired by Intel in 2001, Compaq agreed to release Vesta as free software (LGPL)
- The first download-able kit became available in early 2002
- Intel is now using Vesta in Massachusetts, California, and Bangalore, India on multiple different microprocessor projects
Reinventing the Wheel?
- "make was a really good project for a college student [to implement] 20 years ago."
- -- Tim Leonard, Intel Massachusetts Microprocessor Design Center
- “CVS is a horrible unmaintainable mess.”
- -- Karl Fogel, CVS maintainer/Subversion developer
Problems With Other Systems
- Irreproducible build results
- A compiler binary, library, shared header, or other file gets upgraded: builds start producing different results
- If you can't reproduce a bug, was it fixed or masked?
- Inconsistent build results
- Build works for user A but not user B: why?
- Missing dependencies cause mysterious differences
Problems With Other Systems
- Explicit dependencies
- Users and hacks like makedepend get them wrong
- Some are inexpressible (i.e. “file X doesn't exist”)
- Barriers to truly incremental builds
- Incomplete dependencies cause users to periodically build from scratch (i.e. “make clean”)
- User A builds version N of source X; user B often has to perform the same build step even if it would be correct to use a partial result from user A's build
Problems With Other Systems
- Time-driven dependencies are imprecise
- Partially written result files
- Clock skew between hosts
- Limited build description languages
- Many tools exist to generate Makefiles from more abstract descriptions (autoconf, xmkmf, etc.)
Problems With Other Systems
- Slow repository operations
- Check-out/-in slowed down by delta compression
- Accessing old versions can be slow and awkward
- Building a configuration (aka “tagging”) can be slow
- Non-atomic repository operations
- When check-out/-in aren't atomic with respect to each other, the user performing the checkout can get an inconsistent set of sources
Problems With Other Systems
- Assumptions about stored data
- Not all sources are line-oriented text
- Barriers to parallel development
- User A makes a change, user B incorporates it into their build without intending to and has no easy path back to their previous configuration
- Poor support for distributed development (distributed clients, mirroring content in remote repositories, propagation of changes between repositories)
Vesta's Goals
- Consistent builds
- No external changes affecting builds
- User-independent builds
- Precisely repeatable builds
- Any build ever performed should be precisely repeatable forever
- Incremental builds (even between users)
- Parallel development
Vesta's Approach
- Immutable, immortal, versioned storage of all sources and tools
- Anything that contributes to the result of a build is considered a source (including compilers, system headers, libraries, etc.)
- Builds are always done from immutable versions of sources (which guarantees build repeatability)
- Sources are theoretically immortal (versions can be deleted to reclaim storage space, but once a version is created it can't be re-created with different contents)
Vesta's Approach
- Complete, source-based build descriptions
- Build descriptions include all parameters that affect the result
- A build cannot depend upon any aspect of the user's personal environment
- The versions of all sources and tools to use are completely specified in the build description
- Build descriptions are versioned just like sources and tools
Vesta's Approach
- Automatic dependency detection
- Dependencies are detected and recorded during each build
- Dependency detection is language and tool independent (unlike makedepend)
- Inconsistent builds due to missing dependencies are impossible
- The user doesn't have to spend time thinking about dependencies
Vesta's Approach
- Automatic derived file management
- Intermediate files are managed transparently, out of the user's way (no management of "build areas" required)
- Only the target derived files (executables, libraries, formatted documentation, etc.) need to be copied outside Vesta
- Site-wide caching of all build work
- All user's share each other's compilation work implicitly (you never need to perform a compile if someone else already has)
Vesta Repository
- A virtual filesystem
- Exports an NFS interface
- Provides direct access to all versions (diff them, grep across them, etc.)
- Existing versions are immutable (read-only)
- The working copy of an active checkout is mutable (read/write)
- Higher-level features are accessed through an RPC interface (using existing tools or API)
Vesta Repository
- Checkout, checkin, and most other operations are atomic and fast
- Repositories act as peers; any repository can replicate from another with read permission
- Once configured to allow it, checkout from and checkin to a remote master repository is seamless
- Binary files are not a problem; all files are treated as a sequence of bytes (i.e. like files)
Anatomy of a Package
Anatomy of a Package
Anatomy of a Package
Checkout Process
Immutable Snapshots
Immutable Snapshots
Checkin Process
Build Encapsulation
- Each build step uses chroot into a temporary filesystem that exists just for that step
- Makes each step functional (well-defined inputs and outputs, no side effects) which enables caching and safe re-use
- Temporary filesystem provided through repository's NFS interface
- Provides automatic dependency detection: if a file was read, the tool depended on it, otherwise it didn't
Build Language
- The Vesta System Description Language (SDL) is a full functional programming language
- Most users don't need to know the language in detail, because a library of general build functions is available for common tasks like:
- Build an executable from these sources linked against these libraries
- Process this file with (f)lex to produce C source
- Build a library from these sources
Parallel & Multi-OS Builds
- Builder contacts the RunToolServer daemon over the network to run tools
- Builder is multi-threaded and can run individual build steps on multiple different hosts
- Builder can run tools on hosts of a different OS, even simultaneously
- With a single command, build for multiple target platforms
Content-based Fingerprinting
- Source and derived files are identified by a 128-bit fingerprint for dependency analysis
- A fingerprint is essentially a checksum of the files contents
- If a source has not changed or is edited back to be identical to any previous version, both the repository and the build system know.
- If a result file (e.g. an object file produced by a compiler) is identical to any previous one, the build system knows.
Scalability
- Compaq and Intel microprocessor design groups have been using Vesta for over 6 years
- At the largest Intel site:
- Over 300 users
- Manages 10s of GB of sources, 100s of GB of derived data
- Individual builds have over 2 GB of sources
- 500-800 individual builds done per day
- Repository can out-perform OS native NFS server
- 100s of hosts used for parallel builds
Work in Progress
- New language bindings using SWIG
- Currently: Perl, Python, Tcl, Java
- “Triggers” : scriptable hooks for repository tools
- Send e-mail on checkout/checkin
- Verify some property before checkin
Coming Soon
- Ports to new platforms
- Make-based source kit autogenerated by Vesta builder
- Exploring adding autoconf support
- Next targets: Solaris, MacOS X, FreeBSD
- Better support for merging parallel changes
- Designing a pluggable merging system to support multiple merge tools
Thanks To...
- Digital/Compaq : developing Vesta and releasing it as free software
- Compaq : donated hardware/software
- Enertron LLC : co-lo space
- SourceForge : hosting
- Intel : donating hardware and continuing support
- Ken Schalk: slides and demo, not to mention running the Vesta project.