Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 38 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -278,12 +278,49 @@ This is most useful when using sccache for Rust compilation, as rustc supports u

---

Normalizing Paths with `SCCACHE_BASEDIRS`
-----------------------------------------

By default, sccache requires absolute paths to match for cache hits. To enable cache sharing across different build directories, you can set `SCCACHE_BASEDIRS` to strip a base directory from paths before hashing:

```bash
export SCCACHE_BASEDIRS=/home/user/project
```

You can also specify multiple base directories by separating them by `;` on Windows hosts and by `:` on any other. When multiple directories are provided, the longest matching prefix is used:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
You can also specify multiple base directories by separating them by `;` on Windows hosts and by `:` on any other. When multiple directories are provided, the longest matching prefix is used:
You can also specify multiple base directories by separating them by `;` on Windows hosts and by `:` on any other operating system. When multiple directories are provided, the longest matching prefix is used:


```bash
export SCCACHE_BASEDIRS="/home/user/project:/home/user/workspace"
```

Path matching is **case-insensitive** on Windows and **case-sensitive** on other operating systems.

This is similar to ccache's `CCACHE_BASEDIR` and helps when:
* Building the same project from different directories
* Sharing cache between CI jobs with different checkout paths
* Multiple developers working with different username paths
* Working with multiple project checkouts simultaneously

**Note:** Only absolute paths are supported. Relative paths will prevent server from start.

You can also configure this in the sccache config file:

```toml
# Single directory
basedirs = ["/home/user/project"]

# Or multiple directories
basedirs = ["/home/user/project", "/home/user/workspace"]
```

---

Known Caveats
-------------

### General

* Absolute paths to files must match to get a cache hit. This means that even if you are using a shared cache, everyone will have to build at the same absolute path (i.e. not in `$HOME`) in order to benefit each other. In Rust this includes the source for third party crates which are stored in `$HOME/.cargo/registry/cache` by default.
* By default, absolute paths to files must match to get a cache hit. To work around this, use `SCCACHE_BASEDIRS` (see above) to normalize paths before hashing.

### Rust

Expand Down
21 changes: 21 additions & 0 deletions docs/Configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,26 @@
# If specified, wait this long for the server to start up.
server_startup_timeout_ms = 10000

# Base directories to strip from source paths during cache key
# computation.
#
# Similar to ccache's CCACHE_BASEDIR, but supports multiple paths.
#
# 'basedirs' enables cache hits across different absolute root
# paths when compiling the same source code, such as between
# parallel checkouts of the same project, Git worktrees, or different
# users in a shared environment.
# When multiple paths are provided, the longest matching prefix
# is applied.
Comment on lines +18 to +19
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot

Suggested change
# When multiple paths are provided, the longest matching prefix
# is applied.
# When multiple matching paths are provided, the longest prefix
# is used.

#
# Path matching is case-insensitive on Windows and case-sensitive on other OSes.
#
# Example:
# basedir = ["/home/user/project"] results in the path prefix rewrite:
# "/home/user/project/src/main.c" -> "./src/main.c"
basedirs = ["/home/user/project"]
# basedirs = ["/home/user/project", "/home/user/workspace"]

[dist]
# where to find the scheduler
scheduler_url = "http://1.2.3.4:10600"
Expand Down Expand Up @@ -134,6 +154,7 @@ Note that some env variables may need sccache server restart to take effect.

* `SCCACHE_ALLOW_CORE_DUMPS` to enable core dumps by the server
* `SCCACHE_CONF` configuration file path
* `SCCACHE_BASEDIRS` base directory (or directories) to strip from paths for cache key computation. This is similar to ccache's `CCACHE_BASEDIR` and enables cache hits across different absolute paths when compiling the same source code. Multiple directories can be separated by `;` on Windows hosts and by `:` on any other. When multiple directories are specified, the longest matching prefix is used. Path matching is **case-insensitive** on Windows and **case-sensitive** on other operating systems. Environment variable takes precedence over file configuration. Only absolute paths are supported; relative paths will cause an error and prevent the server from start.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* `SCCACHE_BASEDIRS` base directory (or directories) to strip from paths for cache key computation. This is similar to ccache's `CCACHE_BASEDIR` and enables cache hits across different absolute paths when compiling the same source code. Multiple directories can be separated by `;` on Windows hosts and by `:` on any other. When multiple directories are specified, the longest matching prefix is used. Path matching is **case-insensitive** on Windows and **case-sensitive** on other operating systems. Environment variable takes precedence over file configuration. Only absolute paths are supported; relative paths will cause an error and prevent the server from start.
* `SCCACHE_BASEDIRS` base directory (or directories) to strip from paths for cache key computation. This is similar to ccache's `CCACHE_BASEDIR` and enables cache hits across different absolute paths when compiling the same source code. Multiple directories can be separated by `;` on Windows hosts and by `:` on any other operating system. When multiple directories are specified, the longest matching prefix is used. Path matching is **case-insensitive** on Windows and **case-sensitive** on other operating systems. Environment variable takes precedence over file configuration. Only absolute paths are supported; relative paths will cause an error and prevent the server from start.

* `SCCACHE_CACHED_CONF`
* `SCCACHE_IDLE_TIMEOUT` how long the local daemon process waits for more client requests before exiting, in seconds. Set to `0` to run sccache permanently
* `SCCACHE_STARTUP_NOTIFY` specify a path to a socket which will be used for server completion notification
Expand Down
128 changes: 115 additions & 13 deletions src/cache/cache.rs
Original file line number Diff line number Diff line change
Expand Up @@ -381,6 +381,10 @@ pub trait Storage: Send + Sync {
// Enable by default, only in local mode
PreprocessorCacheModeConfig::default()
}
/// Return the base directories for path normalization if configured
fn basedirs(&self) -> &[Vec<u8>] {
&[]
}
/// Return the preprocessor cache entry for a given preprocessor key,
/// if it exists.
/// Only applicable when using preprocessor cache mode.
Expand Down Expand Up @@ -453,6 +457,38 @@ impl PreprocessorCacheModeConfig {
}
}

/// Wrapper for opendal::Operator that adds basedirs support
#[cfg(any(
feature = "azure",
feature = "gcs",
feature = "gha",
feature = "memcached",
feature = "redis",
feature = "s3",
feature = "webdav",
feature = "oss",
))]
pub struct RemoteStorage {
operator: opendal::Operator,
basedirs: Vec<Vec<u8>>,
}

#[cfg(any(
feature = "azure",
feature = "gcs",
feature = "gha",
feature = "memcached",
feature = "redis",
feature = "s3",
feature = "webdav",
feature = "oss",
))]
impl RemoteStorage {
pub fn new(operator: opendal::Operator, basedirs: Vec<Vec<u8>>) -> Self {
Self { operator, basedirs }
}
}

/// Implement storage for operator.
#[cfg(any(
feature = "azure",
Expand All @@ -462,11 +498,12 @@ impl PreprocessorCacheModeConfig {
feature = "redis",
feature = "s3",
feature = "webdav",
feature = "oss",
))]
#[async_trait]
impl Storage for opendal::Operator {
impl Storage for RemoteStorage {
async fn get(&self, key: &str) -> Result<Cache> {
match self.read(&normalize_key(key)).await {
match self.operator.read(&normalize_key(key)).await {
Ok(res) => {
let hit = CacheRead::from(io::Cursor::new(res.to_bytes()))?;
Ok(Cache::Hit(hit))
Expand All @@ -482,7 +519,9 @@ impl Storage for opendal::Operator {
async fn put(&self, key: &str, entry: CacheWrite) -> Result<Duration> {
let start = std::time::Instant::now();

self.write(&normalize_key(key), entry.finish()?).await?;
self.operator
.write(&normalize_key(key), entry.finish()?)
.await?;

Ok(start.elapsed())
}
Expand All @@ -493,7 +532,7 @@ impl Storage for opendal::Operator {
let path = ".sccache_check";

// Read is required, return error directly if we can't read .
match self.read(path).await {
match self.operator.read(path).await {
Ok(_) => (),
// Read not exist file with not found is ok.
Err(err) if err.kind() == ErrorKind::NotFound => (),
Expand All @@ -512,7 +551,7 @@ impl Storage for opendal::Operator {
Err(err) => bail!("cache storage failed to read: {:?}", err),
}

let can_write = match self.write(path, "Hello, World!").await {
let can_write = match self.operator.write(path, "Hello, World!").await {
Ok(_) => true,
Err(err) if err.kind() == ErrorKind::AlreadyExists => true,
// Tolerate all other write errors because we can do read at least.
Expand All @@ -534,7 +573,7 @@ impl Storage for opendal::Operator {
}

fn location(&self) -> String {
let meta = self.info();
let meta = self.operator.info();
format!(
"{}, name: {}, prefix: {}",
meta.scheme(),
Expand All @@ -550,6 +589,10 @@ impl Storage for opendal::Operator {
async fn max_size(&self) -> Result<Option<u64>> {
Ok(None)
}

fn basedirs(&self) -> &[Vec<u8>] {
&self.basedirs
}
}

/// Normalize key `abcdef` into `a/b/c/abcdef`
Expand All @@ -572,8 +615,9 @@ pub fn storage_from_config(
key_prefix,
}) => {
debug!("Init azure cache with container {container}, key_prefix {key_prefix}");
let storage = AzureBlobCache::build(connection_string, container, key_prefix)
let operator = AzureBlobCache::build(connection_string, container, key_prefix)
.map_err(|err| anyhow!("create azure cache failed: {err:?}"))?;
let storage = RemoteStorage::new(operator, config.basedirs.clone());
return Ok(Arc::new(storage));
}
#[cfg(feature = "gcs")]
Expand All @@ -587,7 +631,7 @@ pub fn storage_from_config(
}) => {
debug!("Init gcs cache with bucket {bucket}, key_prefix {key_prefix}");

let storage = GCSCache::build(
let operator = GCSCache::build(
bucket,
key_prefix,
cred_path.as_deref(),
Expand All @@ -597,14 +641,16 @@ pub fn storage_from_config(
)
.map_err(|err| anyhow!("create gcs cache failed: {err:?}"))?;

let storage = RemoteStorage::new(operator, config.basedirs.clone());
return Ok(Arc::new(storage));
}
#[cfg(feature = "gha")]
CacheType::GHA(config::GHACacheConfig { version, .. }) => {
debug!("Init gha cache with version {version}");

let storage = GHACache::build(version)
let operator = GHACache::build(version)
.map_err(|err| anyhow!("create gha cache failed: {err:?}"))?;
let storage = RemoteStorage::new(operator, config.basedirs.clone());
return Ok(Arc::new(storage));
}
#[cfg(feature = "memcached")]
Expand All @@ -617,14 +663,15 @@ pub fn storage_from_config(
}) => {
debug!("Init memcached cache with url {url}");

let storage = MemcachedCache::build(
let operator = MemcachedCache::build(
url,
username.as_deref(),
password.as_deref(),
key_prefix,
*expiration,
)
.map_err(|err| anyhow!("create memcached cache failed: {err:?}"))?;
let storage = RemoteStorage::new(operator, config.basedirs.clone());
return Ok(Arc::new(storage));
}
#[cfg(feature = "redis")]
Expand Down Expand Up @@ -672,6 +719,7 @@ pub fn storage_from_config(
_ => bail!("Only one of `endpoint`, `cluster_endpoints`, `url` must be set"),
}
.map_err(|err| anyhow!("create redis cache failed: {err:?}"))?;
let storage = RemoteStorage::new(storage, config.basedirs.clone());
return Ok(Arc::new(storage));
}
#[cfg(feature = "s3")]
Expand All @@ -682,7 +730,7 @@ pub fn storage_from_config(
);
let storage_builder =
S3Cache::new(c.bucket.clone(), c.key_prefix.clone(), c.no_credentials);
let storage = storage_builder
let operator = storage_builder
.with_region(c.region.clone())
.with_endpoint(c.endpoint.clone())
.with_use_ssl(c.use_ssl)
Expand All @@ -691,13 +739,14 @@ pub fn storage_from_config(
.build()
.map_err(|err| anyhow!("create s3 cache failed: {err:?}"))?;

let storage = RemoteStorage::new(operator, config.basedirs.clone());
return Ok(Arc::new(storage));
}
#[cfg(feature = "webdav")]
CacheType::Webdav(c) => {
debug!("Init webdav cache with endpoint {}", c.endpoint);

let storage = WebdavCache::build(
let operator = WebdavCache::build(
&c.endpoint,
&c.key_prefix,
c.username.as_deref(),
Expand All @@ -706,6 +755,7 @@ pub fn storage_from_config(
)
.map_err(|err| anyhow!("create webdav cache failed: {err:?}"))?;

let storage = RemoteStorage::new(operator, config.basedirs.clone());
return Ok(Arc::new(storage));
}
#[cfg(feature = "oss")]
Expand All @@ -715,14 +765,15 @@ pub fn storage_from_config(
c.bucket, c.endpoint
);

let storage = OSSCache::build(
let operator = OSSCache::build(
&c.bucket,
&c.key_prefix,
c.endpoint.as_deref(),
c.no_credentials,
)
.map_err(|err| anyhow!("create oss cache failed: {err:?}"))?;

let storage = RemoteStorage::new(operator, config.basedirs.clone());
return Ok(Arc::new(storage));
}
#[allow(unreachable_patterns)]
Expand All @@ -736,12 +787,14 @@ pub fn storage_from_config(
let preprocessor_cache_mode_config = config.fallback_cache.preprocessor_cache_mode;
let rw_mode = config.fallback_cache.rw_mode.into();
debug!("Init disk cache with dir {:?}, size {}", dir, size);

Ok(Arc::new(DiskCache::new(
dir,
size,
pool,
preprocessor_cache_mode_config,
rw_mode,
config.basedirs.clone(),
)))
}

Expand Down Expand Up @@ -823,4 +876,53 @@ mod test {
});
}
}

#[test]
#[cfg(feature = "s3")]
fn test_operator_storage_s3_with_basedirs() {
// Create S3 operator (doesn't need real credentials for this test)
let operator = crate::cache::s3::S3Cache::new(
"test-bucket".to_string(),
"test-prefix".to_string(),
true, // no_credentials = true
)
.with_region(Some("us-east-1".to_string()))
.build()
.expect("Failed to create S3 cache operator");

let basedirs = vec![b"/home/user/project".to_vec(), b"/opt/build".to_vec()];

// Wrap with OperatorStorage
let storage = RemoteStorage::new(operator, basedirs.clone());

// Verify basedirs are stored and retrieved correctly
assert_eq!(storage.basedirs(), basedirs.as_slice());
assert_eq!(storage.basedirs().len(), 2);
assert_eq!(storage.basedirs()[0], b"/home/user/project".to_vec());
assert_eq!(storage.basedirs()[1], b"/opt/build".to_vec());
}

#[test]
#[cfg(feature = "redis")]
fn test_operator_storage_redis_with_basedirs() {
// Create Redis operator
let operator = crate::cache::redis::RedisCache::build_single(
"redis://localhost:6379",
None,
None,
0,
"test-prefix",
0,
)
.expect("Failed to create Redis cache operator");

let basedirs = vec![b"/workspace".to_vec()];

// Wrap with OperatorStorage
let storage = RemoteStorage::new(operator, basedirs.clone());

// Verify basedirs work
assert_eq!(storage.basedirs(), basedirs.as_slice());
assert_eq!(storage.basedirs().len(), 1);
}
}
Loading