Problem
Daft's S3 HTTP client silently ignores standard proxy environment variables (HTTP_PROXY, HTTPS_PROXY, ALL_PROXY, NO_PROXY). This prevents users behind corporate proxies from accessing S3-compatible storage through Daft.
Affected file: src/daft-io/src/s3_like.rs (lines 564–572)
Root Cause
The default_client() function uses aws_smithy_http_client::Builder::build_https() — the high-level API. Internally, this creates a ConnectorBuilder but never sets proxy_config, which defaults to ProxyConfig::disabled():
// Current code in Daft
fn default_client() -> SharedHttpClient {
Builder::new()
.tls_provider(Provider::Rustls(CryptoMode::AwsLc))
.build_https()
}
Inside aws-smithy-http-client, ConnectorBuilder::build_https() does:
let proxy_config = self.proxy_config.clone()
.unwrap_or_else(proxy::ProxyConfig::disabled); // explicitly disabled!
The lower-level ConnectorBuilder API does support proxies via .proxy_config(ProxyConfig::from_env()), which reads the standard env vars — but Builder never wires it up.
Call chain
- Daft calls
Builder::new().tls_provider(...).build_https()
Builder::build_https() internally creates a ConnectorBuilder via new_conn_builder()
new_conn_builder() sets TLS provider and timeout settings, but never calls .proxy_config()
ConnectorBuilder::build_https() defaults proxy_config to ProxyConfig::disabled()
- All proxy env vars are ignored
Proposed Fix
Replace Builder with the lower-level Connector::builder() (i.e., ConnectorBuilder) and add .proxy_config(ProxyConfig::from_env()). Use http_client_fn from aws_smithy_runtime_api to wrap the result into a SharedHttpClient:
fn default_client() -> SharedHttpClient {
use std::sync::OnceLock;
use aws_smithy_http_client::{
Connector,
proxy::ProxyConfig,
tls::{Provider, rustls_provider::CryptoMode},
};
use aws_smithy_runtime_api::client::http::{
http_client_fn, SharedHttpConnector,
};
let proxy_config = ProxyConfig::from_env();
let cached: OnceLock<SharedHttpConnector> = OnceLock::new();
http_client_fn(move |settings, _components| {
cached
.get_or_init(|| {
let connector = Connector::builder()
.proxy_config(proxy_config.clone())
.connector_settings(settings.clone())
.tls_provider(Provider::Rustls(CryptoMode::AwsLc))
.build();
SharedHttpConnector::new(connector)
})
.clone()
})
}
Why this works
ProxyConfig::from_env() reads HTTP_PROXY, HTTPS_PROXY, ALL_PROXY, NO_PROXY (and lowercase variants). When none are set, behavior is identical to the current disabled behavior
OnceLock caches the connector to preserve connection pooling
.connector_settings(settings.clone()) forwards the SDK's connect/read timeout settings, matching original Builder::build_https() behavior
- No new crate dependencies — all types are already available in existing dependencies (
aws_smithy_http_client and aws_smithy_runtime_api)
Context
- Other DataFrame libraries (e.g., Polars) work behind proxies because they use
reqwest (via object_store), which respects proxy env vars by default
- The
aws-smithy-http-client crate (v1.1.11) fully supports proxy configuration — it's just not exposed through the high-level Builder API that Daft uses
- This is a single-function, ~15-line change with no new dependencies
Environment variables that would be respected after fix
| Variable |
Purpose |
HTTP_PROXY / http_proxy |
Proxy for HTTP traffic |
HTTPS_PROXY / https_proxy |
Proxy for HTTPS traffic |
ALL_PROXY / all_proxy |
Proxy for all traffic |
NO_PROXY / no_proxy |
Comma-separated bypass rules (e.g., localhost,*.internal) |
Problem
Daft's S3 HTTP client silently ignores standard proxy environment variables (
HTTP_PROXY,HTTPS_PROXY,ALL_PROXY,NO_PROXY). This prevents users behind corporate proxies from accessing S3-compatible storage through Daft.Affected file:
src/daft-io/src/s3_like.rs(lines 564–572)Root Cause
The
default_client()function usesaws_smithy_http_client::Builder::build_https()— the high-level API. Internally, this creates aConnectorBuilderbut never setsproxy_config, which defaults toProxyConfig::disabled():Inside
aws-smithy-http-client,ConnectorBuilder::build_https()does:The lower-level
ConnectorBuilderAPI does support proxies via.proxy_config(ProxyConfig::from_env()), which reads the standard env vars — butBuildernever wires it up.Call chain
Builder::new().tls_provider(...).build_https()Builder::build_https()internally creates aConnectorBuildervianew_conn_builder()new_conn_builder()sets TLS provider and timeout settings, but never calls.proxy_config()ConnectorBuilder::build_https()defaultsproxy_configtoProxyConfig::disabled()Proposed Fix
Replace
Builderwith the lower-levelConnector::builder()(i.e.,ConnectorBuilder) and add.proxy_config(ProxyConfig::from_env()). Usehttp_client_fnfromaws_smithy_runtime_apito wrap the result into aSharedHttpClient:Why this works
ProxyConfig::from_env()readsHTTP_PROXY,HTTPS_PROXY,ALL_PROXY,NO_PROXY(and lowercase variants). When none are set, behavior is identical to the current disabled behaviorOnceLockcaches the connector to preserve connection pooling.connector_settings(settings.clone())forwards the SDK's connect/read timeout settings, matching originalBuilder::build_https()behavioraws_smithy_http_clientandaws_smithy_runtime_api)Context
reqwest(viaobject_store), which respects proxy env vars by defaultaws-smithy-http-clientcrate (v1.1.11) fully supports proxy configuration — it's just not exposed through the high-levelBuilderAPI that Daft usesEnvironment variables that would be respected after fix
HTTP_PROXY/http_proxyHTTPS_PROXY/https_proxyALL_PROXY/all_proxyNO_PROXY/no_proxylocalhost,*.internal)