Skip to content

fix: respect proxy env vars (HTTP_PROXY, HTTPS_PROXY, etc.) in S3 client#6679

Open
BABTUNA wants to merge 3 commits intoEventual-Inc:mainfrom
BABTUNA:fix/s3-proxy-env-vars
Open

fix: respect proxy env vars (HTTP_PROXY, HTTPS_PROXY, etc.) in S3 client#6679
BABTUNA wants to merge 3 commits intoEventual-Inc:mainfrom
BABTUNA:fix/s3-proxy-env-vars

Conversation

@BABTUNA
Copy link
Copy Markdown
Contributor

@BABTUNA BABTUNA commented Apr 12, 2026

Summary

  • Fix S3 HTTP client silently ignoring standard proxy environment variables
    (HTTP_PROXY, HTTPS_PROXY, ALL_PROXY, NO_PROXY)
  • Replace high-level Builder::build_https() with lower-level
    Connector::builder() + ProxyConfig::from_env()
  • No new crate dependencies - > use existing aws-smithy-http-client and
    aws-smithy-runtime-api APIs

Closes #6638

Test plan

  • When no proxy env vars are set, behavior is identical to before
    (ProxyConfig::from_env() returns disabled config)
  • When HTTP_PROXY/HTTPS_PROXY are set, S3 traffic routes through the proxy

@BABTUNA BABTUNA requested a review from a team as a code owner April 12, 2026 17:16
@github-actions github-actions bot added the fix label Apr 12, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 12, 2026

Greptile Summary

This PR fixes S3 proxy support by replacing the high-level Builder::build_https() call with Connector::builder() + ProxyConfig::from_env(), making Daft honour HTTP_PROXY, HTTPS_PROXY, ALL_PROXY, and NO_PROXY environment variables when building the S3 HTTP client. No new crate dependencies are added.

Confidence Score: 4/5

Safe to merge after considering the OnceLock settings-stickiness trade-off; no regression from the original behavior.

Both findings are P2: the OnceLock caching silently ignores HttpConnectorSettings after the first call, but this is no worse than the original code which ignored them entirely, and in practice Daft's S3 client uses consistent settings per instance. The inline import style violation is minor. The proxy fix itself is correct and achieves the stated goal.

src/daft-io/src/s3_like.rs — specifically the OnceLock caching logic in default_client()

Important Files Changed

Filename Overview
src/daft-io/src/s3_like.rs Replaces Builder::build_https() with Connector::builder() + ProxyConfig::from_env() to honour proxy env vars; introduces OnceLock-based caching that silently ignores HttpConnectorSettings after the first call, and adds inline use statements inside a function body.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["build_s3_conf()"] --> B{"verify_ssl && check_hostname_ssl?"}
    B -->|true| C["default_client()"]
    B -->|false| D["unimplemented!()"]

    C --> E["ProxyConfig::from_env()\n(reads HTTP_PROXY, HTTPS_PROXY, NO_PROXY, ALL_PROXY)"]
    E --> F["OnceLock created\n(per client instance)"]
    F --> G["http_client_fn closure\n→ SharedHttpClient"]

    G --> H{"OnceLock\ninitialised?"}
    H -->|No — first call| I["Connector::builder()\n+ tls_provider\n+ proxy_config\n+ connector_settings\n→ SharedHttpConnector"]
    H -->|Yes — subsequent calls| J["Return cached\nSharedHttpConnector\n⚠️ ignores new settings"]
    I --> K["Cache in OnceLock"]
    K --> L["Return SharedHttpConnector"]
    J --> L
    L --> M["S3 SDK uses connector\nfor all requests"]
Loading

Reviews (1): Last reviewed commit: "fix: respect proxy env vars (HTTP_PROXY,..." | Re-trigger Greptile

Comment thread src/daft-io/src/s3_like.rs Outdated
Comment on lines +576 to +589
let cached: OnceLock<SharedHttpConnector> = OnceLock::new();

http_client_fn(move |settings: &HttpConnectorSettings, _components| {
cached
.get_or_init(|| {
let connector = Connector::builder()
.tls_provider(Provider::Rustls(CryptoMode::AwsLc))
.proxy_config(proxy_config.clone())
.connector_settings(settings.clone())
.build();
SharedHttpConnector::new(connector)
})
.clone()
})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 OnceLock bakes in the first call's HttpConnectorSettings

OnceLock::get_or_init initialises once, so the connector built during the first closure call (using that call's settings) is returned for every subsequent call — even if the SDK supplies different HttpConnectorSettings later (e.g. per-operation timeout overrides). The connector_settings arg on lines 584–585 is silently ignored after the first invocation.

The original Builder::build_https() also didn't thread through settings, so this isn't a regression, but it means the SDK can never vary the connector for this client. If proper per-call settings are not needed, drop the OnceLock and the cache entirely for clarity:

let proxy_config = ProxyConfig::from_env();

http_client_fn(move |settings: &HttpConnectorSettings, _components| {
    let connector = Connector::builder()
        .tls_provider(Provider::Rustls(CryptoMode::AwsLc))
        .proxy_config(proxy_config.clone())
        .connector_settings(settings.clone())
        .build();
    SharedHttpConnector::new(connector)
})

Copy link
Copy Markdown
Contributor Author

@BABTUNA BABTUNA Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment thread src/daft-io/src/s3_like.rs Outdated
Comment on lines +570 to +573
use aws_smithy_runtime_api::client::http::{
HttpConnectorSettings, SharedHttpConnector, http_client_fn,
};
use std::sync::OnceLock;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Inline use imports inside function body

The two new use statements (aws_smithy_runtime_api::client::http and std::sync::OnceLock) are placed inside the default_client function body rather than at the top of the file. The project style guide requires imports to be at the top of the file, not inline within functions or methods.

Rule Used: Import statements should be placed at the top of t... (source)

Learnt From
Eventual-Inc/Daft#5078

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nah this style is used within the context of that file

Copy link
Copy Markdown
Contributor

@ohbh ohbh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks for the PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix: S3 HTTP client ignores HTTP_PROXY/HTTPS_PROXY environment variables

2 participants