Skip to content

Address review feedback on device tensor helpers (#20078)#20078

Open
shoumikhin wants to merge 1 commit into
pytorch:mainfrom
shoumikhin:export-D106842466
Open

Address review feedback on device tensor helpers (#20078)#20078
shoumikhin wants to merge 1 commit into
pytorch:mainfrom
shoumikhin:export-D106842466

Conversation

@shoumikhin
Copy link
Copy Markdown
Contributor

@shoumikhin shoumikhin commented Jun 5, 2026

Summary:

Follow-up to D99913077 applying review feedback on the TensorPtr device tensor helpers: aliasing make_tensor_ptr now preserves device metadata, clone_tensor_ptr requires a CPU source, device alloc/copy failures report their error codes, and the device test is pinned to its abort messages and built in non-aten Buck/CMake/OSS configs. device_allocator moves to exported_deps so the exported header compiles for aten consumers. Mirrored in fbcode and xplat.

Also replaces the two device-transfer helpers clone_tensor_ptr_to_device and clone_tensor_ptr_to_cpu with a single clone_tensor_ptr_to(tensor, target) keyed on the target device. The direction (host-to-device or device-to-host) is inferred from the source and target, which removes the asymmetry where one helper named the device and the other inferred it, and removes the footgun where clone_tensor_ptr_to_device(t, CPU) aborted. CPU-to-CPU and device-to-device are rejected with clear messages; clone_tensor_ptr remains the same-device copy and the make_tensor_ptr device tag is unchanged. This mirrors ATen's single to(device) and keeps the public surface minimal. The extension-tensor.md guide and its ATen equivalence table are updated to match.

This also fixes a pre-existing portable-build break: the aliasing make_tensor_ptr(const Tensor&) overload passed device_type() and device_index() as two separate arguments to a primary factory that takes a single Device, so the non-USE_ATEN_LIB build did not compile; it now wraps them in a Device.

Reviewed By: Gasoonjia

Differential Revision: D106842466

Copilot AI review requested due to automatic review settings June 5, 2026 20:26
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Jun 5, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/20078

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit e7e6ec2 with merge base 0d904b6 (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 5, 2026
@meta-codesync
Copy link
Copy Markdown
Contributor

meta-codesync Bot commented Jun 5, 2026

@shoumikhin has exported this pull request. If you are a Meta employee, you can view the originating Diff in D106842466.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the TensorPtr device-transfer helper APIs and related tests/docs, consolidating the previous directional helpers into a single clone_tensor_ptr_to(tensor, target) and improving portability/build integration.

Changes:

  • Replaces clone_tensor_ptr_to_device / clone_tensor_ptr_to_cpu with clone_tensor_ptr_to(tensor, target) and updates device transfer tests accordingly.
  • Fixes portable build issues around device metadata propagation and improves error reporting for allocation/copy failures.
  • Updates OSS/Buck/CMake test configurations and documentation to reflect the new API surface.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
test/utils/OSSTestConfig.json Adds the device transfer test to OSS test sources.
extension/tensor/test/tensor_ptr_device_test.cpp Updates tests to the unified clone_tensor_ptr_to API and pins death-test messages.
extension/tensor/test/targets.bzl Gates the device test target to non-ATen (portable) builds in Buck.
extension/tensor/test/CMakeLists.txt Adds the device transfer test source to the CMake test target.
extension/tensor/tensor_ptr.h Updates API docs, fixes portable device metadata propagation for aliasing, and declares clone_tensor_ptr_to.
extension/tensor/tensor_ptr.cpp Implements clone_tensor_ptr_to, tightens clone preconditions, and improves error messages.
extension/tensor/targets.bzl Moves device_allocator to exported_deps so exported headers compile for consumers.
docs/source/extension-tensor.md Documents clone_tensor_ptr_to and updates the ATen equivalence table.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +201 to +205
#ifndef USE_ATEN_LIB
ET_CHECK_MSG(
tensor.device_type() == runtime::etensor::DeviceType::CPU,
"clone_tensor_ptr only supports CPU tensors; use clone_tensor_ptr_to with a CPU target first.");
#endif // USE_ATEN_LIB
Comment on lines 432 to +435
#else // USE_ATEN_LIB
executorch::aten::TensorShapeDynamism::DYNAMIC_BOUND,
#endif // USE_ATEN_LIB
std::move(deleter));
#endif // USE_ATEN_LIB
@shoumikhin
Copy link
Copy Markdown
Contributor Author

@pytorchbot label "release notes: none"

@pytorch-bot pytorch-bot Bot added the release notes: none Do not include this in the release notes label Jun 5, 2026
@meta-codesync meta-codesync Bot changed the title Address review feedback on device tensor helpers Address review feedback on device tensor helpers (#20078) Jun 5, 2026
shoumikhin added a commit to shoumikhin/executorch that referenced this pull request Jun 5, 2026
Summary:

Follow-up to D99913077 applying review feedback on the TensorPtr device tensor helpers: aliasing make_tensor_ptr now preserves device metadata, clone_tensor_ptr requires a CPU source, device alloc/copy failures report their error codes, and the device test is pinned to its abort messages and built in non-aten Buck/CMake/OSS configs. device_allocator moves to exported_deps so the exported header compiles for aten consumers. Mirrored in fbcode and xplat.

Also replaces the two device-transfer helpers `clone_tensor_ptr_to_device` and `clone_tensor_ptr_to_cpu` with a single `clone_tensor_ptr_to(tensor, target)` keyed on the target device. The direction (host-to-device or device-to-host) is inferred from the source and target, which removes the asymmetry where one helper named the device and the other inferred it, and removes the footgun where `clone_tensor_ptr_to_device(t, CPU)` aborted. CPU-to-CPU and device-to-device are rejected with clear messages; `clone_tensor_ptr` remains the same-device copy and the `make_tensor_ptr` device tag is unchanged. This mirrors ATen's single `to(device)` and keeps the public surface minimal. The `extension-tensor.md` guide and its ATen equivalence table are updated to match.

This also fixes a pre-existing portable-build break: the aliasing `make_tensor_ptr(const Tensor&)` overload passed `device_type()` and `device_index()` as two separate arguments to a primary factory that takes a single `Device`, so the non-`USE_ATEN_LIB` build did not compile; it now wraps them in a `Device`.

Reviewed By: Gasoonjia

Differential Revision: D106842466
@shoumikhin shoumikhin force-pushed the export-D106842466 branch from 58b9455 to 68dd1a0 Compare June 5, 2026 21:14
shoumikhin added a commit to shoumikhin/executorch that referenced this pull request Jun 5, 2026
Summary:

Follow-up to D99913077 applying review feedback on the TensorPtr device tensor helpers: aliasing make_tensor_ptr now preserves device metadata, clone_tensor_ptr requires a CPU source, device alloc/copy failures report their error codes, and the device test is pinned to its abort messages and built in non-aten Buck/CMake/OSS configs. device_allocator moves to exported_deps so the exported header compiles for aten consumers. Mirrored in fbcode and xplat.

Also replaces the two device-transfer helpers `clone_tensor_ptr_to_device` and `clone_tensor_ptr_to_cpu` with a single `clone_tensor_ptr_to(tensor, target)` keyed on the target device. The direction (host-to-device or device-to-host) is inferred from the source and target, which removes the asymmetry where one helper named the device and the other inferred it, and removes the footgun where `clone_tensor_ptr_to_device(t, CPU)` aborted. CPU-to-CPU and device-to-device are rejected with clear messages; `clone_tensor_ptr` remains the same-device copy and the `make_tensor_ptr` device tag is unchanged. This mirrors ATen's single `to(device)` and keeps the public surface minimal. The `extension-tensor.md` guide and its ATen equivalence table are updated to match.

This also fixes a pre-existing portable-build break: the aliasing `make_tensor_ptr(const Tensor&)` overload passed `device_type()` and `device_index()` as two separate arguments to a primary factory that takes a single `Device`, so the non-`USE_ATEN_LIB` build did not compile; it now wraps them in a `Device`.

Reviewed By: Gasoonjia

Differential Revision: D106842466
@shoumikhin shoumikhin force-pushed the export-D106842466 branch from 68dd1a0 to 5db031c Compare June 5, 2026 21:20
Summary:

Follow-up to D99913077 applying review feedback on the TensorPtr device tensor helpers: aliasing make_tensor_ptr now preserves device metadata, clone_tensor_ptr requires a CPU source, device alloc/copy failures report their error codes, and the device test is pinned to its abort messages and built in non-aten Buck/CMake/OSS configs. device_allocator moves to exported_deps so the exported header compiles for aten consumers. Mirrored in fbcode and xplat.

Also replaces the two device-transfer helpers `clone_tensor_ptr_to_device` and `clone_tensor_ptr_to_cpu` with a single `clone_tensor_ptr_to(tensor, target)` keyed on the target device. The direction (host-to-device or device-to-host) is inferred from the source and target, which removes the asymmetry where one helper named the device and the other inferred it, and removes the footgun where `clone_tensor_ptr_to_device(t, CPU)` aborted. CPU-to-CPU and device-to-device are rejected with clear messages; `clone_tensor_ptr` remains the same-device copy and the `make_tensor_ptr` device tag is unchanged. This mirrors ATen's single `to(device)` and keeps the public surface minimal. The `extension-tensor.md` guide and its ATen equivalence table are updated to match.

This also fixes a pre-existing portable-build break: the aliasing `make_tensor_ptr(const Tensor&)` overload passed `device_type()` and `device_index()` as two separate arguments to a primary factory that takes a single `Device`, so the non-`USE_ATEN_LIB` build did not compile; it now wraps them in a `Device`.

Reviewed By: Gasoonjia

Differential Revision: D106842466
Copilot AI review requested due to automatic review settings June 6, 2026 06:59
@shoumikhin shoumikhin force-pushed the export-D106842466 branch from 5db031c to e7e6ec2 Compare June 6, 2026 06:59
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

Comment on lines +201 to +205
#ifndef USE_ATEN_LIB
ET_CHECK_MSG(
tensor.device_type() == runtime::etensor::DeviceType::CPU,
"clone_tensor_ptr only supports CPU tensors; use clone_tensor_ptr_to with a CPU target first.");
#endif // USE_ATEN_LIB
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. meta-exported release notes: none Do not include this in the release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants