Skip to content

feat(jp62): add Jetson Linux 6.2 base layer#61

Draft
jerry73204 wants to merge 6 commits into
autowarefoundation:mainfrom
NEWSLabNTU:main
Draft

feat(jp62): add Jetson Linux 6.2 base layer#61
jerry73204 wants to merge 6 commits into
autowarefoundation:mainfrom
NEWSLabNTU:main

Conversation

@jerry73204
Copy link
Copy Markdown

Summary

  • New Dockerfile.jp62 producing common-base-jp62 / common-devel-jp62 from nvcr.io/nvidia/l4t-tensorrt:r10.3.0-devel
  • JP62 images fulfill the common-base-cuda / common-devel-cuda contract — downstream Dockerfile.cuda component files work unmodified
  • build.sh --platform jp62 and bake targets added
  • Requires Autoware pinned to a release tag (e.g. 1.7.1), not main

JP62-specific handling

Concern Solution
L4T OpenCV 4.8 conflicts with ROS Replaced with Ubuntu 4.5.4 via apt pin
L4T CMake 3.14 too old CMake 3.28 from Kitware PPA
Ansible CUDA role pulls from sbsa repo --no-nvidia --no-cuda-drivers; L4T provides CUDA
TensorRT/cuDNN cmake modules skipped by --no-nvidia Installed explicitly via apt
ament_cmake_export_libraries _lib cache pollution sed patch: CACHE FORCE + unset before find_library
colcon mixin index missing Registered explicitly in Dockerfile

Known issues

cmake find_library under QEMU (x86 cross-build only)

The colcon build step hits intermittent find_library failures when building under QEMU arm64 emulation on x86. cmake searches the correct path but fails to stat .so files that exist on disk. This only occurs in colcon's Python subprocess context, not in direct cmake invocations. Two contributing factors identified and mitigated:

  1. cmake 3.22 set(_lib "NOTFOUND") bug -- find_library skips search when result variable is pre-set. Fixed by upgrading to cmake 3.28 from Kitware PPA.
  2. ament _lib cache variable reuse (ament_cmake#182) -- shared cache variable across all packages' export templates causes cross-package pollution. Fixed by patching templates with CACHE FORCE + unset.

These fixes resolve most failures (e.g. builtin_interfaces, tier4_metric_msgs now build via colcon). Residual failures for some packages (e.g. rcutils in tier4_debug_msgs) persist under QEMU only -- cmake debug output shows the search path is correct but the file is not found despite if(EXISTS) confirming it. Full colcon build requires native Jetson hardware.

Test method

./build.sh --platform jp62 --target {common,components,universe}

jerry73204 and others added 4 commits April 7, 2026 14:14
- Upgrade cmake to 3.28 from Kitware PPA. System cmake 3.22 has a bug
  where find_library() skips the search when the result variable is
  pre-set to "NOTFOUND" via set(), breaking ament_cmake_export_libraries.

- Patch ament export templates to fix _lib cache variable pollution
  across packages (ament_cmake#182). The template reuses a shared cache
  variable "_lib" across all packages' find_library() calls. When
  find_package(A) caches _lib, subsequent find_package(B) sees the stale
  cache entry and skips the search.

- Install ros-humble-tensorrt-cmake-module and ros-humble-cudnn-cmake-module
  explicitly since --no-nvidia ansible flag skips them.

- Restore .env file COPY for Autoware release tags (1.7.1+).

- Add common-devel-jp62-debug target for interactive debugging.

- Update docker-bake.hcl to target common-devel-jp62-build stage.

- Protect ROS packages from apt-get autoremove with apt-mark manual.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@mitsudome-r
Copy link
Copy Markdown
Member

Thanks for the PR!

JP62 images fulfill the common-base-cuda / common-devel-cuda contract — downstream Dockerfile.cuda component files work unmodified

Do you think it makes it easier for you if we extract common-base and common-base-cuda out of Dockerfile so that you don't have to care about autoware code in Dockerfile.jp62?

Autoware 1.7.1 hardcodes -gencode arch=compute_101 (Blackwell) in 14
CMakeLists.txt files. CUDA 12.6 on JP62 only supports up to compute_90,
causing nvcc fatal errors during the sensing-perception build.

Add patch-cuda-arch.sh that gates compute_101/120 flags behind
CUDA_VERSION >= 12.8, and invoke it automatically from build.sh when
--platform jp62 is used.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jerry73204
Copy link
Copy Markdown
Author

Thanks for the PR!

JP62 images fulfill the common-base-cuda / common-devel-cuda contract — downstream Dockerfile.cuda component files work unmodified

Do you think it makes it easier for you if we extract common-base and common-base-cuda out of Dockerfile so that you don't have to care about autoware code in Dockerfile.jp62?

Sounds a good idea. I'll see the way to revise it.

BTW, the local build on AGX Orin is successful on my side. The Autoware 1.7.1 source requires patches to work around the compute capability issue. I just pushed a auto-patcher script in the PR. I'd like to ask the proper way to apply upstream patches. I used to maintain a patched Autoware repo myself. In the case we'd build a stable version, I'd like to know the proper way to ship upstream patches.

Add docker-compose.jp62.yaml that swaps all services to locally-built
JP62 images with nvidia runtime and ROS_DISTRO=humble. The visualizer
uses the standalone visualizer-jp62 image since it requires VNC/noVNC
components not present in the universe image.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants