add metax dockerfile and its requirements for ms-swift 4.2.x by WendaDeng · Pull Request #1734 · modelscope/modelscope

WendaDeng · 2026-06-10T03:31:53Z

add metax dockerfiles and related build scripts and requirements for ms-swift 4.2.x

…swift 4.2.x

gemini-code-assist

Code Review

This pull request introduces Dockerfiles, build scripts, and documentation to support building MS-Swift v4.2.3 images for Metax accelerators, offering both a full build from a UBI9 base image and a faster build using a prebuilt Metax release image. The review feedback focuses on optimizing the Dockerfiles by consolidating multiple RUN instructions to reduce image layers, removing redundant package installations and debug statements, and fixing a potential build failure in Dockerfile.metax by properly initializing the cu-bridge build process with its pre_make script.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-10T03:33:49Z

+RUN cd /tmp/ && \
+    export MACA_PATH=/opt/maca && \
+    curl -o ${CU_BRIDGE_VERSION}.zip -LsSf https://gitee.com/metax-maca/cu-bridge/repository/archive/${CU_BRIDGE_VERSION}.zip && \
+    unzip ${CU_BRIDGE_VERSION}.zip && \
+    mv cu-bridge-${CU_BRIDGE_VERSION} cu-bridge && \
+    chmod 755 cu-bridge -Rf && \
+    cd cu-bridge && \
+    mkdir build && cd build && \
+    cmake -DCMAKE_INSTALL_PREFIX=/opt/maca/tools/cu-bridge ../ && \
+    make && make install


The cu-bridge build process needs to be initialized by running its pre_make script. Without this initialization, the /root/cu-bridge/CUDA_DIR directory (referenced by CUDA_PATH on line 19) is never created. As a result, the symbolic link creation on line 160 (RUN ln -sf ${CUDA_PATH}/bin/nvcc ${CUDA_PATH}/bin/cucc) will fail because the destination directory does not exist.

Additionally, we should use the standard -R flag for chmod instead of the non-standard -Rf syntax at the end of the command.

RUN cd /tmp/ && \ export MACA_PATH=/opt/maca && \ curl -o ${CU_BRIDGE_VERSION}.zip -LsSf https://gitee.com/metax-maca/cu-bridge/repository/archive/${CU_BRIDGE_VERSION}.zip && unzip ${CU_BRIDGE_VERSION}.zip && \ mv cu-bridge-${CU_BRIDGE_VERSION} cu-bridge && \ chmod -R 755 cu-bridge && \ cd cu-bridge && \ mkdir build && cd build && \ cmake -DCMAKE_INSTALL_PREFIX=/opt/maca/tools/cu-bridge ../ && \ make && make install && \ /opt/maca/tools/cu-bridge/tools/pre_make

gemini-code-assist · 2026-06-10T03:33:49Z

+RUN printf "[metax-centos]\n\
+name=Maca Driver Yum Repository\n\
+baseurl=https://repos.metax-tech.com/r/metax-driver-centos-$(uname -m)/\n\
+enabled=1\n\
+gpgcheck=0" > /etc/yum.repos.d/metax-driver-centos.repo
+
+RUN dnf -y install python3-pip hostname && \
+    dnf clean all
+
+RUN python3 -m pip install uv -i $UV_INDEX_URL --trusted-host ${UV_TRUSTED_INDEX_HOST} && \
+    uv venv /opt/venv --python=${PYTHON_VERSION}
+
+RUN python3 --version && \
+    uv self version
+
+RUN yum install -y \
+    unzip vim git openblas-devel make cmake \
+    ninja-build gcc g++ procps-ng \
+    libibverbs librdmacm libibumad \
+    && yum clean all
+
+RUN git clone --depth 1 --branch ${SWIFT_VERSION} https://github.com/modelscope/ms-swift.git
+RUN git clone --depth 1 --branch ${VLLM_METAX_VERSION} https://github.com/MetaX-MACA/vLLM-metax.git
+RUN git clone --depth 1 --branch ${VLLM_VERSION} https://github.com/vllm-project/vllm.git
+RUN git clone --depth 1 --branch ${MEGATRON_VERSION} https://github.com/NVIDIA/Megatron-LM.git
+
+# Step 1: install MACA SDK, Metax-Driver and cu-bridge
+# Metax-Driver mainly contains vbios and kmd files, which are not needed in a container.
+# Here we keep the mx-smi management tool. Kernel version mismatch errors are ignored.
+RUN yum install -y metax-driver-${MACA_VERSION}* mxgvm && \
+    yum clean all && rm -rf /var/cache/yum /tmp/*
+
+RUN printf "[maca-sdk]\n\
+name=Maca Sdk Yum Repository\n\
+baseurl=https://repos.metax-tech.com/r/maca-sdk-rpm-$(uname -m)/\n\
+enabled=1\n\
+gpgcheck=0" > /etc/yum.repos.d/maca-sdk-rpm.repo
+
+RUN yum install -y maca_sdk-${MACA_VERSION}* && \
+    yum clean all && rm -rf /var/cache/yum /tmp/*


To optimize the Docker image build time and reduce the number of layers, we can combine the repository configurations, package installations, and git clones into fewer RUN instructions.

By adding both repositories first, we can install all required system packages (including metax-driver and maca_sdk) in a single yum install command. This also allows us to install binutils and numactl-libs early, completely eliminating the redundant yum install step later in the file.

RUN printf "[metax-centos]\n\ name=Maca Driver Yum Repository\n\ baseurl=https://repos.metax-tech.com/r/metax-driver-centos-$(uname -m)/\n\ enabled=1\n\ gpgcheck=0" > /etc/yum.repos.d/metax-driver-centos.repo && \ printf "[maca-sdk]\n\ name=Maca Sdk Yum Repository\n\ baseurl=https://repos.metax-tech.com/r/maca-sdk-rpm-$(uname -m)/\n\ enabled=1\n\ gpgcheck=0" > /etc/yum.repos.d/maca-sdk-rpm.repo RUN yum install -y \ python3-pip hostname \ unzip vim git openblas-devel make cmake \ ninja-build gcc g++ procps-ng \ libibverbs librdmacm libibumad \ binutils numactl-libs \ metax-driver-${MACA_VERSION}* mxgvm \ maca_sdk-${MACA_VERSION}* \ && yum clean all && rm -rf /var/cache/yum /tmp/* RUN python3 -m pip install uv -i $UV_INDEX_URL --trusted-host ${UV_TRUSTED_INDEX_HOST} && \ uv venv /opt/venv --python=${PYTHON_VERSION} RUN python3 --version && \ uv self version RUN git clone --depth 1 --branch ${SWIFT_VERSION} https://github.com/modelscope/ms-swift.git && \ git clone --depth 1 --branch ${VLLM_METAX_VERSION} https://github.com/MetaX-MACA/vLLM-metax.git && \ git clone --depth 1 --branch ${VLLM_VERSION} https://github.com/vllm-project/vllm.git && \ git clone --depth 1 --branch ${MEGATRON_VERSION} https://github.com/NVIDIA/Megatron-LM.git

gemini-code-assist · 2026-06-10T03:33:49Z

+RUN yum install -y \
+    gcc \
+    binutils \
+    procps-ng \
+    libibverbs \
+    librdmacm \
+    libibumad \
+    openblas \
+    numactl-libs \
+    && yum clean all && rm -rf /var/cache/yum /tmp/*


This entire yum install block is redundant and can be safely removed because all of these packages (or their development headers/dependencies) are already installed in the combined initial setup step.

# System dependencies are already installed in the base setup step.

gemini-code-assist · 2026-06-10T03:33:50Z

+RUN cd vllm && \
+    python3 use_existing_torch.py && \
+    uv pip install -r requirements/build/cuda.txt
+
+RUN cd vllm && \
+    VLLM_TARGET_DEVICE=empty uv pip install -v . --no-build-isolation


These two RUN blocks can be combined into a single RUN instruction to reduce the number of image layers and avoid redundant directory changes (cd vllm).

RUN cd vllm && \ python3 use_existing_torch.py && \ uv pip install -r requirements/build/cuda.txt && \ VLLM_TARGET_DEVICE=empty uv pip install -v . --no-build-isolation

gemini-code-assist · 2026-06-10T03:33:50Z

+RUN echo $PATH
+RUN apt-get update && apt-get install -y git && rm -rf /var/lib/apt/lists/*


This debug statement is unnecessary for the production image build and adds an extra layer. It should be removed.

RUN apt-get update && apt-get install -y git && rm -rf /var/lib/apt/lists/*

gemini-code-assist · 2026-06-10T03:33:50Z

+# Clone all GitHub sources while the external proxy is enabled.
+RUN rm -rf /workspace/ms-swift /workspace/vLLM-metax /workspace/vllm /workspace/Megatron-LM
+
+RUN git clone --depth 1 --branch ${SWIFT_VERSION} https://github.com/modelscope/ms-swift.git
+RUN git clone --depth 1 --branch ${VLLM_METAX_VERSION} https://github.com/MetaX-MACA/vLLM-metax.git
+RUN git clone --depth 1 --branch ${VLLM_VERSION} https://github.com/vllm-project/vllm.git
+RUN git clone --depth 1 --branch ${MEGATRON_VERSION} https://github.com/NVIDIA/Megatron-LM.git


We can combine the cleanup and the multiple git clone commands into a single RUN instruction to minimize the number of intermediate image layers.

RUN rm -rf /workspace/ms-swift /workspace/vLLM-metax /workspace/vllm /workspace/Megatron-LM && \ git clone --depth 1 --branch ${SWIFT_VERSION} https://github.com/modelscope/ms-swift.git && \ git clone --depth 1 --branch ${VLLM_METAX_VERSION} https://github.com/MetaX-MACA/vLLM-metax.git && \ git clone --depth 1 --branch ${VLLM_VERSION} https://github.com/vllm-project/vllm.git && \ git clone --depth 1 --branch ${MEGATRON_VERSION} https://github.com/NVIDIA/Megatron-LM.git

gemini-code-assist · 2026-06-10T03:33:50Z

+# Step 1: build original vLLM for torch setup
+RUN cd vllm && \
+    python3 use_existing_torch.py && \
+    pip install -r requirements/build/cuda.txt
+
+# Step 2: build vLLM with empty device to avoid CUDA dependency
+RUN cd vllm && \
+    VLLM_TARGET_DEVICE=empty pip install -v . --no-build-isolation


These two RUN blocks can be combined into a single RUN instruction to reduce the number of image layers and avoid redundant directory changes (cd vllm).

RUN cd vllm && \ python3 use_existing_torch.py && \ pip install -r requirements/build/cuda.txt && \ VLLM_TARGET_DEVICE=empty pip install -v . --no-build-isolation

Jintao-Huang · 2026-06-10T03:47:30Z

Please run the lint test first with the following commands:

pip install pre-commit
pre-commit run --all-files

* Upgrade numpy to 2.x for 1.38 Docker images - Replace deprecated numpy aliases (np.math.ceil → math.ceil, np.Inf → np.inf) - Upgrade Docker constraints: numpy>=2.0, cython>=3.0, remove scipy upper bound * Add ipywidgets dependency to Docker images * update docker * fix * fix * fix * fix cpu image * fix(docker): force numpy>=2.0 after evalscope install ms-opencompass pulls numpy<2.0, downgrading numpy from 2.x to 1.26.4. Force reinstall numpy>=2.0 after pip install .[eval] to restore it. * add metax dockerfile and its requirements for ms-swift 4.2.x (#1734) * Update npu dockerfile (#1736) * fix(docker): remove unrelated added docker files --------- Co-authored-by: Jintao Huang <huangjintao.hjt@alibaba-inc.com> Co-authored-by: dwd <dwd1044898101@gmail.com> Co-authored-by: addsubmuldiv <zyh13227@163.com>

feat(docker/Metax): add metax dockerfile and its requirements for ms-…

8260939

…swift 4.2.x

gemini-code-assist Bot reviewed Jun 10, 2026

View reviewed changes

tastelikefeet approved these changes Jun 10, 2026

View reviewed changes

Jintao-Huang approved these changes Jun 10, 2026

View reviewed changes

Wenda Deng (m01359) added 3 commits June 12, 2026 17:34

fix metax 4.2 dockerfiles

d6dc257

fix override.txt

5c63262

fix requirements_extra.txt

939ac6d

fjcobu14 mentioned this pull request Jun 14, 2026

Please enable private vulnerability reporting (Security Advisories) #1719

Open

Jintao-Huang merged commit 37eb369 into modelscope:master Jun 15, 2026
2 checks passed

Yunnglin pushed a commit that referenced this pull request Jun 15, 2026

add metax dockerfile and its requirements for ms-swift 4.2.x (#1734)

113c93b

Yunnglin pushed a commit that referenced this pull request Jun 17, 2026

add metax dockerfile and its requirements for ms-swift 4.2.x (#1734)

de32b02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add metax dockerfile and its requirements for ms-swift 4.2.x#1734

add metax dockerfile and its requirements for ms-swift 4.2.x#1734
Jintao-Huang merged 4 commits into
modelscope:masterfrom
WendaDeng:add-metax-4.2-dockerfiles

WendaDeng commented Jun 10, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 10, 2026

Uh oh!

gemini-code-assist Bot Jun 10, 2026

Uh oh!

gemini-code-assist Bot Jun 10, 2026

Uh oh!

gemini-code-assist Bot Jun 10, 2026

Uh oh!

gemini-code-assist Bot Jun 10, 2026

Uh oh!

gemini-code-assist Bot Jun 10, 2026

Uh oh!

gemini-code-assist Bot Jun 10, 2026

Uh oh!

Jintao-Huang commented Jun 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		RUN echo $PATH
		RUN apt-get update && apt-get install -y git && rm -rf /var/lib/apt/lists/*

Uh oh!

Conversation

WendaDeng commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

Jintao-Huang commented Jun 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

WendaDeng commented Jun 10, 2026 •

edited

Loading