MPI分布式编程 (CPU环境) --3.OpenMPI多节点运行报错

    xiaoxiao2025-02-10  41

    1. OpenMPI多节点运行报错问题

    问题描述:节点一即host3,通过mpirun调用节点二即host4的mpi程序,报错如下。

    $ mpirun -np 1 --host host4 ./main [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file ess_env_module.c at line 367 [[INVALID],INVALID]-[[59225,0],0] mca_oob_tcp_peer_try_connect: connect to 255.255.255.255:51754 failed: Network is unreachable (101) -------------------------------------------------------------------------- ORTE was unable to reliably start one or more daemons. This usually is caused by: * not finding the required libraries and/or binaries on one or more nodes. Please check your PATH and LD_LIBRARY_PATH settings, or configure OMPI with --enable-orterun-prefix-by-default * lack of authority to execute on one or more specified nodes. Please verify your allocation and authorities. * the inability to write startup files into /tmp (--tmpdir/orte_tmpdir_base). Please check with your sys admin to determine the correct locat
    最新回复(0)