libfabric/SOURCES/0001-Fix-segment-fault-issu...

53 lines
1.9 KiB
Diff

From b091a17b1ec7a5b546c2450bbd24bd26716c2f67 Mon Sep 17 00:00:00 2001
From: Honggang Li <honli@redhat.com>
Date: Sun, 4 Aug 2019 21:26:04 -0400
Subject: [PATCH] Fix segment fault issue for linux container
While run openmpi/mpirun with linux containers, the libfabric failed
with segment fault message.
Signal: Segmentation fault (11)
Signal code: Address not mapped (1)
Failing at address: 0xfffffffffffffff0
[ 0] /lib64/libpthread.so.0(+0x12d80)[0x14feb5d4dd80]
[ 1] /lib64/libfabric.so.1(+0x23cd1)[0x14fea8105cd1]
[ 2] /lib64/libfabric.so.1(+0x18240)[0x14fea80fa240]
[ 3] /lib64/libfabric.so.1(fi_getinfo+0x695)[0x14fea80faea5]
[ 4] /lib64/libfabric.so.1(fi_getinfo+0x4e)[0x14fea80ffe9e]
[ 5] /usr/lib64/openmpi/lib/openmpi/mca_btl_usnic.so(+0xdf4e)[0x14fea8445f4e]
[ 6] /usr/lib64/openmpi/lib/libopen-pal.so.40(mca_btl_base_select+0xed)[0x14feb547815d]
[ 7] /usr/lib64/openmpi/lib/openmpi/mca_bml_r2.so(mca_bml_r2_component_init+0x16)[0x14fea9fab2f6]
[ 8] /usr/lib64/openmpi/lib/libmpi.so.40(mca_bml_base_init+0xa4)[0x14feb5ffef94]
[ 9] /usr/lib64/openmpi/lib/libmpi.so.40(ompi_mpi_init+0x654)[0x14feb5fac474]
[10] /usr/lib64/openmpi/lib/libmpi.so.40(MPI_Init+0x72)[0x14feb5fdc6b2]
[11] /home/mpi/ring[0x4009ad]
[12] /lib64/libc.so.6(__libc_start_main+0xf3)[0x14feb599a813]
[13] /home/mpi/ring[0x4008be]
The 'scandir' function called by 'ofi_mem_init' returned -1 with errno
set to ENOENT.
Fixes: 8ce14923ba67 (core/mem: Obtain a list of available huge pages in system)
Signed-off-by: Honggang Li <honli@redhat.com>
---
src/mem.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/mem.c b/src/mem.c
index 91836a79c..23617a0a4 100644
--- a/src/mem.c
+++ b/src/mem.c
@@ -84,7 +84,7 @@ void ofi_mem_init(void)
num_page_sizes = 1;
}
- while (n--) {
+ while (n-- > 0) {
if (sscanf(pglist[n]->d_name, "hugepages-%zukB", &hpsize) == 1) {
hpsize *= 1024;
if (hpsize != page_sizes[OFI_DEF_HUGEPAGE_SIZE])
--
2.20.1