diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..f86d3a4 --- /dev/null +++ b/.gitignore @@ -0,0 +1 @@ +/perftest-4.5-0.12.ge93c538.tar.gz diff --git a/EMPTY b/EMPTY deleted file mode 100644 index 0519ecb..0000000 --- a/EMPTY +++ /dev/null @@ -1 +0,0 @@ - \ No newline at end of file diff --git a/ib_atomic_bw.1 b/ib_atomic_bw.1 new file mode 100644 index 0000000..890b65a --- /dev/null +++ b/ib_atomic_bw.1 @@ -0,0 +1,298 @@ +.\" Copyright (c) 2014, Jan Chaloupka +.\" +.\" %%%LICENSE_START(GPLv2+_DOC_FULL) +.\" This is free documentation; you can redistribute it and/or +.\" modify it under the terms of the GNU General Public License as +.\" published by the Free Software Foundation; either version 2 of +.\" the License, or (at your option) any later version. +.\" +.\" The GNU General Public License's references to "object code" +.\" and "executables" are to be interpreted as the output of any +.\" document formatting or typesetting system, including +.\" intermediate and printed output. +.\" +.\" This manual is distributed in the hope that it will be useful, +.\" but WITHOUT ANY WARRANTY; without even the implied warranty of +.\" MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +.\" GNU General Public License for more details. +.\" +.\" You should have received a copy of the GNU General Public +.\" License along with this manual; if not, see +.\" . +.\" %%%LICENSE_END +.TH "IB_ATOMIC_BW" 1 2014 "Open Fabrics Enterprise Distribution" +.\" IB_ATOMIC_BW +.SH NAME +ib_atomic_bw, ib_atomic_lat, ib_read_bw, ib_read_lat, ib_send_bw, +ib_send_lat, ib_write_bw, ib_write_lat +\- Collection of tests written over uverbs intended for use as a +performance micro-benchmark +.SH SYNOPSIS +.sp +.B ib_atomic_bw [] [options] +.sp +.B ib_atomic_lat [] [options] +.sp +.B ib_read_bw [] [options] +.sp +.B ib_read_lat [] [options] +.sp +.B ib_write_bw [] [options] +.sp +.B ib_write_lat [] [options] +.SH DESCRIPTION +This is a collection of tests written over uverbs intended for use as a +performance micro-benchmark. As an example, the tests can be used for +HW or SW tuning and/or functional testing. + +The collection conatains a set of BW and latency benchmark such as : +.sp +* Read - ib_read_bw and ib_read_lat. +.sp +* Write - ib_write_bw and ib_wriet_lat. +.sp +* Send - ib_send_bw and ib_send_lat. +.sp +* Atomic - ib_atomic_bw and ib_atomic_lat +.sp +* Raw Etherent (when working with MOFED2) - raw_ethernet_bw, raw_ethernet_lat + +The benchmark used the CPU cycle counter to get time stamps without context +switch. Some CPU architectures (e.g., Intel's 80486 or older PPC) do NOT +have such capability. + +The latency benchmarks measures round-trip time but reports half of that as +one-way latency. +This means that it may not be sufficiently accurate for asymmetrical +configurations. + +On Bw benchmarks, we calculate the BW on send side only, as he calculates +the Bw after collecting completion from the receive side. +In case we use the bidirectional flag , BW is calculated on both sides. +in ib_send_bw, server side also calculate the received throughput. + +Min/Median/Max result is reported in latency tests. +The median (vs average) is less sensitive to extreme scores. +Typically, the "Max" value is the first value measured. + +Larger samples help marginally only. The default (1000) is pretty good. +Note that an array of cycles_t (typically unsigned long) is allocated +once to collect samples and again to store the difference between them. +Really big sample sizes (e.g., 1 million) might expose other problems +with the program. In this case you can use -N flag (No Peak) to instruct +the test sample only 2 times (begining and end). + +All throughput tests now have duration feature as well (-D ) +to instruct the test to run for . +Another feature added is --run_infinitely, which instruct the test to run +all te time and print throughput every 5 seconds. + +The "-H" option (latency) will dump the histogram for additional statistical +analysis. +See xgraph, ygraph, r-base (http://www.r-project.org/), pspp, or other +statistical math programs. + + +Architectures tested: i686, x86_64, ia64 +.SH OPTIONS +The SAME OPTIONS must be passed to both server and client. + +If +.I +is not presented, command starts a server and waits for connection. +If it is, command connects to server at +.I . +.sp +.B Common Options: +.RS 4 +.TP +\fB\-h\fR, \fB\-\-help\fR +Display this help message screen. +.TP +\fB\-p\fR, \fB\-\-port\fR=\fI\fR +Listen on/connect to port (default: 18515) when exchaning data. +.TP +\fB\-R\fR, \fB\-\-rdma_cm\fR +Connect QPs with rdma_cm and run test on those QPs. +.TP +\fB\-z\fR, \fB\-\-com_rdma_cm\fR +Communicate with rdma_cm module to exchange data \- use regular QPs. +.TP +\fB\-m\fR, \fB\-\-mtu\fR=\fI\fR + QP Mtu size (default: active_mtu from ibv_devinfo). +.TP +\fB\-c\fR, \fB\-\-connection\fR=\fI\fR +Connection type RC/UC/UD (default RC) +.TP +\fB\-d\fR, \fB\-\-ib\-dev\fR=\fI\fR +Use IB device (default: first device found). +.TP +\fB\-i\fR, \fB\-\-ib\-port\fR=\fI\fR +Use port of IB device (default: 1). +.TP +\fB\-s\fR, \fB\-\-size\fR=\fI\fR +Size of message to exchange (default: 1). +.TP +\fB\-a\fR, \fB\-\-all\fR +Run sizes from 2 till 2^23. +.TP +\fB\-n\fR, \fB\-\-iters\fR=\fI\fR +Number of exchanges (at least 100, default: 1000). +.TP +\fB\-x\fR, \fB\-\-gid\-index\fR=\fI\fR +Test uses GID with GID index taken from command +.TP +\fB\-V\fR, \fB\-\-version\fR +Display version number. +.TP +\fB\-e\fR, \fB\-\-events\fR +Sleep on CQ events (default poll). +.TP +\fB\-F\fR, \fB\-\-CPU\-freq\fR +Do not fail even if cpufreq_ondemand module. +.TP +\fB\-I\fR, \fB\-\-inline_size\fR=\fI\fR +Max size of message to be sent in inline mode. +.TP +\fB\-u\fR, \fB\-\-qp\-timeout\fR=\fI\fR +QP timeout, timeout value is 4 usec*2 ^timeout (default: 14). +.TP +\fB\-S\fR, \fB\-\-sl\fR=\fI\fR +SL \- Service Level (default 0) +.TP +\fB\-r\fR, \fB\-\-rx\-depth\fR=\fI\fR +Make rx queue bigger than tx (default 600). +.RE +.sp +.B Latenct tests options: +.RS 4 +.TP +\fB\-C\fR, \fB\-\-report\-cycles\fR +Report times in cpu cycle units. +.TP +\fB\-H\fR, \fB\-\-report\-histogram\fR +Print out all results (Default: summary only). +.TP +\fB\-U\fR, \fB\-\-report\-unsorted\fR +Print out unsorted results (default sorted). +.RE +.sp +.B BW tests options: +.RS 4 +.TP +\fB\-b\fR, \fB\-\-bidirectional\fR +Measure bidirectional bandwidth (default uni). +.TP +\fB\-N\fR, \fB\-\-no\fR +peak\-bw Cancel peak\-bw calculation (default with peak\-bw) +.TP +\fB\-Q\fR, \fB\-\-cq\-mod\fR +Generate Cqe only after completion +.TP +\fB\-t\fR, \fB\-\-tx\-depth=\fR +Size of tx queue (default: 128). +.TP +\fB\-O\fR, \fB\-\-dualport\fR +Run test in dual\-port mode (2 QPs). both ports must be active (default OFF). +.TP +\fB\-D\fR, \fB\-\-duration=\fR +Run test for period of seconds. +.TP +\fB\-f\fR, \fB\-\-margin=\fR +When in Duration, measure results within margins (default: 2) +.TP +\fB\-l\fR, \fB\-\-post_list=\fR +Post list of WQEs of size (instead of single post). +.TP +\fB\-q\fR, \fB\-\-qp=\fR +Num of QPs running in the process (default: 1). +.TP +\fB\-\-run_infinitely \fR +Run test forever\fR, \fBprint results every 5 seconds. +.RE +.sp +.B SEND tests options: +.RS 4 +.TP +\fB\-r\fR, \fB\-\-rx\-depth=\fR +Size of RX queue (default: 512 in BW test). +.TP +\fB\-g\fR, \fB\-\-mcg=\fR +Send messages to multicast group with qps attached to it. +.TP +\fB\-M\fR, \fB\-\-MGID=\fR +In multicast, uses as the group MGID. +.RE +.sp +.B Raw Ethernet BW test options: +.RS 4 +.TP +\fB\-A\fR, \fB\-\-atomic_type=\fR +type of atomic operation from {CMP_AND_SWAP,FETCH_AND_ADD}. +.TP +\fB\-o\fR, \fB\-\-outs=\fR +Number of outstanding read/atomic requests \- also on READ tests. +.TP +\fB\-B\fR, \fB\-\-source_mac\fR +source MAC address by this format XX:XX:XX:XX:XX:XX (default take the MAC address form GID). +.TP +\fB\-E\fR, \fB\-\-dest_mac\fR +destination MAC address by this format XX:XX:XX:XX:XX:XX **MUST** be entered. +.TP +\fB\-J\fR, \fB\-\-server_ip\fR +server ip address by this format X.X.X.X (using to send packets with IP header). +.TP +\fB\-j\fR, \fB\-\-client_ip\fR +client ip address by this format X.X.X.X (using to send packets with IP header). +.TP +\fB\-K\fR, \fB\-\-server_port\fR +server udp port number (using to send packets with UPD header). +.TP +\fB\-k\fR, \fB\-\-client_port\fR +client udp port number (using to send packets with UDP header). +.TP +\fB\-Z\fR, \fB\-\-server\fR +choose server side for the current machine (\-\-server/\-\-client must be selected ). +.TP +\fB\-P\fR, \fB\-\-client\fR +choose client side for the current machine (\-\-server/\-\-client must be selected). +.RE +.SH ENVIRONMENT +.B Prerequisites: +.RS +kernel 2.6 +.RE +.RS +(kernel module) matches libibverbs +.RE +.RS +(kernel module) matches librdmacm +.RE +.RS +(kernel module) matches libibumad +.RE +.RS +(kernel module) matches libmath (lm). +.RE +.SH NOTES +You need to be running a Subnet Manager on the switch or on one of the nodes in your fabric, in case you are in IB fabric. +.SH BUGS +1. Multicast feauture in ib_send_lat and in ib_send_bw still have many problems! +Will increase the support and bug fixes in this Q, but now the tests may stuck +and could produce undefine behaviours. +.sp +2. Bidirectional feature in ib_send_bw test, when running in UD or UC mode. +The algorithm we use for the bidirectional measurement is designed for RC connection type. +When running in UC or UD connection types, there is a small probablity the test will be stuck. +.sp +3. RDMA_CM feature in read tests still doesn't work. +.sp +4. Dual-port support currently works only with ib_write_bw. +.sp +5. Compabilty issues may occur between different versions of perftest. +Please make sure you work with the same version on both sides to ensure +consistency of the test. +.SH AUTHORS +Please post results/observations to the openib-general mailing list. +See "Contact Us" at http://openib.org/mailman/listinfo/openib-general and +http://www.openib.org. diff --git a/perftest.spec b/perftest.spec new file mode 100644 index 0000000..146a0a8 --- /dev/null +++ b/perftest.spec @@ -0,0 +1,207 @@ +Name: perftest +Summary: IB Performance Tests +Version: 4.5 +%define minor_release 0.12 +%define git_hash ge93c538 +Release: 12%{?dist} +License: GPLv2 or BSD +Source0: https://github.com/linux-rdma/perftest/releases/download/V%{version}-%{minor_release}/perftest-%{version}-%{minor_release}.%{git_hash}.tar.gz +Source1: ib_atomic_bw.1 +Url: https://github.com/linux-rdma/perftest +BuildRequires: libibverbs-devel > 1.1.4, librdmacm-devel > 1.0.14 +BuildRequires: libibumad-devel > 1.3.6 +BuildRequires: pciutils-devel +BuildRequires: autoconf, automake, libtool +Obsoletes: openib-perftest < 1.3 + +%description +Perftest is a collection of simple test programs designed to utilize +RDMA communications and provide performance numbers over those RDMA +connections. It does not work on normal TCP/IP networks, only on +RDMA networks. + +%prep +%setup -q +autoreconf --force --install + +%build +%configure +%make_build + +%install +for file in ib_{atomic,read,send,write}_{lat,bw}; do + install -D -m 0755 $file %{buildroot}%{_bindir}/$file +done +for file in raw_ethernet_{lat,bw}; do + install -D -m 0755 $file %{buildroot}%{_bindir}/$file +done +mkdir -p %{buildroot}%{_mandir}/man1/ +install -D -m 0644 %{SOURCE1} %{buildroot}%{_mandir}/man1/ +pushd %{buildroot}%{_mandir}/man1/ +for file in ib_atomic_lat ib_{read,send,write}_{lat,bw} raw_ethernet_{lat,bw}; do + ln -s ib_atomic_bw.1 ${file}.1 +done +popd + +%files +%doc README +%{_bindir}/* +%{_mandir}/man1/* +%license COPYING + +%changelog +* Wed Nov 10 2021 Honggang Li - 4.5-12 +- Rebase to upstream release perftest-4.5-0.12 +- Resolves: rhbz#2020062 + +* Thu May 13 2021 Honggang Li - 4.5-1 +- Rebase to upstream release perftest-4.5-0.2 +- Resolves: rhbz#1960074 + +* Sat Jan 30 2021 Honggang Li - 4.4-8 +- Check PCIe relaxed ordering compliant +- Resolves: rhbz#1902855 + +* Thu Nov 05 2020 Honggang Li - 4.4-7 +- Rebase to upstream release perftest-4.4-0.32 +- Resolves: bz1888570 + +* Fri Jul 24 2020 Honggang Li - 4.4-3 +- Fix segment fault with large QP numbers +- Resolves: rhbz#1859358 + +* Mon May 25 2020 Honggang Li - 4.4-2 +- Update to upstream 4.4-0.29.g817ec38 tarball +- Resolves: rhbz#1832709 + +* Wed Apr 15 2020 Honggang Li - 4.4-1 +- Update to upstream 4.4-0.23.g89e176a tarball +- Resolves: rhbz#1817830 + +* Mon Jul 23 2018 Jarod Wilson - 4.2-2 +- Update to upstream 4.2-0.8.g0e24e67 tarball + +* Mon Apr 30 2018 Jarod Wilson - 4.2-1 +- Update to upstream 4.2-0.5.gdd28746 tarball + +* Mon Apr 03 2017 Jarod Wilson - 3.4-1 +- Update to upstream 3.4-0.9.g98a9a17 tarball +- Resolves: rhbz#1437978 + +* Thu Aug 18 2016 Jarod Wilson - 3.0-7 +- Address a myriad more coverity/clang warnings +- Add raw_ethernet_* man page symlinks +- Related: rhbz#1273176 +- Related: rhbz#948476 + +* Mon Aug 15 2016 Jarod Wilson - 3.0-6 +- Update to upstream 3.0-3.1.gb36a595 tarball for upstream fixes +- Add in manpages +- Related: rhbz#1365750 +- Resolves: rhbz#948476 + +* Fri Aug 12 2016 Jarod Wilson - 3.0-5 +- Make it possible to actually test with XRC connections again +- Resolves: rhbz#1365750 + +* Mon Aug 08 2016 Jarod Wilson - 3.0-4 +- Install raw_ethernet{lat,bw} tools +- Resolves: rhbz#1365182 + +* Wed May 18 2016 Jarod Wilson - 3.0-3 +- Fix additional memory leaks reported and spotted after last fix + +* Wed May 18 2016 Jarod Wilson - 3.0-2 +- Fix issues uncovered by coverity + +* Wed May 04 2016 Jarod Wilson - 3.0-1 +- Update to upstream release v3.0 +- Resolves: bz1309586, bz1273176 + +* Tue Jun 16 2015 Michal Schmidt - 2.4-1 +- Update to latest upstream release +- Enable s390x platform +- Resolves: bz1182177 + +* Fri Oct 17 2014 Doug Ledford - 2.3-1 +- Update to latest upstream release +- Resolves: bz1061582 + +* Tue May 20 2014 Kyle McMartin - 2.0-4 +- aarch64: add get_cycles implementation since is no longer + exported by the kernel. +- Resolves: #1100043 + +* Thu Jan 23 2014 Doug Ledford - 2.0-3 +- Fix for rpmdiff found issues +- Related: bz1017321 + +* Fri Dec 27 2013 Daniel Mach - 2.0-2 +- Mass rebuild 2013-12-27 + +* Wed Jul 17 2013 Doug Ledford - 2.0-1 +- Update to latest upstream version + +* Thu Feb 14 2013 Fedora Release Engineering - 1.3.0-4 +- Rebuilt for https://fedoraproject.org/wiki/Fedora_19_Mass_Rebuild + +* Fri Jul 20 2012 Fedora Release Engineering - 1.3.0-3 +- Rebuilt for https://fedoraproject.org/wiki/Fedora_18_Mass_Rebuild + +* Fri Jan 06 2012 Doug Ledford - 1.3.0-2 +- Update to latest upstream release +- Initial import into Fedora +- Remove runme from docs section (review item) +- Improve description of package (review item) + +* Fri Jul 22 2011 Doug Ledford - 1.3.0-1 +- Update to latest upstream release (1.2.3 -> 1.3.0) +- Strip rocee related code out of upstream update +- Add a buildrequires on libibumad because upstream needs it now +- Fix lack of build on i686 +- Related: bz725016 +- Resolves: bz724896 + +* Mon Jan 25 2010 Doug Ledford - 1.2.3-3.el6 +- More minor pkgwrangler cleanups +- Related: bz543948 + +* Mon Jan 25 2010 Doug Ledford - 1.2.3-2.el6 +- Fixes for pkgwrangler review +- Related: bz543948 + +* Tue Dec 22 2009 Doug Ledford - 1.2.3-1.el5 +- Update to latest upstream version +- Related: bz518218 + +* Mon Jun 22 2009 Doug Ledford - 1.2-14.el5 +- Rebuild against libibverbs that isn't missing the proper ppc wmb() macro +- Related: bz506258 + +* Sun Jun 21 2009 Doug Ledford - 1.2-13.el5 +- Update to ofed 1.4.1 final bits +- Rebuild against non-XRC libibverbs +- Related: bz506097, bz506258 + +* Sat Apr 18 2009 Doug Ledford - 1.2-12.el5 +- Update to ofed 1.4.1-rc3 version +- Remove dead patch +- Related: bz459652 + +* Wed Sep 17 2008 Doug Ledford - 1.2-11 +- Upstream has updated the tarball without updating the version, so we + grabbed the one from the OFED-1.3.2-20080728.0355 tarball +- Resolves: bz451481 + +* Wed Apr 09 2008 Doug Ledford - 1.2-10 +- Fix the fact that the itc clock on ia64 may be a multiple of the cpu clock +- Resolves: bz433659 + +* Tue Apr 01 2008 Doug Ledford - 1.2-9 +- Update to OFED 1.3 final bits +- Related: bz428197 + +* Sun Jan 27 2008 Doug Ledford - 1.2-8 +- Split out to separate package (used to be part of openib package) +- Related: bz428197 + diff --git a/sources b/sources new file mode 100644 index 0000000..2a34d06 --- /dev/null +++ b/sources @@ -0,0 +1 @@ +SHA512 (perftest-4.5-0.12.ge93c538.tar.gz) = 1768273026a9fd6177d559a897b9e356862f5f6db035432df1ffc464f73a706fb71bad6df9878dd3a8e3a1ef35b70b670e37c62b670df326523c81e9bcd74da7