From a812bc0cd2607ee380873480179f8b49a5fb6941 Mon Sep 17 00:00:00 2001 From: CentOS Sources Date: Tue, 7 May 2019 09:21:02 -0400 Subject: [PATCH] import hardlink-1.3-6.el8 --- .gitignore | 0 .hardlink.metadata | 0 SOURCES/gpl-2.0.txt | 339 ++++++++++++++++++++++++++++++++++ SOURCES/hardlink.1 | 62 +++++++ SOURCES/hardlink.c | 432 ++++++++++++++++++++++++++++++++++++++++++++ SPECS/hardlink.spec | 193 ++++++++++++++++++++ 6 files changed, 1026 insertions(+) create mode 100644 .gitignore create mode 100644 .hardlink.metadata create mode 100644 SOURCES/gpl-2.0.txt create mode 100644 SOURCES/hardlink.1 create mode 100644 SOURCES/hardlink.c create mode 100644 SPECS/hardlink.spec diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..e69de29 diff --git a/.hardlink.metadata b/.hardlink.metadata new file mode 100644 index 0000000..e69de29 diff --git a/SOURCES/gpl-2.0.txt b/SOURCES/gpl-2.0.txt new file mode 100644 index 0000000..d159169 --- /dev/null +++ b/SOURCES/gpl-2.0.txt @@ -0,0 +1,339 @@ + GNU GENERAL PUBLIC LICENSE + Version 2, June 1991 + + Copyright (C) 1989, 1991 Free Software Foundation, Inc., + 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + Everyone is permitted to copy and distribute verbatim copies + of this license document, but changing it is not allowed. + + Preamble + + The licenses for most software are designed to take away your +freedom to share and change it. By contrast, the GNU General Public +License is intended to guarantee your freedom to share and change free +software--to make sure the software is free for all its users. This +General Public License applies to most of the Free Software +Foundation's software and to any other program whose authors commit to +using it. (Some other Free Software Foundation software is covered by +the GNU Lesser General Public License instead.) You can apply it to +your programs, too. + + When we speak of free software, we are referring to freedom, not +price. Our General Public Licenses are designed to make sure that you +have the freedom to distribute copies of free software (and charge for +this service if you wish), that you receive source code or can get it +if you want it, that you can change the software or use pieces of it +in new free programs; and that you know you can do these things. + + To protect your rights, we need to make restrictions that forbid +anyone to deny you these rights or to ask you to surrender the rights. +These restrictions translate to certain responsibilities for you if you +distribute copies of the software, or if you modify it. + + For example, if you distribute copies of such a program, whether +gratis or for a fee, you must give the recipients all the rights that +you have. You must make sure that they, too, receive or can get the +source code. And you must show them these terms so they know their +rights. + + We protect your rights with two steps: (1) copyright the software, and +(2) offer you this license which gives you legal permission to copy, +distribute and/or modify the software. + + Also, for each author's protection and ours, we want to make certain +that everyone understands that there is no warranty for this free +software. If the software is modified by someone else and passed on, we +want its recipients to know that what they have is not the original, so +that any problems introduced by others will not reflect on the original +authors' reputations. + + Finally, any free program is threatened constantly by software +patents. We wish to avoid the danger that redistributors of a free +program will individually obtain patent licenses, in effect making the +program proprietary. To prevent this, we have made it clear that any +patent must be licensed for everyone's free use or not licensed at all. + + The precise terms and conditions for copying, distribution and +modification follow. + + GNU GENERAL PUBLIC LICENSE + TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION + + 0. This License applies to any program or other work which contains +a notice placed by the copyright holder saying it may be distributed +under the terms of this General Public License. The "Program", below, +refers to any such program or work, and a "work based on the Program" +means either the Program or any derivative work under copyright law: +that is to say, a work containing the Program or a portion of it, +either verbatim or with modifications and/or translated into another +language. (Hereinafter, translation is included without limitation in +the term "modification".) Each licensee is addressed as "you". + +Activities other than copying, distribution and modification are not +covered by this License; they are outside its scope. The act of +running the Program is not restricted, and the output from the Program +is covered only if its contents constitute a work based on the +Program (independent of having been made by running the Program). +Whether that is true depends on what the Program does. + + 1. You may copy and distribute verbatim copies of the Program's +source code as you receive it, in any medium, provided that you +conspicuously and appropriately publish on each copy an appropriate +copyright notice and disclaimer of warranty; keep intact all the +notices that refer to this License and to the absence of any warranty; +and give any other recipients of the Program a copy of this License +along with the Program. + +You may charge a fee for the physical act of transferring a copy, and +you may at your option offer warranty protection in exchange for a fee. + + 2. You may modify your copy or copies of the Program or any portion +of it, thus forming a work based on the Program, and copy and +distribute such modifications or work under the terms of Section 1 +above, provided that you also meet all of these conditions: + + a) You must cause the modified files to carry prominent notices + stating that you changed the files and the date of any change. + + b) You must cause any work that you distribute or publish, that in + whole or in part contains or is derived from the Program or any + part thereof, to be licensed as a whole at no charge to all third + parties under the terms of this License. + + c) If the modified program normally reads commands interactively + when run, you must cause it, when started running for such + interactive use in the most ordinary way, to print or display an + announcement including an appropriate copyright notice and a + notice that there is no warranty (or else, saying that you provide + a warranty) and that users may redistribute the program under + these conditions, and telling the user how to view a copy of this + License. (Exception: if the Program itself is interactive but + does not normally print such an announcement, your work based on + the Program is not required to print an announcement.) + +These requirements apply to the modified work as a whole. If +identifiable sections of that work are not derived from the Program, +and can be reasonably considered independent and separate works in +themselves, then this License, and its terms, do not apply to those +sections when you distribute them as separate works. But when you +distribute the same sections as part of a whole which is a work based +on the Program, the distribution of the whole must be on the terms of +this License, whose permissions for other licensees extend to the +entire whole, and thus to each and every part regardless of who wrote it. + +Thus, it is not the intent of this section to claim rights or contest +your rights to work written entirely by you; rather, the intent is to +exercise the right to control the distribution of derivative or +collective works based on the Program. + +In addition, mere aggregation of another work not based on the Program +with the Program (or with a work based on the Program) on a volume of +a storage or distribution medium does not bring the other work under +the scope of this License. + + 3. You may copy and distribute the Program (or a work based on it, +under Section 2) in object code or executable form under the terms of +Sections 1 and 2 above provided that you also do one of the following: + + a) Accompany it with the complete corresponding machine-readable + source code, which must be distributed under the terms of Sections + 1 and 2 above on a medium customarily used for software interchange; or, + + b) Accompany it with a written offer, valid for at least three + years, to give any third party, for a charge no more than your + cost of physically performing source distribution, a complete + machine-readable copy of the corresponding source code, to be + distributed under the terms of Sections 1 and 2 above on a medium + customarily used for software interchange; or, + + c) Accompany it with the information you received as to the offer + to distribute corresponding source code. (This alternative is + allowed only for noncommercial distribution and only if you + received the program in object code or executable form with such + an offer, in accord with Subsection b above.) + +The source code for a work means the preferred form of the work for +making modifications to it. For an executable work, complete source +code means all the source code for all modules it contains, plus any +associated interface definition files, plus the scripts used to +control compilation and installation of the executable. However, as a +special exception, the source code distributed need not include +anything that is normally distributed (in either source or binary +form) with the major components (compiler, kernel, and so on) of the +operating system on which the executable runs, unless that component +itself accompanies the executable. + +If distribution of executable or object code is made by offering +access to copy from a designated place, then offering equivalent +access to copy the source code from the same place counts as +distribution of the source code, even though third parties are not +compelled to copy the source along with the object code. + + 4. You may not copy, modify, sublicense, or distribute the Program +except as expressly provided under this License. Any attempt +otherwise to copy, modify, sublicense or distribute the Program is +void, and will automatically terminate your rights under this License. +However, parties who have received copies, or rights, from you under +this License will not have their licenses terminated so long as such +parties remain in full compliance. + + 5. You are not required to accept this License, since you have not +signed it. However, nothing else grants you permission to modify or +distribute the Program or its derivative works. These actions are +prohibited by law if you do not accept this License. Therefore, by +modifying or distributing the Program (or any work based on the +Program), you indicate your acceptance of this License to do so, and +all its terms and conditions for copying, distributing or modifying +the Program or works based on it. + + 6. Each time you redistribute the Program (or any work based on the +Program), the recipient automatically receives a license from the +original licensor to copy, distribute or modify the Program subject to +these terms and conditions. You may not impose any further +restrictions on the recipients' exercise of the rights granted herein. +You are not responsible for enforcing compliance by third parties to +this License. + + 7. If, as a consequence of a court judgment or allegation of patent +infringement or for any other reason (not limited to patent issues), +conditions are imposed on you (whether by court order, agreement or +otherwise) that contradict the conditions of this License, they do not +excuse you from the conditions of this License. If you cannot +distribute so as to satisfy simultaneously your obligations under this +License and any other pertinent obligations, then as a consequence you +may not distribute the Program at all. For example, if a patent +license would not permit royalty-free redistribution of the Program by +all those who receive copies directly or indirectly through you, then +the only way you could satisfy both it and this License would be to +refrain entirely from distribution of the Program. + +If any portion of this section is held invalid or unenforceable under +any particular circumstance, the balance of the section is intended to +apply and the section as a whole is intended to apply in other +circumstances. + +It is not the purpose of this section to induce you to infringe any +patents or other property right claims or to contest validity of any +such claims; this section has the sole purpose of protecting the +integrity of the free software distribution system, which is +implemented by public license practices. Many people have made +generous contributions to the wide range of software distributed +through that system in reliance on consistent application of that +system; it is up to the author/donor to decide if he or she is willing +to distribute software through any other system and a licensee cannot +impose that choice. + +This section is intended to make thoroughly clear what is believed to +be a consequence of the rest of this License. + + 8. If the distribution and/or use of the Program is restricted in +certain countries either by patents or by copyrighted interfaces, the +original copyright holder who places the Program under this License +may add an explicit geographical distribution limitation excluding +those countries, so that distribution is permitted only in or among +countries not thus excluded. In such case, this License incorporates +the limitation as if written in the body of this License. + + 9. The Free Software Foundation may publish revised and/or new versions +of the General Public License from time to time. Such new versions will +be similar in spirit to the present version, but may differ in detail to +address new problems or concerns. + +Each version is given a distinguishing version number. If the Program +specifies a version number of this License which applies to it and "any +later version", you have the option of following the terms and conditions +either of that version or of any later version published by the Free +Software Foundation. If the Program does not specify a version number of +this License, you may choose any version ever published by the Free Software +Foundation. + + 10. If you wish to incorporate parts of the Program into other free +programs whose distribution conditions are different, write to the author +to ask for permission. For software which is copyrighted by the Free +Software Foundation, write to the Free Software Foundation; we sometimes +make exceptions for this. Our decision will be guided by the two goals +of preserving the free status of all derivatives of our free software and +of promoting the sharing and reuse of software generally. + + NO WARRANTY + + 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY +FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN +OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES +PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED +OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF +MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS +TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE +PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, +REPAIR OR CORRECTION. + + 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING +WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR +REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, +INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING +OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED +TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY +YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER +PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE +POSSIBILITY OF SUCH DAMAGES. + + END OF TERMS AND CONDITIONS + + How to Apply These Terms to Your New Programs + + If you develop a new program, and you want it to be of the greatest +possible use to the public, the best way to achieve this is to make it +free software which everyone can redistribute and change under these terms. + + To do so, attach the following notices to the program. It is safest +to attach them to the start of each source file to most effectively +convey the exclusion of warranty; and each file should have at least +the "copyright" line and a pointer to where the full notice is found. + + + Copyright (C) + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License along + with this program; if not, write to the Free Software Foundation, Inc., + 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. + +Also add information on how to contact you by electronic and paper mail. + +If the program is interactive, make it output a short notice like this +when it starts in an interactive mode: + + Gnomovision version 69, Copyright (C) year name of author + Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. + This is free software, and you are welcome to redistribute it + under certain conditions; type `show c' for details. + +The hypothetical commands `show w' and `show c' should show the appropriate +parts of the General Public License. Of course, the commands you use may +be called something other than `show w' and `show c'; they could even be +mouse-clicks or menu items--whatever suits your program. + +You should also get your employer (if you work as a programmer) or your +school, if any, to sign a "copyright disclaimer" for the program, if +necessary. Here is a sample; alter the names: + + Yoyodyne, Inc., hereby disclaims all copyright interest in the program + `Gnomovision' (which makes passes at compilers) written by James Hacker. + + , 1 April 1989 + Ty Coon, President of Vice + +This General Public License does not permit incorporating your program into +proprietary programs. If your program is a subroutine library, you may +consider it more useful to permit linking proprietary applications with the +library. If this is what you want to do, use the GNU Lesser General +Public License instead of this License. diff --git a/SOURCES/hardlink.1 b/SOURCES/hardlink.1 new file mode 100644 index 0000000..5aa022a --- /dev/null +++ b/SOURCES/hardlink.1 @@ -0,0 +1,62 @@ +.TH "hardlink" "1" +.SH "NAME" +hardlink \- Consolidate duplicate files via hardlinks +.SH "SYNOPSIS" +.PP +\fBhardlink\fP [\fB-c\fP] [\fB-n\fP] [\fB-v\fP] [\fB-vv\fP] [\fB-x pattern\fP] [\fB-h\fP] directory1 [ directory2 ... ] +.SH "DESCRIPTION" +.PP +This manual page documents \fBhardlink\fP, a +program which consolidates duplicate files in one or more directories +using hardlinks. +.PP +\fBhardlink\fP traverses one +or more directories searching for duplicate files. When it finds duplicate +files, it uses one of them as the master. It then removes all other +duplicates and places a hardlink for each one pointing to the master file. +This allows for conservation of disk space where multiple directories +on a single filesystem contain many duplicate files. +.PP +Since hard links can only span a single filesystem, \fBhardlink\fP +is only useful when all directories specified are on the same filesystem. +.SH "OPTIONS" +.PP +.IP "\fB-c\fP" 10 +Compare only the contents of the files being considered for consolidation. +Disregards permission, ownership and other differences. +.IP "\fB-f\fP" 10 +Force hardlinking across file systems. +.IP "\fB-n\fP" 10 +Do not perform the consolidation; only print what would be changed. +.IP "\fB-v\fP" 10 +Print summary after hardlinking. +.IP "\fB-vv\fP" 10 +Print every hardlinked file and bytes saved. Also print summary after hardlinking. +.IP "\fB-x pattern\fP" 10 +Exclude files and directories matching pattern from hardlinking. +.IP "\fB-h\fP" 10 +Show help. +.PP +The optional pattern for excluding files and directories must be a PCRE2 +compatible regular expression. Only the basename of the file or directory +is checked, not its path. Excluded directories' contents will not be examined. +.SH "AUTHOR" +.PP +\fBhardlink\fP was written by Jakub Jelinek . +.PP +Man page written by Brian Long. +.PP +Man page updated by Jindrich Novy +.SH "BUGS" +.PP +\fBhardlink\fP assumes that its target directory trees do not change from under +it. If a directory tree does change, this may result in \fBhardlink\fP +accessing files and/or directories outside of the intended directory tree. +Thus, you must avoid running \fBhardlink\fP on potentially changing directory +trees, and especially on directory trees under control of another user. +.PP +Historically \fBhardlink\fP silently excluded any names beginning with +".in.", as well as any names beginning with "." followed by exactly 6 +other characters. That prior behavior can be achieved by specifying +.br +-x '^(\\.in\\.|\\.[^.]{6}$)' diff --git a/SOURCES/hardlink.c b/SOURCES/hardlink.c new file mode 100644 index 0000000..8e74ca0 --- /dev/null +++ b/SOURCES/hardlink.c @@ -0,0 +1,432 @@ +/* Copyright (C) 2001 Red Hat, Inc. + + Written by Jakub Jelinek . + + This program is free software; you can redistribute it and/or + modify it under the terms of the GNU General Public License as + published by the Free Software Foundation; either version 2 of the + License, or (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software Foundation, + Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA */ + +/* Changes by Rémy Card to use constants and add option -n. */ +/* Changes by Jindrich Novy to add option -h, -f, replace mmap(2), fix overflows */ +/* Changes by Travers Carter to make atomic hardlinking */ +/* Changes by Todd Lewis that adds option -x to exclude files with pcre lib */ + +#define _GNU_SOURCE +#define PCRE2_CODE_UNIT_WIDTH 8 +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define NHASH (1<<17) /* Must be a power of 2! */ +#define NIOBUF (1<<12) +#define NAMELEN 4096 +#define NBUF 64 + +pcre2_code *re; +PCRE2_SPTR exclude_pattern; +pcre2_match_data *match_data; + +struct _f; +typedef struct _h { + struct _h *next; + struct _f *chain; + off_t size; + time_t mtime; +} h; + +typedef struct _d { + struct _d *next; + char name[0]; +} d; + +d *dirs; + +h *hps[NHASH]; + +int no_link = 0; +int verbose = 0; +int content_only = 0; +int force = 0; + +typedef struct _f { + struct _f *next; + ino_t ino; + dev_t dev; + unsigned int cksum; + char name[0]; +} f; + +__attribute__((always_inline)) inline unsigned int hash(off_t size, time_t mtime) +{ + return (size ^ mtime) & (NHASH - 1); +} + +__attribute__((always_inline)) inline int stcmp(struct stat *st1, struct stat *st2, int content_only) +{ + if (content_only) + return st1->st_size != st2->st_size; + return st1->st_mode != st2->st_mode || st1->st_uid != st2->st_uid || + st1->st_gid != st2->st_gid || st1->st_size != st2->st_size || + st1->st_mtime != st2->st_mtime; +} + +long long ndirs, nobjects, nregfiles, ncomp, nlinks, nsaved; + +void doexit(int i) +{ + if (verbose) { + fprintf(stderr, "\n\n"); + fprintf(stderr, "Directories %lld\n", ndirs); + fprintf(stderr, "Objects %lld\n", nobjects); + fprintf(stderr, "IFREG %lld\n", nregfiles); + fprintf(stderr, "Comparisons %lld\n", ncomp); + fprintf(stderr, "%s %lld\n", (no_link ? "Would link" : "Linked"), nlinks); + fprintf(stderr, "%s %lld\n", (no_link ? "Would save" : "saved"), nsaved); + } + exit(i); +} + +void usage(char *prog) +{ + fprintf (stderr, "Usage: %s [-cnvhf] [-x pat] directories...\n", prog); + fprintf (stderr, " -c When finding candidates for linking, compare only file contents.\n"); + fprintf (stderr, " -n Don't actually link anything, just report what would be done.\n"); + fprintf (stderr, " -v Print summary after hardlinking.\n"); + fprintf (stderr, " -vv Print every hardlinked file and bytes saved + summary.\n"); + fprintf (stderr, " -f Force hardlinking across filesystems.\n"); + fprintf (stderr, " -x pat Exclude files matching pattern.\n"); + fprintf (stderr, " -h Show help.\n"); + exit(255); +} + +unsigned int buf[NBUF]; +char iobuf1[NIOBUF], iobuf2[NIOBUF]; + +__attribute__((always_inline)) inline size_t add2(size_t a, size_t b) +{ + size_t sum = a + b; + if (sum < a) { + fprintf(stderr, "\nInteger overflow\n"); + doexit(5); + } + return sum; +} + +__attribute__((always_inline)) inline size_t add3(size_t a, size_t b, size_t c) +{ + return add2(add2(a, b), c); +} + +typedef struct { + char *buf; + size_t alloc; +} dynstr; + +void growstr(dynstr *str, size_t newlen) +{ + if (newlen < str->alloc) + return; + str->buf = realloc(str->buf, str->alloc = add2(newlen, 1)); + if (!str->buf) { + fprintf(stderr, "\nOut of memory 4\n"); + doexit(4); + } +} +dev_t dev = 0; +void rf (const char *name) +{ + struct stat st, st2, st3; + const size_t namelen = strlen(name); + nobjects++; + if (lstat (name, &st)) + return; + if (st.st_dev != dev && !force) { + if (dev) { + fprintf(stderr, "%s is on different filesystem than the rest.\nUse -f option to override.\n", name); + doexit(6); + } + dev = st.st_dev; + } + if (S_ISDIR (st.st_mode)) { + d * dp = malloc(add3(sizeof(d), namelen, 1)); + if (!dp) { + fprintf(stderr, "\nOut of memory 3\n"); + doexit(3); + } + memcpy(dp->name, name, namelen + 1); + dp->next = dirs; + dirs = dp; + } else if (S_ISREG (st.st_mode)) { + int fd, i; + f * fp, * fp2; + h * hp; + const char *n1, *n2; + int cksumsize = sizeof(buf); + unsigned int cksum; + time_t mtime = content_only ? 0 : st.st_mtime; + unsigned int hsh = hash (st.st_size, mtime); + off_t fsize; + nregfiles++; + if (verbose > 1) + fprintf(stderr, " %s", name); + fd = open (name, O_RDONLY); + if (fd < 0) return; + if (st.st_size < sizeof(buf)) { + cksumsize = st.st_size; + memset (((char *)buf) + cksumsize, 0, (sizeof(buf) - cksumsize) % sizeof(buf[0])); + } + if (read (fd, buf, cksumsize) != cksumsize) { + close(fd); + if (verbose > 1 && namelen <= NAMELEN) + fprintf(stderr, "\r%*s\r", (int)(namelen + 2), ""); + return; + } + cksumsize = (cksumsize + sizeof(buf[0]) - 1) / sizeof(buf[0]); + for (i = 0, cksum = 0; i < cksumsize; i++) { + if (cksum + buf[i] < cksum) + cksum += buf[i] + 1; + else + cksum += buf[i]; + } + for (hp = hps[hsh]; hp; hp = hp->next) + if (hp->size == st.st_size && hp->mtime == mtime) + break; + if (!hp) { + hp = malloc(sizeof(h)); + if (!hp) { + fprintf(stderr, "\nOut of memory 1\n"); + doexit(1); + } + hp->size = st.st_size; + hp->mtime = mtime; + hp->chain = NULL; + hp->next = hps[hsh]; + hps[hsh] = hp; + } + for (fp = hp->chain; fp; fp = fp->next) + if (fp->cksum == cksum) + break; + for (fp2 = fp; fp2 && fp2->cksum == cksum; fp2 = fp2->next) + if (fp2->ino == st.st_ino && fp2->dev == st.st_dev) { + close(fd); + if (verbose > 1 && namelen <= NAMELEN) + fprintf(stderr, "\r%*s\r", (int)(namelen + 2), ""); + return; + } + for (fp2 = fp; fp2 && fp2->cksum == cksum; fp2 = fp2->next) + if (!lstat (fp2->name, &st2) && S_ISREG (st2.st_mode) && + !stcmp (&st, &st2, content_only) && + st2.st_ino != st.st_ino && + st2.st_dev == st.st_dev) { + int fd2 = open (fp2->name, O_RDONLY); + if (fd2 < 0) continue; + if (fstat (fd2, &st2) || !S_ISREG (st2.st_mode) || st2.st_size == 0) { + close (fd2); + continue; + } + ncomp++; + lseek(fd, 0, SEEK_SET); + for (fsize = st.st_size; fsize > 0; fsize -= NIOBUF) { + off_t rsize = fsize >= NIOBUF ? NIOBUF : fsize; + if (read (fd, iobuf1, rsize) != rsize || read (fd2, iobuf2, rsize) != rsize) { + close(fd); + close(fd2); + fprintf(stderr, "\nReading error\n"); + return; + } + if (memcmp (iobuf1, iobuf2, rsize)) break; + } + close(fd2); + if (fsize > 0) continue; + if (lstat (name, &st3)) { + fprintf(stderr, "\nCould not stat %s again\n", name); + close(fd); + return; + } + st3.st_atime = st.st_atime; + if (stcmp (&st, &st3, 0)) { + fprintf(stderr, "\nFile %s changed underneath us\n", name); + close(fd); + return; + } + n1 = fp2->name; + n2 = name; + if (!no_link) { + const char *suffix = ".$$$___cleanit___$$$"; + const size_t suffixlen = strlen(suffix); + size_t n2len = strlen(n2); + dynstr nam2 = {NULL, 0}; + growstr(&nam2, add2(n2len, suffixlen)); + memcpy(nam2.buf, n2, n2len); + memcpy(&nam2.buf[n2len], suffix, suffixlen + 1); + /* First create a temporary link to n1 under a new name */ + if (link(n1, nam2.buf)) { + fprintf(stderr, "\nFailed to hardlink %s to %s (create temporary link as %s failed - %s)\n", n1, n2, nam2.buf, strerror(errno)); + free(nam2.buf); + continue; + } + /* Then rename into place over the existing n2 */ + if (rename (nam2.buf, n2)) { + fprintf(stderr, "\nFailed to hardlink %s to %s (rename temporary link to %s failed - %s)\n", n1, n2, n2, strerror(errno)); + /* Something went wrong, try to remove the now redundant temporary link */ + if (unlink(nam2.buf)) { + fprintf(stderr, "\nFailed to remove temporary link %s - %s\n", nam2.buf, strerror(errno)); + } + free(nam2.buf); + continue; + } + free(nam2.buf); + } + nlinks++; + if (st3.st_nlink > 1) { + /* We actually did not save anything this time, since the link second argument + had some other links as well. */ + if (verbose > 1) + fprintf(stderr, "\r%*s\r%s %s to %s\n", (int)(((namelen > NAMELEN) ? 0 : namelen) + 2), "", (no_link ? "Would link" : "Linked"), n1, n2); + } else { + nsaved+=((st.st_size+4095)/4096)*4096; + if (verbose > 1) + fprintf(stderr, "\r%*s\r%s %s to %s, %s %ld\n", (int)(((namelen > NAMELEN) ? 0 : namelen) + 2), "", (no_link ? "Would link" : "Linked"), n1, n2, (no_link ? "would save" : "saved"), st.st_size); + } + close(fd); + return; + } + fp2 = malloc(add3(sizeof(f), namelen, 1)); + if (!fp2) { + fprintf(stderr, "\nOut of memory 2\n"); + doexit(2); + } + close(fd); + fp2->ino = st.st_ino; + fp2->dev = st.st_dev; + fp2->cksum = cksum; + memcpy(fp2->name, name, namelen + 1); + if (fp) { + fp2->next = fp->next; + fp->next = fp2; + } else { + fp2->next = hp->chain; + hp->chain = fp2; + } + if (verbose > 1 && namelen <= NAMELEN) + fprintf(stderr, "\r%*s\r", (int)(namelen + 2), ""); + return; + } +} + +int main(int argc, char **argv) +{ + int ch; + int i; + int errornumber; + PCRE2_SIZE erroroffset; + dynstr nam1 = {NULL, 0}; + while ((ch = getopt (argc, argv, "cnvhfx:")) != -1) { + switch (ch) { + case 'n': + no_link++; + break; + case 'v': + verbose++; + break; + case 'c': + content_only++; + break; + case 'f': + force=1; + break; + case 'x': + exclude_pattern = (PCRE2_SPTR)optarg; + break; + case 'h': + default: + usage(argv[0]); + } + } + if (optind >= argc) + usage(argv[0]); + if (exclude_pattern) { + re = pcre2_compile( + exclude_pattern, /* the pattern */ + PCRE2_ZERO_TERMINATED, /* indicates pattern is zero-terminate */ + 0, /* default options */ + &errornumber, + &erroroffset, + NULL); /* use default compile context */ + if (!re) { + PCRE2_UCHAR buffer[256]; + pcre2_get_error_message(errornumber, buffer, sizeof(buffer)); + fprintf(stderr, "pattern error at offset %d: %s\n", (int)erroroffset, buffer); + usage(argv[0]); + } + match_data = pcre2_match_data_create_from_pattern(re, NULL); + } + for (i = optind; i < argc; i++) + rf(argv[i]); + while (dirs) { + DIR *dh; + struct dirent *di; + d * dp = dirs; + size_t nam1baselen = strlen(dp->name); + dirs = dp->next; + growstr(&nam1, add2(nam1baselen, 1)); + memcpy(nam1.buf, dp->name, nam1baselen); + free (dp); + nam1.buf[nam1baselen++] = '/'; + nam1.buf[nam1baselen] = 0; + dh = opendir (nam1.buf); + if (dh == NULL) + continue; + ndirs++; + while ((di = readdir (dh)) != NULL) { + if (!di->d_name[0]) + continue; + if (di->d_name[0] == '.') { + if (!di->d_name[1] || !strcmp(di->d_name, "..")) + continue; + } + if (re && pcre2_match( + re, /* compiled regex */ + (PCRE2_SPTR)di->d_name, + strlen(di->d_name), + 0, /* start at offset 0 */ + 0, /* default options */ + match_data, /* block for storing the result */ + NULL) /* use default match context */ + >= 0) { + if (verbose) { + nam1.buf[nam1baselen] = 0; + fprintf(stderr,"Skipping %s%s\n", nam1.buf, di->d_name); + } + continue; + } + { + size_t subdirlen; + growstr(&nam1, add2(nam1baselen, subdirlen = strlen(di->d_name))); + memcpy(&nam1.buf[nam1baselen], di->d_name, add2(subdirlen, 1)); + } + rf(nam1.buf); + } + closedir(dh); + } + doexit(0); + return 0; +} diff --git a/SPECS/hardlink.spec b/SPECS/hardlink.spec new file mode 100644 index 0000000..a635370 --- /dev/null +++ b/SPECS/hardlink.spec @@ -0,0 +1,193 @@ +Summary: Create a tree of hardlinks +Name: hardlink +Version: 1.3 +Release: 6%{?dist} +Epoch: 1 +License: GPLv2+ +URL: https://pagure.io/hardlink +Source0: https://pagure.io/hardlink/raw/master/f/hardlink.c +Source1: https://pagure.io/hardlink/raw/master/f/hardlink.1 +Source2: https://www.gnu.org/licenses/old-licenses/gpl-2.0.txt +BuildRequires: pcre2-devel, gcc + +%description +hardlink is used to create a tree of hard links. It's used by kernel +installation to dramatically reduce the amount of disk space used by each +kernel package installed. + +%prep +%setup -q -c -T +install -pm 644 %{SOURCE0} %{SOURCE2} . + +%build +%{__cc} %{optflags} %{__global_ldflags} hardlink.c -o hardlink -lpcre2-8 + +%install +install -D -m 644 %{SOURCE1} %{buildroot}%{_mandir}/man1/hardlink.1 +install -D -m 755 hardlink %{buildroot}%{_sbindir}/hardlink + +%files +%license gpl-2.0.txt +%{_sbindir}/hardlink +%{_mandir}/man1/hardlink.1* + +%changelog +* Mon Feb 19 2018 Francisco Javier Tsao Santín - 1:1.3-6 +- Added gcc to build requirements + +* Wed Feb 07 2018 Fedora Release Engineering - 1:1.3-5 +- Rebuilt for https://fedoraproject.org/wiki/Fedora_28_Mass_Rebuild + +* Sat Aug 19 2017 Tomasz Kłoczko - 1:1.3-4 +- remove manually added pcre2 requires (this is autogenerated) +- removed BuildRoot, %%defattr() and Group (new Fedora Packaging Guildline) +- do not use straight gcc and add use %%{__global_ldflags} +- use %%_licensedir is no longer needed +- minor cleanups: +-- reformat %%description to 80 col +-- added full URLs in Source fields +-- more macros +-- a bit simpler %%prep + +* Wed Aug 02 2017 Fedora Release Engineering - 1:1.3-3 +- Rebuilt for https://fedoraproject.org/wiki/Fedora_27_Binutils_Mass_Rebuild + +* Wed Jul 26 2017 Fedora Release Engineering - 1:1.3-2 +- Rebuilt for https://fedoraproject.org/wiki/Fedora_27_Mass_Rebuild + +* Sun Apr 23 2017 Francisco Javier Tsao Santín - 1:1.3-1 +- Patch by Todd Lewis that adds option -x to exclude files with pcre lib +- This patch solves RH Bugzilla ID's 955246 1322198 + +* Thu Feb 16 2017 Francisco Javier Tsao Santín - 1:1.2-1 +- Fixed 32 bit build with gcc7 (RH Bugzilla ID 1422989) + +* Sun Feb 12 2017 Francisco Javier Tsao Santín - 1:1.1-4 +- Fixed source url and description in spec file + +* Fri Feb 10 2017 Fedora Release Engineering - 1:1.1-3 +- Rebuilt for https://fedoraproject.org/wiki/Fedora_26_Mass_Rebuild + +* Sat Sep 03 2016 Kevin Fenzi - 1.1-2 +- Drop the kernel-utils obsolete that was added in 2005. + +* Sun Jul 10 2016 Francisco Javier Tsao Santín - 1:1.1-1 +- Patch by Travers Carter for making hardlinking atomic + +* Wed Feb 03 2016 Fedora Release Engineering - 1:1.0-23 +- Rebuilt for https://fedoraproject.org/wiki/Fedora_24_Mass_Rebuild + +* Wed Jun 17 2015 Fedora Release Engineering - 1:1.0-22 +- Rebuilt for https://fedoraproject.org/wiki/Fedora_23_Mass_Rebuild + +* Sat Aug 16 2014 Fedora Release Engineering - 1:1.0-21 +- Rebuilt for https://fedoraproject.org/wiki/Fedora_21_22_Mass_Rebuild + +* Sat Jul 12 2014 Tom Callaway - 1:1.0-20 +- fix license handling + +* Sat Jun 07 2014 Fedora Release Engineering - 1:1.0-19 +- Rebuilt for https://fedoraproject.org/wiki/Fedora_21_Mass_Rebuild + +* Sat Aug 03 2013 Fedora Release Engineering - 1:1.0-18 +- Rebuilt for https://fedoraproject.org/wiki/Fedora_20_Mass_Rebuild + +* Wed Apr 10 2013 Jan Zeleny - 1:1.0-17 +- Mention -f option in the man page + +* Thu Feb 14 2013 Fedora Release Engineering - 1:1.0-16 +- Rebuilt for https://fedoraproject.org/wiki/Fedora_19_Mass_Rebuild + +* Thu Jul 19 2012 Fedora Release Engineering - 1:1.0-15 +- Rebuilt for https://fedoraproject.org/wiki/Fedora_18_Mass_Rebuild + +* Sun Apr 15 2012 Jindrich Novy - 1:1.0-14 +- do not allow to hardlink files across filesystems by default (#786719) + (use -f option to override) + +* Fri Jan 13 2012 Fedora Release Engineering - 1:1.0-13 +- Rebuilt for https://fedoraproject.org/wiki/Fedora_17_Mass_Rebuild + +* Fri Oct 21 2011 Jindrich Novy - 1:1.0-12 +- fix possible buffer overflows, integer overflows (CVE-2011-3630 CVE-2011-3631 CVE-2011-3632) +- update man page + +* Wed Mar 2 2011 Jindrich Novy - 1:1.0-11 +- don't use mmap(2) to avoid failures on i386 with 1GB files and larger (#672917) +- fix package URL (#676962) + +* Wed Feb 09 2011 Fedora Release Engineering - 1:1.0-10 +- Rebuilt for https://fedoraproject.org/wiki/Fedora_15_Mass_Rebuild + +* Fri Jul 24 2009 Fedora Release Engineering - 1:1.0-9 +- Rebuilt for https://fedoraproject.org/wiki/Fedora_12_Mass_Rebuild + +* Tue Feb 24 2009 Fedora Release Engineering - 1:1.0-8 +- Rebuilt for https://fedoraproject.org/wiki/Fedora_11_Mass_Rebuild + +* Mon Feb 25 2008 Jindrich Novy 1:1.0-7 +- manual rebuild because of gcc-4.3 (#434188) + +* Tue Feb 19 2008 Fedora Release Engineering - 1:1.0-6 +- Autorebuild for GCC 4.3 + +* Thu Aug 23 2007 Jindrich Novy - 1:1.0-5 +- update License +- rebuild for BuildID + +* Mon Apr 23 2007 Jindrich Novy - 1:1.0-4 +- include sources in debuginfo package (#230833) + +* Mon Feb 5 2007 Jindrich Novy - 1:1.0-3 +- merge review related spec fixes (#225881) + +* Sun Oct 29 2006 Jindrich Novy - 1:1.0-2 +- update docs to describe highest verbosity -vv option (#210816) +- use dist + +* Wed Jul 12 2006 Jindrich Novy - 1:1.0-1.23 +- remove ugly suffixes added by rebuild script + +* Wed Jul 12 2006 Jesse Keating - 1:1.0-1.21.2.1 +- rebuild + +* Fri Feb 10 2006 Jesse Keating - 1:1.0-1.20.2 +- bump again for double-long bug on ppc(64) + +* Tue Feb 07 2006 Jesse Keating - 1:1.0-1.19.1 +- rebuilt for new gcc4.1 snapshot and glibc changes + +* Fri Dec 09 2005 Jesse Keating +- rebuilt + +* Mon Nov 14 2005 Jindrich Novy +- more spec cleanup - thanks to Matthias Saou (#172968) +- use UTF-8 encoding in the source + +* Mon Nov 7 2005 Jindrich Novy +- add hardlink man page +- add -h option +- use _sbindir instead of /usr/sbin directly +- don't warn because of uninitialized variable +- spec cleanup + +* Fri Aug 26 2005 Dave Jones +- Document hardlink command line options. (Ville Skytta) (#161738) + +* Wed Apr 27 2005 Jeremy Katz +- don't try to hardlink 0 byte files (#154404) + +* Fri Apr 15 2005 Florian La Roche +- remove empty scripts + +* Tue Mar 1 2005 Dave Jones +- rebuild for gcc4 + +* Tue Feb 8 2005 Dave Jones +- rebuild with -D_FORTIFY_SOURCE=2 + +* Tue Jan 11 2005 Dave Jones +- Add missing Obsoletes: kernel-utils + +* Sat Dec 18 2004 Dave Jones +- Initial packaging, based upon kernel-utils.