File theunreproduciblepackage-1.0.1.obscpio of Package theunreproduciblepackage

07070100000000000081A4000000000000000000000001662A61C400000430000000000000000000000000000000000000002700000000theunreproduciblepackage-1.0.1/COPYINGCopyright 2017 Bernhard M. Wiedemann and others

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
07070100000001000081A4000000000000000000000001662A61C40000008B000000000000000000000000000000000000002800000000theunreproduciblepackage-1.0.1/Makefileall:
	-find -mindepth 1 -type d | sort | xargs --verbose -IDIR make -k -C DIR

install:
	echo "dummy install: use out/ dir as appropriate"
07070100000002000081A4000000000000000000000001662A61C4000001AB000000000000000000000000000000000000002900000000theunreproduciblepackage-1.0.1/README.mdThe Unreproducible Package

is meant as a practical way to demonstrate the various ways
that software can break reproducible builds
using just low level primitives without requiring external existing programs
that implement these primitives themselves.

It is structured so that one subdirectory demonstrates one class of issues
in some variants observed in the wild.

See https://reproducible-builds.org/ for background info.
07070100000003000041ED000000000000000000000002662A61C400000000000000000000000000000000000000000000002400000000theunreproduciblepackage-1.0.1/aslr07070100000004000081A4000000000000000000000001662A61C400000007000000000000000000000000000000000000002F00000000theunreproduciblepackage-1.0.1/aslr/.gitignore1
1b
2
07070100000005000081A4000000000000000000000001662A61C400000176000000000000000000000000000000000000002800000000theunreproduciblepackage-1.0.1/aslr/1.c#include <unistd.h>

typedef struct {
  char data1;
  short data2;
  // struct has a short (2-byte) value at offset 1, making it unaligned
  // causing compilers to do padding
  // but padding bytes are uninitialized and get random values from ASLR
} unalignedstruct;

int main()
{
  unalignedstruct x;
  x.data1=1;
  x.data2=0x203;
  write(1, &x, sizeof(x));
  return 0;
}
07070100000006000081A4000000000000000000000001662A61C40000009C000000000000000000000000000000000000002900000000theunreproduciblepackage-1.0.1/aslr/1b.c#include <stdio.h>

int main()
{
  int buffer[20];
  int sum;
  int i;
  for(i=0; i<20; i++) {
    sum+=buffer[i];
  }
  printf("%i\n", sum);
  return 0;
}
07070100000007000081A4000000000000000000000001662A61C40000004E000000000000000000000000000000000000002800000000theunreproduciblepackage-1.0.1/aslr/2.c#include <stdio.h>

int main()
{
  int x;
  printf("%p\n", &x);
  return 0;
}
07070100000008000081A4000000000000000000000001662A61C40000009F000000000000000000000000000000000000002D00000000theunreproduciblepackage-1.0.1/aslr/MakefileCFLAGS=-Wall
all: run

run: 1 1b 2
	for s in 1 ; do ./$$s ; done | od -tx1 > ../out/aslr
	for s in 1b 2 ; do ./$$s ; done >> ../out/aslr

clean:
	rm -f 1 1b 2
07070100000009000081A4000000000000000000000001662A61C40000071A000000000000000000000000000000000000002E00000000theunreproduciblepackage-1.0.1/aslr/README.mdSee also https://reproducible-builds.org/docs/value-initialization/

[ASLR](https://en.wikipedia.org/wiki/Address_space_layout_randomization)
is controlled under Linux via `/proc/sys/kernel/randomize_va_space`
It will randomize memory addresses and thus pointers
and also initialize certain memory with random values.

ASLR can also be disabled per process using `setarch $(arch) -R make`

But the proper fix for cases like `1.c` is to use `memset` or `bzero`.

## Seen in the wild:
* case 1 (uninitialized padding memory):
  * [LiE](https://github.com/davidsd/lie/pull/1/files)
  * [gcin](https://build.opensuse.org/request/show/520868)
  * [ipadic](https://build.opensuse.org/request/show/540040) http://rb.zq1.de/compare.factory-20170910/ipadic-compare.out

* case 1b (uninitialized memory):
  * [i4l-base](https://build.opensuse.org/request/show/539442)

* case 2 (pointers):
  * http://rb.zq1.de/compare.factory-20170910/python-rtslib-compare.out
  * http://rb.zq1.de/compare.factory-20170910/python-utidy-compare.out
  * http://rb.zq1.de/compare.factory-20170910/ragel-compare.out

* unknown:
  * http://rb.zq1.de/compare.factory-20171011/aegisub-compare.out
  * http://rb.zq1.de/compare.factory-20170910/gnustep-libobjc2-compare.out
  * http://rb.zq1.de/compare.factory-20170910/kdebindings-smokekde-compare.out
  * http://rb.zq1.de/compare.factory-20170910/kdebindings-smokeqt-compare.out
  * http://rb.zq1.de/compare.factory-20170910/ldc-compare.out
  * http://rb.zq1.de/compare.factory-20170910/libkolabxml-compare.out
  * http://rb.zq1.de/compare.factory-20170910/mkvtoolnix-compare.out
  * http://rb.zq1.de/compare.factory-20171011/nodejs6-compare.out
  * http://rb.zq1.de/compare.factory-20170910/perl-MooseX-Role-Cmd-compare.out
  * http://rb.zq1.de/compare.factory-20170910/quantum-espresso-compare.out
0707010000000A000041ED000000000000000000000004662A61C400000000000000000000000000000000000000000000003200000000theunreproduciblepackage-1.0.1/compile-time-check0707010000000B000081A4000000000000000000000001662A61C40000004E000000000000000000000000000000000000003B00000000theunreproduciblepackage-1.0.1/compile-time-check/Makefiledummy:
	echo "use 'make all'"

all:
	make -C benchmark
	make -C cpu-detection
0707010000000C000041ED000000000000000000000002662A61C400000000000000000000000000000000000000000000003C00000000theunreproduciblepackage-1.0.1/compile-time-check/benchmark0707010000000D000081A4000000000000000000000001662A61C4000001E5000000000000000000000000000000000000004000000000theunreproduciblepackage-1.0.1/compile-time-check/benchmark/1.c#include <stdio.h>

static __inline__ unsigned long long rdtsc(void)
{
    unsigned hi, lo;
    __asm__ __volatile__ ("rdtsc" : "=a"(lo), "=d"(hi) :: "rcx");
    return ( (unsigned long long)lo)|( ((unsigned long long)hi)<<32 );
}

int main(int argc, char **argv)
{
  int i;
  unsigned long long start, end;
  if(argc>1)
    sscanf(argv[1], "%u", &i);
  else
    i=100000000;
  start = rdtsc();
  for(; i>0; --i) { }
  end = rdtsc();
  printf("took: %llu\n", end-start);
  return 0;
}
0707010000000E000081A4000000000000000000000001662A61C40000003C000000000000000000000000000000000000004500000000theunreproduciblepackage-1.0.1/compile-time-check/benchmark/Makefileall: run
run: 1
	./1 | tee ../../out/compile-time-benchmark
0707010000000F000081A4000000000000000000000001662A61C4000001FE000000000000000000000000000000000000004600000000theunreproduciblepackage-1.0.1/compile-time-check/benchmark/README.mdSome packages come with alternate implementations for some functions
and do benchmarking during compile time to decide which version to use.

This *might* be stable on one machine, but results will vary when building in VMs or across different types of machines.

## Examples

* [autogen](https://build.opensuse.org/request/show/585128) determines `AG_TIMEOUT` during build time.
* ksh and [graphviz](https://build.opensuse.org/request/show/498837) determine `_mmap_worthy` during build time using benchmarks.
07070100000010000041ED000000000000000000000002662A61C400000000000000000000000000000000000000000000004000000000theunreproduciblepackage-1.0.1/compile-time-check/cpu-detection07070100000011000081A4000000000000000000000001662A61C400000033000000000000000000000000000000000000004900000000theunreproduciblepackage-1.0.1/compile-time-check/cpu-detection/Makefileall:
	lscpu > ../../out/compile-time-cpu-detection
07070100000012000081A4000000000000000000000001662A61C4000001ED000000000000000000000000000000000000004A00000000theunreproduciblepackage-1.0.1/compile-time-check/cpu-detection/README.mdSome packages do CPU-detection at compile time.
This can happen using gcc's -march=native, -mcpu=native or -mtune=native compiler options.

The former ones might even cause the software to break on older CPUs because those do not know newer instructions. That is usually not desirable for a general-purpose software distribution.

## Examples

* See [r-b blog 168](https://reproducible-builds.org/blog/posts/168/) and [this openSUSE bug](https://bugzilla.opensuse.org/show_bug.cgi?id=1100677)
07070100000013000041ED000000000000000000000003662A61C400000000000000000000000000000000000000000000002B00000000theunreproduciblepackage-1.0.1/environment07070100000014000081A4000000000000000000000001662A61C400000077000000000000000000000000000000000000003400000000theunreproduciblepackage-1.0.1/environment/Makefileall: run
run:
	(hostname ; echo $$HOSTNAME ; uname -a ; whoami ; id ; echo $$USER $$UID) > ../out/environment-explicit
07070100000015000081A4000000000000000000000001662A61C400000282000000000000000000000000000000000000003500000000theunreproduciblepackage-1.0.1/environment/README.mdsome software explicitly collects information from the build environment (e.g. hostname, kernel version)
or implicitly uses some of it (e.g. locale or timezone)

It is possible to get reproducible build results by using a normalized build environment such as a virtual machine that always builds with the same kernel and hostname.

The other way is to patch software to not collect or depend on that environment - possibly only when `SOURCE_DATE_EPOCH` is set, so that developer's debug builds still work as they need it.

https://github.com/bmwiedemann/reproducible-faketools can help to discover or workaround some of these issues as well.
070701000000160000A1FF0000000000000000000000016629ED3000000009000000000000000000000000000000000000003200000000theunreproduciblepackage-1.0.1/environment/locale../locale07070100000017000041ED000000000000000000000002662A61C400000000000000000000000000000000000000000000003400000000theunreproduciblepackage-1.0.1/environment/timezone07070100000018000081A4000000000000000000000001662A61C4000000C8000000000000000000000000000000000000003D00000000theunreproduciblepackage-1.0.1/environment/timezone/Makefileall: run
run:
	( date -I ; perl -e 'print scalar localtime,"\n"' ; python3 -c 'import time; print(time.localtime())' ) > ../../out/environment-implicit-timezone # differs for TZ=UTC-14 and TZ=UTC+12

07070100000019000081A4000000000000000000000001662A61C4000000CD000000000000000000000000000000000000003E00000000theunreproduciblepackage-1.0.1/environment/timezone/README.mdTo reproduce the issue:

```
make TZ=UTC-14
make TZ=UTC+12
```

To avoid the issue:
export TZ=UTC
or always call date -u
and replace calls to localtime with gmtime
to always use the universal UTC timezone
0707010000001A000041ED000000000000000000000002662A61C400000000000000000000000000000000000000000000002A00000000theunreproduciblepackage-1.0.1/filesystem0707010000001B000081A4000000000000000000000001662A61C400000025000000000000000000000000000000000000003300000000theunreproduciblepackage-1.0.1/filesystem/Makefileall:
	stat . > ../out/filesystem.txt
0707010000001C000081A4000000000000000000000001662A61C4000003E9000000000000000000000000000000000000003400000000theunreproduciblepackage-1.0.1/filesystem/README.mdThere are several ways, a filesystem can be used to introduce nondeterminism:

1. [readdir order](../readdir) is already covered in its own section

2. storing `mtime`, `ctime` or `atime` values of files touched during build will introduce [varying timestamps](../timestamp)

3. using `st_ino` or `st_dev` fields from [`stat(2)`](http://man7.org/linux/man-pages/man2/stat.2.html)

4. storing `st_size` for directories or using `st_blocks` or `st_blksize` for any file, because it is filesystem-dependent. E.g. creating a million files and removing them again, will cause a directory to be large on ext4


## Seen in the wild:

1.
    * see [readdir](../readdir)

2.
    * .pyc files in cpython
    * compiled free pascal (fpc) files
    * fwnn .dic files

3.
    * [fwnn](https://osdn.net/projects/freewnn/ticket/38482)
    * [geany](https://bugzilla.opensuse.org/show_bug.cgi?id=1049382)

4.
    * [rpm](https://github.com/rpm-software-management/rpm/commit/2cf7096ba534b065feb038306c792784458ac9c7)
0707010000001D000041ED000000000000000000000002662A61C400000000000000000000000000000000000000000000002D00000000theunreproduciblepackage-1.0.1/floatingpoint0707010000001E000081A4000000000000000000000001662A61C400000012000000000000000000000000000000000000003600000000theunreproduciblepackage-1.0.1/floatingpoint/Makefileall:
	true # TODO
0707010000001F000081A4000000000000000000000001662A61C4000002C8000000000000000000000000000000000000003D00000000theunreproduciblepackage-1.0.1/floatingpoint/README.markdownA few cases have been observed in the wild, where build results
depended on the CPU type, because floating point operations
would yield different accuracy.

This might also produce deviations between i586 and x86\_64 builds.

In some of these cases the goal of reproducibility may conflict with performance or accuracy.

# Seen in the wild:

* [fluidsynth](https://github.com/FluidSynth/fluidsynth/pull/512) fused-multiply-add used in glibc-2.29 pow if available

* [calibre](https://github.com/kovidgoyal/calibre/blob/fab8c8f2d4d8c0c1b2046c2fbfd204189191c4c5/src/calibre/linux.py#L1171) `.png` files vary from SSE4.1 ?

* [piglit](https://rb.zq1.de/compare.factory-20181023/piglit-compare.out) - already fixed?
07070100000020000041ED000000000000000000000002662A61C400000000000000000000000000000000000000000000002400000000theunreproduciblepackage-1.0.1/hash07070100000021000081A4000000000000000000000001662A61C400000040000000000000000000000000000000000000002D00000000theunreproduciblepackage-1.0.1/hash/Makefileall: run

run:
	for s in hash.* ; do ./$$s ; done > ../out/hash
07070100000022000081A4000000000000000000000001662A61C4000004CF000000000000000000000000000000000000002E00000000theunreproduciblepackage-1.0.1/hash/README.mdSee also https://reproducible-builds.org/docs/stable-outputs/

Many interpreted languages offer randomized hash tables (aka dict aka associative array) as a defense against DoS (see [oCERT-2011-003](http://www.ocert.org/advisories/ocert-2011-003.html) for more discussion).

If a build process depends on output from such a randomized hash, it can be nondeterministic.

One possible solution is to do in the build environment

```bash
export QT_HASH_SEED=0
export PERL_HASH_SEED=42
export PYTHONHASHSEED=0
```

Which will tell perl, python and Qt's QHash to use a constant hash seed instead of a randomized one.

Another correct approach when a build process depends on output from a hash is to sort the keys of the hash and explicitly output it in the order of the keys.

Observed in the wild:

* C++ [llvm](https://reviews.llvm.org/D50559)
* C++ [gcc](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90778)
* go [rclone](https://github.com/rclone/rclone/pull/3289)
* perl [HSAIL-Tools](https://github.com/HSAFoundation/HSAIL-Tools/pull/51)
* perl [yast-x11](https://github.com/yast/yast-x11/pull/18)
* salt [bind-formula](https://github.com/saltstack-formulas/bind-formula/pull/110/commits/7f500766e0d9aec76522feb89e02bd1f3b0b7d42)
07070100000023000081ED000000000000000000000001662A61C4000000ED000000000000000000000000000000000000002C00000000theunreproduciblepackage-1.0.1/hash/hash.pl#!/usr/bin/perl
use strict;
my %hash;
for (1..10) {
    $hash{"key$_"} = "value$_";
}

foreach(keys(%hash)) {
    print "$_ => $hash{$_}\n";
}
print "---\n";
# using an iterator
while(my @e=each(%hash)) {
    print "$e[0] => $e[1]\n";
}
07070100000024000081ED000000000000000000000001662A61C40000011B000000000000000000000000000000000000002C00000000theunreproduciblepackage-1.0.1/hash/hash.py#!/usr/bin/python3
# works with python2 and python3

myhash={}
for i in range(1,10):
    myhash["key" + str(i)] = "value" + str(i)

print(myhash)

for key in myhash:
    print(key + " => " + myhash[key])

print("---")
for key,value in myhash.items():
    print(key + " => " + value)
07070100000025000041ED000000000000000000000005662A61C400000000000000000000000000000000000000000000002600000000theunreproduciblepackage-1.0.1/locale07070100000026000081A4000000000000000000000001662A61C40000006B000000000000000000000000000000000000002F00000000theunreproduciblepackage-1.0.1/locale/Makefiledummy:
	echo "use 'make all'"
all:
	for d in case sort wild ; do \
            make -C $$d ;\
        done
07070100000027000081A4000000000000000000000001662A61C40000021C000000000000000000000000000000000000003600000000theunreproduciblepackage-1.0.1/locale/README.markdownThis is a collection of some locale-oriented pitfalls. This is especially
dangerous with European languages, which use latin alphabet, but have some rules
that differ from `en_US` locale. The examples included here are mainly around
common Roman letters which made it to most European languages, without
diacritics.

* `case` &mdash; a lowercase / uppercase conversion
* `sort` &mdash; different order of the letters
* `wild` &mdash; some wildards have different meaning

The solution, in most cases, should be to enforce `C.UTF-8` locale.
07070100000028000041ED000000000000000000000002662A61C400000000000000000000000000000000000000000000002B00000000theunreproduciblepackage-1.0.1/locale/case07070100000029000081A4000000000000000000000001662A61C40000009A000000000000000000000000000000000000003400000000theunreproduciblepackage-1.0.1/locale/case/Makefileall: run

run:
	echo i | tr '[:lower:]' '[:upper:]' > ../../out/upper-i.txt
	echo I | tr '[:upper:]' '[:lower:]' > ../../out/lower-I.txt

.PHONY: all run
0707010000002A000081A4000000000000000000000001662A61C4000000F7000000000000000000000000000000000000003B00000000theunreproduciblepackage-1.0.1/locale/case/README.markdownDifferent languages have different alphabets and some of them have different
concept of which letters are in lowercase-uppercase relations. The example is
Turkish language which has letters `Iı` and `İi`.

```
make
make LC_CTYPE=tr_TR.UTF-8
```
0707010000002B000041ED000000000000000000000002662A61C400000000000000000000000000000000000000000000002B00000000theunreproduciblepackage-1.0.1/locale/sort0707010000002C000081A4000000000000000000000001662A61C400000004000000000000000000000000000000000000003600000000theunreproduciblepackage-1.0.1/locale/sort/.gitignorein/
0707010000002D000081A4000000000000000000000001662A61C4000000D5000000000000000000000000000000000000003400000000theunreproduciblepackage-1.0.1/locale/sort/Makefileall: run

in:
	mkdir in
	for c in A B C D E F G H I J K L M N O P Q R S T U V W X Y Z; do \
		touch in/$${c}-file; \
	done

run: in
	ls -1 in | sort > ../../out/ls.txt

clean:
	$(RM) -rf in

.PHONY: all run clean
0707010000002E000081A4000000000000000000000001662A61C4000000EF000000000000000000000000000000000000003B00000000theunreproduciblepackage-1.0.1/locale/sort/README.markdownDifferent languages have different alphabets and some of them have letters in
different order than your language. The example is Estonian language which has
the letter Z before letters T, U, V, W, X, Y.

```
make
make LC_COLLATE=eesti
```
0707010000002F000041ED000000000000000000000002662A61C400000000000000000000000000000000000000000000002B00000000theunreproduciblepackage-1.0.1/locale/wild07070100000030000081A4000000000000000000000001662A61C400000004000000000000000000000000000000000000003600000000theunreproduciblepackage-1.0.1/locale/wild/.gitignorein/
07070100000031000081A4000000000000000000000001662A61C4000000CB000000000000000000000000000000000000003400000000theunreproduciblepackage-1.0.1/locale/wild/Makefileall: run

in:
	mkdir in
	for c in A B C D E F G H I J K L M N O P Q R S T U V W X Y Z; do \
		touch in/$${c}-file; \
	done

run: in
	cp in/[A-Z]* ../../out/

clean:
	$(RM) -rf in/

.PHONY: all run clean
07070100000032000081A4000000000000000000000001662A61C400000106000000000000000000000000000000000000003B00000000theunreproduciblepackage-1.0.1/locale/wild/README.markdownWildcards can have different meaning for locales. The example is Estonian
language which has the letter Z before letters T, U, V, W, X, Y. This affects
the common `[A-Z]` idiom in both globbing (bash) and regular expressions.

```
make
make LC_COLLATE=eesti
```
07070100000033000041ED000000000000000000000002662A61C400000000000000000000000000000000000000000000002300000000theunreproduciblepackage-1.0.1/out07070100000034000081A4000000000000000000000001662A61C400000002000000000000000000000000000000000000002E00000000theunreproduciblepackage-1.0.1/out/.gitignore*
07070100000035000041ED000000000000000000000002662A61C400000000000000000000000000000000000000000000002300000000theunreproduciblepackage-1.0.1/pgo07070100000036000081A4000000000000000000000001662A61C400000181000000000000000000000000000000000000002C00000000theunreproduciblepackage-1.0.1/pgo/Makefileall: runmany

run:
	rm -f *.gcda
	make -B build CFLAGS="-O3 -fprofile-generate"
	make profile
	make clean
	make build CFLAGS="-O3 -fprofile-use -Wall"
	cp -a pgo ../out/

profile:
	bash -c 'echo $$RANDOM' | ./pgo || :

build: pgo
clean:
	rm -f pgo

runmany:
	for i in $$(seq 100) ; do \
	    make run >/dev/null 2>&1 ;\
	    md5sum pgo ;\
	done | sort | uniq -c > ../out/pgo-stats.txt
07070100000037000081A4000000000000000000000001662A61C40000048B000000000000000000000000000000000000002D00000000theunreproduciblepackage-1.0.1/pgo/README.mdProfile Guided Optimization (PGO)
is a feature found in gcc and maybe other compilers.

With PGO, a build happens in several stages:
1. the software is built with extra code for profile generation
2. the software is run and profile information (e.g. counters of calls that are independent of system performance) recorded in .gcda files
3. the software is built again using the information from .gcda files for optimization

This will create unreproducible binaries, unless all inputs in step 2 are constant.

Possible solutions:
* use constant input in step 2 (see gzip and libsamplerate below)
* remove .gcda files that differ across builds after step 2 (see bash below)
* disable PGO completely, losing some optimization

Seen in the wild:
* gcc
* python `make profile-opt`
* [openSUSE/bash](https://build.opensuse.org/request/show/498339)
* [openSUSE/grep](https://build.opensuse.org/request/show/647618)
* [openSUSE/gzip](https://build.opensuse.org/request/show/499887)
* [openSUSE/libsamplerate](https://build.opensuse.org/request/show/562897)
* [GNU hello](https://www.reddit.com/r/reproduciblebuilds/comments/tqrf9q/the_binary_that_varies_from_full_moon/)
07070100000038000081A4000000000000000000000001662A61C4000000A1000000000000000000000000000000000000002900000000theunreproduciblepackage-1.0.1/pgo/pgo.c#include <unistd.h>

int main()
{
  char ret=0;
  char c;
  while(read(0, &c, 1) == 1) {
    if(c < '4') ret++; // count the chars below "4"
  }
  return ret;
}
07070100000039000041ED000000000000000000000002662A61C400000000000000000000000000000000000000000000002300000000theunreproduciblepackage-1.0.1/pid0707010000003A000081A4000000000000000000000001662A61C400000032000000000000000000000000000000000000002C00000000theunreproduciblepackage-1.0.1/pid/Makefileall: run

run: 
	for s in pid.* ; do ./$$s ; done
0707010000003B000081A4000000000000000000000001662A61C4000001E4000000000000000000000000000000000000002A00000000theunreproduciblepackage-1.0.1/pid/READMEProcesses can ask the operating system for their own process ID (PID).
This is often used in places where parallel processing requires unique temporary filenames to not accidentally overwrite the output of another process (see 'race').

However sometimes the PID will end up in output files as happens when compiling .c files.
Calling `strip` on the output can drop these file names and remove irreproducibility in that case.
Or use deterministic ways to determine a unique filename.
0707010000003C000081A4000000000000000000000001662A61C40000001B000000000000000000000000000000000000002B00000000theunreproduciblepackage-1.0.1/pid/dummy.cint main()
{
  return 0;
}
0707010000003D000081ED000000000000000000000001662A61C4000000DB000000000000000000000000000000000000002A00000000theunreproduciblepackage-1.0.1/pid/pid.sh#!/bin/sh
tmp=pidtmpfilename$$.c
cp -a dummy.c $tmp
gcc $tmp -o ../out/pidfromgcc
rm -f $tmp

# https://reproducible-builds.org/docs/archives/
tar c pid.sh > ../out/pidfrom.tar # POSIX PaX headers have a PID by default
0707010000003E000041ED000000000000000000000002662A61C400000000000000000000000000000000000000000000002400000000theunreproduciblepackage-1.0.1/race0707010000003F000081A4000000000000000000000001662A61C400000033000000000000000000000000000000000000002D00000000theunreproduciblepackage-1.0.1/race/Makefileall: run

run: 
	for s in race.* ; do ./$$s ; done
07070100000040000081A4000000000000000000000001662A61C400000344000000000000000000000000000000000000002E00000000theunreproduciblepackage-1.0.1/race/README.mdRaces or race conditions happen when two processes do something in parallel and sometimes one part can be faster than another part and screw its results.

Races are generic bugs, independent of the goal of reproducible builds, but they can be hard to notice and catch, so they often get found as a side effect when working on reproducible builds.

Seen in the wild in:
* [autogen](https://savannah.gnu.org/support/index.php?109234)
* [intltool](https://bugs.launchpad.net/intltool/+bug/1687644)
* [openSUSE/python-singlespec/setup.py](http://rb.zq1.de/compare.factory-20170428/python-bottle-compare.out) where we generate python2 and python3 packages in one build and if those were done within the same second, python's setup.py would think it is already done and skip the requested install - solved by using `setup.py --force install`
07070100000041000081ED000000000000000000000001662A61C4000000CC000000000000000000000000000000000000002E00000000theunreproduciblepackage-1.0.1/race/race.2.sh#!/bin/sh

racepart()
{
    input=$1
    sleep 0.1
    echo $input
}

for i in $(seq 1 10) ; do
    # & backgrounds the process to do parallel processing
    racepart $i &
done > ../out/race2-result
wait
07070100000042000081ED000000000000000000000001662A61C400000126000000000000000000000000000000000000002C00000000theunreproduciblepackage-1.0.1/race/race.pl#!/usr/bin/perl -w
# this behaves more deterministic than bash examples
# but with varying taskset $N, it also has some randomness
use strict;
open (OUT, ">", "../out/race.pl.out");
for(1..5) {
  my $pid = fork();
  if($pid==0){
    print OUT "$_\n";
    print "$_\n";
    exit 0
  }
}
wait();
07070100000043000081ED000000000000000000000001662A61C40000011B000000000000000000000000000000000000002C00000000theunreproduciblepackage-1.0.1/race/race.sh#!/bin/sh

racepart()
{
    input=$1
    echo $input > tmpfile
    #sleep 0.$((RANDOM*3)) # optional
    cat tmpfile 2>/dev/null
    rm -f tmpfile
}

for i in $(seq 1 10) ; do
    # & backgrounds the process to do parallel processing
    racepart $i &
done > ../out/race-result
wait
07070100000044000041ED000000000000000000000002662A61C400000000000000000000000000000000000000000000002600000000theunreproduciblepackage-1.0.1/random07070100000045000081A4000000000000000000000001662A61C400000049000000000000000000000000000000000000002F00000000theunreproduciblepackage-1.0.1/random/Makefileall: run

run:
	for s in random.* ; do ./$$s ; done > ../out/random-uuid
07070100000046000081A4000000000000000000000001662A61C400000157000000000000000000000000000000000000002D00000000theunreproduciblepackage-1.0.1/random/READMESee also https://reproducible-builds.org/docs/randomness/

Some packages want to generate unique IDs
to distinguish different outputs from each other

The preferred fix is to use a hash of all the input instead.

Observed in the wild:
https://bugs.launchpad.net/qutim/+bug/1724148
https://github.com/varnishcache/varnish-cache/pull/2436/files
07070100000047000081ED000000000000000000000001662A61C400000043000000000000000000000000000000000000003000000000theunreproduciblepackage-1.0.1/random/random.pl#!/usr/bin/perl
print "UUID=\"".int(rand(1000000000000000))."\"\n"
07070100000048000081ED000000000000000000000001662A61C400000022000000000000000000000000000000000000003000000000theunreproduciblepackage-1.0.1/random/random.sh#!/bin/sh
echo "UUID=\"$RANDOM\""
07070100000049000041ED000000000000000000000002662A61C400000000000000000000000000000000000000000000002700000000theunreproduciblepackage-1.0.1/readdir0707010000004A000081A4000000000000000000000001662A61C400000003000000000000000000000000000000000000003200000000theunreproduciblepackage-1.0.1/readdir/.gitignorein
0707010000004B000081A4000000000000000000000001662A61C400000096000000000000000000000000000000000000003000000000theunreproduciblepackage-1.0.1/readdir/Makefileall: prep run

prep:
	mkdir -p in
	for i in $$(seq 10) ; do touch in/$$i ; done

run:
	for s in readdir.* ; do ./$$s ; done > ../out/readdir.filelist
0707010000004C000081A4000000000000000000000001662A61C40000043B000000000000000000000000000000000000002E00000000theunreproduciblepackage-1.0.1/readdir/READMESee also https://reproducible-builds.org/docs/stable-inputs/

According to POSIX specs, readdir can return entries in a directory
in random order, though the specifics depend on the underlying filesystem.

On Linux with ext4 it is pretty random.

Observed in the wild:
Python:
https://github.com/dahlia/libsass-python/pull/212/files
https://github.com/mypaint/libmypaint/pull/108/files
https://github.com/skyjake/Doomsday-Engine/pull/18/files
https://github.com/kovidgoyal/html5-parser/pull/5/files
https://github.com/dugsong/libdnet/pull/42/files
https://github.com/carlos-jenkins/nested/pull/1/files
https://www.riverbankcomputing.com/pipermail/pyqt/2019-June/041854.html (os.walk)

C:
https://sourceforge.net/p/blobwars/patches/8/

Make (reproducible with glob-sort patch at https://savannah.gnu.org/bugs/?52076):
https://github.com/dunst-project/dunst/pull/372/files

shell/find:
https://github.com/crawl/crawl/pull/609/files
http://dpdk.org/dev/patchwork/patch/29949/

LISP:
https://github.com/SawfishWM/librep/pull/12/files

Rust:
https://github.com/rust-lang/git2-rs/pull/619
0707010000004D000081ED000000000000000000000001662A61C40000013A000000000000000000000000000000000000003200000000theunreproduciblepackage-1.0.1/readdir/readdir.pl#!/usr/bin/perl -w
opendir(D, "in");
foreach(readdir(D)) {
    print "$_\n";
}

use File::Find;
find({wanted=>sub {print "$_ "}}, "in");

# Find with:
# grep -r -e readdir -e File::Find .

# Fix with:
# foreach(sort(readdir(D)))

# find({wanted=>sub {print "$_ "}, preprocess => sub {sort {$a cmp $b} @_}}, "in");
0707010000004E000081ED000000000000000000000001662A61C40000019B000000000000000000000000000000000000003200000000theunreproduciblepackage-1.0.1/readdir/readdir.py#!/usr/bin/python3
import glob
import os

for entry in os.listdir('in'):
    print(entry);

for entry in glob.glob('in/*'):
    print(entry);

for entry in os.walk('in'):
    # note: for nested dirs, both list of dirs and list of files need to be sorted to make the output reproducible
    print(entry);

# Find with:
# egrep -r -e 'os\.(listdir|walk)' -e 'glob\.glob' .

# Fix with:
# sorted(os.listdir('in'))
0707010000004F000081ED000000000000000000000001662A61C40000011B000000000000000000000000000000000000003200000000theunreproduciblepackage-1.0.1/readdir/readdir.rb#!/usr/bin/ruby
puts Dir.glob("in/*")
puts Dir["in/*"]
puts Dir.children("in/")
puts Dir.entries("in/")
# See also: .foreach .each .each_child .read

# Find with:
# egrep -r -e 'Dir\.(glob|children|entries|foreach|each|read)' -e 'Dir\[' .

# Fix with:
# puts Dir.entries("in/").sort
07070100000050000081ED000000000000000000000001662A61C40000003F000000000000000000000000000000000000003200000000theunreproduciblepackage-1.0.1/readdir/readdir.sh#!/bin/sh
find in/ -type f

# Fix with:
# find | LC_ALL=C sort
07070100000051000081A4000000000000000000000001662A61C400000672000000000000000000000000000000000000003D00000000theunreproduciblepackage-1.0.1/theunreproduciblepackage.spec#
# spec file for package theunreproduciblepackage
#
# Copyright (c) 2024 SUSE LLC
#
# All modifications and additions to the file contributed by third parties
# remain the property of their copyright owners, unless otherwise agreed
# upon. The license for this file, and modifications and additions to the
# file, is the same license as for the pristine package itself (unless the
# license for the pristine package is not an Open Source License, in which
# case the license is the MIT License). An "Open Source License" is a
# license that conforms to the Open Source Definition (Version 1.9)
# published by the Open Source Initiative.

# Please submit bugfixes or comments via https://bugs.opensuse.org/
#


Name:           theunreproduciblepackage
Version:        1.0.1
Release:        0
Summary:        Demonstrator for sources of non-determinism
License:        MIT
URL:            https://github.com/bmwiedemann/theunreproduciblepackage/
Source:         %name-%version.tar
BuildRequires:  gcc
BuildRequires:  hostname
BuildRequires:  make
BuildRequires:  ruby

%description
The Unreproducible Package
is meant as a practical way to demonstrate the various ways that software can break reproducible builds using just low level primitives without requiring external existing programs that implement these primitives themselves.

It is structured so that one subdirectory demonstrates one class of issues in some variants observed in the wild.

See https://reproducible-builds.org/ for background info.

%prep
%autosetup -p1

%build
%make_build

%install
%make_install

%post
%postun

%files
%license COPYING
%doc README.md
%doc out

%changelog

07070100000052000041ED000000000000000000000003662A61C400000000000000000000000000000000000000000000002900000000theunreproduciblepackage-1.0.1/timestamp07070100000053000081A4000000000000000000000001662A61C40000000B000000000000000000000000000000000000003400000000theunreproduciblepackage-1.0.1/timestamp/.gitignoregenversion
07070100000054000081A4000000000000000000000001662A61C40000008C000000000000000000000000000000000000003200000000theunreproduciblepackage-1.0.1/timestamp/MakefileCFLAGS=-Wall
all: run

run: genversion
	for s in genversion genversion.[^c]* ; do ./$$s ; done > ../out/version.h

clean:
	rm -f genversion
07070100000055000081A4000000000000000000000001662A61C40000076B000000000000000000000000000000000000003300000000theunreproduciblepackage-1.0.1/timestamp/README.mdSee also https://reproducible-builds.org/docs/timestamps/

There are multiple valid approaches to fixing timestamps
that end up in build results

1) if it is not required, it might be possible to drop it completely, but be careful when discussing with upstream maintainers as some have strong opinions here.

2) if it is meant to show the date, age or version of the software, such as with the date on top of a man-page, it is possible to use the modification time of the ChangeLog file instead: that remains constant for release tarballs and git snapshots, but can be easily updated for other cases (e.g. debug builds). There is no portable shell command syntax for this, but on some systems (at least FreeBSD and Linux), `date -r ChangeLog` does the trick. 
 
Also for file format converters, using the modification time of the input file(s) can give a meaningful result, except if those inputs are also generated during a package build.

3) use the [`SOURCE_DATE_EPOCH`](https://wiki.debian.org/ReproducibleBuilds/TimestampsProposal) environment variable. Templates for most languages already exist to ease adoption. For C and shell patches can be a bit complex, though.

Seen in the wild:
* https://github.com/cea-hpc/robinhood/pull/83/files shell
* https://github.com/SawfishWM/sawfish/pull/29/files shell
* https://sourceforge.net/p/shorewall/mailman/message/35956407/ shell
* https://github.com/Gnucash/gnucash/pull/180/files shell mtime
* https://gerrit.gromacs.org/#/c/6896/ cmake
* https://github.com/votca/csg/pull/228/files cmake
* https://github.com/magefile/mage/pull/474/files golang
* https://github.com/marshmallow-code/marshmallow/pull/679/files python
* https://debbugs.gnu.org/cgi/bugreport.cgi?bug=27773 perl
* https://github.com/ruby/rdoc/pull/570/files ruby
* https://github.com/AlephAlpha/build-time/pull/5/files rust
* https://github.com/ellson/graphviz/pull/1253/files TCL
07070100000056000041ED000000000000000000000002662A61C400000000000000000000000000000000000000000000003300000000theunreproduciblepackage-1.0.1/timestamp/copyright07070100000057000081A4000000000000000000000001662A61C400000062000000000000000000000000000000000000003C00000000theunreproduciblepackage-1.0.1/timestamp/copyright/Makefileall:
	date "+Copyright %Y by the authors. All rights granted" > ../../out/timestamp-copyright.txt
07070100000058000081A4000000000000000000000001662A61C4000004AE000000000000000000000000000000000000003D00000000theunreproduciblepackage-1.0.1/timestamp/copyright/README.mdSome software and documentation uses the year of build
in copyright strings.

As long as this only touches comments in .c or .h files,
it will only alter the build-id in the resulting binary.

However, building today's software in 2037 and getting a
`Copyright © 2002-2037 THEAUTHORS`
is wrong, because nobody authored anything in 2037 in this version.

Projects are better off manually updating the year with git commits
or updating the date from a ChangeLog or input file mtime
so it remains constant for release tarballs.

See also on the [stackoverflow discussion](https://stackoverflow.com/questions/2390230/do-copyright-dates-need-to-be-updated) titled "Do copyright dates need to be updated?"

Seen in the wild:
* https://github.com/dealii/dealii/pull/7258/files cmake
* https://gitlab.com/gnutls/gnutls/merge_requests/928/diffs autoconf/shell
* https://github.com/kubernetes/kubernetes/pull/59172/files go
* https://github.com/pytest-dev/pytest/pull/3710/files python
* https://github.com/apache/cassandra/commit/84fc68ce3f77e88a542dd2443e560cb291109198 java
* https://issues.apache.org/jira/browse/MCOMPILER-380 java
* https://www.redhat.com/archives/libguestfs/2018-August/msg00230.html
07070100000059000081A4000000000000000000000001662A61C4000000D9000000000000000000000000000000000000003600000000theunreproduciblepackage-1.0.1/timestamp/genversion.c// observed in packages fontforge, groff, judy, lirc, mawk, pcp, xmgrace
#include <stdio.h>
#include <time.h>

int main()
{
  const time_t now = time(NULL);
  printf("BUILD_DATE=\"%s\"\n", ctime(&now));
  return 0;
}
0707010000005A000081ED000000000000000000000001662A61C400000037000000000000000000000000000000000000003700000000theunreproduciblepackage-1.0.1/timestamp/genversion.pl#!/usr/bin/perl
print 'BUILD_DATE="'.localtime."\"\n";
0707010000005B000081ED000000000000000000000001662A61C40000006C000000000000000000000000000000000000003700000000theunreproduciblepackage-1.0.1/timestamp/genversion.py#!/usr/bin/python3
import datetime
print('BUILD_DATE="'+datetime.datetime.today().strftime("%F %T") + "\"")
0707010000005C000081ED000000000000000000000001662A61C400000027000000000000000000000000000000000000003700000000theunreproduciblepackage-1.0.1/timestamp/genversion.sh#!/bin/sh
date +"BUILD_DATE=\"%F %T\""
07070100000000000000000000000000000000000000010000000000000000000000000000000000000000000000000000000B00000000TRAILER!!!79 blocks
openSUSE Build Service is sponsored by