File README.building of Package spark-kit
This package provides build dependencies for building Apache Spark
using Tetra (ruby2.2-rubygem-tetra). Using Tetra is neccessary due
the Maven-based build process downloading dependencies at build time.
Tetra will keep track of these downloaded dependencies during a so-called
"dry-run" build on the maintainer's machine and generate a tarball
(spark-kit.tar.xz) containing all of them. This tarball is then used to
allow the same build to run in an offline manner in OBS (where there is no
Internet connectivity on purpose in order to allow for reproducible
builds). To generate spark-kit.tar.xz for a new version of Spark, proceed
as follows (you will need to have Tetra installed for this):
Note: these steps need to be performed on the machine where you keep your
spark{,-kit} working copy.
1) Download an updated Spark tarball. We'll assume the new tarball's version
is 1.6.3 throughout the rest of this file, which is the version it was at
the time of this writing.
2) Initialize a Tetra build directory for your new spark tarball by running
these commands from your working copy. (Note that the colons in OBS
checkout directory names can cause issues building packages, so you are
advised to run this process from within /tmp or another location
containing your source tarball.)
tetra init spark spark-1.6.3.tar.gz
6) Perform a dry-run build with Tetra:
cd spark/src/spark-1.6.3
tetra dry-run
./build/mvn clean install -DskipTests -Phadoop-2.4
mvn dependency:go-offline -Phadoop-2.4
exit
7) Generate the new kit tarball:
tetra generate-all
You will now find an updated kit tarball in
spark/packages/spark-kit.tar.xz
Substitute this tarball for the existing one in the kit package and bump its
version.