Edinburgh Research Explorer Genefer: Programs for finding large probable generalized Fermat primes

Genefer is a suite of programs for performing Probable Primality (PRP) tests of Generalised Fermat numbers b 2n +1 (GFNs) using a Fermat test. Optimised implementations are available for modern CPUs using single instruction, multiple data (SIMD) instructions, as well as for GPUs using CUDA or OpenCL. Genefer has been extensively used by PrimeGrid – a volunteer computing project searching for large prime numbers of various kinds, including GFNs. Genefer’s architecture separates the high level logic such as checkpointing and user interface from the architecture-specific performance-critical parts of the implementation, which are suitable for re-use. Genefer is released under the MIT license. Source and binaries are available from www.assembla.com/ spaces/genefer.


Introduction
Genefer is a suite of programs for performing Probable Primality (PRP) tests of Generalised Fermat numbers b 2 n  +1 (GFNs).The code implements a Fermat primality test, computing a p-1 (mod p) for a candidate GFN p.It is a necessary, but not sufficient, condition for p prime that this test returns 1, so Genefer pseudo-primes must be subsequently tested for primality using a different program such as PFGW [1].Genefer was originally developed by the second author, who managed a manual distributed computing effort between 2000 and 2004, which discovered several 100,000 + digit primes [2].Since 2009, the PrimeGrid [3] volunteer computing project has used Genefer to extend the search to well over a million digits [4] and is currently searching for a world-record sized GFN prime.The current version 3.2.7 of Genefer includes contributions from a number of developers in the PrimeGrid community, and has been open-source since 2011.

Implementation and architecture
Genefer began as several monolithic C codes, each adapted for different CPU architectures (32-bit x86, x87 extended precision, and x86-64 with SSE2), with differing accuracy and performance characteristics.The accuracy manifests itself as a limit in the maximum value of b that can be tested for a given n before round-off errors are encountered (see Table 1).The limits are generated directly by the program and are checked at runtime.In version 2.3.0 the code was re-architected by the first author to provide a consistent user interface across all variants, and to define an API to separate the architecture-dependent portions of the code.The front-end contains common functionality such as command line argument parsing, the high-level logic which drives the Fermat test algorithm, checkpoint/ restart support, as well as testing and performance benchmark.The performance critical implementation of the large-integer multiplication kernels are encapsulated by a C++ abstract class, and each different optimised version is an implementation of this.
Subsequently, many new variants of the multiplication step have been added, optimised for modern processor Instruction Set Architectures including AVX and FMA3/4, as well as GPUs using either CUDA or OpenCL.Originally based on Discrete Weighted Transforms (DWT) [5], recent versions have been used the z-Transform [6] for improved performance and better accuracy.In addition, support for BOINC [7] has been added, which enables Genefer to be used by many thousands of users transparently via the BOINC client installed on their home computers.Since version 3.2.0,multiple transform implementations may be compiled into a single executable, which may either be auto-selected at runtime based on the detected hardware, or selected by the user.A summary of the performance and accuracy of each of the transforms in Genefer 3.2.7 is given in Table 1.Note that the CUDA implementation is a DWT based on NVIDIA CUDA Fast Fourier Transform library (cuFFT) while the OpenCL version is a bespoke z-Transform.
The Genefer source code available from SVN contains a README with instructions on building the various different versions.Makefiles are provided for Windows, Linux, and Mac platforms, along with Visual Studio 2012 and 2013 solution files for building the GPU versions on Windows.

Usage
Every version shares the same command line interface, and supports three main modes of operation: interactive, where the user navigates a textual menu system; quick test, which tests a single candidate GFN specified via the -q flag; and batch mode, where a file is read containing a list of candidates to be processed.
For example, using the quick test (-q) option executing a probable primality test on a single candidate is as simple as: If several candidates are to be tested, they can be listed in a file of b, N pairs, one per line (some lines have been removed from the code output, for clarity), which is passed to Genefer as a command line argument: Intel Xeon X5650, 3 Intel Xeon E5-2670, 4 Intel Xeon E5620, 5 Intel Xeon E5-1650v2.The interactive mode may be accessed by running the binary without any arguments.
A "Generalized Fermat Prime Search" discussion forum exists at PrimeGrid (http://www.primegrid.com/forum_forum.php?id=75) where both the developers and users of Genefer are active.It is an excellent resource for asking questions and is searchable to find solutions to previouslyencountered issues.

Quality control
Genefer contains a suite of inbuilt tests, which cover both prime and non-prime candidates for all relevant values of N -32 to 524288 for CPU builds and 8192 to 4194304 for GPU builds.These are system tests that carry out an entire calculation from start to finish, and verify against a known reference result.Typically, these tests are run by developers as regression tests during the development of new versions.
Development versions of Genefer are identified by a version string with the '-dev' suffix.Prior to release, a set of acceptance tests are carried out in collaboration with PrimeGrid, which includes manual tests against known reference results, and integration tests where the release candidate app is used in a production environment and validated against the previous released version.These tests are carried out on all 3 supported platforms (Windows, Linux, Mac), for all transform types, and include use of the checkpoint/restart functionality.Once the acceptance tests are completed, the '-dev' suffix is removed and the release version is tagged in SVN.The released binaries are then distributed to PrimeGrid users via BOINC.

Operating system
Runs on Windows (98 or later), Linux (2.6 or greater) and Mac OS X (10.5 or greater)

Programming language
Genefer is written in C++, with Intel vector intrinsics and long double types.The CPU version of the code can be compiled with gcc 4.9 or clang 3.6.The CUDA and OpenCL code are compiled with MS Visual Studio 2012 or later on Windows.

Additional system requirements
Genefer requires an x86 CPU.For the CUDA version of the code an NVIDIA GPU with compute capability 1.3 or above is required.For the OpenCL version, an OpenCL 1.0 device supporting the cl_khr_fp64 or cl_amd_fp64 extension (double precision floating point) is required.

Dependencies
Currently, the BOINC libraries are required to build genefer, although the code can be run in a stand-alone mode.For CUDA, the code may be compiled with CUDA 3.2 or greater.The distributed binaries are linked against version 5.5 or 6.0.For OpenCL, version 1.0 or greater is required.