Description of the miniproject¶
Introduction¶
The miniproject package is a tiny stupid package that demonstrates the build process for the mmgroup package. The mmgroup package is a python implementation of Conway’s construction [Con85] of the monster group \(\mathbb{M}\), which is the largest sporadic finite simple group.
That package contains C programs that have been generated automatically by a progam code generator. Here automatically generated low-level functions written in C are used to compute rather large tables, which will be use by high-level C programs. This means that the build process has to take place in several stages. Also, there are C subroutines which are used by both, low-level and high-level functions. So it makes sense to place these subroutines into a shared library.
When porting the mmgroup package to another operating system, several aspects specific to the operating system have to be considered. This miniproject package can be considered as a model for the mmgroup package focussing on the os-specific aspects.
Installation¶
The current version of the miniproject package is a source distribution
that has been tested on a 64-bit Windows platform only. It runs with
python 3.6 or higher. The distribution contains a number of extensions
written in C which have to be built before use. At present there is no
binary distribution for Windows.
Before you can use this package or build its extensions you should install the following python packages:
Package |
Purpose |
|---|---|
|
Development: integrating |
|
Recommendation: These packages are not really needed here,
but they should be installed, since they are required for
the |
|
Testing: basic package used for testing |
|
Development: basic package used for setup and building extensions |
|
Documentation: basic package used for documentation |
|
Documentation: bibliography in BibTeX style |
Packages used for the purpose of documentation are required only if you want to rebuild the documentation. If you want to rebuild the documentation you should also install the following programs:
Program |
Purpose |
Location |
|---|---|---|
miktex |
Documentation |
|
Perl |
Documentation |
Building the extension¶
To build the required extension, go to root directory of the distribution.
The is the directory containing the files setup.py and README.rst.
From there run the following two commands:
python setup.py build_ext
pytest src/miniproject/ -v
Installing a C compiler for cython in Windows¶
The bad news for Windows developers is that there is no pre-installed
C compiler on a standard Windows system. However, the Cython
package requires a C compiler. Here in principle, the user has the
choice between the following two compilers:
MSVC
MinGW-w64
The user has to install a C compiler so that it cooperates with
cython.
That installation process is out of the scope of this document.
For installing MinGW, one might start looking at
https://cython.readthedocs.io/en/latest/src/tutorial/appendix.html.
For installing MSVC, one might start looking at
https://wiki.python.org/moin/WindowsCompilers
The current setup.py supports MinGW and MSVC for 64-bit
Windows. According to the last URL the MinGW compiler works with
all Python versions up to 3.4.
The author has installed the MSVC compiler with the Microsoft
Build Tools for Visual Studio from:
https://visualstudio.microsoft.com/thank-you-downloading-visual-studio/?sku=BuildTools&rel=16 ,
following the instructions in
https://www.scivision.dev/python-windows-visual-c-14-required/ .
Before typing python setup.py bdist_wheel in a Windows command
line the author had to type:
"C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Auxiliary\Build\vcvars64.bat"
Here the path my be different on the user’s Windows system.
Application interface¶
-
miniproject.wrapper.wrap_double(k)¶ Python wrapper of a C function that doubles an integer
k.
-
miniproject.wrapper.wrap_triple(k)¶ Python wrapper of a C function that triples an integer
k.The function returns 3 *
k.For small values
kthe function reads the result from a table which has been generated by a python script. For large valueskit uses a function from a shared library.Both, the table and the shared library have been created in a prior step of the build process.
The modified build process¶
The necessary classes for modifying the build process given by the
setuptools/distutils package is code in the build_ext_steps.py
script.
Module build_ext_steps provides a customized version of
the ‘build_ext’ command for setup.py.
It should be placed in the same directory as the module setup.py.
Distributing python packages¶
The standard toolkit for distributing python packages is the
setuptools package. Here the user types:
python setup.py build_ext
at the console for building the extensions to the python package, which
are typically written in a language like C or C++ for the sake of speed.
We may use e.g. the Cython package to write python wrappers for the
functions written in C or C++. The setuptools package supports the
integration of Cython extensions.
The setup.py script describes a list of extensions, where each
extension is an instance of class Extension which is provided
by setuptools. Then it calls the setup function which builds
all these extensions:
from setuptools import setup
from setuptools.extension import Extension
ext_modules = [
Extension(
... # description of first extension
),
Extension(
... # description of second extension
),
...
]
setup(
..., # Standard arguments for the setup function
ext_modules = ext_modules, # List of extensions to be built
)
We assume that the reader is familiar with the standard python setup process. For background, we refer to
A new paradigm for building python packages¶
This build_ext_steps module supports a new paradigm for building a
python extension:
A python program
make_stage1.pycreates a C programstage1.c.We create a python extension
stage1.so(orstage1.pydin Windows) that makes the functionality ofstage1.cavailable in python.A python program
make_stage2.pycreates a C programstage2.c. Heremake_stage2.pymay importstage1.so(orstage1.pyd).We create a python extension that makes the functionality of
stage2.cavailable in python.etc.
This paradigm is not supported by the setuptools package.
Using class BuildExtCmd for the new paradigm¶
For using the new building paradigm we have to replace the standard
class build_ext by the class build_ext_steps.BuildExtCmd.
from setuptools import setup
from build_ext_steps import Extension
from build_ext_steps import BuildExtCmd
ext_modules = [
# description of extension as above
]
setup(
..., # Standard arguments for the setup function
ext_modules = ext_modules, # List of extensions to be built
cmdclass={
'build_ext': BuildExtCmd, # replace class for build_ext
},
)
This change has a few consequences:
It is guaranteed that the extension are build in the given order
Extensions are always build in place (option
build_ext --inplace) (The current version does not support building the extension in a special build directory.)The building of all extensions is now forced (option
build_ext --f), regardless of any time stamps.A keyword argument
extra_compile_argsandextra_link_argsfor an instance of classExtensionmay be a dictionary‘compiler’ : <List if arguments>
instead of a list of arguments. Here ‘compiler’ is a string describing a compiler. For a list of compilers, run
python setup.py build_ext --help-compiler.
Apart from these changes, an extension is created in the same way
as with setuptools.
For a documentation of the Extension class in the setuptools
package, see
https://docs.python.org/3/distutils/apiref.html?highlight=extension#distutils.core.Extension
Inserting user-defined functions into the build process¶
Module build_ext_steps provides a class CustomBuildStep
for adding user-defined functions to the build process.
In the list ext_modules of extensions, instances of class
CustomBuildStep may be mixed with instances of class Extension,
Class CustomBuildStep models an arbitrary sequence of functions to
be executed.
The constructor for that class takes a string ‘name’ describing the action of these functions, followed by an arbitrary number of lists, where each list describes a function to be executed.
Here the first entry of each list is either a string or a callable python function. If the first entry is a string then a subprocess with that name is called. Otherwise the corresponding python function is executed. Subsequent entries in the list are arguments given to the subprocess or to the function.
Such a subprocess may be e.g. a step that generates C code to be used for building a subsequent python extension.
Its recommended to use the string sys.executable (provided
by the sys package) instead of the string 'python' for
starting a python subprocess.
Using an extension in a subsequent build step¶
Once a python extension has been built, it can also be used in a subsequent step of the build process, e.g. for calculating large arrays of constants for C programs.
This approach works well on a Windows system, but it might not work on other operating systems. Here it is a good idea to write a pure-python substitute for any C extension to be used in a subsequent build step. This may slow down the build process considerably. But it is better to have a slow build process than no build process at all.
The build process for the miniproject package¶
The miniproject package
contains a function double_function written in C and stored in
a shared library. That function doubles an integer value. There is a
python wrapper (written in Cython) which makes that function available
in python. The pytest package is used to test that function.
The miniproject package also contains a function triple_function
witten in C and stored in another shared library. Again, there is a
python wrapper (written in Cython) for that function so that it can also
be tested with pytest.
Function triple_function calls function double_function, which
is in a different shared library. For small values it uses a
precomputed table triple_table.c for tripling a number. That table
is precomputed by the python script codegen.py. Of course, the
C program triple_table.c must be integrated into the process that
builds the python wrapper for function triple_function.
So the functionality provided by this package is trivial, but the
building process for this package is extremely involved. That build
process is based on the standard setuptools/distutils package. As
usual, there is a standard script setup.py in the root directory of
the package controls the build process.
In the current version the necessary modifications of the
setuptools/distutils package required for a multi-step build
process are contain in the build_ext_steps.py script in the
root directory.
Porting the miniproject and the mmgroup package¶
The current versions of the miniproject and mmgroup packages support the 64-bit Windows operating system with the mingw32 complier only.
The author is aware of the fact that porting a package as complex as the mmgroup package to a different operating system, or even adjusting it to a different compiler, may be a highly frustrating job. Here the miniproject can be used as a model for the mmgroup package.
It is highly recommended to port the miniproject to any operating system before porting the mmgroup package.
After porting the miniproject the file build_ext_steps.py
(and all new files created for the porting process)
should be copied from the root directory of the miniproject
to the corresponding directory of the mmgroup package.
References
- Con85
J. H. Conway. A simple construction of the Fischer-Griess monster group. Inventiones Mathematicae, 1985.