Single-precision (fp32) build support#5033
Conversation
|
Thanks for this.
What about single+complex? Isn't that a valid configuration?
This isn't necessary. That's all going to be taken care of when I release the next major version in the next 24 hours.
Can you get this fixed upstream? Clearly Claude already knows what to do. |
bcce0fc to
682a980
Compare
…spatialindex float64 coercion
- mesh.py: clarify why PETSc.RealType is correct for plex_from_cell_list - mg/kernels.py: fix to_reference_coords_kernel signature (PetscScalar *X -> %(RealType)s *X) and add RealType to the template substitution dict - evaluate.h: comment explaining why int/double is intentional for evaluate() - test_stokes_mini.py: replace fp32 solver workaround with @pytest.mark.skipsingle; fieldsplit path needs a dedicated fp32 test - conftest.py: remove trailing blank line
…nd skip VertexOnlyMesh tests in fp32
0dc7472 to
dd517a9
Compare
Added ScalarType.SINGLE_COMPLEX (--arch single-complex) to firedrake-configure for all four platform targets: single_mode (from PETSC_PRECISION) and complex_mode (from PETSC_SCALAR) are detected independently, and ScalarType already resolves to numpy.complex64 for a single+complex build.
Removed from the PR description — thanks.
Opened a PR here: https://gitlab.com/petsc/petsc/-/merge_requests/9272 |
connorjward
left a comment
There was a problem hiding this comment.
I've just looked at the CI/install related side of this (at a glance everything else seems pretty good).
We just have to be careful about adding new test builds to Firedrake. I will try and figure out a solution soon.
| fail-fast: false | ||
| matrix: | ||
| arch: [default, complex] | ||
| arch: [default, complex, single] |
There was a problem hiding this comment.
This will add an extra 50% to the load on our CI runners. I will try and redesign our CI system so that we can reasonably support extra configurations like single precision.
| ], | ||
| ), | ||
|
|
||
| (OS.UBUNTU_2404_AARCH64, ScalarType.SINGLE, GPUPlatform.NO_GPU): Arch( |
There was a problem hiding this comment.
Because we don't test aarch on CI apart from the basic real+complex we shouldn't be adding them here. They can be added as 'Community Archs' in the dict below.
The same applies to macOS.
This might be easier once #5085 is merged as that adds our first 'community' options.
fixes #3040
Description
Adds single-precision (fp32) build support. Firedrake can now run on a PETSc installation compiled with
--with-precision=single. The approach mirrors complex mode: precision is detected at import time from PETSc's build variables and flows through from there.Prerequisite
Requires https://gitlab.com/petsc/petsc/-/merge_requests/9272 (
petsc4py: handle PETSC_DOUBLE in DMSwarm.getField). Without it,DMSwarm.getField()raisesAssertionErroron fp32 builds becausePETSC_DOUBLEis not mapped to a numpy dtype in the single-precision case wherePETSC_REAL != PETSC_DOUBLE.Changes
scripts/firedrake-configure--arch single/ScalarType.SINGLE; passes--with-precision=singleto PETSc configure; excludesfftwandsuitesparse(no fp32 support in those libraries)tsfc/parameters.py,tsfc/loopy.py,tsfc/kernel_interface/common.py,tsfc/ufl_utils.pyscalar_type/scalar_type_cderived from PETSc precision at import time; constant initializers cast to the kernel scalar dtypeCore (
evaluate.h,locate.c,pointquery_utils.py,pointeval_utils.py,mg/kernels.py)double/intwithPetscReal/PetscIntin generated C code1e-6in fp32 mode vs1e-12in fp64, to stay within single-precision rangefiredrake/mesh.py,firedrake/utility_meshes.pyPETSc.RealType; physical coordinate arrays for rtree andDMSwarmPIC_coorremainfloat64(required by the rtree C API and PETSc swarm internals)firedrake/function.pyfloat64regardless ofScalarType, for geometric robustness in cell locationfiredrake/assemble.py,firedrake/functionspaceimpl.py,pyop2/codegen/builder.pydtype=intwithdtype=IntTypeinnumpy.prodand array allocation callsfiredrake/utils.pysingle_modeboolean flag (mirrorscomplex_mode)Tests
@pytest.mark.skipsinglemarker for tests incompatible with fp32tests/firedrake/conftest.py: registers the marker and wires it uptest_locate_cell,test_interpolate_cross_mesh[extrudedcube],test_parallel_high_order_location).github/workflows/core.ymlsingleto the CI matrix alongsidedefaultandcomplexKnown limitations
test_parallel_high_order_locationis skipped in fp32: high-order cell location in a warped mesh requires double-precision accuracy that fp32 cannot provide attolerance=0.0001.