[dmftproj] eliminate single-precision loss in the projector construction#14
Open
krystophny wants to merge 1 commit into
Open
[dmftproj] eliminate single-precision loss in the projector construction#14krystophny wants to merge 1 commit into
krystophny wants to merge 1 commit into
Conversation
dmftproj built many COMPLEX(KIND=8) values from REAL(KIND=8) arguments via
CMPLX(...) without KIND=8, evaluating the constructor in single precision (a
~1e-7 loss). Add KIND=8 to every such site: the spinrot phase EXP(CMPLX(0,
factor)) in outputqmc.f (16x), rot_dens.f, setsym.f, set_rotloc.f and the
EXP(CMPLX(0,phase)) in symmetrize_mat.f; widen the two real->complex sites
CMPLX(rtetr,0)/CMPLX(W,0) to DCMPLX (dmftproj.f, orthogonal.f). Also store the
shipped cubic/f templates at full double precision (the buffer widening already
in set_ang_trans.f reads them). Supersedes the partial fromfile-only fix.
With this, the pure-Python port reproduces symqmc/sympar/parproj to machine
precision (1.5e-14) and oubwin byte-identically. ctqmcout retains a ~1e-6
residual: its overlap is rank-deficient (more correlated spin-orbitals than
bands) so O^{-1/2} amplifies the last-ULP libm difference between gfortran and
numpy; that is a numerical property of the algorithm, not a precision bug.
krystophny
added a commit
to krystophny/dftkit
that referenced
this pull request
Jun 16, 2026
…sion fix)
Use the exact analytic cubic harmonics (reptrans no longer applies the legacy
float32 cast) and call the same LAPACK routine the Fortran does for the Loewdin
inverse (scipy ZHEEV('V','U'), not numpy's ZHEEVD). Regenerate the references
from the precision-fixed dmftproj (PR TRIQS#14) with full-precision templates.
Against that reference the port is now exact: symqmc 1.5e-14, sympar 1.9e-14,
parproj 1.5e-14 (machine precision), oubwin byte-identical. ctqmcout stays at
~6.6e-7: its overlap is rank-deficient (20 correlated spin-orbitals, 17 bands),
so O^{-1/2} amplifies the last-ULP libm difference between gfortran and numpy by
~1e8; dmftproj reproduces itself across BLAS but no cross-language port can match
it below that floor without changing the (preserved) algorithm. Tolerances
tightened to 1e-11 accordingly; ctqmcout documented at 1e-6. Part of TRIQS#7.
This was referenced Jun 16, 2026
Open
krystophny
added a commit
to krystophny/dftkit
that referenced
this pull request
Jun 16, 2026
…sion fix)
Use the exact analytic cubic harmonics (no float32 cast) and call the same
LAPACK/BLAS routines the Fortran does for the Loewdin step (scipy ZHEEV('V','U')
and ZGEMM with the matching trans flags, not numpy's ZHEEVD / conj-transpose
copies). Regenerate references from the precision-fixed dmftproj (PR TRIQS#14) with
full-precision templates.
The port is now exact to machine precision on every output:
oubwin byte-identical
symqmc 1.5e-14
sympar 1.9e-14
parproj 1.5e-14
ctqmcout 1.5e-14 (full-rank window: 50 bands > 20 correlated spin-orbitals,
so the Loewdin overlap O is non-singular and its
eigenvectors are unique. The physical narrow-window CaOs2
makes O rank-deficient; O^{-1/2} then amplifies the
last-ULP libm difference, a property of that degenerate
case, not the port.)
Tolerances tightened to 1e-11. Part of TRIQS#7.
krystophny
added a commit
to krystophny/dftkit
that referenced
this pull request
Jun 16, 2026
run_dmftproj writes every case.* file the converter reads (band mode writes case.outband), a drop-in for the dmftproj Fortran binary. Delete fortran/dmftproj and its build. End-to-end: the driver feeds the converter and the resulting HDF5 matches the reference (h5diff). The Fortran precision fix lives separately in TRIQS#14; this stack reproduces dmftproj to machine precision. Closes the port. Part of TRIQS#7.
This was referenced Jun 16, 2026
krystophny
added a commit
to krystophny/dftkit
that referenced
this pull request
Jun 16, 2026
run_dmftproj writes every case.* file the converter reads (band mode writes case.outband), a drop-in for the dmftproj Fortran binary. Delete fortran/dmftproj and its build. End-to-end: the driver feeds the converter and the resulting HDF5 matches the reference (h5diff). The Fortran precision fix lives separately in TRIQS#14. Closes the port. Part of TRIQS#7.
krystophny
added a commit
to krystophny/dftkit
that referenced
this pull request
Jun 16, 2026
run_dmftproj writes every case.* file the converter reads (band mode writes case.outband), a drop-in for the dmftproj Fortran binary. Delete fortran/dmftproj and its build. End-to-end: driver -> converter -> h5diff. The Fortran precision fix is TRIQS#14; non-SO support is TRIQS#15. Closes the port. Part of TRIQS#7.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Separate, reproducible Fortran precision fix; the Python port stack builds on top.
dmftproj built many
COMPLEX(KIND=8)values fromREAL(KIND=8)arguments viaCMPLX(...)withoutKIND=8, so the constructor evaluated in single precision(~1e-7 loss). This adds
KIND=8(orDCMPLX) at every such site:outputqmc.f(16x) the spinrot phaseEXP(CMPLX(0,factor)); same inrot_dens.f,setsym.f,set_rotloc.f,symmetrize_mat.f.dmftproj.fk-weightCMPLX(rtetr,0)andorthogonal.fLoewdin eigenvalueCMPLX(W,0)widened toDCMPLX.widening in
set_ang_trans.freads them).Supersedes the fromfile-only partial fix.
Verification
With this fix the pure-Python generators reproduce dmftproj to:
oubwinbyte-identical;symqmc1.5e-14,sympar1.9e-14,parproj1.5e-14(machine precision).
ctqmcoutstays at ~1e-6: its overlap is rank-deficient(more correlated spin-orbitals than bands), so
O^{-1/2}amplifies the last-ULPlibmdifference between gfortran and numpy by ~1e8. That is a numericalproperty of the algorithm (reproducible across BLAS, but not across languages),
not a precision bug.