-
Notifications
You must be signed in to change notification settings - Fork 37
Release v1.3.2 #354
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Release v1.3.2 #354
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
sunqm
approved these changes
Mar 10, 2025
Walter-Feng
added a commit
to Walter-Feng/gpu4pyscf
that referenced
this pull request
Mar 11, 2025
commit a8b5e41 Author: Qiming Sun <[email protected]> Date: Mon Mar 10 17:11:56 2025 -0700 Multigrid GPU kernel for meta-GGA functionals (pyscf#353) * Add eval_tau and eval_mat_tau * bug fixes * MGGA for s-type GTO is likely correct * Bugfix for eval_tau * Fix eval_tau for l=8 * using constexpr * lint * typo --------- Co-authored-by: Qiming Sun <[email protected]> commit 3b0c194 Author: Xiaojie Wu <[email protected]> Date: Mon Mar 10 15:03:33 2025 -0700 V1.3.2 (update version) (pyscf#355) * release v1.3.2 * Update CHANGELOG * Update version --------- Co-authored-by: Qiming Sun <[email protected]> commit fd4b818 Author: Xiaojie Wu <[email protected]> Date: Mon Mar 10 14:45:47 2025 -0700 Release v1.3.2 (pyscf#354) * release v1.3.2 * Update CHANGELOG --------- Co-authored-by: Qiming Sun <[email protected]> commit 3260bcd Author: Qiming Sun <[email protected]> Date: Fri Mar 7 15:58:36 2025 -0800 Multigrid integration for GGA functional (pyscf#342) * Optimize eval_rho * Optimize get_pp * Symmetry in density matrix and J matrix * Add eval_mat_gga * New eval_mat_gga * fill_dm_xyz function * Optimize basis filtering * use atomicAdd only for eval_rho * Lint * Bugfix for eval_mat_gga * Using constexpr * using constexpr --------- Co-authored-by: Qiming Sun <[email protected]> commit 3b16e37 Author: Qiming Sun <[email protected]> Date: Thu Mar 6 14:44:21 2025 -0800 Ensure the dft driver work the same way as that on volcengine (pyscf#349) * Ensure the dft driver work the same way as that on volcengine * lint * bugfix --------- Co-authored-by: Qiming Sun <[email protected]> commit 6c8b354 Author: Xiaojie Wu <[email protected]> Date: Thu Mar 6 11:01:10 2025 -0800 Remove the limit of nbas in DFT (pyscf#348) * eliminate the limit of nbas in DFT * avoid to reinstall cutensor commit 81e858c Author: Xiaojie Wu <[email protected]> Date: Wed Mar 5 19:15:55 2025 -0800 Compile virtual CUDA architecture for libxc (pyscf#347) * 70-virtual + 70-real * v0.7 * install cutensor commit 37bdc01 Author: Qiming Sun <[email protected]> Date: Tue Mar 4 18:28:43 2025 -0800 Reduce memory footprint in df-hessian for cupy.einsum (pyscf#345) * Reduce memory footprint in df-hessian for cupy.einsum * fix auxbasis_response * cleanup * Restore to the old blksize setting commit 5960680 Author: Xiaojie Wu <[email protected]> Date: Tue Mar 4 09:29:42 2025 -0800 Release gpu4pyscf-libxc-v0.6 (pyscf#339) * Enable building KXC * Bump version * disable unused build * add gint-rys back * build on self-host machine * make -j1 commit c9cf0f8 Author: Xiaojie Wu <[email protected]> Date: Tue Mar 4 09:17:33 2025 -0800 Several maintenance issues (pyscf#344) * dump DFT info * remove SM60 * consistent lindep * consistent lindep with pyscf * Update keyword argument lindep * kwarg passing error --------- Co-authored-by: Qiming Sun <[email protected]> Co-authored-by: Qiming Sun <[email protected]> commit 634fa2a Author: Qiming Sun <[email protected]> Date: Wed Feb 26 15:12:13 2025 -0800 Multi-grid algorithm for Coulomb matrix (pyscf#333) * Add multigrid.cu * Add multigrid.py; debugging * Fix dimension and overflow bugs * Debugging multigrid.cu * Mostly correct. Errors larger than the required precision. * Debugging get_j_pass2 * Some bug fixes * syncthreads before accessing shm in loop * Rewrite the MG task generation function * Improve multigrid accuracy. get_rho and get_j can reach the required precision * Add multigrid tests * Bugfix * Fix cart2xyz * Import error --------- Co-authored-by: Qiming Sun <[email protected]> Co-authored-by: Xiaojie Wu <[email protected]> commit c82e049 Author: Xiaojie Wu <[email protected]> Date: Wed Feb 26 12:41:43 2025 -0800 Support I orbital in DFT (pyscf#340) * support I orbital in DFT * add test up to lmax=8 commit 3f3fad7 Author: Qiming Sun <[email protected]> Date: Wed Feb 26 09:28:38 2025 -0800 Block divergent optimization for the molecular int3c2e integral tensor (pyscf#337) * Block-divergent int3c2e * int3c2e correct * int3c2e block-divergent version correct * unroll int3c2e * add unrolled_int3c2e_bdiv * Add sort_orbitals and unsort_orbitals functions for int3c2e_bdiv * Compatibility between new int3c2e and existing implmentations * fixes * Fixes for int3c2e_bdiv version * Remove unused code * Removing debug code * Add missing file * Import circular dependency * Fix merging --------- Co-authored-by: Qiming Sun <[email protected]> commit 13f7083 Author: puzhichen <[email protected]> Date: Wed Feb 26 23:36:03 2025 +0800 add tddft in nightly build (pyscf#338) : commit 65dcf10 Author: Xiaojie Wu <[email protected]> Date: Tue Feb 18 20:51:11 2025 -0800 copy instead of casting commit cbb33a4 Author: puzhichen <[email protected]> Date: Thu Feb 20 01:52:01 2025 +0800 Feat/add tddft test (pyscf#315) * wait cpu-benchmark finished * add the small mol data * change some comments * finish the benchmark tests. * add the slow mark * add the dependence on lindep commit 451d868 Author: puzhichen <[email protected]> Date: Tue Feb 18 03:12:06 2025 +0800 the lindep parameter is used in the function real_eig (pyscf#332) * the lindep parameter is used in the function real_eig * add the lindep in unit tests commit 3ec4ae6 Author: Qiming Sun <[email protected]> Date: Tue Feb 11 09:29:20 2025 -0800 Warp divergent optimization (pyscf#324) * A different 4c2e algorithm * Modify rys_roots structure * New code generator works * Update rys_jk unrolling * Update rys_roots in various modules * Missing file * Update unrolled_rys * Bugfixes for unrolled_rys * Fix rys_contract_jk_ip1 * Update unrolled_ejk_ip1 * Fix bugs * Optimize unrolled_ejk_ip1 * Optimize shm footprint in rys_contract_jk_ip1 * Improve rys_contract_jk_ip1 and unrolled_ejk_ip1 * Optimize rys_contract_jk_ip2 * Fix unrolled_ejk_ip2 * reduce memory footprint for ip2_type3 * Update rys_contract_jk_ip1 * Fixes * Update unrolled_rys function signature * Fix unrolled_rys_ip1 * update unrolled_rys_ip1 * Update create_tasks * Change rys_roots path * Update j engine * Update pbc/rys_roots_dat.cu * Fix overflow for 48KB shared memory * Fix bug in uhf gradients kernel * Fix unrolled ip1 and ip2 code * Apply rys_roots_rs function for rys_contract_j * Improve logger.init_timer * Remove unused files * Adjust DD_CACHE_MAX size dynamically --------- Co-authored-by: Qiming Sun <[email protected]> commit 45483ee Author: henryw7 <[email protected]> Date: Mon Feb 10 20:30:20 2025 -0800 Record a trap in current dft 3c driver code (pyscf#331) commit c156379 Author: Xiaojie Wu <[email protected]> Date: Thu Feb 6 06:59:34 2025 -0800 sg1 for heavy atoms (pyscf#330) commit ee1f2b7 Author: Xiaojie Wu <[email protected]> Date: Tue Feb 4 18:33:35 2025 -0800 Update pypi_wheel.yml commit b82e219 Author: Qiming Sun <[email protected]> Date: Tue Feb 4 18:30:06 2025 -0800 Release v1.3.1 (pyscf#326) * Release v1.3.1 commit 57b51d9 Author: Qiming Sun <[email protected]> Date: Tue Feb 4 18:29:24 2025 -0800 map and reduce functions for multi-gpu runtime (pyscf#307) * Dump num_devices info * Add multi_gpu module * Apply multi_gpu helper functions in scf.jk * Fix get_j for h-type functions --------- Co-authored-by: Qiming Sun <[email protected]> commit 1b96a24 Author: Xiaojie Wu <[email protected]> Date: Tue Feb 4 07:37:55 2025 -0800 Add 3c driver (pyscf#318) * be compatible with pyscf 2.8 * remove an example * check convergence * max_memory = 32000 * more bound checks for grids * remove test_gil * remove log * unit test * add 3c driver * merge master and use jfit auxbasis * update 3c driver * bugfix in 3c driver * remove molecules in drivers * optimize 3c driver * added opt_3c_driver.py commit f74cf73 Author: Qiming Sun <[email protected]> Date: Fri Jan 31 15:53:13 2025 -0800 PBC gaussian density fitting GPU kernels (pyscf#297) * Add pbc int3c2e.cu * Update int3c2e.py * Update int3c2e * shared memory size issue and some bug fixes * Bugfixes for sr-int3c2e screening * Tune integral screening parameters * Missing imports * Fix k-point int3c2e * Add placeholder int3c2e_unrolled.cu * Missing file * fix * Add rsdf_builder * Some updates * Update rsdf, bugs in cderi_kk * updates * replace pbc.df cderi generator * Update rsdf_builder and df_jk; small precision errors persist * Fix a kmesh bug in rsdf * Fix lattice sum cutoff estimation * Fix PTR_BAS_COORD initialization bug * Fix various df accurancy issues * Bugfix in rsdf_builder and GDF functions * add examples for PBC GDF * Fix shared memory size estimation error in rys_ejk_ip1_kernel * Adjust BPC-GDF eigenvalue decomposition settings * Update rsdf_builder tests * Lint * pyscf-2.8 issue * small updates * Tune parameters in rsdf_builder * Debug nan in tests * Ensure data continuity in cp array * Some debugging msgs * More debugging msg * Bug in cutensor interface for complex tensor contractions * pbc-int3c2e error * Header and warnings --------- Co-authored-by: Qiming Sun <[email protected]> commit f421aec Author: Xiaojie Wu <[email protected]> Date: Fri Jan 31 06:37:21 2025 -0800 Bugfix: import _response_functions in polarizability module (pyscf#322) * import _response_function * error message commit a519ef3 Author: henryw7 <[email protected]> Date: Wed Jan 29 14:30:54 2025 -0800 Save GPU memory for PCM analytical hessian (pyscf#320) commit 62f16a6 Author: henryw7 <[email protected]> Date: Mon Jan 27 14:35:08 2025 -0800 Analytical PCM 2nd derivative (pyscf#302) * Analytical qv and nuc terms gives correct result * Second derivative of Fi and Ai working * PCM S diagonal second derivative working * PCM hessian d2D d2S working * PCM hessian for grad_solver done * PCM Hessian optimize electron hessian, remove nao*nao*ngrids memory requirement * Add tests for int1e 2nd derivative and pcm 2nd derivative * Fix tests * Forgot to change a debug setting off when generating finite difference result * Remove explicit inverse K * Improvement: remove abuse use of einsum, accelerate pcm and int1e 2nd derivative kernels commit a1de844 Author: Xiaojie Wu <[email protected]> Date: Mon Jan 27 06:55:03 2025 -0800 Bugfix: turn off nlc in df.hessian (pyscf#319) * use do_nlc() * Add error message commit 1593f2a Author: henryw7 <[email protected]> Date: Thu Jan 23 23:48:21 2025 -0800 Fix the bug that libcusolver.so.11 exist but libcusolver.so does not (pyscf#317) commit fb5326d Author: Xiaojie Wu <[email protected]> Date: Wed Jan 22 06:31:49 2025 -0800 Bugfix: atomic number in esp.py (pyscf#314) * Bugfix: atomic number in esp.py * Another bug commit 6ea5c6c Author: puzhichen <[email protected]> Date: Mon Jan 20 23:13:36 2025 +0800 Davidson iterations for tddft on GPU (pyscf#305) * some mmodifications * test * finish writting, start debugging * Finish debugging and unit tests. * remove some comments and unused codes * after review the codes * change the threshold in precond * add the import _response_functions * change codes according to review comments commit 86ca248 Author: Xiaojie Wu <[email protected]> Date: Mon Jan 20 07:09:36 2025 -0800 More bound checks in numint (pyscf#309) * be compatible with pyscf 2.8 * remove an example * check convergence * max_memory = 32000 * more bound checks for grids * remove test_gil * remove log * unit test commit 0add455 Author: Qiming Sun <[email protected]> Date: Sun Jan 19 11:41:26 2025 -0800 Bugfix: Hessian CPHF memory footprint for RSH functionals (pyscf#311) commit 1233861 Author: Xiaojie Wu <[email protected]> Date: Thu Jan 16 15:35:23 2025 -0800 Resolve compatibility issue with pyscf 2.8 (pyscf#306) * be compatible with pyscf 2.8 * remove an example * check convergence * max_memory = 32000 commit 9b694fa Author: henryw7 <[email protected]> Date: Wed Jan 8 23:09:54 2025 -0800 Analytical PCM righthand side of CPHF, analytic derivative of V_ia(q) (pyscf#298) * C-PCM righthand side of CPHF, analytic derivative of V_munu(q), trash implementation that works * dVia/dx works for IEFPCM * SSVPE is different from ICEPCM without contraction * Small mistake * Remove ngrids * nao * nao memory bottleneck of pcm V_munu matrix derivative * Fix linter error * Remove natm * 3 * nao * nao memory bottleneck, now the bottleneck is natm * 3 * nmo * nocc. Also bug fix here and there * Remove the numerical implementation of dVia/dx * Change finite difference implementation to test * Remove abuse use of cp.einsum and cp.zeros * Reorganize code for 2nd derivative term commit 0427e6a Author: Qiming Sun <[email protected]> Date: Tue Jan 7 16:48:49 2025 -0800 Release v1.3.0 (pyscf#296) * Release v1.2.2 * Update CHANGELOG * Update Changelog commit 49f2f56 Author: Xiaojie Wu <[email protected]> Date: Tue Jan 7 11:49:30 2025 -0800 Benchmark in nightly build (pyscf#295) * refactor hessian class * fixed bug in df.hessian.uhf * update license * format code * support h function in hessian.jk * unit test * optimize df hessian memory usage * more accurate memory estimate for hessian * _gen_jk -> _get_jk_ip * with_j and with_k for hessian * memory estimate * tested on 095 molecule * improve make_h1 in df.hessian * bugfix * use sorted_mol * update nightly build * assert hermi==1 * typo in uhf.hessian * inject gen_response into soscf * update tests for nightly build * disable benchmark for ci * install pytest-benchmark * change the file names of benchmark tests * disable benchmark for ci * test dir * save changes * add copy_array * assert chunk_shape * improve hcore derivatives * cupy copy -> copy_array * optimize multi-GPU * bugfix for single gpu * update benchmark script * np.isclose * bugfix * auxbasis_response * add benchmark results * split nightly benchmark * optimize df.hessian memory * small fixes * bugfix in df.hessian * bugfix * add benchmark data * remove comments * resolve comments * group_size in hessian * resolve possible memory leak * bugfix * bugfix commit e55a70e Author: Xiaojie Wu <[email protected]> Date: Sun Jan 5 19:25:03 2025 -0800 update examples commit 177fb05 Author: Qiming Sun <[email protected]> Date: Fri Dec 27 22:05:17 2024 -0800 Add pickle serialization (pyscf#294) * Add pickle serialization (fix pyscf#267) * syntax error * Fix DFHF serialization tests --------- Co-authored-by: Qiming Sun <[email protected]> commit ad52eba Author: Xiaojie Wu <[email protected]> Date: Fri Dec 27 22:04:57 2024 -0800 Refactor Hessian classes (pyscf#290) * refactor hessian class * fixed bug in df.hessian.uhf * update license * format code * support h function in hessian.jk * unit test * optimize df hessian memory usage * more accurate memory estimate for hessian * _gen_jk -> _get_jk_ip * with_j and with_k for hessian * memory estimate * tested on 095 molecule * improve make_h1 in df.hessian * bugfix * use sorted_mol * assert hermi==1 * typo in uhf.hessian * inject gen_response into soscf * remove print commit 9d28f26 Author: Qiming Sun <[email protected]> Date: Mon Dec 23 20:40:19 2024 -0800 pbc.df.ft_ao on GPU (pyscf#291) * Add ft_ao cuda kernel * Add ft_ao.py * Add helper functions in gpu4pyscf.gto.mole * Update pbc.ft_ao * ft_ao runs, output incorrect * PBC ft_ao general kernel correct * ft_ao unrolled * Modified kpts_to_kmesh * Add tests * Handle non-symmetric case; add more tests. * Lint * Missing files * Apply the ft_ao GPU implementation in aft and aft_jk * Update VHFOpt in scf.jk module * Undefined variables * vhfopt.mol -> vhfopt.sorted_mol * Fix J-engine due to the change of _VHFOpt class * Remove print statements * Apache header * More Apache headers --------- Co-authored-by: Qiming Sun <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.