Skip to content

Release v1.3.2 #354

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 10, 2025
Merged

Release v1.3.2 #354

merged 2 commits into from
Mar 10, 2025

Conversation

wxj6000
Copy link
Collaborator

@wxj6000 wxj6000 commented Mar 10, 2025

No description provided.

@wxj6000 wxj6000 marked this pull request as ready for review March 10, 2025 08:07
@wxj6000 wxj6000 merged commit fd4b818 into master Mar 10, 2025
6 checks passed
Walter-Feng added a commit to Walter-Feng/gpu4pyscf that referenced this pull request Mar 11, 2025
commit a8b5e41
Author: Qiming Sun <[email protected]>
Date:   Mon Mar 10 17:11:56 2025 -0700

    Multigrid GPU kernel for meta-GGA functionals (pyscf#353)

    * Add eval_tau and eval_mat_tau

    * bug fixes

    * MGGA for s-type GTO is likely correct

    * Bugfix for eval_tau

    * Fix eval_tau for l=8

    * using constexpr

    * lint

    * typo

    ---------

    Co-authored-by: Qiming Sun <[email protected]>

commit 3b0c194
Author: Xiaojie Wu <[email protected]>
Date:   Mon Mar 10 15:03:33 2025 -0700

    V1.3.2 (update version) (pyscf#355)

    * release v1.3.2

    * Update CHANGELOG

    * Update version

    ---------

    Co-authored-by: Qiming Sun <[email protected]>

commit fd4b818
Author: Xiaojie Wu <[email protected]>
Date:   Mon Mar 10 14:45:47 2025 -0700

    Release v1.3.2 (pyscf#354)

    * release v1.3.2

    * Update CHANGELOG

    ---------

    Co-authored-by: Qiming Sun <[email protected]>

commit 3260bcd
Author: Qiming Sun <[email protected]>
Date:   Fri Mar 7 15:58:36 2025 -0800

    Multigrid integration for GGA functional (pyscf#342)

    * Optimize eval_rho

    * Optimize get_pp

    * Symmetry in density matrix and J matrix

    * Add eval_mat_gga

    * New eval_mat_gga

    * fill_dm_xyz function

    * Optimize basis filtering

    * use atomicAdd only for eval_rho

    * Lint

    * Bugfix for eval_mat_gga

    * Using constexpr

    * using constexpr

    ---------

    Co-authored-by: Qiming Sun <[email protected]>

commit 3b16e37
Author: Qiming Sun <[email protected]>
Date:   Thu Mar 6 14:44:21 2025 -0800

    Ensure the dft driver work the same way as that on volcengine (pyscf#349)

    * Ensure the dft driver work the same way as that on volcengine

    * lint

    * bugfix

    ---------

    Co-authored-by: Qiming Sun <[email protected]>

commit 6c8b354
Author: Xiaojie Wu <[email protected]>
Date:   Thu Mar 6 11:01:10 2025 -0800

    Remove the limit of nbas in DFT (pyscf#348)

    * eliminate the limit of nbas in DFT

    * avoid to reinstall cutensor

commit 81e858c
Author: Xiaojie Wu <[email protected]>
Date:   Wed Mar 5 19:15:55 2025 -0800

    Compile virtual CUDA architecture for libxc (pyscf#347)

    * 70-virtual + 70-real

    * v0.7

    * install cutensor

commit 37bdc01
Author: Qiming Sun <[email protected]>
Date:   Tue Mar 4 18:28:43 2025 -0800

    Reduce memory footprint in df-hessian for cupy.einsum (pyscf#345)

    * Reduce memory footprint in df-hessian for cupy.einsum

    * fix auxbasis_response

    * cleanup

    * Restore to the old blksize setting

commit 5960680
Author: Xiaojie Wu <[email protected]>
Date:   Tue Mar 4 09:29:42 2025 -0800

    Release gpu4pyscf-libxc-v0.6 (pyscf#339)

    * Enable building KXC

    * Bump version

    * disable unused build

    * add gint-rys back

    * build on self-host machine

    * make -j1

commit c9cf0f8
Author: Xiaojie Wu <[email protected]>
Date:   Tue Mar 4 09:17:33 2025 -0800

    Several maintenance issues (pyscf#344)

    * dump DFT info

    * remove SM60

    * consistent lindep

    * consistent lindep with pyscf

    * Update keyword argument lindep

    * kwarg passing error

    ---------

    Co-authored-by: Qiming Sun <[email protected]>
    Co-authored-by: Qiming Sun <[email protected]>

commit 634fa2a
Author: Qiming Sun <[email protected]>
Date:   Wed Feb 26 15:12:13 2025 -0800

    Multi-grid algorithm for Coulomb matrix (pyscf#333)

    * Add multigrid.cu

    * Add multigrid.py; debugging

    * Fix dimension and overflow bugs

    * Debugging multigrid.cu

    * Mostly correct. Errors larger than the required precision.

    * Debugging get_j_pass2

    * Some bug fixes

    * syncthreads before accessing shm in loop

    * Rewrite the MG task generation function

    * Improve multigrid accuracy. get_rho and get_j can reach the required precision

    * Add multigrid tests

    * Bugfix

    * Fix cart2xyz

    * Import error

    ---------

    Co-authored-by: Qiming Sun <[email protected]>
    Co-authored-by: Xiaojie Wu <[email protected]>

commit c82e049
Author: Xiaojie Wu <[email protected]>
Date:   Wed Feb 26 12:41:43 2025 -0800

    Support I orbital in DFT (pyscf#340)

    * support I orbital in DFT

    * add test up to lmax=8

commit 3f3fad7
Author: Qiming Sun <[email protected]>
Date:   Wed Feb 26 09:28:38 2025 -0800

    Block divergent optimization for the molecular int3c2e integral tensor (pyscf#337)

    * Block-divergent int3c2e

    * int3c2e correct

    * int3c2e block-divergent version correct

    * unroll int3c2e

    * add unrolled_int3c2e_bdiv

    * Add sort_orbitals and unsort_orbitals functions for int3c2e_bdiv

    * Compatibility between new int3c2e and existing implmentations

    * fixes

    * Fixes for int3c2e_bdiv version

    * Remove unused code

    * Removing debug code

    * Add missing file

    * Import circular dependency

    * Fix merging

    ---------

    Co-authored-by: Qiming Sun <[email protected]>

commit 13f7083
Author: puzhichen <[email protected]>
Date:   Wed Feb 26 23:36:03 2025 +0800

    add tddft in nightly build (pyscf#338)

    :

commit 65dcf10
Author: Xiaojie Wu <[email protected]>
Date:   Tue Feb 18 20:51:11 2025 -0800

    copy instead of casting

commit cbb33a4
Author: puzhichen <[email protected]>
Date:   Thu Feb 20 01:52:01 2025 +0800

    Feat/add tddft test (pyscf#315)

    * wait cpu-benchmark finished

    * add the small mol data

    * change some comments

    * finish the benchmark tests.

    * add the slow mark

    * add the dependence on lindep

commit 451d868
Author: puzhichen <[email protected]>
Date:   Tue Feb 18 03:12:06 2025 +0800

    the lindep parameter is used in the function real_eig (pyscf#332)

    * the lindep parameter is used in the function real_eig

    * add the lindep in unit tests

commit 3ec4ae6
Author: Qiming Sun <[email protected]>
Date:   Tue Feb 11 09:29:20 2025 -0800

    Warp divergent optimization (pyscf#324)

    * A different 4c2e algorithm

    * Modify rys_roots structure

    * New code generator works

    * Update rys_jk unrolling

    * Update rys_roots in various modules

    * Missing file

    * Update unrolled_rys

    * Bugfixes for unrolled_rys

    * Fix rys_contract_jk_ip1

    * Update unrolled_ejk_ip1

    * Fix bugs

    * Optimize unrolled_ejk_ip1

    * Optimize shm footprint in rys_contract_jk_ip1

    * Improve rys_contract_jk_ip1 and unrolled_ejk_ip1

    * Optimize rys_contract_jk_ip2

    * Fix unrolled_ejk_ip2

    * reduce memory footprint for ip2_type3

    * Update rys_contract_jk_ip1

    * Fixes

    * Update unrolled_rys function signature

    * Fix unrolled_rys_ip1

    * update unrolled_rys_ip1

    * Update create_tasks

    * Change rys_roots path

    * Update j engine

    * Update pbc/rys_roots_dat.cu

    * Fix overflow for 48KB shared memory

    * Fix bug in uhf gradients kernel

    * Fix unrolled ip1 and ip2 code

    * Apply rys_roots_rs function for rys_contract_j

    * Improve logger.init_timer

    * Remove unused files

    * Adjust DD_CACHE_MAX size dynamically

    ---------

    Co-authored-by: Qiming Sun <[email protected]>

commit 45483ee
Author: henryw7 <[email protected]>
Date:   Mon Feb 10 20:30:20 2025 -0800

    Record a trap in current dft 3c driver code (pyscf#331)

commit c156379
Author: Xiaojie Wu <[email protected]>
Date:   Thu Feb 6 06:59:34 2025 -0800

    sg1 for heavy atoms (pyscf#330)

commit ee1f2b7
Author: Xiaojie Wu <[email protected]>
Date:   Tue Feb 4 18:33:35 2025 -0800

    Update pypi_wheel.yml

commit b82e219
Author: Qiming Sun <[email protected]>
Date:   Tue Feb 4 18:30:06 2025 -0800

    Release v1.3.1 (pyscf#326)

    * Release v1.3.1

commit 57b51d9
Author: Qiming Sun <[email protected]>
Date:   Tue Feb 4 18:29:24 2025 -0800

    map and reduce functions for multi-gpu runtime (pyscf#307)

    * Dump num_devices info

    * Add multi_gpu module

    * Apply multi_gpu helper functions in scf.jk

    * Fix get_j for h-type functions

    ---------

    Co-authored-by: Qiming Sun <[email protected]>

commit 1b96a24
Author: Xiaojie Wu <[email protected]>
Date:   Tue Feb 4 07:37:55 2025 -0800

    Add 3c driver (pyscf#318)

    * be compatible with pyscf 2.8

    * remove an example

    * check convergence

    * max_memory = 32000

    * more bound checks for grids

    * remove test_gil

    * remove log

    * unit test

    * add 3c driver

    * merge master and use jfit auxbasis

    * update 3c driver

    * bugfix in 3c driver

    * remove molecules in drivers

    * optimize 3c driver

    * added opt_3c_driver.py

commit f74cf73
Author: Qiming Sun <[email protected]>
Date:   Fri Jan 31 15:53:13 2025 -0800

    PBC gaussian density fitting GPU kernels (pyscf#297)

    * Add pbc int3c2e.cu

    * Update int3c2e.py

    * Update int3c2e

    * shared memory size issue and some bug fixes

    * Bugfixes for sr-int3c2e screening

    * Tune integral screening parameters

    * Missing imports

    * Fix k-point int3c2e

    * Add placeholder int3c2e_unrolled.cu

    * Missing file

    * fix

    * Add rsdf_builder

    * Some updates

    * Update rsdf, bugs in cderi_kk

    * updates

    * replace pbc.df cderi generator

    * Update rsdf_builder and df_jk; small precision errors persist

    * Fix a kmesh bug in rsdf

    * Fix lattice sum cutoff estimation

    * Fix PTR_BAS_COORD initialization bug

    * Fix various df accurancy issues

    * Bugfix in rsdf_builder and GDF functions

    * add examples for PBC GDF

    * Fix shared memory size estimation error in rys_ejk_ip1_kernel

    * Adjust BPC-GDF eigenvalue decomposition settings

    * Update rsdf_builder tests

    * Lint

    * pyscf-2.8 issue

    * small updates

    * Tune parameters in rsdf_builder

    * Debug nan in tests

    * Ensure data continuity in cp array

    * Some debugging msgs

    * More debugging msg

    * Bug in cutensor interface for complex tensor contractions

    * pbc-int3c2e error

    * Header and warnings

    ---------

    Co-authored-by: Qiming Sun <[email protected]>

commit f421aec
Author: Xiaojie Wu <[email protected]>
Date:   Fri Jan 31 06:37:21 2025 -0800

    Bugfix: import _response_functions in polarizability module (pyscf#322)

    * import _response_function

    * error message

commit a519ef3
Author: henryw7 <[email protected]>
Date:   Wed Jan 29 14:30:54 2025 -0800

    Save GPU memory for PCM analytical hessian (pyscf#320)

commit 62f16a6
Author: henryw7 <[email protected]>
Date:   Mon Jan 27 14:35:08 2025 -0800

    Analytical PCM 2nd derivative (pyscf#302)

    * Analytical qv and nuc terms gives correct result

    * Second derivative of Fi and Ai working

    * PCM S diagonal second derivative working

    * PCM hessian d2D d2S working

    * PCM hessian for grad_solver done

    * PCM Hessian optimize electron hessian, remove nao*nao*ngrids memory requirement

    * Add tests for int1e 2nd derivative and pcm 2nd derivative

    * Fix tests

    * Forgot to change a debug setting off when generating finite difference result

    * Remove explicit inverse K

    * Improvement: remove abuse use of einsum, accelerate pcm and int1e 2nd derivative kernels

commit a1de844
Author: Xiaojie Wu <[email protected]>
Date:   Mon Jan 27 06:55:03 2025 -0800

    Bugfix: turn off nlc in df.hessian (pyscf#319)

    * use do_nlc()

    * Add error message

commit 1593f2a
Author: henryw7 <[email protected]>
Date:   Thu Jan 23 23:48:21 2025 -0800

    Fix the bug that libcusolver.so.11 exist but libcusolver.so does not (pyscf#317)

commit fb5326d
Author: Xiaojie Wu <[email protected]>
Date:   Wed Jan 22 06:31:49 2025 -0800

    Bugfix: atomic number in esp.py (pyscf#314)

    * Bugfix: atomic number in esp.py

    * Another bug

commit 6ea5c6c
Author: puzhichen <[email protected]>
Date:   Mon Jan 20 23:13:36 2025 +0800

    Davidson iterations for tddft on GPU (pyscf#305)

    * some mmodifications

    * test

    * finish writting, start debugging

    * Finish debugging and unit tests.

    * remove some comments and unused codes

    * after review the codes

    * change the threshold in precond

    * add the import _response_functions

    * change codes according to review comments

commit 86ca248
Author: Xiaojie Wu <[email protected]>
Date:   Mon Jan 20 07:09:36 2025 -0800

    More bound checks in numint (pyscf#309)

    * be compatible with pyscf 2.8

    * remove an example

    * check convergence

    * max_memory = 32000

    * more bound checks for grids

    * remove test_gil

    * remove log

    * unit test

commit 0add455
Author: Qiming Sun <[email protected]>
Date:   Sun Jan 19 11:41:26 2025 -0800

    Bugfix: Hessian CPHF memory footprint for RSH functionals (pyscf#311)

commit 1233861
Author: Xiaojie Wu <[email protected]>
Date:   Thu Jan 16 15:35:23 2025 -0800

    Resolve compatibility issue with pyscf 2.8 (pyscf#306)

    * be compatible with pyscf 2.8

    * remove an example

    * check convergence

    * max_memory = 32000

commit 9b694fa
Author: henryw7 <[email protected]>
Date:   Wed Jan 8 23:09:54 2025 -0800

    Analytical PCM righthand side of CPHF, analytic derivative of V_ia(q) (pyscf#298)

    * C-PCM righthand side of CPHF, analytic derivative of V_munu(q), trash implementation that works

    * dVia/dx works for IEFPCM

    * SSVPE is different from ICEPCM without contraction

    * Small mistake

    * Remove ngrids * nao * nao memory bottleneck of pcm V_munu matrix derivative

    * Fix linter error

    * Remove natm * 3 * nao * nao memory bottleneck, now the bottleneck is natm * 3 * nmo * nocc. Also bug fix here and there

    * Remove the numerical implementation of dVia/dx

    * Change finite difference implementation to test

    * Remove abuse use of cp.einsum and cp.zeros

    * Reorganize code for 2nd derivative term

commit 0427e6a
Author: Qiming Sun <[email protected]>
Date:   Tue Jan 7 16:48:49 2025 -0800

    Release v1.3.0 (pyscf#296)

    * Release v1.2.2

    * Update CHANGELOG

    * Update Changelog

commit 49f2f56
Author: Xiaojie Wu <[email protected]>
Date:   Tue Jan 7 11:49:30 2025 -0800

    Benchmark in nightly build (pyscf#295)

    * refactor hessian class

    * fixed bug in df.hessian.uhf

    * update license

    * format code

    * support h function in hessian.jk

    * unit test

    * optimize df hessian memory usage

    * more accurate memory estimate for hessian

    * _gen_jk -> _get_jk_ip

    * with_j and with_k for hessian

    * memory estimate

    * tested on 095 molecule

    * improve make_h1 in df.hessian

    * bugfix

    * use sorted_mol

    * update nightly build

    * assert hermi==1

    * typo in uhf.hessian

    * inject gen_response into soscf

    * update tests for nightly build

    * disable benchmark for ci

    * install pytest-benchmark

    * change the file names of benchmark tests

    * disable benchmark for ci

    * test dir

    * save changes

    * add copy_array

    * assert chunk_shape

    * improve hcore derivatives

    * cupy copy -> copy_array

    * optimize multi-GPU

    * bugfix for single gpu

    * update benchmark script

    * np.isclose

    * bugfix

    * auxbasis_response

    * add benchmark results

    * split nightly benchmark

    * optimize df.hessian memory

    * small fixes

    * bugfix in df.hessian

    * bugfix

    * add benchmark data

    * remove comments

    * resolve comments

    * group_size in hessian

    * resolve possible memory leak

    * bugfix

    * bugfix

commit e55a70e
Author: Xiaojie Wu <[email protected]>
Date:   Sun Jan 5 19:25:03 2025 -0800

    update examples

commit 177fb05
Author: Qiming Sun <[email protected]>
Date:   Fri Dec 27 22:05:17 2024 -0800

    Add pickle serialization (pyscf#294)

    * Add pickle serialization (fix pyscf#267)

    * syntax error

    * Fix DFHF serialization tests

    ---------

    Co-authored-by: Qiming Sun <[email protected]>

commit ad52eba
Author: Xiaojie Wu <[email protected]>
Date:   Fri Dec 27 22:04:57 2024 -0800

    Refactor Hessian classes (pyscf#290)

    * refactor hessian class

    * fixed bug in df.hessian.uhf

    * update license

    * format code

    * support h function in hessian.jk

    * unit test

    * optimize df hessian memory usage

    * more accurate memory estimate for hessian

    * _gen_jk -> _get_jk_ip

    * with_j and with_k for hessian

    * memory estimate

    * tested on 095 molecule

    * improve make_h1 in df.hessian

    * bugfix

    * use sorted_mol

    * assert hermi==1

    * typo in uhf.hessian

    * inject gen_response into soscf

    * remove print

commit 9d28f26
Author: Qiming Sun <[email protected]>
Date:   Mon Dec 23 20:40:19 2024 -0800

    pbc.df.ft_ao on GPU (pyscf#291)

    * Add ft_ao cuda kernel

    * Add ft_ao.py

    * Add helper functions in gpu4pyscf.gto.mole

    * Update pbc.ft_ao

    * ft_ao runs, output incorrect

    * PBC ft_ao general kernel correct

    * ft_ao unrolled

    * Modified kpts_to_kmesh

    * Add tests

    * Handle non-symmetric case; add more tests.

    * Lint

    * Missing files

    * Apply the ft_ao GPU implementation in aft and aft_jk

    * Update VHFOpt in scf.jk module

    * Undefined variables

    * vhfopt.mol -> vhfopt.sorted_mol

    * Fix J-engine due to the change of _VHFOpt class

    * Remove print statements

    * Apache header

    * More Apache headers

    ---------

    Co-authored-by: Qiming Sun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants