StartDate: 2026-06-04 06:06:38+00:00 CpuId: 12x Intel Xeon W 2000 / D-2100 (Skylake / Cascade Lake) {Skylake}, 14nm GpuId: 1x Tesla V100-SXM2-16GB CommitSHA: 8dea62cad104caef3e3318490f57559519d1eb8a CommitTime: 2026-06-03 23:47:56 +0200 CommitAuthor: Ole Schütt CommitSubject: Reorganize dashboard and reduce Spack testing cadence (#5344) #################### Building Image cp2k-perf-cuda-volta #################### Dockerfile: /tools/docker/Dockerfile.test_performance_cuda_V100 Build-Path: / Build-Args: GIT_COMMIT_SHA=8dea62cad104caef3e3318490f57559519d1eb8a SPACK_CACHE=gs://cp2k-spack-cache Build-Cache: Yes Populating docker build cache... done. DEPRECATED: The legacy builder is deprecated and will be removed in a future release. BuildKit is currently disabled; enable it by removing the DOCKER_BUILDKIT=0 environment-variable. Sending build context to Docker daemon 420.4MB Step 1/46 : FROM nvidia/cuda:12.9.1-devel-ubuntu24.04 12.9.1-devel-ubuntu24.04: Pulling from nvidia/cuda 32f112e3802c: Pulling fs layer 644e9b203583: Pulling fs layer 02559cd4bc8d: Pulling fs layer 2cd52cbb1ebe: Pulling fs layer 6e8af4fd0a07: Pulling fs layer 15a17189b2df: Pulling fs layer 02cb0e091e33: Pulling fs layer 9c3d619183d2: Pulling fs layer 7f7602a82106: Pulling fs layer 5a2aba542b08: Pulling fs layer 6cb9b761b877: Pulling fs layer 2cd52cbb1ebe: Waiting 9c3d619183d2: Waiting 7f7602a82106: Waiting 5a2aba542b08: Waiting 6e8af4fd0a07: Waiting 15a17189b2df: Waiting 02cb0e091e33: Waiting 6cb9b761b877: Waiting 644e9b203583: Verifying Checksum 644e9b203583: Download complete 2cd52cbb1ebe: Verifying Checksum 2cd52cbb1ebe: Download complete 32f112e3802c: Verifying Checksum 32f112e3802c: Download complete 6e8af4fd0a07: Download complete 02cb0e091e33: Verifying Checksum 02cb0e091e33: Download complete 9c3d619183d2: Verifying Checksum 9c3d619183d2: Download complete 7f7602a82106: Verifying Checksum 7f7602a82106: Download complete 02559cd4bc8d: Verifying Checksum 02559cd4bc8d: Download complete 6cb9b761b877: Verifying Checksum 6cb9b761b877: Download complete 32f112e3802c: Pull complete 644e9b203583: Pull complete 02559cd4bc8d: Pull complete 2cd52cbb1ebe: Pull complete 6e8af4fd0a07: Pull complete 15a17189b2df: Download complete 5a2aba542b08: Verifying Checksum 5a2aba542b08: Download complete 15a17189b2df: Pull complete 02cb0e091e33: Pull complete 9c3d619183d2: Pull complete 7f7602a82106: Pull complete 5a2aba542b08: Pull complete 6cb9b761b877: Pull complete Digest: sha256:020bc241a628776338f4d4053fed4c38f6f7f3d7eb5919fecb8de313bb8ba47c Status: Downloaded newer image for nvidia/cuda:12.9.1-devel-ubuntu24.04 ---> eecafe98c3e1 Step 2/46 : ENV CUDA_PATH /usr/local/cuda ---> Using cache ---> 780681fb1fee Step 3/46 : ENV LD_LIBRARY_PATH /usr/local/cuda/lib64 ---> Using cache ---> ba98a15dc225 Step 4/46 : ENV CUDA_CACHE_DISABLE 1 ---> Using cache ---> 3932740340f7 Step 5/46 : RUN apt-get update -qq && apt-get install -qq --no-install-recommends gfortran && rm -rf /var/lib/apt/lists/* ---> Using cache ---> a06eb14abc29 Step 6/46 : WORKDIR /opt/cp2k-toolchain ---> Using cache ---> 082681bac850 Step 7/46 : COPY ./tools/toolchain/install_requirements*.sh ./ ---> Using cache ---> d8bfc1674c90 Step 8/46 : RUN ./install_requirements.sh ubuntu ---> Using cache ---> de928c312410 Step 9/46 : RUN mkdir scripts ---> Using cache ---> 4aed4b85b643 Step 10/46 : COPY ./tools/toolchain/scripts/VERSION ./tools/toolchain/scripts/parse_if.py ./tools/toolchain/scripts/tool_kit.sh ./tools/toolchain/scripts/common_vars.sh ./tools/toolchain/scripts/signal_trap.sh ./tools/toolchain/scripts/get_openblas_arch.sh ./scripts/ ---> Using cache ---> ce9efe84db60 Step 11/46 : COPY ./tools/toolchain/install_cp2k_toolchain.sh . ---> Using cache ---> dfc1a5ca7e3f Step 12/46 : RUN ./install_cp2k_toolchain.sh --with-mpich=install --mpi-mode=mpich --enable-cuda=yes --gpu-ver=V100 --dry-run ---> Using cache ---> 1bc3916e19c7 Step 13/46 : COPY ./tools/toolchain/scripts/stage0/ ./scripts/stage0/ ---> Using cache ---> bbd97369be82 Step 14/46 : RUN ./scripts/stage0/install_stage0.sh && rm -rf ./build ---> Using cache ---> fbbd58fb6405 Step 15/46 : COPY ./tools/toolchain/scripts/stage1/ ./scripts/stage1/ ---> Using cache ---> 9707298b4465 Step 16/46 : RUN ./scripts/stage1/install_stage1.sh && rm -rf ./build ---> Using cache ---> 10af8edef201 Step 17/46 : COPY ./tools/toolchain/scripts/stage2/ ./scripts/stage2/ ---> Using cache ---> cde1e5c7df26 Step 18/46 : RUN ./scripts/stage2/install_stage2.sh && rm -rf ./build ---> Using cache ---> e634e183ddda Step 19/46 : COPY ./tools/toolchain/scripts/stage3/ ./scripts/stage3/ ---> Using cache ---> 90e1d29eaee5 Step 20/46 : RUN ./scripts/stage3/install_stage3.sh && rm -rf ./build ---> Using cache ---> 456e432c42cd Step 21/46 : COPY ./tools/toolchain/scripts/stage4/ ./scripts/stage4/ ---> Using cache ---> 25314ed00994 Step 22/46 : RUN ./scripts/stage4/install_stage4.sh && rm -rf ./build ---> Using cache ---> 2f32d5fcf1ca Step 23/46 : COPY ./tools/toolchain/scripts/stage5/ ./scripts/stage5/ ---> Using cache ---> f6eb71d2ea73 Step 24/46 : RUN ./scripts/stage5/install_stage5.sh && rm -rf ./build ---> Using cache ---> 89a999028ecd Step 25/46 : COPY ./tools/toolchain/scripts/stage6/ ./scripts/stage6/ ---> Using cache ---> 4fee466a0efd Step 26/46 : RUN ./scripts/stage6/install_stage6.sh && rm -rf ./build ---> Using cache ---> 4a225437d875 Step 27/46 : COPY ./tools/toolchain/scripts/stage7/ ./scripts/stage7/ ---> Using cache ---> b3bdd93e7b5e Step 28/46 : RUN ./scripts/stage7/install_stage7.sh && rm -rf ./build ---> Using cache ---> fc993b0523c8 Step 29/46 : COPY ./tools/toolchain/scripts/stage8/ ./scripts/stage8/ ---> Using cache ---> b243c28b2b5f Step 30/46 : RUN ./scripts/stage8/install_stage8.sh && rm -rf ./build ---> Using cache ---> ac272cd10306 Step 31/46 : COPY ./tools/toolchain/scripts/stage9/ ./scripts/stage9/ ---> Using cache ---> 3ae08df2098f Step 32/46 : RUN ./scripts/stage9/install_stage9.sh && rm -rf ./build ---> Using cache ---> 8632987b9f69 Step 33/46 : WORKDIR /opt/cp2k ---> Using cache ---> b27caf79383d Step 34/46 : COPY ./src ./src ---> Using cache ---> 7238308c7619 Step 35/46 : COPY ./data ./data ---> Using cache ---> 6b0386348422 Step 36/46 : COPY ./tools/build_utils ./tools/build_utils ---> Using cache ---> 305359f46882 Step 37/46 : COPY ./cmake ./cmake ---> Using cache ---> 2201832e7ffe Step 38/46 : COPY ./CMakeLists.txt . ---> Using cache ---> 604590b61f0f Step 39/46 : COPY ./tools/docker/scripts/build_cp2k.sh . ---> Using cache ---> 365edc145129 Step 40/46 : RUN ./build_cp2k.sh toolchain_cuda_V100 psmp ---> Using cache ---> 15f950ae6008 Step 41/46 : COPY ./benchmarks ./benchmarks ---> Using cache ---> 13a1024660a9 Step 42/46 : COPY ./tools/regtesting ./tools/regtesting ---> Using cache ---> 32ea7ac77177 Step 43/46 : COPY ./tools/docker/scripts/test_performance.sh ./tools/docker/scripts/plot_performance.py ./ ---> Using cache ---> dd6848e4c4e7 Step 44/46 : RUN ./test_performance.sh "toolchain_cuda_V100" 2>&1 | tee report.log ---> Using cache ---> cc9f56570b41 Step 45/46 : CMD cat $(find ./report.log -mmin +10) | sed '/^Summary:/ s/$/ (cached)/' ---> Using cache ---> 127c58eabc10 Step 46/46 : ENTRYPOINT [] ---> Using cache ---> 535bdf0e77e7 [Warning] One or more build-args [GIT_COMMIT_SHA SPACK_CACHE] were not consumed Successfully built 535bdf0e77e7 Successfully tagged us-central1-docker.pkg.dev/cp2k-org-project/cp2kci/img_cp2k-perf-cuda-volta:master Pushing new image... done. #################### Running Image cp2k-perf-cuda-volta #################### ============== CP2K Binary Flags ============= cp2kflags: omp libint fftw3 libxc elpa parallel scalapack mpi_f08 cosma xsmm dbcsr_acc sirius offload_cuda spla_gemm_offloading libvdwxc hdf5 ========== Checking Benchmark Inputs ========= Found 83 input files and 0 errors. ========== Running Performance Test ========== Plot: name="total_timings_6cpu_1gpu", title="Total Timings with 6 CPU Cores and 1 GPU", ylabel="time [s]" Running H2O-64.inp with 3 threads and 2 ranks... done. From /workspace/artifacts/H2O-64_6cpu_1gpu.out: ------------------------------------------------------------------------------- - - - T I M I N G - - - ------------------------------------------------------------------------------- SUBROUTINE CALLS ASD SELF TIME TOTAL TIME MAXIMUM AVERAGE MAXIMUM AVERAGE MAXIMUM CP2K 1 1.0 0.030 0.033 109.294 109.295 qs_mol_dyn_low 1 2.0 0.004 0.004 108.860 108.864 qs_forces 11 3.9 0.002 0.002 108.809 108.810 qs_energies 11 4.9 0.001 0.001 97.749 97.751 scf_env_do_scf 11 5.9 0.001 0.001 82.325 82.326 scf_env_do_scf_inner_loop 121 6.6 0.006 0.009 71.156 71.157 velocity_verlet 10 3.0 0.002 0.002 71.025 71.042 rebuild_ks_matrix 132 8.4 0.001 0.001 29.731 29.736 qs_ks_build_kohn_sham_matrix 132 9.4 0.021 0.021 29.730 29.735 dbcsr_multiply_generic 2551 12.5 0.161 0.162 28.871 28.964 qs_ks_update_qs_env 132 7.6 0.001 0.001 27.951 27.955 qs_scf_new_mos 121 7.6 0.001 0.001 24.941 24.943 qs_scf_loop_do_ot 121 8.6 0.001 0.001 24.940 24.942 qs_rho_update_rho_low 132 7.7 0.001 0.001 23.154 23.173 calculate_rho_elec 132 8.7 0.997 1.006 23.153 23.172 ot_scf_mini 121 9.6 0.003 0.003 22.708 22.710 fft_wrap_pw1pw2 1331 11.7 0.025 0.026 18.842 18.879 fft_wrap_pw1pw2_140 539 12.2 0.003 0.003 16.213 16.271 sum_up_and_integrate 132 10.4 0.003 0.003 15.241 15.322 integrate_v_rspace 132 11.4 0.398 0.400 15.136 15.217 multiply_cannon 2551 13.5 0.368 0.371 14.156 14.201 make_m2s 5102 13.5 0.049 0.049 12.839 12.913 multiply_cannon_loop 2551 14.5 0.290 0.293 12.864 12.890 make_images 5102 14.5 1.391 1.444 12.640 12.713 density_rs2pw 132 9.7 0.008 0.008 12.253 12.430 ot_mini 121 10.6 0.001 0.001 12.400 12.404 init_scf_loop 11 6.9 0.000 0.000 11.078 11.079 grid_collocate_task_list 132 9.7 9.866 10.002 9.866 10.002 pw_gpu_r3dc1d_3d_ps 671 13.2 2.652 2.682 9.628 9.637 pw_gpu_c1dr3d_3d_ps 660 14.2 2.539 2.572 9.181 9.210 build_core_hamiltonian_matrix_ 11 4.9 0.001 0.001 7.932 8.033 qs_energies_init_hamiltonians 11 5.9 0.000 0.000 7.821 7.821 hybrid_alltoall_any 5285 16.5 5.446 5.581 7.625 7.679 prepare_preconditioner 11 7.9 0.000 0.000 7.659 7.664 make_preconditioner 11 8.9 0.000 0.000 7.659 7.664 make_images_data 5102 15.5 0.061 0.062 7.526 7.570 grid_integrate_task_list 132 12.4 7.464 7.545 7.464 7.545 qs_ot_get_derivative 121 11.6 0.002 0.002 7.362 7.365 potential_pw2rs 132 12.4 0.041 0.041 7.274 7.276 init_scf_run 11 5.9 0.000 0.000 6.925 6.926 scf_env_initial_rho_setup 11 6.9 0.000 0.001 6.925 6.925 multiply_cannon_multrec 5102 15.5 2.423 2.481 6.768 6.831 make_full_inverse_cholesky 11 9.9 0.000 0.000 6.395 6.658 ot_diis_step 121 11.6 0.006 0.007 5.010 5.010 mp_alltoall_z22v 1331 15.7 4.815 4.931 4.815 4.931 qs_ot_get_p 132 10.4 0.002 0.002 4.733 4.738 mp_waitall_1 71967 17.0 4.407 4.584 4.407 4.584 apply_preconditioner_dbcsr 132 12.6 0.000 0.000 4.298 4.300 apply_single 132 13.6 0.001 0.001 4.298 4.299 build_core_ppl_forces 11 5.9 4.070 4.148 4.070 4.148 wfi_extrapolate 11 7.9 0.001 0.001 4.009 4.009 build_core_hamiltonian_matrix 11 6.9 0.001 0.001 3.954 3.993 dbcsr_mm_accdrv_process 10670 16.2 1.155 1.338 3.902 3.910 dbcsr_complete_redistribute 389 12.7 1.458 1.481 3.595 3.876 calculate_dm_sparse 132 9.5 0.001 0.001 3.612 3.613 qs_env_update_s_mstruct 11 6.9 0.000 0.000 3.496 3.504 qs_ot_p2m_diag 80 11.4 0.146 0.148 3.504 3.504 multiply_cannon_sync_h2d 5102 15.5 3.384 3.464 3.384 3.464 transfer_rs2pw 539 10.7 0.009 0.009 2.928 3.120 qs_ks_update_qs_env_forces 11 4.9 0.000 0.000 3.023 3.024 yz_to_x 671 14.2 0.510 0.515 2.948 3.009 pw_poisson_solve 132 10.4 0.003 0.003 2.985 2.993 cp_dbcsr_syevd 80 12.4 0.008 0.008 2.975 2.975 x_to_yz 660 15.2 0.546 0.550 2.923 2.968 cp_dbcsr_sm_fm_multiply 37 9.5 0.001 0.001 2.942 2.947 qs_ot_get_derivative_diag 75 12.3 0.003 0.003 2.813 2.817 copy_dbcsr_to_fm 183 11.7 0.004 0.004 2.764 2.786 qs_create_task_list 11 7.9 0.000 0.000 2.622 2.667 generate_qs_task_list 11 8.9 1.180 1.182 2.622 2.667 transfer_rs2pw_140 143 11.6 1.712 1.736 2.430 2.636 cp_fm_diag_elpa 80 13.4 0.001 0.001 2.573 2.573 cp_fm_diag_elpa_base 80 14.4 2.529 2.539 2.570 2.571 pw_gpu_fg 671 14.2 2.475 2.501 2.475 2.501 calculate_first_density_matrix 1 7.0 0.000 0.000 2.454 2.455 cp_dbcsr_sm_fm_multiply_core 37 10.5 0.000 0.000 2.373 2.374 cp_fm_cholesky_invert 11 10.9 2.309 2.309 2.309 2.309 dbcsr_special_finalize 7653 15.5 0.043 0.044 2.296 2.299 jit_kernel_multiply 10 15.8 2.067 2.261 2.067 2.261 qs_vxc_create 132 10.4 0.004 0.004 2.220 2.244 xc_vxc_pw_create 132 11.4 0.754 0.761 2.216 2.240 ------------------------------------------------------------------------------- PlotPoint: plot="total_timings_6cpu_1gpu", name="H2O-64", label="H2O-64", y=109.294, yerr=0.0 Plot: name="H2O-64_timings_6cpu_1gpu", title="Timings of H2O-64 with 6 CPU Cores and 1 GPU", ylabel="time [s]" PlotPoint: plot="H2O-64_timings_6cpu_1gpu", name="rest", label="rest", y=77.29599999999999, yerr=0.0 PlotPoint: plot="H2O-64_timings_6cpu_1gpu", name="grid_collocate_task_list", label="grid_collocate_task_list", y=9.866, yerr=0.0 PlotPoint: plot="H2O-64_timings_6cpu_1gpu", name="grid_integrate_task_list", label="grid_integrate_task_list", y=7.464, yerr=0.0 PlotPoint: plot="H2O-64_timings_6cpu_1gpu", name="hybrid_alltoall_any", label="hybrid_alltoall_any", y=5.446, yerr=0.0 PlotPoint: plot="H2O-64_timings_6cpu_1gpu", name="mp_alltoall_z22v", label="mp_alltoall_z22v", y=4.815, yerr=0.0 PlotPoint: plot="H2O-64_timings_6cpu_1gpu", name="mp_waitall_1", label="mp_waitall_1", y=4.407, yerr=0.0 Running H2O-64_nonortho.inp with 3 threads and 2 ranks... done. From /workspace/artifacts/H2O-64_nonortho_6cpu_1gpu.out: ------------------------------------------------------------------------------- - - - T I M I N G - - - ------------------------------------------------------------------------------- SUBROUTINE CALLS ASD SELF TIME TOTAL TIME MAXIMUM AVERAGE MAXIMUM AVERAGE MAXIMUM CP2K 1 1.0 0.028 0.028 95.927 95.927 qs_mol_dyn_low 1 2.0 0.004 0.004 95.494 95.497 qs_forces 11 3.9 0.002 0.002 95.446 95.446 qs_energies 11 4.9 0.001 0.001 84.236 84.237 scf_env_do_scf 11 5.9 0.001 0.001 67.925 67.925 velocity_verlet 10 3.0 0.001 0.002 60.299 60.317 scf_env_do_scf_inner_loop 100 6.5 0.005 0.007 56.742 56.742 rebuild_ks_matrix 111 8.3 0.001 0.001 25.958 25.963 qs_ks_build_kohn_sham_matrix 111 9.3 0.017 0.018 25.957 25.962 dbcsr_multiply_generic 2053 12.4 0.131 0.132 24.124 24.162 qs_ks_update_qs_env 111 7.6 0.001 0.001 23.870 23.875 qs_scf_new_mos 100 7.5 0.001 0.001 19.369 19.388 qs_scf_loop_do_ot 100 8.5 0.001 0.001 19.368 19.387 qs_rho_update_rho_low 111 7.7 0.001 0.001 18.288 18.291 calculate_rho_elec 111 8.7 0.832 0.833 18.288 18.290 ot_scf_mini 100 9.5 0.003 0.003 17.529 17.531 fft_wrap_pw1pw2 1121 11.6 0.021 0.021 15.930 15.933 sum_up_and_integrate 111 10.3 0.002 0.002 13.896 13.922 integrate_v_rspace 111 11.3 0.338 0.340 13.807 13.833 fft_wrap_pw1pw2_140 455 12.2 0.003 0.003 13.698 13.718 multiply_cannon 2053 13.4 0.309 0.313 12.239 12.246 multiply_cannon_loop 2053 14.4 0.235 0.239 11.271 11.281 init_scf_loop 11 6.9 0.000 0.000 11.096 11.096 make_m2s 4106 13.4 0.039 0.040 10.367 10.369 density_rs2pw 111 9.7 0.007 0.007 10.267 10.338 ot_mini 100 10.5 0.001 0.001 10.211 10.216 make_images 4106 14.4 1.107 1.121 10.204 10.205 qs_energies_init_hamiltonians 11 5.9 0.000 0.000 8.794 8.794 pw_gpu_r3dc1d_3d_ps 566 13.1 2.247 2.253 8.195 8.208 build_core_hamiltonian_matrix_ 11 4.9 0.001 0.001 7.903 8.066 prepare_preconditioner 11 7.9 0.000 0.000 7.756 7.762 make_preconditioner 11 8.9 0.000 0.000 7.756 7.762 pw_gpu_c1dr3d_3d_ps 555 14.2 2.119 2.137 7.708 7.723 grid_integrate_task_list 111 12.3 7.323 7.346 7.323 7.346 grid_collocate_task_list 111 9.7 7.162 7.222 7.162 7.222 init_scf_run 11 5.9 0.000 0.000 6.844 6.844 scf_env_initial_rho_setup 11 6.9 0.000 0.001 6.843 6.843 make_full_inverse_cholesky 11 9.9 0.000 0.000 6.493 6.755 qs_ot_get_derivative 100 11.5 0.001 0.001 6.212 6.218 hybrid_alltoall_any 4255 16.3 4.511 4.523 6.181 6.197 multiply_cannon_multrec 4106 15.4 1.995 2.020 6.169 6.188 potential_pw2rs 111 12.3 0.035 0.035 6.146 6.146 make_images_data 4106 15.4 0.049 0.050 6.031 6.046 qs_env_update_s_mstruct 11 6.9 0.000 0.000 4.401 4.548 build_core_ppl_forces 11 5.9 4.041 4.167 4.041 4.167 mp_alltoall_z22v 1121 15.6 4.047 4.052 4.047 4.052 build_core_hamiltonian_matrix 11 6.9 0.001 0.001 3.964 4.008 ot_diis_step 100 11.5 0.005 0.005 3.976 3.976 dbcsr_complete_redistribute 321 12.2 1.450 1.468 3.708 3.970 wfi_extrapolate 11 7.9 0.001 0.001 3.895 3.895 dbcsr_mm_accdrv_process 8792 16.1 1.210 1.522 3.803 3.814 qs_create_task_list 11 7.9 0.000 0.000 3.515 3.617 generate_qs_task_list 11 8.9 1.473 1.496 3.515 3.617 apply_preconditioner_dbcsr 111 12.6 0.000 0.000 3.488 3.491 apply_single 111 13.6 0.001 0.001 3.488 3.490 mp_waitall_1 57939 16.9 3.444 3.451 3.444 3.451 calculate_dm_sparse 111 9.5 0.001 0.001 3.191 3.210 qs_ks_update_qs_env_forces 11 4.9 0.000 0.000 3.141 3.141 qs_ot_get_p 111 10.4 0.001 0.001 3.060 3.064 cp_dbcsr_sm_fm_multiply 37 9.5 0.001 0.001 3.020 3.021 copy_dbcsr_to_fm 149 11.2 0.004 0.004 2.892 2.904 multiply_cannon_sync_h2d 4106 15.4 2.861 2.889 2.861 2.889 transfer_rs2pw 455 10.6 0.007 0.008 2.464 2.579 calculate_first_density_matrix 1 7.0 0.000 0.000 2.515 2.516 pw_poisson_solve 111 10.3 0.003 0.003 2.501 2.507 yz_to_x 566 14.1 0.432 0.435 2.495 2.499 cp_dbcsr_sm_fm_multiply_core 37 10.5 0.000 0.000 2.447 2.447 x_to_yz 555 15.2 0.455 0.456 2.439 2.440 jit_kernel_multiply 10 15.4 2.023 2.347 2.023 2.347 cp_fm_cholesky_invert 11 10.9 2.291 2.291 2.291 2.291 transfer_dbcsr_to_fm 11 10.9 0.001 0.001 2.272 2.284 transfer_rs2pw_140 122 11.5 1.462 1.465 2.049 2.169 pw_gpu_fg 566 14.1 2.120 2.140 2.120 2.140 copy_fm_to_dbcsr 172 11.2 0.001 0.001 1.867 2.133 build_core_ppl 11 7.9 2.065 2.101 2.065 2.101 qs_ot_get_derivative_taylor 55 13.0 0.002 0.002 2.011 2.013 qs_ot_p2m_diag 46 11.0 0.080 0.081 1.990 1.990 dbcsr_special_finalize 6159 15.4 0.035 0.036 1.943 1.950 ------------------------------------------------------------------------------- PlotPoint: plot="total_timings_6cpu_1gpu", name="H2O-64_nonortho", label="H2O-64_nonortho", y=95.927, yerr=0.0 Plot: name="H2O-64_nonortho_timings_6cpu_1gpu", title="Timings of H2O-64_nonortho with 6 CPU Cores and 1 GPU", ylabel="time [s]" PlotPoint: plot="H2O-64_nonortho_timings_6cpu_1gpu", name="rest", label="rest", y=68.843, yerr=0.0 PlotPoint: plot="H2O-64_nonortho_timings_6cpu_1gpu", name="grid_integrate_task_list", label="grid_integrate_task_list", y=7.323, yerr=0.0 PlotPoint: plot="H2O-64_nonortho_timings_6cpu_1gpu", name="grid_collocate_task_list", label="grid_collocate_task_list", y=7.162, yerr=0.0 PlotPoint: plot="H2O-64_nonortho_timings_6cpu_1gpu", name="hybrid_alltoall_any", label="hybrid_alltoall_any", y=4.511, yerr=0.0 PlotPoint: plot="H2O-64_nonortho_timings_6cpu_1gpu", name="mp_alltoall_z22v", label="mp_alltoall_z22v", y=4.047, yerr=0.0 PlotPoint: plot="H2O-64_nonortho_timings_6cpu_1gpu", name="build_core_ppl_forces", label="build_core_ppl_forces", y=4.041, yerr=0.0 Running w64PBE.inp with 3 threads and 2 ranks... failed. ----------------------------------- OT --------------------------------------- Step Update method Time Convergence Total energy Change ------------------------------------------------------------------------------ 1 OT DIIS 0.80E-01 3.9 0.00000471 -1102.7676350024 -4.38E-10 2 OT DIIS 0.80E-01 1.7 0.00000539 -1102.7676348546 1.48E-07 3 OT DIIS 0.80E-01 1.7 0.00000486 -1102.7676348805 -2.59E-08 4 OT DIIS 0.80E-01 1.7 0.00001225 -1102.7676348941 -1.36E-08 5 OT DIIS 0.80E-01 1.7 0.00005090 -1102.7676348939 1.73E-10 6 OT DIIS 0.80E-01 1.7 0.00001213 -1102.7676348940 -8.32E-11 7 OT DIIS 0.80E-01 1.7 0.00000145 -1102.7676348971 -3.12E-09 8 OT DIIS 0.80E-01 1.7 0.00000291 -1102.7676349404 -4.34E-08 9 OT DIIS 0.80E-01 1.7 0.00000179 -1102.7676349530 -1.26E-08 10 OT DIIS 0.80E-01 1.7 0.00000488 -1102.7676349618 -8.75E-09 Leaving inner SCF loop after reaching 10 steps. Electronic density on regular grids: -512.0000000044 -0.0000000044 Core density on regular grids: 511.9999999998 -0.0000000002 Total charge density on r-space grids: -0.0000000045 Total charge density g-space grids: -0.0000000045 Overlap energy of the core charge distribution: 0.00000091569564 Self energy of the core charge distribution: -2838.67351367283345 Core Hamiltonian energy: 824.05924223467923 Hartree energy: 1182.15847701058419 Exchange-correlation energy: -270.31184144987799 Total energy: -1102.76763496175226 outer SCF iter = 10 RMS gradient = 0.49E-05 energy = -1102.7676349618 ----------------------------------- OT --------------------------------------- Minimizer : DIIS : direct inversion in the iterative subspace using 7 DIIS vectors safer DIIS on Preconditioner : FULL_SINGLE_INVERSE : inversion of H + eS - 2*(Sc)(c^T*H*c+const)(Sc)^T Precond_solver : DEFAULT stepsize : 0.08000000 energy_gap : 0.08000000 eps_taylor : 0.10000E-15 max_taylor : 4 ----------------------------------- OT --------------------------------------- Step Update method Time Convergence Total energy Change ------------------------------------------------------------------------------ 1 OT DIIS 0.80E-01 3.9 0.00000901 -1102.7676349672 -5.46E-09 2 OT DIIS 0.80E-01 1.7 0.00000317 -1102.7676344772 4.90E-07 3 OT DIIS 0.80E-01 1.7 0.00000228 -1102.7676347209 -2.44E-07 4 OT DIIS 0.80E-01 1.7 0.00000805 -1102.7676348898 -1.69E-07 5 OT SD 0.80E-01 1.7 0.00001646 -1102.7676348899 -7.07E-11 6 OT DIIS 0.80E-01 1.7 0.00000655 -1102.7676331015 1.79E-06 7 OT DIIS 0.80E-01 1.7 0.00000183 -1102.7676348931 -1.79E-06 8 OT DIIS 0.80E-01 1.7 0.00001112 -1102.7676348959 -2.80E-09 9 OT SD 0.80E-01 1.7 0.00001472 -1102.7676349235 -2.75E-08 10 OT DIIS 0.80E-01 1.7 0.00000497 -1102.7676336146 1.31E-06 Leaving inner SCF loop after reaching 10 steps. Electronic density on regular grids: -512.0000000044 -0.0000000044 Core density on regular grids: 511.9999999998 -0.0000000002 Total charge density on r-space grids: -0.0000000045 Total charge density g-space grids: -0.0000000045 Overlap energy of the core charge distribution: 0.00000091569564 Self energy of the core charge distribution: -2838.67351367283345 Core Hamiltonian energy: 824.05926407005745 Hartree energy: 1182.15845546203695 Exchange-correlation energy: -270.31184038957821 Total energy: -1102.76763361462167 outer SCF iter = 11 RMS gradient = 0.50E-05 energy = -1102.7676336146 outer SCF loop FAILED to converge after 11 iterations or 110 steps ******************************************************************************* * ___ * * / \ * * [ABORT] * * \___/ SCF run NOT converged. To continue the calculation regardless, * * | please set the keyword IGNORE_CONVERGENCE_FAILURE. * * O/| * * /| | * * / \ qs_scf.F:685 * ******************************************************************************* ===== Routine Calling Stack ===== 5 scf_env_do_scf 4 qs_energies 3 qs_forces 2 qs_mol_dyn_low 1 CP2K Abort(1) on node 0 (rank 0 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0 Note: The following floating-point exceptions are signalling: IEEE_UNDERFLOW_FLAG IEEE_DENORMAL STOP 1 Summary: Running w64PBE.inp failed. (cached) Status: FAILED Uploading artifacts... done EndDate: 2026-06-04 06:11:54+00:00