Choosing Resources: CPUs / MEM / GPU

If you have followed the previous guides, you have already learned how to submit a SLURM job. Now, it’s time to choose the appropriate resources for your job. This point is briefly addressed in the FAD.

MARBEC-GPU is a shared resource with great freedom of use (no quota on computing hours or number of simultaneous jobs) and a simple scheduling system: first come, first served. Therefore, it is crucial to choose the allocated resources correctly.

IMPORTANT: The execution time of a job depends on several factors, including the size of your data, the resources used, but mainly on how you have written your code or the packages used. Most codes does not allow parallelization of computations (using multiple CPU cores at the same time) and even fewer support GPU execution. Thus, even if you request 10 CPU cores and a GPU, it is quite possible that your job only uses one CPU core and not use the GPU at all. Furthermore, taking twice as many CPUs does not reduce the execution time by half; it is up to you to test and find a good compromise between allocated resources and execution time.

Good Practices

Learn about the executed code

Before running your job on MARBEC-GPU, check if your code parallelizes calculations and/or uses the GPU. If you see parameters like “workers”, “n_jobs”, “n_cpus”, “device”, or “gpu” in the documentation, it’s a good sign that the code can parallelize calculations and/or use the GPU. You can also refer to the documentation of the packages used (for commonly used packages, LLMs will know how to answer you; for niche packages or languages, don’t hesitate to ask the author). If you don’t find these parameters, it is likely that the code does not use these resources.

Choose the allocated resources
- CPU: If your code parallelizes calculations, you can allocate multiple CPU cores: #SBATCH -c. For typical uses in Python or R for machine learning, data extraction, etc., choosing between 2 and 16 CPU cores is often sufficient. For more specific uses, you may need to adjust accordingly.
- MEM: The amount of RAM requested mainly depends on the size of your data and the size of the models used. For typical uses, between 8 and 64 GB of RAM is often sufficient. For heavier data processing tasks, you can go up to 256 GB of RAM.
- GPU: If your code uses the GPU, you can allocate 1 GPU.
Check resource usage

After your job is completed, you can check the effective percentage of resource usage with the command reportseff --user <username> (for more information on the command, use reportseff --help).

Note

Be aware that session jobs (named pawner-jupyterhub) also appear in the reportseff, you can ignore them.

Then, adjust future allocations based on effective usage and computation time.