[question]: command line params needed to run localcolabfold on a local server #260

tamuanand · 2024-09-24T23:21:38Z

Hi @YoshitakaMo and localcolabfold team,

First of all, huge thanks for providing this tool.

I have a question: What are the set of parameters to use with colabfold_batch and colabfold_search when running everything in a local server with gpus and no data traversal to the MSA server on ColabFold on Google Colaboratory - #258 (comment)

In this regard, I happened to see this issue in the ColabFold repository and within that issue, I happened to see a detailed blog post from @mavericb as a comment with the blog being here: https://www.blopig.com/blog/2024/04/dockerized-colabfold-for-large-scale-batch-predictions/

I notice that @mavericb uses colabfold_batch and colabfold_search this way (with docker):

colabfold_search \
--mmseqs /usr/local/envs/colabfold/bin/mmseqs \
input.fasta database msas \
> /search.log 2>&1 \
&& colabfold_batch msas predictions \
> batch.log 2>&1

Questions:

Is the above command line from @mavericb the one to use (I would probably add --amber --use-gpu-relax or should I use the command line as shown here with the use of --use-env 1 --use-templates 1 --db-load-mode 2 for colabfold_search and use of -pdb-hit-file ... --local-pdb-path in colabfold_batch? What's the main difference between the 2 ways
The FAQS here suggest that ColabFold does not support multiple GPUs. Is it true for localcolabfold too where I want to run everything on my localserver via a slurm queue which has some single GPU machines and some multi-GPU machines? Related to this, what happens if a job lands on a multi-gpu machine (I am assuming the job will only run on 1 GPU - is my assumption correct?)

Thanks in advance.

The text was updated successfully, but these errors were encountered:

YoshitakaMo · 2024-09-25T04:02:17Z

Answer to Q1.

Please see the help message (colabfold_batch --help or colabfold_search --help), FAQ of this repo, and ColabFold paper. --amber and --use-gpu-relax of colabfold_batch are optional flags. It depends on your purpose whether you use them or not.

--use-env 1, --use-templates 1, --db-load-mode 2 are the optional flags of colabfold_search. --use-env 1 and --use-templates 1 yield more diverse MSAs from metagenomics database (colabfold_envdb_202108_db) and template information (as output file foo.m8), respectively. --db-load-mode 2 is a mmseqs2's flag. See the instruction.

Q2.

Is it true for localcolabfold too where I want to run everything on my localserver via a slurm queue which has some single GPU machines and some multi-GPU machines?

Yes.

Related to this, what happens if a job lands on a multi-gpu machine?

It will be a waste of GPU resources.

You might want to specify the number of GPUs and CPUs when submitting a job with Slurm. For example, if a compute node foo01 has 16-core CPUs and 4 GPUs, you can submit 4 different jobs simultaneously to the foo01 node by specifying 4 cores and 1 GPU in Slurm. Since colabfold_search and colabfold_batch do not require more than 4~8 CPU cores, this is the best option.

tamuanand · 2024-09-29T23:21:30Z

Thanks @YoshitakaMo

tamuanand closed this as completed Sep 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[question]: command line params needed to run localcolabfold on a local server #260

[question]: command line params needed to run localcolabfold on a local server #260

tamuanand commented Sep 24, 2024

YoshitakaMo commented Sep 25, 2024

tamuanand commented Sep 29, 2024

[question]: command line params needed to run localcolabfold on a local server #260

[question]: command line params needed to run localcolabfold on a local server #260

Comments

tamuanand commented Sep 24, 2024

YoshitakaMo commented Sep 25, 2024

tamuanand commented Sep 29, 2024