Тёмный

Faster AlphaFold protein structure predictions using ColabFold 

UCSF ChimeraX
Подписаться 2,8 тыс.
Просмотров 25 тыс.
50% 1

ChimeraX now uses an optimized version of AlphaFold called ColabFold to prediction protein structures that takes tens of minutes instead of hours. This video shows how to predict structures with ColabFold in ChimeraX.

Наука

Опубликовано:

 

17 июл 2022

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 38   
@shruthikrishnaswamy1539
@shruthikrishnaswamy1539 Год назад
thank you so much, this was very helpful for my project. It took a whopping 22 minutes for me. But I believe faster versions will pop up soon.
@laurenegan2918
@laurenegan2918 Год назад
I never comment, but honestly thank you so much for this! So straightforward and I was about to give up completely on alpha fold!
@ucsfchimerax8387
@ucsfchimerax8387 Год назад
Glad you found the video useful. ChimeraX uses ColabFold which is an optimized version of AlphaFold and you can also run ColabFold without ChimeraX using their web page colab.research.google.com/github/sokrypton/ColabFold/blob/main/AlphaFold2.ipynb. Setting up AlphaFold on your own computer is a lot of trouble, requiring Linux, Docker, about 3 Tbytes of free disk and days to download databases, and hopefully a high-end Nvidia GPU, and even after all that work it is 10 times slower than ColabFold. So thanks should go to the developers of ColabFold.
@alexkukreja3509
@alexkukreja3509 Год назад
Hi. First, thanks for all of the helpful videos that ChimeraX puts out to use this great software. I really enjoy using it, especially with the Colabfold features added in. Second, is there a way to access the sequence alignments that Colabfold finds for your protein sequences at the beginning of the structure prediction? It would be nice to see which organisms these alignments are coming from and be able to analyze in greater detail the sequence variation.
@wubishetmengistu5874
@wubishetmengistu5874 2 месяца назад
Thanks for sharing.
@user-TengfeiLiu
@user-TengfeiLiu Год назад
excellent
@user-ui3hb5pc2w
@user-ui3hb5pc2w 9 месяцев назад
Hi, Very very nice explanations, indeed, I know that this video is not new but I've seen it just today. I have a quetsion, I use version 1.6.1 (2023-05-09) of Chimera X and I would like to know how to proceed to run colabfold (now it is the 1.5.2) there is only the command alphafold on this version, in th ebeginning of your video you said that you modified chimera to do this. So, how to do that is my naive question. Thanks a lot, Didier
@nickinner3118
@nickinner3118 Год назад
Thanks so much for this helpful video and clear explanation! I just wonder why you predicted two proteins? Do you think this can predict ligand bind to the protein? Such as to see if ligand bind or not to the target protein? Many thanks
@ucsfchimerax8387
@ucsfchimerax8387 Год назад
Most proteins function as part of multi-protein complexes, so it is useful to predict complexes of more than one protein. AlphaFold cannot make predictions with ligands, ions, solvent, nucleic acids. It only handles the 20 standard (unmodified) amino acids.
@lokkaf3526
@lokkaf3526 5 месяцев назад
I want to predict my vaccine structure. Do I also put the adjuvant sequence or just the epitope protein from MHC 1 AND 2? Any response is appreciated.
@harisjan6047
@harisjan6047 9 месяцев назад
My protein is above 3000 Amino acid sequence, and it's full pdb structure is not available, what you recommend which server / tool should I use , please reply?
@jiaqiaozhou9676
@jiaqiaozhou9676 Год назад
Thanks for the tutorial. I wonder if in current version of ChimeraX, the Tools - Structure Prediction - Alphafold will directly run colabfold instead of the original Alphafold.
@ucsfchimerax8387
@ucsfchimerax8387 Год назад
That is the message of this video. ChimeraX has used ColabFold since mid July. ColabFold is used by ChimeraX versions 1.3 and 1.4 and more recent versions.
@user-dx7nk6bl9s
@user-dx7nk6bl9s Год назад
What is use_amber among the functions of Colabfold? And in the pseudocode "to specify inter-protein chainbreaks for modeling complexes", does chainbreaks mean chain damage such as SS break?
@ucsfchimerax8387
@ucsfchimerax8387 Год назад
The ChimeraX AlphaFold user interface does not have a use_amber flag, but I guess you are looking at ColabFold code. That probably means to energy minimize the predicted structures using the Amber force field using OpenMM. The ChimeraX flag to enable that is in the AlphaFold panel, press Options, it is called "Energy-minimize predicted structures" and is off by default.
@zhenwang323
@zhenwang323 Год назад
Thank you for the tutorial. I wonder how I can display the protein by prediction confidence. I just displayed it by chain and I wanted to reset it to displaying by prediction confidence.
@ucsfchimerax8387
@ucsfchimerax8387 Год назад
The AlphaFold pLDDT per-residue confidence values are in the bfactor column of the PDB file. You can color the ribbon by confidence values using ChimeraX command "color bfactor palette alphafold". Also the AlphaFold Error Plot panel (menu Tools / Structure Prediction) that shows the predicted aligned error has a button "Color pLDDT" that does the same thing.
@ucsfchimerax8387
@ucsfchimerax8387 Год назад
More info about coloring by PAE and pLDDT is here: www.rbvi.ucsf.edu/chimerax/data/pae-apr2022/pae.html
@user-hn1qm4ui4x
@user-hn1qm4ui4x Год назад
Thanks for this video.I wonder which version of chimeraX is used in this video.
@ucsfchimerax8387
@ucsfchimerax8387 Год назад
The first line of the ChimeraX Log panel in the video shows the version is an August 15, 2022 daily build. But all ChimeraX versions 1.4 and newer are using ColabFold. ChimeraX loads the Python script from GitHub (github.com/RBVI/ChimeraX/blob/develop/src/bundles/alphafold/src/alphafold21_predict_colab.ipynb) each time you run an AlphaFold prediction and that script is periodically updated and all ChimeraX versions 1.4 and newer use it.
@shyamabhatt8780
@shyamabhatt8780 Год назад
Hi, is there any way to pin point at what specific residue the proteins are interacting at?
@ucsfchimerax8387
@ucsfchimerax8387 Год назад
If you predict a multiprotein complex the interface between the proteins often contains tens to hundreds of residues. Lots of ChimeraX tools help you look at that, such as the Contacts tool and Interfaces tool. There is documentation online.
@drgul248
@drgul248 Год назад
there are many Collab notebooks, which one is best for predicting protein complexes? and which google Collab notebook is used by chimeraX?
@ucsfchimerax8387
@ucsfchimerax8387 Год назад
ChimeraX uses ColabFold which is an optimized version of AlphaFold that is about 10 times faster. It also uses enhanced sequence databases. This is described in the ChimeraX documentation, google search chimerax alphafold and you will find it including a reference to the ColabFold journal article.
@kabirbiswas7129
@kabirbiswas7129 Год назад
I was trying to predict a dimer of a relative long protein (875 aa long) but getting the following error: 19:39:12 Could not predict af1750. Not Enough GPU memory? INTERNAL: CUBLAS_STATUS_EXECUTION_FAILED 19:39:12 Done Downloading structure predictions to directory Downloads/ChimeraX/AlphaFold cp: cannot stat '*_relaxed_rank_1_model_*.pdb': No such file or directory cp: cannot stat '*_unrelaxed_rank_1_model_*_scores.json': No such file or directory Is there a way to resolve this error? I am using a Google Colab subscription. Thanks.
@ucsfchimerax8387
@ucsfchimerax8387 Год назад
You are predicting for 1750 residues which is too large. As the error message says, not enough GPU memory. Unfortunately old Google Colab GPUs (16 GB memory) often run out of memory above around 1000 amino acids. To run these larger structures requires installing your own AlphaFold on a Linux machine with a high-end GPU -- not an easy task. Here are some benchmarks for large sequences running AlphaFold www.rbvi.ucsf.edu/chimerax/data/alphafold-jan2022/afspeed.html
@ArpitaDas-ss8bz
@ArpitaDas-ss8bz Год назад
Hi, can you please help me with modified sequence predictions? like I have a protein that has an acetyl group, how to input that in the sequence to predict hetero dimeric structure?
@ucsfchimerax8387
@ucsfchimerax8387 Год назад
AlphaFold only predicts structures containing the 20 standard amino acids.
@oronoko
@oronoko Год назад
I'm running ChimeraX version 1.5 (2022-11-24), but it launches alphafold21_predict_colab.ipynb as in your other video ("Running AlphaFold to Predict Protein Complexes from ChimeraX") and not colabfold_predict.ipynb as shown here.. do you know how to change this? thanks!
@ucsfchimerax8387
@ucsfchimerax8387 Год назад
All versions of ChimeraX use the same version of ColabFold. The alphafold21_predict_colab.ipynb notebook that runs the calculation on Google Colab is identical to colabfold_predict.ipynb. Also ChimeraX fetches this script from GitHub for each prediction which allows me to periodically update the script to newer AlphaFold / ColabFold versions. In the next few months I will hopefully be updating to ColabFold that uses AlphaFold 2.3 which is more memory efficient allowing larger structures to be predicted.
@yogeshbudhathoki6630
@yogeshbudhathoki6630 Год назад
What is the maximum number of sequence (residues) it can do?
@ucsfchimerax8387
@ucsfchimerax8387 Год назад
Google Colab only offers old GPUs with at most 16 GB of memory that can handle total sequence length of about 1000. More details here: www.rbvi.ucsf.edu/pipermail/chimerax-users/2022-October/004507.html
@nengyaogoh1795
@nengyaogoh1795 Год назад
Hi, is there any way to predict a homodimer protein?
@ucsfchimerax8387
@ucsfchimerax8387 Год назад
Yes, paste two copies of the sequence into the AlphaFold Panel separated by a comma and press the Predict button.
@nengyaogoh1795
@nengyaogoh1795 Год назад
@@ucsfchimerax8387 thank you !
@gabrielgoetten
@gabrielgoetten Год назад
Many thanks for this update but I also encountered an error, which seems to have a rather complicated solution. However, before the crash I was able to obtain the unrelaxed model... --------------------------------------------------------------------------- ValueError Traceback (most recent call last) in () 393 remove_from_list(seq_list, 'prokaryote') # Obsolete "prokaryote" flag 394 --> 395 run_prediction(seq_list, use_templates = use_templates, energy_minimize = not dont_minimize) 3 frames /usr/local/lib/python3.7/dist-packages/alphafold/model/model.py in predict(self, feat, random_seed) 192 193 sub_feat["prev"] = result["prev"] --> 194 result, _ = self.apply(self.params, key, sub_feat) 195 confidences = get_confidence_metrics(result, multimer_mode=self.multimer_mode) 196 if self.config.model.stop_at_score_ranker == "plddt": ValueError: INTERNAL: Failed to launch CUDA kernel: fusion_848 with block dimensions: 96x1x1 and grid dimensions: 22240x1x1: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered
@ucsfchimerax8387
@ucsfchimerax8387 Год назад
I've seen various CUDA_ERROR_ILLEGAL_ADDRESS errors running AlphaFold. CUDA is the language used on the Nvidia graphics processor. I think this error is also usually associated with running sequences that are too long and result from running out of memory. But I don't have good evidence for that. The same error occurs on the PDB 6UM1 test case shown here www.rbvi.ucsf.edu/chimerax/data/alphafold-jan2022/afspeed.html. If you had a more modern GPU with more memory than what Google Colab offers it would probably work. Another possibility is that it is a bug. To test that you could slight vary your input, for example, deleting one residue at the end of the sequence and seeing if the same error occurs.
Далее
Все кругом Миланы... 🤣
00:12
Просмотров 206 тыс.
Send this to an artist… 😉 #shortsart
00:19
Просмотров 3,2 млн
How AlphaFold solves protein folding
12:43
Просмотров 14 тыс.
John Jumper: "Structure Prediction with AlphaFold"
18:58
DeepMind AlphaFold 3 - This Will Change Everything!
9:47
How to interpret AlphaFold structures
1:40:08
Просмотров 51 тыс.
S-Pen в Samsung достоин Золота #Shorts
0:38