跳转到内容

AlphaFold

维基百科,自由的百科全书
three individual polypeptide chains at different levels of folding and a cluster of chains
氨基酸折叠形成蛋白质

AlphaFold(直译:阿尔法折叠)是Alphabet旗下Google旗下DeepMind开发的一款蛋白质结构预测程式[1]。该程序被设计为一个深度学习系统[2]

AlphaFold人工智能有2个主要版本:AlphaFold 1(2018)和AlphaFold 2(2020)。前者使用AlphaFold 1在2018年12月的第13届CASP(英语:Critical Assessment of protein Structure Prediction,直译:蛋白质结构预测的关键评估)的排名中第一。该程序特别成功地预测了被竞赛组织者评为最困难的目标的最准确结构,其中没有来自具有部分相似序列的蛋白质的现有模板结构。

蛋白质通过卷曲折叠会构成三维结构,蛋白质的功能正由其结构决定。了解蛋白质结构有助于开发治疗疾病的药物[3]。DeepMind称,AlphaFold能在数天内识别蛋白质的形状,而此前学界识别蛋白质形状经常需花费数年时间[4]。2020年11月,在第14届CASP(英语:Critical Assessment of protein Structure Prediction,直译:蛋白质结构预测的关键评估)竞赛中[5],AlphaFold 2(2020)表现良好,中位分数为92.4(满分100分)[6]。它的准确度远远高于其他任何程序[7]

2021年7月15日,AlphaFold 2论文在《自然》杂志上作为高级访问出版物与开源软件和可搜索的物种蛋白质组数据库一起发表[8][9][10]

2024年5月8日,AlphaFold 3发布。它可以预测蛋白质与DNA、RNA、各种配体和离子形成的复合物的结构。[7]

蛋白质折叠问题

[编辑]

蛋白质由蛋白质一级结构组成,蛋白质折叠的过程中蛋白质会自发折叠形成蛋白质三级结构。蛋白质结构对蛋白质生物学功能至关重要。然而,了解氨基酸序列如何确定蛋白质三级结构极具挑战性,这被称为“蛋白质折叠问题”。[11]“蛋白质折叠问题”涉及折叠稳定结构的原子间力热力学、蛋白质以极快速达到其最终折叠状态的机制和途径,以及如何从氨基酸序列预测蛋白质天然结构。[12]

蛋白质结构过去通过诸如X射线晶体学低温电子显微镜核磁共振等技术进行实验确定,这些技术既昂贵又耗时。[11]

过去60年努力只确定了约170,000种蛋白质结构,而所有生命形式中已知蛋白质超过2亿种。[13]

如果可以仅从氨基酸序列预测蛋白质结构,将极大地促进科学研究。然而利文索尔佯谬表明,虽蛋白质可在几毫秒内折叠,但随机计算所有可能的结构以确定真正的天然结构所需的时间比已知宇宙的年龄要长,这使得预测蛋白质为科学家们构建了生物学中的一项重大挑战。[11]

多年来,研究人员应用了许多计算方法来解决蛋白质结构预测问题,但除了小而简单的蛋白质外,它们准确性还远远远没有接近实验技术,从而限制了科学研究。

CASP于1994年发起,旨在挑战科学界做出最好的蛋白质结构预测,结果对于最困难的到2016年的蛋白质发现GDT分数也只能达到100满分的40分。[13]

2018年,AlphaFold使用人工智能深度学习技术参加CASP[11]

算法

[编辑]
已隐藏部分未翻译内容,欢迎参与翻译

DeepMind 已知在一个公开的蛋白质序列与结构数据库中,训练了超过 170,000 种蛋白质。该程式使用了一种注意力网络,这是一种深度学习技术,专注于让AI算法辨识较大问题中的各个部分,然后将这些部分组合起来,以获得整体解决方案。[2] 整体训练是在 100 到 200 个 GPU 的运算能力下进行的。[2] 在这些硬件上训练系统花费了“数周”时间,随后该程式在对每个结构进行收敛时仅需“数天”。[14]

AlphaFold 1(2018)

[编辑]

DeepMind 的 AlphaFold 1(2018 年)是基于 2010 年代由不同团队开发的研究成果所建立,这些研究利用来自许多不同生物的大型 DNA 序列数据库(大多数尚未知道其 3D 结构),试图在不同残基中找到可能存在的相关变化,即使这些残基在主链中并不相邻。这些相关性显示这些残基在物理上可能彼此接近,即使在序列中距离较远,从而能够估算出一个接触图。基于 2018 年之前的研究成果,AlphaFold 1 将这种方法扩展,估算残基之间距离的几率分布,将接触图转换为可能的距离图。它还使用比以往更先进的学习方法来进行推论。团队结合基于这种几率分布的统计势能以及配置的局部吉布斯自由能,透过梯度下降法来寻找最符合这两者的解。[需要解释][15][16]

在技术层面上,Torrisi 等人在 2019 年对 AlphaFold 1 的方法做了如下总结:[17]

AlphaFold 的核心是一个距离图预测器,由一个非常深的残差神经网络实作,使用 220 个残差模块来处理 64×64×128 的表示,这对应于两个 64 个氨基酸片段计算得来的输入特征。每个残差模块都有三层,包括一个 3×3 的扩张卷积层,这些模块轮流使用扩张值 1、2、4 和 8。整体模型拥有 2,100 万个参数。该网络结合了一维和二维输入,包括来自不同来源的演化特征档案以及共演化特征。除了非常细致的距离直方图形式的距离图外,AlphaFold 还会预测每个残基的Φ 和 Ψ 角度,并以此建立初步的 3D 结构预测。AlphaFold 的研究团队认为,模型的深度、大型的裁剪尺寸、大约 29,000 种蛋白质的大型训练集、现代深度学习技术以及距离直方图所提供的丰富资讯,都是让 AlphaFold 能够达成高精确度接触图预测的关键。

AlphaFold 2(2020)

[编辑]
File:AlphaFold 2 block design.png
AlphaFold 2 设计。(源:[14])

2020 年版本的程式(AlphaFold 2,2020 年)与 2018 年在 CASP 13 中获胜的原始版本有显著不同,根据 DeepMind 团队的说法。[18][19]

DeepMind 团队指出,其先前的方法结合了局部物理与基于模式识别的引导势能,但这种方法往往会过度考虑序列中彼此相邻的残基之间的相互作用,相较之下,对于链条中距离较远的残基之间的相互作用则考虑不足。因此,AlphaFold 1 倾向于偏好带有比实际情况更多二级结构(如α螺旋β折板)的模型,这是一种过拟合现象。[20]

AlphaFold 1 中使用的软件设计包含许多独立训练的模组,这些模组被用来生成引导势能,然后与基于物理的能量势能结合。AlphaFold 2 则用一套由多个子网络组成的单一可微分端对端模型取代,完全基于模式识别,并作为单一整合结构进行训练。[19][21] 局部物理(基于 AMBER 模型的能量微调)仅作为神经网络预测收敛后的最终微调步骤,且只对预测结构进行轻微调整。[20]

2020 年系统的关键部分是两个模组,据信基于Transformer设计,用于逐步完善每个蛋白质中氨基酸残基与另一残基之间(以图论中的“”表示)的关系资讯向量(绿色阵列表示),以及每个氨基酸位置与输入序列比对中不同序列之间的关系(红色阵列表示)。[21] 在内部,这些微调转换包含多层结构,透过训练数据学习的情境相关方式,将相关数据汇聚并过滤掉不相关数据(“注意力机制”)。这些转换会重复进行,每一步的输出作为下一步的输入,并让精炼后的残基/残基资讯进一步完善残基/序列资讯,反之亦然。[21]

这些迭代的输出最终会用于结构预测模组,[21] 该模组同样使用 Transformer,[22] 并且同样进行多次迭代。在 DeepMind 提供的范例中,结构预测模组于首次迭代即达成正确拓扑,GDT_TS 分数为 78,但有大量(90%)立体化学违规(如不合理的键角或键长)。随着后续迭代进行,违规数量逐渐下降。至第三次迭代时,GDT_TS 分数接近 90,第八次迭代时违规数量接近零。[23]

AlphaFold 团队于 2020 年 11 月表示,他们认为 AlphaFold 尚有发展空间,并能在准确性方面进一步提升。[18]

最初的训练数据仅限于单一胜肽链。然而,2021 年 10 月的更新版本 AlphaFold-Multimer 将蛋白质复合物纳入训练数据。DeepMind 表示,这项更新在准确预测蛋白质-蛋白质互动方面的成功率约为 70%。[24]

竞赛

[编辑]

CASP13

[编辑]

2018年12月,DeepMind的AlphaFold在第13届蛋白质结构预测技术评估(CASP)中,于总体排名中位居第一。[25][26]

该程序在预测被竞赛组织者评定为最难的目标结构时特别成功,这些目标是没有现有的模板结构可供参考,且蛋白质的序列部分相似。AlphaFold在此类目标中,对43个蛋白质目标中的25个给出了最佳预测,[26][27][28] 在CASP的全球距离测试(GDT)中取得了58.9的中位数分数,超过了排名第二和第三的两支队伍,分别为52.5和52.4,这两队也在使用深度学习估算接触距离。[29],他们也在使用深度学习估算接触距离。[30][31] 整体来说,该程序在所有目标中获得了68.5的GDT分数。[32]

2020年1月,AlphaFold 1的实现及示范代码在GitHub上公开发布开源[33][11] 但正如该网站的“读我”文件中所述:“此代码无法用于预测任意蛋白质序列的结构。它仅能用于预测CASP13数据集中的结构(下面链接)。特征生成代码与我们的内部基础设施以及外部工具紧密结合,因此我们无法开源它。”因此,实际上,所存放的代码并不适合一般用途,而仅限于CASP13蛋白质。该公司截至2021年3月5日,尚未宣布有关公开其代码的计划。

CASP14

[编辑]

在2020年11月,DeepMind的新版本AlphaFold 2在CASP14竞赛中获得了第一名。该程式对97个目标中的88个做出了最佳预测。根据竞赛中的全球距离测试(GDT)衡量,AlphaFold 2的中位数分数达到了92.4(满分为100),这意味着超过一半的预测结果在原子位置上的准确度超过92.4%,这一准确度被认为与X光晶体学等实验技术相当。

AlphaFold 2在这次竞赛中的表现远远超过了2018年AlphaFold 1的成绩,当时只有2个预测达到相同的准确度。88%的预测结果在GDT_TS得分上超过了80,其中在最困难的目标群体中,AlphaFold 2的中位数得分为87。

此外,AlphaFold 2在蛋白质骨架主链的α碳原子位置的根均方误差(RMSD)方面也表现出色,88%的预测结果在RMSD上小于4Å,76%的预测结果达到3Å以下,46%的预测结果更是精确到2Å以下。

AlphaFold 2的模型准确度被描述为“非常非常出色”,尤其是在表面侧链的建模上。为了进一步验证AlphaFold 2的准确性,竞赛主办方邀请了四个领先的实验小组来测试那些他们无法确定结构的蛋白质,并且AlphaFold 2生成的三维模型在这些蛋白质结构的确定中足够准确,这些结构被用于分子替代法。

尽管如此,AlphaFold 2在三个结构的预测上表现较差,其中两个来自于蛋白质NMR(核磁共振)技术,这些结构是直接在水溶液中确定的,而AlphaFold大多数是基于X光晶体学数据进行训练的。另外一个则是多结构复合体,包含52个相同的蛋白质结构单元,这样的情况AlphaFold并未专门设计来处理。对于单一结构的所有目标(除了一个非常大的蛋白质和两个NMR结构),AlphaFold 2的GDT_TS得分均超过80。

== 看法 == AlphaFold 2 在 CASP全球距离测试 (GDT) 中得分超过 90 被认为是 计算生物学 领域的重大成就[13],也是迈向生物学数十年来的重大挑战的一大进步。诺贝尔化学奖 得主、结构生物学Venki Ramakrishnan 称这一结果是“在蛋白质折叠问题上的惊人进展”,[13] 并补充道:“这一进展发生得比很多领域内的专家预测的要早几十年。看到它将如何根本改变生物学研究,将是非常令人兴奋的。”[14]

Propelled by press releases from CASP and DeepMind,[34][14] AlphaFold 2's success received wide media attention.[35] As well as news pieces in the specialist science press, such as Nature,[36] Science,[13] MIT Technology Review,[2] and New Scientist,[37][38] the story was widely covered by major national newspapers,[39][40][41][42] as well as general news-services and weekly publications, such as Fortune,[43][19] The Economist,[18] Bloomberg,[32] Der Spiegel,[44] and The Spectator.[45] In London The Times made the story its front-page photo lead, with two further pages of inside coverage and an editorial.[46][47] A frequent theme was that ability to predict protein structures accurately based on the constituent amino acid sequence is expected to have a wide variety of benefits in the life sciences space including accelerating advanced drug discovery and enabling better understanding of diseases.[36][48] Writing about the event, the MIT Technology Review noted that the AI had "solved a fifty-year old grand challenge of biology."[2] The same article went on to note that the AI algorithm could "predict the shape of proteins to within the width of an atom."[2]

As summed up by Der Spiegel reservations about this coverage have focussed in two main areas: "There is still a lot to be done" and: "We don't even know how they do it".[49]

Although a 30-minute presentation about AlphaFold 2 was given on the second day of the CASP conference (December 1) by project leader John Jumper,[50] it has been described as "exceedingly high-level, heavy on ideas and insinuations, but almost entirely devoid of detail".[7]Template:Unreliable source Unlike other research groups presenting at CASP14, DeepMind's presentation was not recorded and is not publicly available. DeepMind is expected to publish a scientific paper giving an account of AlphaFold 2 in the proceedings volume[何时?] of the CASP conference; but it is not known whether it will go beyond what was said in the presentation.

Speaking to El País, researcher Alfonso Valencia said "The most important thing that this advance leaves us is knowing that this problem has a solution, that it is possible to solve it... We only know the result. Google does not provide the software and this is the frustrating part of the achievement because it will not directly benefit science."[42] Nevertheless, as much as Google and DeepMind do release may help other teams develop similar AI systems, an "indirect" benefit.[42] In late 2019 DeepMind released much of the code of the first version of AlphaFold as open source; but only when work was well underway on the much more radical AlphaFold 2. Another option it could take might be to make AlphaFold 2 structure prediction available as an online black-box subscription service. Convergence for a single sequence has been estimated to require on the order of $10,000 worth of wholesale compute time.[51] But this would deny researchers access to the internal states of the system, the chance to learn more qualitatively what gives rise to AlphaFold 2's success, and the potential for new algorithms that could be lighter and more efficient yet still achieve such results. Fears of potential for a lack of transparency by DeepMind have been contrasted with five decades of heavy public investment into the open Protein Data Bank and then also into open DNA sequence repositories, without which the data to train AlphaFold 2 would not have existed.[52][53][54]

Of note, on June 18th, 2021 Demis Hassabis tweeted: "Brief update on some exciting progress on #AlphaFold! We’ve been heads down working flat out on our full methods paper (currently under review) with accompanying open source code and on providing broad free access to AlphaFold for the scientific community. More very soon!"[55]

However it is not yet clear to what extent structure predictions made by AlphaFold 2 will hold up for proteins bound into complexes with other proteins and other molecules.[56] This was not a part of the CASP competition which AlphaFold entered, and not an eventuality it was internally designed to expect. Where structures that AlphaFold 2 did predict were for proteins that had strong interactions either with other copies of themselves, or with other structures, these were the cases where AlphaFold 2's predictions tended to be least refined and least reliable. As a large fraction of the most important biological machines in a cell comprise such complexes, or relate to how protein structures become modified when in contact with other molecules, this is an area that will continue to be the focus of considerable experimental attention.[56]

With so little yet known about the internal patterns that AlphaFold 2 learns to make its predictions, it is not yet clear to what extent the program may be impaired in its ability to identify novel folds, if such folds are not well represented in the existing protein structures known in structure databases.[57][56] It is also not well known the extent to which protein structures in such databases, overwhelmingly of proteins that it has been possible to crystallise to X-ray, are representative of typical proteins that have not yet been crystallised. And it is also unclear how representative the frozen protein structures in crystals are of the dynamic structures found in the cells in vivo. AlphaFold 2's difficulties with structures obtained by protein NMR methods may not be a good sign.

On its potential as a tool for drug discovery, Stephen Curry notes that while the resolution of AlphaFold 2's structures may be very good, the accuracy with which binding sites are modelled needs to be even higher: typically molecular docking studies require the atomic positions to be accurate within a 0.3 Å margin, but the predicted protein structure only have at best an RMSD of 0.9 Å for all atoms. So AlphaFold 2's structures may only be a limited help in such contexts.[57][56] Moreover, according to Science columnist Derek Lowe, because the prediction of small-molecule binding even then is still not very good, computational prediction of drug targets is simply not in a position to take over as the "backbone" of corporate drug discovery—so "protein structure determination simply isn’t a rate-limiting step in drug discovery in general".[58] It has also been noted that even with a structure for a protein, to then understand how it functions, what it does, and how that fits within wider biological processes can still be very challenging.[59] Nevertheless, if better knowledge of protein structure could lead to better understanding of individual disease mechanisms and ultimately to better drug targets, or better understanding of the differences between human and animal models, ultimately that could lead to improvements.[60]

Also, because AlphaFold processes protein-only sequences by design, other associated biomolecules are not considered. On the impact of absent metals, co-factors and, most visibly, co- and post-translational modifications such as protein glycosylation from AlphaFold models, Elisa Fadda (Maynooth University, Ireland) and Jon Agirre (University of York, UK) highlighted the need for scientists to check databases such as UniProt-KB for likely missing components, as these can play an important role not just in folding but in protein function.[61] However, the authors highlighted that many AlphaFold models were accurate enough to allow for the introduction of post-predictional modifications.[61]

Finally, some have noted that even a perfect answer to the protein prediction problem would still leave questions about the protein folding problem—understanding in detail how the folding process actually occurs in nature (and how sometimes they can also misfold).[62]

But even with such caveats, AlphaFold 2 was described as a huge technical step forward and intellectual achievement.[63][64]

AlphaFold蛋白质结构数据库

[编辑]

AlphaFold蛋白质结构数据库于2021年7月22日启动,这是AlphaFold和欧洲分子生物学实验室欧洲生物信息研究所的共同努力。AlphaFold提供对超过2亿个蛋白质结构预测的开放访问,以加速科学研究。在启动时,该数据库包含人类和20种模式生物的几乎完整UniProt蛋白质组的AlphaFold预测蛋白质结构模型,总计超过365,000种蛋白质(该数据库不包括少于16个或多于2700个氨基酸残基蛋白质[65],但对人类而言,残基蛋白质可在文件中获得。[66])。

AlphaFold目标是覆盖UniRef90中1亿个蛋白质大部分集合。截至2022年5月15日,已有992,316个可用。[67]

应用

[编辑]

AlphaFold已被用于预测SARS-CoV-2COVID-19的病原体)的蛋白质结构。 这些蛋白质的结构在2020年初有待实验检测[68]。在将结果发布到更大的研究界之前,英国弗朗西斯·克里克研究所英语Francis Crick Institute(Francis Crick Institute)的科学家们对结果进行了检查。该团队还证实了对实验确定的SARS-CoV-2刺突蛋白的准确预测,该蛋白在国际开放存取数据库蛋白质数据库(Protein Data Bank)中共享,然后发布了计算确定的未充分研究的蛋白质分子的结构[69]

参见

[编辑]

参考文献

[编辑]
  1. ^ AlphaFold. Deepmind. [2020-11-30]. (原始内容存档于2021-01-19). 
  2. ^ 2.0 2.1 2.2 2.3 2.4 2.5 DeepMind's protein-folding AI has solved a 50-year-old grand challenge of biology. MIT Technology Review. [2020-11-30]. (原始内容存档于2021-08-28) (英语). 
  3. ^ DeepMind称AI能精确预测蛋白折叠 将加速药物设计. 第一财经. 
  4. ^ DeepMind宣布能够预测蛋白质结构. 金融时报中文网. [2020-12-03]. (原始内容存档于2020-12-22). 
  5. ^ Shead, Sam. DeepMind solves 50-year-old 'grand challenge' with protein folding A.I.. CNBC. 2020-11-30 [2020-11-30]. (原始内容存档于2021-01-28) (英语). 
  6. ^ “阿尔法折叠”精准预测蛋白质三维结构. 科技日报. [2020-12-03]. (原始内容存档于2020-12-05). 
  7. ^ 7.0 7.1 7.2 DeepMind's protein-folding AI has solved a 50-year-old grand challenge of biology. MIT Technology Review. [2020-11-30]. (原始内容存档于2021-08-28) (英语). 
  8. ^ Jumper, John; Evans, Richard; Pritzel, Alexander; Green, Tim; Figurnov, Michael; Ronneberger, Olaf; Tunyasuvunakool, Kathryn; Bates, Russ; Žídek, Augustin; Potapenko, Anna; Bridgland, Alex; Meyer, Clemens; Kohl, Simon A A; Ballard, Andrew J; Cowie, Andrew; Romera-Paredes, Bernardino; Nikolov, Stanislav; Jain, Rishub; Adler, Jonas; Back, Trevor; Petersen, Stig; Reiman, David; Clancy, Ellen; Zielinski, Michal; Steinegger, Martin; Pacholska, Michalina; Berghammer, Tamas; Bodenstein, Sebastian; Silver, David; Vinyals, Oriol; Senior, Andrew W; Kavukcuoglu, Koray; Kohli, Pushmeet; Hassabis, Demis. Highly accurate protein structure prediction with AlphaFold. Nature. 2021-07-15, 596 (7873): 583–589. PMC 8371605可免费查阅. PMID 34265844. doi:10.1038/s41586-021-03819-2可免费查阅 (英语). 
  9. ^ GitHub - deepmind/alphafold: Open source code for AlphaFold.. GitHub. [2021-07-24]. (原始内容存档于2021-07-23) (英语). 
  10. ^ AlphaFold Protein Structure Database. alphafold.ebi.ac.uk. [2021-07-24]. (原始内容存档于2021-07-24). 
  11. ^ 11.0 11.1 11.2 11.3 11.4 AlphaFold: Using AI for scientific discovery. Deepmind. [2020-11-30]. (原始内容存档于2022-03-07). 
  12. ^ Ken A. Dill, S. Banu Ozkan, M. Scott Shell, and Thomas R. Weikl. The Protein Folding Problem. Annual Review of Biophysics. 2008, 37: 289–316. PMC 2443096可免费查阅. PMID 18573083. doi:10.1146/annurev.biophys.37.092707.153558. 
  13. ^ 13.0 13.1 13.2 13.3 13.4 Robert F. Service, 'The game has changed.' AI triumphs at solving protein structures页面存档备份,存于互联网档案馆), Science, 30 November 2020
  14. ^ 14.0 14.1 14.2 14.3 引用错误:没有为名为DeepMindAlpha2的参考文献提供内容
  15. ^ Mohammed AlQuraishi(2019 年 5 月),AlphaFold at CASP13页面存档备份,存于互联网档案馆),Bioinformatics35(22),4862–4865 doi:10.1093/bioinformatics/btz422。另见 Mohammed AlQuraishi(2018 年 12 月 9 日),AlphaFold @ CASP13: "What just happened?"页面存档备份,存于互联网档案馆)(部落格文章)。
    Mohammed AlQuraishi(2020 年 1 月 15 日),A watershed moment for protein structure prediction页面存档备份,存于互联网档案馆),Nature 577,627–628 doi:10.1038/d41586-019-03951-0
  16. ^ AlphaFold: Machine learning for protein structure prediction页面存档备份,存于互联网档案馆),Foldit,2020 年 1 月 31 日
  17. ^ Torrisi, Mirko 等(2020 年 1 月 22 日),Deep learning methods in protein structure prediction页面存档备份,存于互联网档案馆),Computational and Structural Biotechnology Journal,第 18 卷,1301–1310。doi:10.1016/j.csbj.2019.12.011(CC-BY-4.0)
  18. ^ 18.0 18.1 18.2 引用错误:没有为名为economist20201130的参考文献提供内容
  19. ^ 19.0 19.1 19.2 Jeremy Kahn,Lessons from DeepMind's breakthrough in protein-folding A.I.页面存档备份,存于互联网档案馆),《Fortune》杂志,2020 年 12 月 1 日
  20. ^ 20.0 20.1 John Jumper 等人,会议摘要(2020 年 12 月)
  21. ^ 21.0 21.1 21.2 21.3 参见方块图。亦见 John Jumper 等人(2020 年 12 月 1 日),AlphaFold 2 简报页面存档备份,存于互联网档案馆),第 10 页
  22. ^ 结构模组被指出使用“三维等变 Transformer 架构”(John Jumper 等人(2020 年 12 月 1 日),AlphaFold 2 简报页面存档备份,存于互联网档案馆),第 12 页)。
    一种基于 SE(3) 等变性设计的 Transformer 网络由 Fabian Fuchs 等人 提出,SE(3)-Transformers: 3D Roto-Translation Equivariant Attention Networks页面存档备份,存于互联网档案馆),NeurIPS 2020;另见其网站页面存档备份,存于互联网档案馆)。目前尚不清楚此设计与 AlphaFold 使用的设计相似程度。
    另见 AlQuaraishi 的部落格文章页面存档备份,存于互联网档案馆) 或 Fabian Fuchs 的详细文章页面存档备份,存于互联网档案馆)。
  23. ^ John Jumper 等人(2020 年 12 月 1 日),AlphaFold 2 简报页面存档备份,存于互联网档案馆),第 12 至 20 页
  24. ^ Callaway, Ewen. What's next for AlphaFold and the AI protein-folding revolution. Nature. 2022-04-13, 604 (7905): 234–238 [2022-04-15]. doi:10.1038/d41586-022-00997-5. (原始内容存档于2022-07-26) (英语). 
  25. ^ 根据综合z-score的团队表现页面存档备份,存于互联网档案馆), CASP 13, 2018年12月. (AlphaFold = Team 043: A7D)
  26. ^ 26.0 26.1 Sample, Ian. Google's DeepMind predicts 3D shapes of proteins. The Guardian. 2018-12-02 [2020-11-30]. (原始内容存档于2019-07-18). 
  27. ^ AlphaFold: Using AI for scientific discovery. Deepmind. [2020-11-30]. (原始内容存档于2024-10-10). 
  28. ^ Singh, Arunima. Deep learning 3D structures. Nature Methods. 2020, 17 (3): 249. ISSN 1548-7105. PMID 32132733. S2CID 212403708. doi:10.1038/s41592-020-0779-y可免费查阅 (英语). 
  29. ^ CASP 13数据表页面存档备份,存于互联网档案馆) 043 A7D, 322 Zhang, 和 089 MULTICOM
  30. ^ Wei Zheng et al,Deep-learning contact-map guided protein structure prediction in CASP13页面存档备份,存于互联网档案馆), Proteins: Structure, Function, and Bioinformatics, 87(12) 1149–1164 doi:10.1002/prot.25792; 和幻灯片页面存档备份,存于互联网档案馆
  31. ^ Hou, Jie; Wu, Tianqi; Cao, Renzhi; Cheng, Jianlin. Protein tertiary structure modeling driven by deep learning and contact distance prediction in CASP13. Proteins: Structure, Function, and Bioinformatics (Wiley). 2019-04-25, 87 (12): 1165–1178. ISSN 0887-3585. PMC 6800999可免费查阅. PMID 30985027. bioRxiv 10.1101/552422可免费查阅. doi:10.1002/prot.25697. 
  32. ^ 32.0 32.1 DeepMind Breakthrough Helps to Solve How Diseases Invade Cells. Bloomberg.com. 2020-11-30 [2020-11-30]. (原始内容存档于2022-04-05) (英语). 
  33. ^ deepmind/deepmind-research. GitHub. [2020-11-30]. (原始内容存档于2022-02-01) (英语). 
  34. ^ Artificial intelligence solution to a 50-year-old science challenge could ‘revolutionise’ medical research页面存档备份,存于互联网档案馆) (press release), CASP organising committee, 30 November 2020
  35. ^ Brigitte Nerlich, Protein folding and science communication: Between hype and humility页面存档备份,存于互联网档案馆), University of Nottingham blog, 4 December 2020
  36. ^ 36.0 36.1 Callaway, Ewen. 'It will change everything': DeepMind's AI makes gigantic leap in solving protein structures. Nature. 2020-11-30, 588 (7837): 203–204. Bibcode:2020Natur.588..203C. PMID 33257889. doi:10.1038/d41586-020-03348-4可免费查阅 (英语). 
  37. ^ Michael Le Page, DeepMind's AI biologist can decipher secrets of the machinery of life页面存档备份,存于互联网档案馆), New Scientist, 30 November 2020
  38. ^ The predictions of DeepMind’s latest AI could revolutionise medicine页面存档备份,存于互联网档案馆), New Scientist, 2 December 2020
  39. ^ Cade Metz, London A.I. Lab Claims Breakthrough That Could Accelerate Drug Discovery页面存档备份,存于互联网档案馆), New York Times, 30 November 2020
  40. ^ Ian Sample,DeepMind AI cracks 50-year-old problem of protein folding页面存档备份,存于互联网档案馆), The Guardian, 30 November 2020
  41. ^ Lizzie Roberts, 'Once in a generation advance' as Google AI researchers crack 50-year-old biological challenge页面存档备份,存于互联网档案馆). Daily Telegraph, 30 November 2020
  42. ^ 42.0 42.1 42.2 Nuño Dominguez, La inteligencia artificial arrasa en uno de los problemas más importantes de la biología页面存档备份,存于互联网档案馆) (Artificial intelligence takes out one of the most important problems in biology), El País, 2 December 2020
  43. ^ Jeremy Kahn, In a major scientific breakthrough, A.I. predicts the exact shape of proteins页面存档备份,存于互联网档案馆), Fortune, 30 November 2020
  44. ^ Julia Merlot, Forscher hoffen auf Durchbruch für die Medikamentenforschung页面存档备份,存于互联网档案馆) (Researchers hope for a breakthrough for drug research), Der Spiegel, 2 December 2020
  45. ^ Bissan Al-Lazikani, The solving of a biological mystery页面存档备份,存于互联网档案馆), The Spectator, 1 December 2020
  46. ^ Tom Whipple, "Deepmind computer solves new puzzle: life", The Times, 1 December 2020. front page image页面存档备份,存于互联网档案馆), via Twitter.
  47. ^ Tom Whipple, Deepmind finds biology’s ‘holy grail’ with answer to protein problem页面存档备份,存于互联网档案馆), The Times (online), 30 November 2020.
    In all science editor Tom Whipple wrote six articles on the subject for The Times on the day the news broke. (thread页面存档备份,存于互联网档案馆)).
  48. ^ Tim Hubbard, The secret of life, part 2: the solution of the protein folding problem.页面存档备份,存于互联网档案馆), medium.com, 30 November 2020
  49. ^ Christian Stöcker, Google greift nach dem Leben selbst页面存档备份,存于互联网档案馆) (Google is reaching for life itself), Der Spiegel, 6 December 2020
  50. ^ John Jumper et al. (1 December 2020), AlphaFold 2页面存档备份,存于互联网档案馆). Presentation given at CASP 14.
  51. ^ Carlos Outeiral, CASP14: what Google DeepMind’s AlphaFold 2 really achieved, and what it means for protein folding, biology and bioinformatics页面存档备份,存于互联网档案馆), Oxford Protein Informatics Group. (3 December)
  52. ^ Aled Edwards, The AlphaFold2 success: It took a village页面存档备份,存于互联网档案馆), via medium.com, 5 December 2020
  53. ^ David Briggs, If Google’s Alphafold2 really has solved the protein folding problem, they need to show their working页面存档备份,存于互联网档案馆), The Skeptic, 4 December 2020
  54. ^ The Guardian view on DeepMind’s brain: the shape of things to come页面存档备份,存于互联网档案馆), The Guardian, 6 December 2020
  55. ^ Demis Hassabis, "Brief update on some exciting progress on #AlphaFold!"页面存档备份,存于互联网档案馆) (tweet), via twitter, 18 June 2021
  56. ^ 56.0 56.1 56.2 56.3 Tom Ireland, How will AlphaFold change bioscience research?页面存档备份,存于互联网档案馆), The Biologist, 4 December 2020
  57. ^ 57.0 57.1 Stephen Curry, No, DeepMind has not solved protein folding页面存档备份,存于互联网档案馆), Reciprocal Space (blog), 2 December 2020
  58. ^ Derek Lowe, In the Pipeline: What’s Crucial And What Isn’t页面存档备份,存于互联网档案馆), Science Translational Medicine, 25 September 2019
  59. ^ Philip Ball, Behind the Screens of AlphaFold页面存档备份,存于互联网档案馆), Chemistry World, 9 December 2020. See also tweets页面存档备份,存于互联网档案馆), 1 December
  60. ^ Derek Lowe, In the Pipeline: The Big Problems页面存档备份,存于互联网档案馆), Science Translational Medicine, 1 December 2020
  61. ^ 61.0 61.1 Bagdonas, Haroldas; Fogarty, Carl A.; Fadda, Elisa; Agirre, Jon. The case for post-predictional modifications in the AlphaFold Protein Structure Database. Nature Structural & Molecular Biology. 2021-10-29, 28 (11): 869–870 [2022-07-29]. ISSN 1545-9985. PMID 34716446. S2CID 240228913. doi:10.1038/s41594-021-00680-9. (原始内容存档于2022-06-23) (英语). 
  62. ^ e.g. Greg Bowman, Protein folding and related problems remain unsolved despite AlphaFold's advance页面存档备份,存于互联网档案馆), Folding@home blog, 8 December 2020
  63. ^ Cristina Sáez, El último avance fundamental de la biología se basa en la investigación de un científico español页面存档备份,存于互联网档案馆), La Vanguardia, 2 December 2020. (Alfonso Valencia overall view)
  64. ^ Zero Gravitas and Jacky Liang, DeepMind’s AlphaFold 2—An Impressive Advance With Hyperbolic Coverage页面存档备份,存于互联网档案馆), Skynet today (blog), Stanford, 9 December 2020
  65. ^ AlphaFold Protein Structure Database. alphafold.ebi.ac.uk. [2021-07-29]. (原始内容存档于2022-07-29). 
  66. ^ AlphaFold Protein Structure Database. alphafold.ebi.ac.uk. [2021-07-27]. (原始内容存档于2022-07-29). 
  67. ^ AlphaFold Protein Structure Database. www.alphafold.ebi.ac.uk. [2022-07-29]. (原始内容存档于2022-08-02). 
  68. ^ AI Can Help Scientists Find a Covid-19 Vaccine. Wired. [2020-12-01]. ISSN 1059-1028. (原始内容存档于2022-04-23) (美国英语). 
  69. ^ Computational predictions of protein structures associated with COVID-19. Deepmind. [2020-12-01]. (原始内容存档于2022-03-25). 

外部链接

[编辑]

AlphaFold(2018年)

[编辑]

AlphaFold 2(2020年)

[编辑]