Тёмный

Designing DNA With Tunable Regulatory Activity Using Discrete Diffusion | Anirban Sarkar & Peter Koo 

Подписаться
Просмотров 343
% 15

Portal is the home of the AI for drug discovery community. Join for more details on this talk and to connect with the speakers: portal.valencelabs.com/logg
Engineering regulatory DNA sequences with precise activity levels in specific cell types hold immense potential for medicine and biotechnology. However, the vast combinatorial space of possible sequences and the complex regulatory grammars governing gene regulation have proven challenging for existing approaches. Supervised deep learning models that score sequences proposed by local search algorithms ignore the global structure of functional sequence space. While diffusion-based generative models have shown promise in learning these distributions, their application to regulatory DNA has been limited. Evaluating the quality of generated sequences also remains challenging due to a lack of a unified framework that characterizes key properties of regulatory DNA. Here we introduce DNA Discrete Diffusion (D3), a generative framework for conditionally sampling regulatory sequences with targeted functional activity levels. We develop a comprehensive suite of evaluation metrics that assess the functional similarity, sequence similarity, and regulatory composition of generated sequences. Through benchmarking on three high-quality functional genomics datasets spanning human promoters and fly enhancers, we demonstrate that D3 outperforms existing methods in capturing the diversity of cis-regulatory grammars and generating sequences that more accurately reflect the properties of genomic regulatory DNA. Furthermore, we show that D3-generated sequences can effectively augment supervised models and improve their predictive performance, even in data-limited scenarios.
Paper link: www.biorxiv.org/content/10.1101/2024.05.23.595630
Speakers: Anirban Sarkar & Peter Koo
Twitter Hannes: HannesStaerk
Twitter Dominique: dom_beaini
~
Chapters
00:00 - Intro + Background
14:00 - Continuous Time Framework
21:14 - Concrete Score
28:38 - Learning Concrete Scores
32:11 - Training Process
35:56 - Sampling Process
36:53 - Evaluation
55:49 - Conclusions
59:34 - Q+A

Наука

Опубликовано:

 

8 июл 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии