In 2018, the Parker Institute for Cancer Immunotherapy (PICI) partnered with Cognitect to harmonize multi-omic biological data from a wide range of vendors, labs, and partners into a unified schema together with patient clinical data. As a small informatics team with an ambitious mission, we needed a higher order approach, so we built Pret to model ETL as compilation, transforming an edn map specifying the relation of arbitrary tables to our Datomic schema into transaction data. This talk details our experience using CANDEL in anger for four years, during which it powered peer-reviewed science published in journals like Nature and Cell, and enabled us to adapt to changes in data and understanding without writing new code. Nothing in the implementation is aware of the platform’s original scientific purpose and we believe it is broadly useful for ETL shaped problems in many domains. PICI has made CANDEL open source and available for any organization to use, learn from, or contribute to.
17 июл 2024