The use of Active Machine Learning for Protospacer-Adjacent Motif recovery in Class 2 CRISPR-Cas systems
Аутори
Kirillov, BogdanVasileva, Aleksandra
Fedorov, Oleg
Panov, Maxim
Severinov, Konstantin
Остала ауторства
Morić, IvanaĐorđević, Valentina
Конференцијски прилог (Објављена верзија)
,
© 2023 Institute of Molecular Genetics and Genetic Engineering, University of Belgrade
Метаподаци
Приказ свих података о документуАпстракт
The recognition of target DNA sequences during the interference phase of prokaryotic
CRISPR-Cas immunity relies on Protospacer-Adjacent Motif (PAM) sequences, specific
for each Cas effector. PAM identification is a laborious and time consuming process
that requires multiple stages including in vitro and in vivo cleavage assays followed by
Next Generation Sequencing of targets that withstood cleavage. Determining PAM is
an essential step of characterisation of any novel Cas9 ortholog and determines the
likelihood of its potential use. This study investigates the potential of machine learning
to predict PAM sequences for a given Cas9 ortholog based on the results of cleavage
experiments and employing an Active Learning process akin to Reinforcement Learning
with Human Feedback. Machine learning-facilitated PAM identification would streamline
and accelerate existing pipelines for describing novel Cas proteins. We demonstrate that
simple models with a small amount of data are su...fficient for confident PAM predictions
when training is effectively orchestrated.
Кључне речи:
bioinformatics / CRISPR / machine learningИзвор:
4th Belgrade Bioinformatics Conference, 2023, 4, 40-40Издавач:
- Belgrade : Institute of molecular genetics and genetic engineering
Напомена:
- Book of abstract: 4th Belgrade Bioinformatics Conference, June 19-23, 2023
Колекције
Институција/група
Institut za molekularnu genetiku i genetičko inženjerstvoTY - CONF AU - Kirillov, Bogdan AU - Vasileva, Aleksandra AU - Fedorov, Oleg AU - Panov, Maxim AU - Severinov, Konstantin PY - 2023 UR - https://belbi.bg.ac.rs/ UR - https://imagine.imgge.bg.ac.rs/handle/123456789/1978 AB - The recognition of target DNA sequences during the interference phase of prokaryotic CRISPR-Cas immunity relies on Protospacer-Adjacent Motif (PAM) sequences, specific for each Cas effector. PAM identification is a laborious and time consuming process that requires multiple stages including in vitro and in vivo cleavage assays followed by Next Generation Sequencing of targets that withstood cleavage. Determining PAM is an essential step of characterisation of any novel Cas9 ortholog and determines the likelihood of its potential use. This study investigates the potential of machine learning to predict PAM sequences for a given Cas9 ortholog based on the results of cleavage experiments and employing an Active Learning process akin to Reinforcement Learning with Human Feedback. Machine learning-facilitated PAM identification would streamline and accelerate existing pipelines for describing novel Cas proteins. We demonstrate that simple models with a small amount of data are sufficient for confident PAM predictions when training is effectively orchestrated. PB - Belgrade : Institute of molecular genetics and genetic engineering C3 - 4th Belgrade Bioinformatics Conference T1 - The use of Active Machine Learning for Protospacer-Adjacent Motif recovery in Class 2 CRISPR-Cas systems EP - 40 SP - 40 VL - 4 UR - https://hdl.handle.net/21.15107/rcub_imagine_1978 ER -
@conference{ author = "Kirillov, Bogdan and Vasileva, Aleksandra and Fedorov, Oleg and Panov, Maxim and Severinov, Konstantin", year = "2023", abstract = "The recognition of target DNA sequences during the interference phase of prokaryotic CRISPR-Cas immunity relies on Protospacer-Adjacent Motif (PAM) sequences, specific for each Cas effector. PAM identification is a laborious and time consuming process that requires multiple stages including in vitro and in vivo cleavage assays followed by Next Generation Sequencing of targets that withstood cleavage. Determining PAM is an essential step of characterisation of any novel Cas9 ortholog and determines the likelihood of its potential use. This study investigates the potential of machine learning to predict PAM sequences for a given Cas9 ortholog based on the results of cleavage experiments and employing an Active Learning process akin to Reinforcement Learning with Human Feedback. Machine learning-facilitated PAM identification would streamline and accelerate existing pipelines for describing novel Cas proteins. We demonstrate that simple models with a small amount of data are sufficient for confident PAM predictions when training is effectively orchestrated.", publisher = "Belgrade : Institute of molecular genetics and genetic engineering", journal = "4th Belgrade Bioinformatics Conference", title = "The use of Active Machine Learning for Protospacer-Adjacent Motif recovery in Class 2 CRISPR-Cas systems", pages = "40-40", volume = "4", url = "https://hdl.handle.net/21.15107/rcub_imagine_1978" }
Kirillov, B., Vasileva, A., Fedorov, O., Panov, M.,& Severinov, K.. (2023). The use of Active Machine Learning for Protospacer-Adjacent Motif recovery in Class 2 CRISPR-Cas systems. in 4th Belgrade Bioinformatics Conference Belgrade : Institute of molecular genetics and genetic engineering., 4, 40-40. https://hdl.handle.net/21.15107/rcub_imagine_1978
Kirillov B, Vasileva A, Fedorov O, Panov M, Severinov K. The use of Active Machine Learning for Protospacer-Adjacent Motif recovery in Class 2 CRISPR-Cas systems. in 4th Belgrade Bioinformatics Conference. 2023;4:40-40. https://hdl.handle.net/21.15107/rcub_imagine_1978 .
Kirillov, Bogdan, Vasileva, Aleksandra, Fedorov, Oleg, Panov, Maxim, Severinov, Konstantin, "The use of Active Machine Learning for Protospacer-Adjacent Motif recovery in Class 2 CRISPR-Cas systems" in 4th Belgrade Bioinformatics Conference, 4 (2023):40-40, https://hdl.handle.net/21.15107/rcub_imagine_1978 .