Data mining for long-non coding RNAs deregulated in colon cancer through analysis of Gene Expression Omnibus database
Конференцијски прилог (Објављена верзија)
,
© 2023 Institute of Molecular Genetics and Genetic Engineering, University of Belgrade
Метаподаци
Приказ свих података о документуАпстракт
Colorectal cancer (CRC) is one of the most commonly diagnosed cancers worldwide. Lack
of specific CRC symptoms is a challenge for clinicians, as the symptoms overlap with other
non-cancerous diseases, leading to 20-25% of newly diagnosed CRC patients already
having liver metastasis. Thus, discovering reliable early-disease biomarkers is of high
importance. Non-coding RNAs (ncRNAs) have been demonstrated to be involved in CRC
development and progression. Long non-coding RNAs (lncRNAs) can interact with RNA,
DNA and proteins, forming complexes that are involved in regulation of gene expression
via multiple mechanisms, affecting every stage of colon carcinogenesis and making them
top candidates for novel biomarker discovery.
The aim of our study was to conduct data mining of Gene Expression Omnibus (GEO)
database by using “colon cancer” and “ncRNA” keywords, and identify differentially
expressed lnRNAs present in different GEO datasets.
GEO database which collects submitted hi...gh-throughput gene expression data was queried
for all datasets that studied colon cancer and ncRNA. Over 60 datasets were manually
inspected in order to identify those where analysis of colon and normal tissue originating
from the same patient was done. Each dataset was analyzed by GEO2R software to
discover differentially expressed lncRNAs. LncRNAs were considered significant if they
appeared in more than one GEO dataset. Parts of lncRNAs sequences available in GEO2R
analysis results were run through BLAST in order to identify full length lncRNAs.
Five GEO datasets matched our criteria. We discovered 12 sequences that appeared in
more than one dataset and we identified them through BLAST analysis. Six sequences
originated from lncRNAs (RYR3 divergent transcript, long intergenic non-protein coding
RNA 595, TOX divergent transcript, FLVCR2 antisense RNA 1, LHRI_LNC744.1 lncRNA
gene, and ELFN1 antisense RNA 1), while six sequences represented partial sequences
of various mRNAs. Four lncRNAs were down-regulated in colon cancer; one was upregulated,
while one showed different expression patterns in different GEO datasets.
In this study, we have identified six lncRNAs that have potential significance for colorectal
cancer etiology and will be a subject of further in silico and in vitro study.
Кључне речи:
long non-coding RNA / colorectal cancer / data mining / GEO databaseИзвор:
4th Belgrade Bioinformatics Conference, 2023, 4, 89-89Издавач:
- Belgrade : Institute of molecular genetics and genetic engineering
Напомена:
- Book of abstract: 4th Belgrade Bioinformatics Conference, June 19-23, 2023
Институција/група
Institut za molekularnu genetiku i genetičko inženjerstvoTY - CONF AU - Pruner, Iva AU - Nikolić, Aleksandra PY - 2023 UR - https://belbi.bg.ac.rs/ UR - https://imagine.imgge.bg.ac.rs/handle/123456789/2034 AB - Colorectal cancer (CRC) is one of the most commonly diagnosed cancers worldwide. Lack of specific CRC symptoms is a challenge for clinicians, as the symptoms overlap with other non-cancerous diseases, leading to 20-25% of newly diagnosed CRC patients already having liver metastasis. Thus, discovering reliable early-disease biomarkers is of high importance. Non-coding RNAs (ncRNAs) have been demonstrated to be involved in CRC development and progression. Long non-coding RNAs (lncRNAs) can interact with RNA, DNA and proteins, forming complexes that are involved in regulation of gene expression via multiple mechanisms, affecting every stage of colon carcinogenesis and making them top candidates for novel biomarker discovery. The aim of our study was to conduct data mining of Gene Expression Omnibus (GEO) database by using “colon cancer” and “ncRNA” keywords, and identify differentially expressed lnRNAs present in different GEO datasets. GEO database which collects submitted high-throughput gene expression data was queried for all datasets that studied colon cancer and ncRNA. Over 60 datasets were manually inspected in order to identify those where analysis of colon and normal tissue originating from the same patient was done. Each dataset was analyzed by GEO2R software to discover differentially expressed lncRNAs. LncRNAs were considered significant if they appeared in more than one GEO dataset. Parts of lncRNAs sequences available in GEO2R analysis results were run through BLAST in order to identify full length lncRNAs. Five GEO datasets matched our criteria. We discovered 12 sequences that appeared in more than one dataset and we identified them through BLAST analysis. Six sequences originated from lncRNAs (RYR3 divergent transcript, long intergenic non-protein coding RNA 595, TOX divergent transcript, FLVCR2 antisense RNA 1, LHRI_LNC744.1 lncRNA gene, and ELFN1 antisense RNA 1), while six sequences represented partial sequences of various mRNAs. Four lncRNAs were down-regulated in colon cancer; one was upregulated, while one showed different expression patterns in different GEO datasets. In this study, we have identified six lncRNAs that have potential significance for colorectal cancer etiology and will be a subject of further in silico and in vitro study. PB - Belgrade : Institute of molecular genetics and genetic engineering C3 - 4th Belgrade Bioinformatics Conference T1 - Data mining for long-non coding RNAs deregulated in colon cancer through analysis of Gene Expression Omnibus database EP - 89 SP - 89 VL - 4 UR - https://hdl.handle.net/21.15107/rcub_imagine_2034 ER -
@conference{ author = "Pruner, Iva and Nikolić, Aleksandra", year = "2023", abstract = "Colorectal cancer (CRC) is one of the most commonly diagnosed cancers worldwide. Lack of specific CRC symptoms is a challenge for clinicians, as the symptoms overlap with other non-cancerous diseases, leading to 20-25% of newly diagnosed CRC patients already having liver metastasis. Thus, discovering reliable early-disease biomarkers is of high importance. Non-coding RNAs (ncRNAs) have been demonstrated to be involved in CRC development and progression. Long non-coding RNAs (lncRNAs) can interact with RNA, DNA and proteins, forming complexes that are involved in regulation of gene expression via multiple mechanisms, affecting every stage of colon carcinogenesis and making them top candidates for novel biomarker discovery. The aim of our study was to conduct data mining of Gene Expression Omnibus (GEO) database by using “colon cancer” and “ncRNA” keywords, and identify differentially expressed lnRNAs present in different GEO datasets. GEO database which collects submitted high-throughput gene expression data was queried for all datasets that studied colon cancer and ncRNA. Over 60 datasets were manually inspected in order to identify those where analysis of colon and normal tissue originating from the same patient was done. Each dataset was analyzed by GEO2R software to discover differentially expressed lncRNAs. LncRNAs were considered significant if they appeared in more than one GEO dataset. Parts of lncRNAs sequences available in GEO2R analysis results were run through BLAST in order to identify full length lncRNAs. Five GEO datasets matched our criteria. We discovered 12 sequences that appeared in more than one dataset and we identified them through BLAST analysis. Six sequences originated from lncRNAs (RYR3 divergent transcript, long intergenic non-protein coding RNA 595, TOX divergent transcript, FLVCR2 antisense RNA 1, LHRI_LNC744.1 lncRNA gene, and ELFN1 antisense RNA 1), while six sequences represented partial sequences of various mRNAs. Four lncRNAs were down-regulated in colon cancer; one was upregulated, while one showed different expression patterns in different GEO datasets. In this study, we have identified six lncRNAs that have potential significance for colorectal cancer etiology and will be a subject of further in silico and in vitro study.", publisher = "Belgrade : Institute of molecular genetics and genetic engineering", journal = "4th Belgrade Bioinformatics Conference", title = "Data mining for long-non coding RNAs deregulated in colon cancer through analysis of Gene Expression Omnibus database", pages = "89-89", volume = "4", url = "https://hdl.handle.net/21.15107/rcub_imagine_2034" }
Pruner, I.,& Nikolić, A.. (2023). Data mining for long-non coding RNAs deregulated in colon cancer through analysis of Gene Expression Omnibus database. in 4th Belgrade Bioinformatics Conference Belgrade : Institute of molecular genetics and genetic engineering., 4, 89-89. https://hdl.handle.net/21.15107/rcub_imagine_2034
Pruner I, Nikolić A. Data mining for long-non coding RNAs deregulated in colon cancer through analysis of Gene Expression Omnibus database. in 4th Belgrade Bioinformatics Conference. 2023;4:89-89. https://hdl.handle.net/21.15107/rcub_imagine_2034 .
Pruner, Iva, Nikolić, Aleksandra, "Data mining for long-non coding RNAs deregulated in colon cancer through analysis of Gene Expression Omnibus database" in 4th Belgrade Bioinformatics Conference, 4 (2023):89-89, https://hdl.handle.net/21.15107/rcub_imagine_2034 .