Analysis of Long COVID Phenotypes and their Impact on Mental Health and Daily Functioning: Insights from Twitter
Authors
Markovikj, MarkoDobreva, Jovana
Lucas, Mary
Vodenska, Irena
Chitkushev, Lou
Trajanov, Dimitar
Contributors
Morić, IvanaĐorđević, Valentina
Conference object (Published version)
,
© 2023 Institute of Molecular Genetics and Genetic Engineering, University of Belgrade
Metadata
Show full item recordAbstract
In this study, we conducted an investigation into Long COVID from a user perspective, utilizing
Twitter social media data. Prior to analysis, the data underwent preprocessing to obtain raw text
per tweet. Our analysis commenced with basic statistical analysis and subsequently expanded to
identify characteristic periods for the phenotypes based on dynamic timelines. We also explored the
relationships between the phenotypes, as well as the interdependence between phenotypes and
geolocation.
In the context of this research, an analysis was conducted on a collection of tweets that encompassed
the timeframe from March 2020 to March 2022. The dataset consisted of approximately 1.9
million tweets. In order to concentrate on word phrases, extraneous elements such as mentions,
emoticons, links, and hashtags were eliminated. Subsequently, a process of lemmatization was
performed. For the purpose of reducing the number of distinct phenotypes under investigation
and facilitating the pre...sentation of results, the collected data was categorized into five overarching
groups: Cardiovascular, Respiratory, Daily Living, Neurological and Mental Health, and Other.
The statistical data regarding the most commonly used words by individuals describing their
experiences during the Long COVID period are as follows: “Ampicillin” was tweeted 125,295 times,
“Death” was tweeted 121,156 times, “Suffer” was tweeted 125,113 times, and “Vaccine” was
tweeted 108,968 times. We observe distinct patterns in the emergence of certain phenotypes
during this period, particularly in relation to the quality of life. On August 1, 2020, the term “quality
of life” was mentioned in only 223 tweets, whereas one year later, during the same month, this
phenotype garnered 1,663 tweets.
Our findings reveal that the occurrence of Long COVID phenotypes is influenced by both temporal and
geographical factors. The analysis shows a clear and notable trend within the dataset. Specifically,
it is observed that neurological symptoms, along with symptoms that impede individuals’ daily
functioning, exhibit the highest prevalence, particularly during the latter half of the analyzed tweet
period. This period corresponds to a time when an increasing number of individuals have recovered
from COVID-19 and are reporting their experiences with Long COVID. Notably, fatigue, depression,
stress, and anxiety emerge as the most prevalent phenotypes.
This scientific investigation of the complex interactions between Long COVID phenotypes, mental
health, and the manifestation of diverse symptoms is offering insights into the profound consequences
on individuals’ lives. These findings shed light on the significant burden posed by Long COVID and its
cascading effects on various aspects of individuals’ well-being and society at large.
Keywords:
long COVID / data mining / computer science / nlpSource:
4th Belgrade Bioinformatics Conference, 2023, 4, 111-111Publisher:
- Belgrade : Institute of molecular genetics and genetic engineering
Note:
- Book of abstract: 4th Belgrade Bioinformatics Conference, June 19-23, 2023
Collections
Institution/Community
Institut za molekularnu genetiku i genetičko inženjerstvoTY - CONF AU - Markovikj, Marko AU - Dobreva, Jovana AU - Lucas, Mary AU - Vodenska, Irena AU - Chitkushev, Lou AU - Trajanov, Dimitar PY - 2023 UR - https://belbi.bg.ac.rs/ UR - https://imagine.imgge.bg.ac.rs/handle/123456789/2056 AB - In this study, we conducted an investigation into Long COVID from a user perspective, utilizing Twitter social media data. Prior to analysis, the data underwent preprocessing to obtain raw text per tweet. Our analysis commenced with basic statistical analysis and subsequently expanded to identify characteristic periods for the phenotypes based on dynamic timelines. We also explored the relationships between the phenotypes, as well as the interdependence between phenotypes and geolocation. In the context of this research, an analysis was conducted on a collection of tweets that encompassed the timeframe from March 2020 to March 2022. The dataset consisted of approximately 1.9 million tweets. In order to concentrate on word phrases, extraneous elements such as mentions, emoticons, links, and hashtags were eliminated. Subsequently, a process of lemmatization was performed. For the purpose of reducing the number of distinct phenotypes under investigation and facilitating the presentation of results, the collected data was categorized into five overarching groups: Cardiovascular, Respiratory, Daily Living, Neurological and Mental Health, and Other. The statistical data regarding the most commonly used words by individuals describing their experiences during the Long COVID period are as follows: “Ampicillin” was tweeted 125,295 times, “Death” was tweeted 121,156 times, “Suffer” was tweeted 125,113 times, and “Vaccine” was tweeted 108,968 times. We observe distinct patterns in the emergence of certain phenotypes during this period, particularly in relation to the quality of life. On August 1, 2020, the term “quality of life” was mentioned in only 223 tweets, whereas one year later, during the same month, this phenotype garnered 1,663 tweets. Our findings reveal that the occurrence of Long COVID phenotypes is influenced by both temporal and geographical factors. The analysis shows a clear and notable trend within the dataset. Specifically, it is observed that neurological symptoms, along with symptoms that impede individuals’ daily functioning, exhibit the highest prevalence, particularly during the latter half of the analyzed tweet period. This period corresponds to a time when an increasing number of individuals have recovered from COVID-19 and are reporting their experiences with Long COVID. Notably, fatigue, depression, stress, and anxiety emerge as the most prevalent phenotypes. This scientific investigation of the complex interactions between Long COVID phenotypes, mental health, and the manifestation of diverse symptoms is offering insights into the profound consequences on individuals’ lives. These findings shed light on the significant burden posed by Long COVID and its cascading effects on various aspects of individuals’ well-being and society at large. PB - Belgrade : Institute of molecular genetics and genetic engineering C3 - 4th Belgrade Bioinformatics Conference T1 - Analysis of Long COVID Phenotypes and their Impact on Mental Health and Daily Functioning: Insights from Twitter EP - 111 SP - 111 SP - 111 VL - 4 UR - https://hdl.handle.net/21.15107/rcub_imagine_2056 ER -
@conference{ author = "Markovikj, Marko and Dobreva, Jovana and Lucas, Mary and Vodenska, Irena and Chitkushev, Lou and Trajanov, Dimitar", year = "2023", abstract = "In this study, we conducted an investigation into Long COVID from a user perspective, utilizing Twitter social media data. Prior to analysis, the data underwent preprocessing to obtain raw text per tweet. Our analysis commenced with basic statistical analysis and subsequently expanded to identify characteristic periods for the phenotypes based on dynamic timelines. We also explored the relationships between the phenotypes, as well as the interdependence between phenotypes and geolocation. In the context of this research, an analysis was conducted on a collection of tweets that encompassed the timeframe from March 2020 to March 2022. The dataset consisted of approximately 1.9 million tweets. In order to concentrate on word phrases, extraneous elements such as mentions, emoticons, links, and hashtags were eliminated. Subsequently, a process of lemmatization was performed. For the purpose of reducing the number of distinct phenotypes under investigation and facilitating the presentation of results, the collected data was categorized into five overarching groups: Cardiovascular, Respiratory, Daily Living, Neurological and Mental Health, and Other. The statistical data regarding the most commonly used words by individuals describing their experiences during the Long COVID period are as follows: “Ampicillin” was tweeted 125,295 times, “Death” was tweeted 121,156 times, “Suffer” was tweeted 125,113 times, and “Vaccine” was tweeted 108,968 times. We observe distinct patterns in the emergence of certain phenotypes during this period, particularly in relation to the quality of life. On August 1, 2020, the term “quality of life” was mentioned in only 223 tweets, whereas one year later, during the same month, this phenotype garnered 1,663 tweets. Our findings reveal that the occurrence of Long COVID phenotypes is influenced by both temporal and geographical factors. The analysis shows a clear and notable trend within the dataset. Specifically, it is observed that neurological symptoms, along with symptoms that impede individuals’ daily functioning, exhibit the highest prevalence, particularly during the latter half of the analyzed tweet period. This period corresponds to a time when an increasing number of individuals have recovered from COVID-19 and are reporting their experiences with Long COVID. Notably, fatigue, depression, stress, and anxiety emerge as the most prevalent phenotypes. This scientific investigation of the complex interactions between Long COVID phenotypes, mental health, and the manifestation of diverse symptoms is offering insights into the profound consequences on individuals’ lives. These findings shed light on the significant burden posed by Long COVID and its cascading effects on various aspects of individuals’ well-being and society at large.", publisher = "Belgrade : Institute of molecular genetics and genetic engineering", journal = "4th Belgrade Bioinformatics Conference", title = "Analysis of Long COVID Phenotypes and their Impact on Mental Health and Daily Functioning: Insights from Twitter", pages = "111-111-111", volume = "4", url = "https://hdl.handle.net/21.15107/rcub_imagine_2056" }
Markovikj, M., Dobreva, J., Lucas, M., Vodenska, I., Chitkushev, L.,& Trajanov, D.. (2023). Analysis of Long COVID Phenotypes and their Impact on Mental Health and Daily Functioning: Insights from Twitter. in 4th Belgrade Bioinformatics Conference Belgrade : Institute of molecular genetics and genetic engineering., 4, 111-111. https://hdl.handle.net/21.15107/rcub_imagine_2056
Markovikj M, Dobreva J, Lucas M, Vodenska I, Chitkushev L, Trajanov D. Analysis of Long COVID Phenotypes and their Impact on Mental Health and Daily Functioning: Insights from Twitter. in 4th Belgrade Bioinformatics Conference. 2023;4:111-111. https://hdl.handle.net/21.15107/rcub_imagine_2056 .
Markovikj, Marko, Dobreva, Jovana, Lucas, Mary, Vodenska, Irena, Chitkushev, Lou, Trajanov, Dimitar, "Analysis of Long COVID Phenotypes and their Impact on Mental Health and Daily Functioning: Insights from Twitter" in 4th Belgrade Bioinformatics Conference, 4 (2023):111-111, https://hdl.handle.net/21.15107/rcub_imagine_2056 .