Telegram Analysis Page conducted by TCA team

Methodology

Our Telegram data analysis begins with the systematic scraping of designated Telegram channels to gather relevant data. Once this raw data is procured, it undergoes preprocessing to ensure its relevance and integrity.

The next phase involves the application of clusterization algorithms to group similar data, ensuring that related information is categorized correctly. The central component of this process is the utilization of the spatial clustering algorithm to group similar texts. The method is chosen for its proficiency in identifying clusters of varying shapes and sizes in large datasets.

DBSCAN visualization

For topic modeling, we employ a combination of n-grams and more advanced Natural Language Processing techniques. This allows us to discern overarching themes and recurring topics within the vast volume of content generated in Telegram channels. As an example of each topic, we provide the ‘representative’ text, one that has the highest average cosine similarity to all other texts in its designated cluster.

Additionally, sentiment analysis techniques are deployed to gauge the tone and sentiment of the content.

Ethical Considerations: We prioritize the ethical implications of our work. The data is sourced from public Telegram channels and is used strictly for research purposes. No personal data is exploited. Furthermore, all research endeavors align with the principles of responsible and ethical data use.