1. Introduction
Social media platforms have reshaped the way the public engages with social issues, creating a massive repository of user data. Twitter and other platforms are not only social venues, but also important sources of information for better understanding public opinion and predicting trends. However, how to effectively extract insights from unstructured massive data has become a technical challenge. The traditional dictionary-based emotion analysis method is difficult to accurately capture the fluctuation of emotions in a complex context. Deep learning technology shows the potential to break through traditional bottlenecks by analyzing hidden rules in massive data. In this study, multimodal AI technology is integrated to build a real-time analysis system: the convolutional neural network is responsible for emotion classification, the topic model mining hot events, and the graph neural network tracking the propagation path. Compared with traditional models such as support vector machines, deep learning programs have been shown to have significant advantages in terms of prediction accuracy. The technical framework can help companies predict the evolution of public opinion on the product, and also provide decision-making support for government departments monitoring public events [1]. In the research process, the entire process plan, from data collection and cleaning, feature engineering to model optimization, was established. Particularly in the cases of election public opinion monitoring and new product launch feedback, the system showed a prediction accuracy of more than 30% higher than that of traditional methods. This methodology opens a new technical avenue for analyzing competitive intelligence for companies and developing public policies.
2. Literature review
2.1. Sentiment analysis in social media
Sentiment analysis, as a core module of natural language processing technology, plays a key role in analyzing the emotional trend of social media. Existing technical routes are mainly divided into traditional methods based on emotional dictionaries and intelligent solutions based on deep learning. The traditional method makes emotional judgment through predefined positive and negative lexicon, but it is difficult to process complex semantics such as irony and puns in the context of the network. Deep learning models such as convolutional neural networks, with their semantic understanding ability trained by training on massive data, show significant advantages in processing user-generated content. As shown in Figure 1, this technology not only supports emotion classification but also builds a complete natural language processing system with entity recognition, topic clustering, and other functions [2]. Experimental data show that in social media text analysis using fuzzy expression, the accuracy of the deep learning model is improved by about 15 percentage points compared with the traditional support vector machine method, especially in capturing users' potential emotional fluctuations. This technological breakthrough offers new possibilities for accurately capturing social media public opinion [3].
Figure 1: Overview of AI technologies in public sentiment and trend prediction on social media(source:analyticsvidhya.com)
2.2. Topic modeling for trend identification
Topic modeling technology provides an effective tool for analyzing potential problems in mass text on social media. The Implicit Dirichlet Distribution (LDA) model, as a classic algorithm, has become a dominant technology in public opinion analysis on networks by analyzing lexical distribution characteristics in documents to identify topic clusters. However, facing informal expression scenarios such as Weibo and Twitter, the traditional model has the limitation of high topic dispersion. A new topic modeling technique (BERTopic) based on the BERT architecture captures contextual semantics through pre-trained language models and combines dynamic clustering algorithms to generate more interpretive topic labels [4]. When analyzing the evolution of network buzzwords and emergencies, this improvement plan can more accurately identify complex issues such as "double reduction policy conflicts" and "new energy vehicle subsidies", providing reliable technical support for the government and enterprises to grasp public attention.
2.3. Graph neural networks for social network analysis
Graph neural networks (GNN) provide an innovative solution to solve the complex network of relationships in social media. This technology captures the information diffusion path by analyzing the information transmission law between nodes (users) and connections (interactive relationships). In the social graph, which contains multidimensional interactive data such as likes, retweets, and comments, GNN can effectively identify potential communication patterns. For example, by predicting the diffusion trajectory of a topic, the model can quantitatively evaluate the communication influence of different user nodes, so as to predict the evolution of public opinion [5]. Practical application shows that this technology has significant advantages in identifying key opinion leaders and tracking the transmission path of emergencies, especially in analyzing the aggregation effect of views formed by mutual influence between users. This analysis method based on the relational network provides a new technical perspective for understanding the dynamic evolution mechanism of social media.
3. Experimental methodology
3.1. Data collection from social media platforms
To ensure the accuracy of sentiment analysis and trend forecasting model, typical social media platforms like Twitter were selected as data sources. With diverse user groups and topics covering a wide range of fields, such as politics, people's livelihood, culture, and entertainment, these platforms have become high-quality sample banks for observing public discussions. Data capture posts, comments, and hashtags that focus on specific events (such as the US midterm elections), products (such as iPhone launches), and social issues. The time period covers the comparison data between the hot event cycle and the regular period to capture the short-term fluctuations in public opinion and the law of long-term trend evolution [6]. As shown in Figure 2, data acquisition is the starting point of the entire technical process, followed by data cleaning, feature extraction, and model training. This complete chain design ensures the systematization and reliability of analysis results.
3.2. Preprocessing and feature extraction
Raw social media data often contains a lot of disturbing information, such as internet slang, spelling mistakes, etc. The research team implemented a multi-step cleaning process: first, the text was segmented and the sentences were divided into independent semantic units; second, function words without practical meaning such as "of" and "are filtered out. Finally, through normalization processing, the various tense words are unified into the basic form. In feature engineering, the TF-IDF algorithm is used to identify text keywords, and word vector technology is used to transform the text into a numerical representation in a multidimensional semantic space [7]. As shown in Figure 2, the normalized feature data becomes the basic input layer for sentiment analysis and topic modeling. Such structured processing greatly improves the adaptability of the model to network language characteristics.
3.3. AI models implementation and training
To achieve accurate emotion discrimination, the research team built a convolutional neural network (CNN) classification model. By capturing the local semantic features of text, this model demonstrates its advantages in the task of short text classification in social media. In the training process, a three-stage optimization strategy is adopted: first, pre-training is performed based on a massive general corpus, then fine-tuning is performed for domain-specific data, and finally, the accuracy of positive and negative emotion discrimination is improved by parameter calibration [8]. In topic modeling, the LDA model is responsible for identifying the distribution of macro questions, while BERTopic generates more granular subtags based on semantic similarity. The graph neural network focuses on analyzing the propagation path formed by user interaction, encodes the user's attribute data and "like" redirection behavior into node functions, and dynamically simulates the topic diffusion trajectory. As shown in Figure 2, the collaborative operation of the three models constitutes a complete analysis system, forming a technical closed loop from data input to result output [9].
Figure 2: Experimental methodology flowchart for sentiment analysis and trend prediction
4. Experimental results
4.1. Sentiment analysis accuracy
The CNN-based emotion classification model performs well in the empirical test. As shown in Table 1, this model outperforms traditional algorithms such as support vector machines (85.1%) with an accuracy of 92.5%. Especially when dealing with abbreviations and homophonies in network language, CNN can effectively overcome the recognition barrier thanks to the ability to extract semantic features. Even in the face of the emergence of internet buzzwords such as “yyd” or context-dependent wordplay, the model can still maintain high discrimination accuracy thanks to context correlation [10]. This advantage comes from the convolutional layer’s ability to capture local semantic features, making it more adaptable when dealing with fragmented texts on social media.
Table 1: Sentiment analysis accuracy results
Model | Accuracy (%) |
CNN | 92.5 |
SVM | 85.1 |
Logistic Regression | 80.2 |
Naive Bayes | 78.3 |
4.2. Topic modeling insights
The topic modeling results clearly reveal the distribution of core issues in social media data. In the political election scenario, the model successfully separated key subtopics such as candidates' policy proposals and voter behavior characteristics. For product listing cases, focus on user ratings, brand awareness, and other dimensions to perform cluster analysis. As shown in Table 2, the aggregation score for the "new product launch" category reaches 0.91, indicating the highest semantic relevance in this domain. In particular, the application of bertopic technology makes the extracted topic labels more suitable for real-world business scenarios [11]. For example, in the case of promoting new energy vehicles, subtopics such as "endurance anxiety" and "charging facilities" can be accurately identified, providing an operational database for businesses to formulate marketing strategies.
Table 2: Topic modeling coherence scores
Topic | Coherence Score |
Politics | 0.85 |
Product Launch | 0.91 |
Entertainment | 0.78 |
Social Issues | 0.82 |
4.3. Graph neural network for trend prediction
The graph neural network-based prediction system is effective in tracking the spread of social media topics. By analyzing the propagation path in the user interaction network, the model can not only identify key opinion leaders but also simulate the process of topic infiltration between different communities. In the empirical analysis of three major public opinion events, the system successfully predicted the trajectory of the conflict topic, issued a public opinion warning 24 to 48 hours in advance, and provided a decision-making window for emergency interventions [12].
5. Conclusion
This study validates the innovative value of integrating multimodal AI technology into social media analysis. The CNN-based emotion classification model outperformed traditional methods with an accuracy of 92.5%, proving the advantage of deep learning in processing network language features. The topic modeling technology successfully attracted users' attention in the iteration cycle of new consumer electronics products and provided data support for enterprise product optimization. The prediction test of the graph neural network in three emergency situations shows that its trend prediction accuracy is 28% higher than that of the traditional propagation model. This technical framework shows the application potential in commercial competitive intelligence, public policy evaluation, and other scenarios, especially in identifying potential risks to public opinion, and the system can send early warning signals 24 to 48 hours in advance. The follow-up research will examine the cross-platform data fusion mechanism, improve the real-time monitoring capability of global events, and provide a technical basis for building an intelligent decision-support system.