1. Introduction
In recent years, stock price crash risk has become an increasingly critical area of study due to frequent episodes of extreme market volatility driven by investor sentiment. A notable example occurred in February 2025, a sudden plunge in A-share markets—particularly in speculative "concept stocks" like Cambricon and Hangzhou Iron & Steel—highlighted the risks of sentiment-driven volatility. These episodes highlight the substantial risks posed by speculative investor behavior, resulting in the rapid erosion of billions in market capitalization and revealing deep-seated vulnerabilities in market rationality and stability. Such events destabilize financial markets and jeopardize economic sustainability, making the study of stock price crash risk a critical priority.
A key factor amplifying market instability is the rapid dissemination of investor sentiment through social media platforms. According to China Internet Network Information Center, in 2023, China’s internet penetration rate exceeded 76%, with 1.08 billion netizens actively engaging on platforms like Weibo and Xueqiu, where financial discussions and sentiment spread instantaneously [1]. Social media download rates further reflect this shift: platforms such as Douyin (TikTok) and WeChat host over 800 million and 1.3 billion monthly active users, respectively, creating an ecosystem dominated by retail investors whose trading decisions are often heavily influenced by sentiment rather than fundamentals. In this hyperconnected environment, herding behavior and emotional decision-making become widespread, amplifying market volatility [2-3]. Behavioral finance theories posit that sentiment-driven trading exacerbates mispricing, as investors overweight recent news and underreact to long-term fundamentals. The 2024 sell-off of Reddit, shared by Tencent, which triggered a 7.29% intraday price collapse, exemplifies how sentiment shocks propagate through digital channels, destabilizing even established firms.
Existing literature has extensively explored the sentiment-crash risk nexus, yet critical gaps persist. Prior studies focus on macroeconomic factors (e.g., liquidity, firm size) or external governance (e.g., institutional ownership) as moderators [4-10]. However, there is a notable research gap regarding the mediating role played by internal control mechanisms—a fundamental component of corporate governance—in influencing stock price crash risk. Internal control systems, designed to mitigate managerial opportunism and ensure financial transparency, may act as a buffer against sentiment-induced overvaluation. To elucidate, firms with robust internal controls are less likely to engage in earnings manipulation, a precursor to abrupt corrections. Additionally, while heterogeneity analyses often emphasize firm size or ownership structure (e.g., state-owned enterprises), the impact of internal control deficiencies on sentiment sensitivity remains unexplored. In 2021, China Securities Regulatory Commission (CSRC) just revealed 13 typical illegal cases caused by internal control defects [11]. The underlying problems of these cases are strongly linked to financial misstatements, irregularities in accruals, and higher audit risks, all of which make it much more likely that stock price crash will happen.
This study aims to address these gaps by examining the role internal controls play in mediating the relationship between investor sentiment and stock price crash risk. Firstly, it systematically investigates whether investor sentiment exacerbates crash risk. Second, it introduces internal control quality as a novel mediator, testing whether sentiment exacerbates crash risk by eroding governance safeguards. Third, it pioneers heterogeneity analysis based on internal control deficiencies, revealing how governance failures amplify sentiment’s destabilizing effects. These contributions collectively deepen our understanding of the interplay between investor sentiment, corporate governance, and market stability.
2. Literature review
2.1. Investor sentiment
Investor sentiment refers to the collective mood or attitude of investors toward the market or specific stocks. This mood can significantly influence their trading behavior and, consequently, stock prices. Investor sentiment is often driven by psychological factors such as optimism, pessimism, or risk aversion rather than fundamental financial metrics [12]. Research on investor sentiment has evolved significantly, particularly with the advent of social media platforms, which provide real-time data reflecting investor attitudes and emotions.
Early studies, such as Barberis, proposed models to predict stock price reactions based on the strength and weight of news announcements [12]. Later, Baker and Wurgler demonstrated that investor sentiment affects stock prices through speculative demand and arbitrage limits, with sentiment-driven stocks (e.g., young and unprofitable firms) exhibiting higher volatility due to subjective valuations [13]. More recently, social media platforms like Twitter and Weibo have been utilized to measure investor sentiment. For instance, Bollen found that Twitter sentiment could predict 86.7% of the variations in the Dow Jones Index [14], while studies on Sina Weibo revealed a positive correlation between investor sentiment and stock market returns, with effects lasting over 40 trading days [15-16].
A lot of progress has been made in measuring investor sentiment as well. There are now lexicon-based approaches using SenticNet [17] and hybrid models with LSTM for sentiment analysis and stock price prediction [18]. These advancements highlight the growing importance of investor sentiment in understanding market dynamics and its potential to influence stock price movements.
2.2. Stock price crash risk
Stock price crash risk is defined as the probability of experiencing sudden and severe declines in stock prices, characterized by negative skewness and abrupt downward adjustments. Investor behavior, specifically speculative enthusiasm and over-optimism, often inflates stock prices beyond intrinsic values, creating unsustainable bubbles that, once burst, trigger severe market corrections [19].
Quantitative measures of stock price crash risk have become increasingly sophisticated. Chen et al. introduced negative skewness in stock returns (NCSKEW) and down-to-up volatility (DUVOL) [20]. Hutton et al. refined this by using weekly returns over one year and addressing asynchronous trading [21]. Further refinements by subsequent studies introduced threshold adjustments—for example, Jin and Myers proposed thresholds of -3.2 instead of the original -3.09—to enhance precision [22]. Additionally, these methods are widely adopted globally with local adjustment to capture the asymmetry in return distributions, with higher values indicating a greater likelihood of crashes [23-24].
2.3. Investor sentiment's impact on stock price crash risk
The relationship between investor sentiment and stock price crash risk has been a focal point of behavioral finance research. Sentiment-driven investors may ignore fundamental risks, creating a disconnect between stock prices and intrinsic values, which could potentially increase the likelihood of future crashes. However, this relationship is not yet fully understood, and the mechanisms (such as herd behavior and the mediating role of institutional investors) through which investor sentiment influences crash risk require further exploration [2, 8].
2.4. Internal control’s mediation role
Internal control, implemented by an entity’s board of directors, management, and other personnel to provide reasonable assurance, functions as an effective management tool and a means of power balance, information asymmetry elimination [24], and managerial opportunism mitigation. For firms, a robust internal control system is essential for organizations to manage risks, ensure reliable financial reporting, and comply with laws and regulations.
Previous studies have discussed internal control’s relationship with investor sentiment [25]. Amin found that higher investor sentiment boosts the internal control of shopping behavior [26]. Also, empirical research proved that social media’s categories and attention are correlated with corporate internal control [27]. Concerning the social media’s role in spreading information to investors and shifting the investors’ sentiments, we hypothesize that higher investor sentiment may enhance firms' internal control.
Regarding stock price crash risk, it is examined that internal control and its five components (i.e., control environment, risk assessment, control activities, information and communication, and monitoring) alleviate future stock price crash risk [25]. Also, empirical research proved that stock price crash risk has significant positive associations with accrual management, financial statement restatements, and auditor-attested internal control weaknesses [28]. Hence, in this context, we assume that a higher level of internal control mitigates the risk of a stock price crash.
Based on the literature review, we propose the following hypotheses:
H1: Higher investor sentiment heightens the stock price crash risk.
H2: The higher the level of internal control within a firm, the greater the impact of investor sentiment on stock price crash risk.
3. Research design
3.1. Data source and sampling
The fundamental sample of this paper is the A-share listed companies in cities of China from 2007-2022, for Accounting Standards for enterprises were revised since 2007 and financial metrics are acquired in estimation. Like previous literature, this paper excludes financial institutions and ST and PT companies, for these enterprises have different portfolios and unique exposure to climate transition risk compared with other enterprises.
The financial data of the companies in this study's sample were obtained from the CSMAR database, the stock financial data were sourced from the RESSET financial database. Notably, we excluded companies with missing data and minorized continuous variables at the 1% and 99% percentiles, ultimately resulting in a sample of 29,203 company observations.
3.2. Measure of variables
3.2.1. Independent variable: investor sentiment
The investor sentiment (sentiment) variable is constructed using principal component analysis at the individual stock level, drawing on the approach by Lei et al. and Tang and Cui [29-30]:
\( {sentiment_{i,t}}=0.2703\cdot tobinQ_{i,t}^{⊥}+0.3674\cdot MOM_{i,t}^{⊥}-0.2877\cdot BM_{i,t}^{⊥}+0.4524\cdot turnover_{i,t}^{⊥} \) (1)
Where tobinQi,t⊥, MOMi,t⊥, BMi,t⊥, and turnoveri,t⊥ are the normalized Tobin's Q, stock return momentum, book-to-market ratio, and stock turnover rate, respectively. The coefficients of the variables are consistent with the expected signs.
3.2.2. Dependent variable: stock price crash risk
Following formal research, we construct two proxies for stock price crash risk. The primary step involves estimating specific weekly returns [23, 31-32].
\( {R_{i,t}}={α_{i}}+{β_{1}}{R_{m,t-2}}+{β_{2}}{R_{m,t-1}}+{β_{3}}{R_{m,t}}+{β_{4}}{R_{m,t+1}}+{β_{5}}{R_{m,t+2}}+{ϵ_{i,t}} \) (2)
\( {W_{i,t}}=ln{(1+{ϵ_{i,t}})} \) (3)
Where, Ri,t and Rm,t separately represent the return of stock i in week t, and the value-weighted return of the A-share market in week t, while Wi,t refers to the company i specific return for week t. Utilizing Wi,t, we derive NCSKEW, a proxy for stock price crash risk, as follows:
\( {NCSKEW_{i,t}}=-[\frac{{n(n-1)^{\frac{3}{2}}}\sum W_{i,t}^{3}}{(n-1)(n-2){(\sum W_{i,t}^{2})^{\frac{3}{2}}}}] \) (4)
To clarify, a stock with a higher NCSKEW value tends to be more susceptible to crashes and vice versa. To keep our results robust, we also calculate down-to-up volatility (DUVOL) as another measurement for stock crash risk [31].
\( {DUVOL_{i,t}}=log{[\frac{({n_{u}}-1)\sum _{DOWN}W_{i,t}^{2}}{({n_{d}}-1)\sum _{UP}W_{i,t}^{2}}]} \) (5)
3.2.3. Other variables
In line with existing research [4, 8, 23], we selected the following control variables: ROE (Return on Equity), growth (growth rate), stockreturn (stock return), lev (leverage), BM (book-to-market ratio), tobinQ (Tobin's Q), dturn (stock turnover), size (firm size), and Big4 (indicator for Big Four auditors). These variables (as shown in Table 1) are selected based on their established relationship with firm performance and risk.
To test H2 and explore the underlying mechanism, we examine a potential mediator (internalcontrol) to be a proxy for internal control by using DIB China Listed Companies Internal Control Indexing [34-35].
Table 1. Variable measurement
Variable | Definition | Measurement | |
Control variables | ROE | Return on Equity | Net Income / Average Shareholders' Equity × 100% |
growth | Revenue Growth Rate | (Current Period Revenue - Previous Period Revenue) / Previous Period Revenue × 100% | |
stockreturn | Stock return Rate | (Ending Stock Price - Beginning Stock Price + Dividends) / Beginning Stock Price × 100% | |
lev | Leverage ratio | Total Liabilities / Total Assets × 100% | |
BM | Book-to-Market ratio | Book Value of Equity / Market Value of Equity | |
tobinQ | Tobin's Q | (Market Value of Equity + Total Liabilities) / Total Assets | |
Dturn | The change rate of turnover ratio; the difference between the current year's turnover ratio and the previous year's turnover ratio, divided by the current year's turnover ratio. | (current year's turnover ratio-previous year's turnover ratio)/ current year's turnover ratio | |
size | Firm size, typically based on total assets | Natural logarithm of Total Assets (ln(Total Assets)) | |
Big4 | Audit quality indicator, equal to 1 if the firm is audited by a Big Four audit firm, otherwise 0. | Dummy variable (1 = Big Four auditor, 0 = Non-Big Four auditor) | |
Mediation variable | internalcontrol | DIB China Listed Companies Internal Control Index, integrating the current status of internal control systems. | DIB China Listed Companies Internal Control Indexing |
3.3. Empirical model
The empirical model can be defined as:
\( {NCSKEW_{i, t}} =α+{β_{1}}{Sentiment_{i, t}}+{β_{2}}{ROE_{i, t}}+{β_{3}}{growth_{i, t}}+{β_{4}}{stockreturn_{i, t}}{+β_{5}}{lev_{i, t}}+{β_{6}}{BM_{i, t}}+ \)
\( {β_{7}}{tobinQ_{i, t}}{+β_{8}}{Dturn_{i, t}}{+β_{9}}{size_{i, t}}+{β_{10}}{Big4_{i, t}}+{Industry+Year+ϵ_{i,t}} \) (1)
Where all variables are in line with Table 1 variables measurement.
4. Results
4.1. Descriptive analysis
Table 2 presents the descriptive statistics of the main variables. The dependent variable, stock price crash risk (NCSKEW), has a mean of -0.330 and a median of -0.288, indicating that most firms in the sample exhibit relatively low crash risk. The minimum and maximum values of NCSKEW (-2.404 and 1.712, respectively) suggest significant variation in crash risk across firms, with a standard deviation of 0.719 reflecting moderate volatility. The key explanatory variable, investor sentiment (sentiment), has a mean of 0.062 and a median of -0.062, both close to zero, indicating that sentiment is generally balanced between optimism and pessimism. However, the wide range (-0.935 to 2.060) and a standard deviation of 0.592 highlight substantial fluctuations in sentiment over time.
Table 2. Statistics of main variables for each model
mean | sd | min | p50 | max | count | |
sentiment | .062092 | .5917303 | -.934765 | -.0617971 | 2.060167 | 29203 |
NCSKEW | -.3302244 | .7190124 | -2.404438 | -.2881167 | 1.711655 | 29203 |
ROE | .0510716 | .1519295 | -.9442208 | .0663336 | .3359866 | 29203 |
growth | .158368 | .3791702 | -.5887637 | .1018178 | 2.406837 | 29203 |
stockreturn | .1311219 | .5407138 | -.660803 | .003997 | 2.75 | 29203 |
lev | .4376431 | .2012745 | .066164 | .4316978 | .9011077 | 29203 |
BM | .6226063 | .254034 | .123555 | .616717 | 1.174154 | 29203 |
tobinQ | 2.042932 | 1.298765 | .851677 | 1.621489 | 8.093589 | 29203 |
dturn | 5.634927 | 4.191487 | .5913387 | 4.457237 | 20.51956 | 29203 |
size | 22.33051 | 1.292237 | 19.68826 | 22.14792 | 26.15236 | 29203 |
Big4 | .0654385 | .2473022 | 0 | 0 | 1 | 29203 |
N | 29203 |
4.2. Baseline Regression
The Table 3 indicate that investor sentiment (sentiment) has a statistically significant positive effect on stock price crash risk (NCSKEW). Specifically, the coefficient of sentiment on NCSKEW is 0.084 and 0.091 respectively, suggesting that higher investor sentiment leads to a higher likelihood of stock price crashes, regardless of whether we control for firm and year fixed effects (result (1) and (2)). When clustering firms by industry, the results remain consistent with our expectations (result (3)).
Table 3. Baseline regression results
NCSKEW | (1) | (2) | (3) |
sentiment | 0.084*** | 0.165*** | 0.165*** |
(0.011) | (0.015) | (0.014) | |
ROE | 0.021 | -0.057 | -0.057 |
(0.030) | (0.035) | (0.050) | |
growth | 0.055*** | 0.049*** | 0.049*** |
(0.011) | (0.012) | (0.013) | |
stockreturn | -0.179*** | -0.191*** | -0.191*** |
(0.010) | (0.014) | (0.021) | |
lev | -0.029 | -0.075 | -0.075* |
(0.025) | (0.047) | (0.040) | |
BM | 0.066* | 0.319*** | 0.319*** |
(0.035) | (0.051) | (0.047) | |
tobinQ | 0.021*** | 0.021** | 0.021** |
(0.006) | (0.008) | (0.009) | |
dturn | -0.021*** | -0.024*** | -0.024*** |
(0.001) | (0.002) | (0.002) | |
size | -0.025*** | -0.030** | -0.030** |
(0.005) | (0.013) | (0.014) | |
Big4 | -0.024 | -0.043 | -0.043 |
(0.018) | (0.040) | (0.047) | |
_cons | 0.280** | 0.275 | 0.275 |
(0.111) | (0.276) | (0.303) | |
Firm fixed effect | No | Yes | Yes |
Year fixed effect | No | Yes | Yes |
N | 29203 | 28812 | 28812 |
* p < 0.10, ** p < 0.05, *** p < 0.01
4.3. Robustness test
To ensure the robustness of our findings, it employs an alternative measure of stock price crash risk, DUVOL (down-to-up volatility). The results using DUVOL as the dependent variable are consistent with those using NCSKEW, confirming that investor sentiment significantly increases crash risk. Table 4 shows that the coefficient of sentiment remains positive and statistically significant, with a magnitude of 0.053. This consistency across different measures of crash risk strengthens the validity of our findings.
Furthermore, we also try to cluster by individual firm and control the fixed effect of Industry and year, rather than the fixed effect of firm and year, and also find that investor sentiment heightens the risk of stock price crash (coefficient=0.69, p < 0.05).
Table 4. Robustness check regression results
(1) | (2) | |
DUVOL | NCSKEW | |
sentiment | 0.053*** | 0.069*** |
(0.006) | (0.009) | |
_cons | -0.223*** | -0.334*** |
(0.001) | (0.005) | |
Firm fixed effect | Yes | No |
Year fixed effect | Yes | Yes |
Industry fixed effect | No | Yes |
N | 28812 | 29203 |
* p < 0.10, ** p < 0.05, *** p < 0.01
4.4. Heterogeneity analysis
This paper builds on previous research [35] by adding interaction terms between sentiment and three important moderators: firm size (size), state-owned enterprise (SOE) status, and internal control deficiencies (IsDeficiency). The results reveal significant heterogeneity:
4.4.1. Internal control deficiencies as moderator
The interaction term sentiment × IsDeficiency is statistically significant at the 5% level (coefficient = 0.039, p<0.01). This indicates that the positive impact of investor sentiment on crash risk is amplified for firms with internal control deficiencies. Specifically, a one-unit increase in sentiment raises NCSKEW by 0.081 units for firms without deficiencies, but this effect grows to 0.120 units (0.081 + 0.039) for firms with deficiencies (Table 5. (1)).
4.4.2. State-Owned Enterprise (SOE) status
The interaction term sentiment × SOE is significant at the 1% level (coefficient = 0.056, p < 0.01). For non-SOEs, sentiment has a baseline positive effect on NCSKEW (coefficient = 0.072), but this effect nearly doubles for SOEs (0.072 + 0.056 = 0.128) (Table 5. (2)). This suggests that SOEs, despite their perceived government backing, are more vulnerable to sentiment-driven crashes, potentially due to weaker market discipline, higher opacity, and investor overconfidence in their financial stability.
4.4.3. Firm size
In contrast, the interaction term sentiment × size is statistically insignificant (coefficient = -0.002, p > 0.10) (Table 5. (3)). The lack of significance implies that firm size does not systematically moderate the sentiment-crash risk relationship in our sample.
Table 5. Heterogeneity regression results
(1) | (2) | (3) | |
sentiment | 0.081*** | 0.072*** | 0.136 |
(0.011) | (0.012) | (0.140) | |
IsDeficiency | 0.042*** | ||
(0.014) | |||
sentiment_x_IsDeficiency | 0.039** | ||
(0.018) | |||
size | 0.021* | ||
(0.011) | |||
sentiment_x_size | -0.002 | ||
(0.006) | |||
SOE | -0.043 | ||
(0.031) | |||
sentiment_x_SOE | 0.056*** | ||
(0.016) | |||
_cons | -0.349*** | -0.320*** | -0.801*** |
(0.006) | (0.013) | (0.245) | |
Year fixed effect | Yes | Yes | Yes |
Firm fixed effect | Yes | Yes | Yes |
N | 28812 | 28208 | 28812 |
* p < 0.10, ** p < 0.05, *** p < 0.01
4.5. Mediation mechanism
Table 6 shows that investor sentiment negatively affects internal control quality (coefficient = 4.295) and that lower internal control quality increases crash risk (coefficient = -.0001423), which is significant at 5% and 1%, respectively. This indicates that investor sentiment exacerbates crash risk by deteriorating internal control quality and gives evidence to H2.
Table 6. Mediation regression results
(1) | (2) | |
internalcontrol | NCSKEW | |
sentiment | 4.295** | |
(1.684) | ||
internalcontrol | -0.000*** | |
(0.000) | ||
_cons | 640.835*** | -0.240*** |
(0.666) | (0.025) | |
Year fixed effect | Yes | Yes |
Firm fixed effect | Yes | Yes |
N | 28812 | 28812 |
* p < 0.10, ** p < 0.05, *** p < 0.01
5. Conclusion
This study examines the relationship between investor sentiment, internal control quality, and stock price crash risk using a sample of Chinese A-share listed companies from 2007 to 2022. The study finds that investor sentiment significantly exacerbates crash risk, particularly in firms with internal control deficiencies and state-owned enterprises (SOEs). Mediation tests reveal that higher investor sentiment enhances internal control quality, which in turn mitigates crash risk, highlighting the critical role of governance mechanisms in coping with market instability.
However, in this study, several limitations warrant attention. First, the PCA-based sentiment measure captures outcomes rather than causes of sentiment. Future research could use advanced NLP techniques like Large Language Models (LLMs) or BERT to analyze social media content for more direct sentiment measurement [36-37]. Besides, while fixed effects mitigate some endogeneity concerns, issues like reverse causality may persist. Addressing this could involve using instrumental variables (e.g., regulatory shocks) or natural experiments. Last but not least, the mediation effect of internal control quality, though statistically significant, has limited economic significance (-0.000). Future studies could explore alternative mediators (e.g., auditor quality) or complementary mechanisms like ESG performance.
These findings have important implications. For researchers, this empirical study adds to behavioral finance by showing how sentiment, internal governance, and market stability are connected. It does this by filling in a research gap and adding to the mediation mechanism. Additionally, concerning the results, policymakers and managers have to strengthen internal controls and enhance transparency, particularly in SOEs and firms with governance deficiencies, to mitigate the stock price crash risk and construct a stabler capital environment. Based on this research, further research could explore external mechanisms linking sentiment to market stability and the role of social media’s regulatory interventions by utilizing recent Large Language Models (LLMs) or machine learning techniques.