^{1}

^{2}

^{2}

^{2}

^{1}

The authors have declared that no competing interests exist.

Conceived and designed the experiments: XQS HWS XQC. Performed the experiments: XQS HWS. Analyzed the data: XQS. Contributed reagents/materials/analysis tools: YZ. Wrote the paper: XQS HWS.

Stock price prediction is an important and challenging problem in stock market analysis. Existing prediction methods either exploit autocorrelation of stock price and its correlation with the supply and demand of stock, or explore predictive indictors exogenous to stock market. In this paper, using transaction record of stocks with identifier of traders, we introduce an index to characterize market confidence, i.e., the ratio of the number of traders who is active in two successive trading days to the number of active traders in a certain trading day. Strong Granger causality is found between the index of market confidence and stock price. We further predict stock price by incorporating the index of market confidence into a neural network based on time series of stock price. Experimental results on 50 stocks in two Chinese Stock Exchanges demonstrate that the accuracy of stock price prediction is significantly improved by the inclusion of the market confidence index. This study sheds light on using

With the increasing availability of huge databases for financial systems, financial study becomes a hot research topic. Scientists attempted to understand the statistical mechanics of financial systems, e.g., analyzing long-term trend and fluctuation of stock indices [

Existing methods for stock price prediction generally fall into three main paradigms. The first kind of methods predicts stock price by exploiting autocorrelation of price dynamics [

Stock price is actually the outcome of a game among investors. For a particular stock, each investor has his/her own estimation to the fundamental value of the stock. Investors submit ask price or bid price to stock exchange, with these prices reflecting their estimation to the value of stock. Next, stock exchange executes ask orders and bid orders according to predefined rules, and stock price is a result of the execution of orders. In this process, investors’ estimation to the value of stock is the determinant of stock price. The estimated value of a stock is a full reflection of investors’ judgment according to all relevant information they could get about the stock. To buy or to sell a certain stock reflects a trader’s expectation of the movement trend of price. The decision of an individual investor may be based on incomplete information, but the collective behavior of all investors could remedy the lack of information and finally determines the price of a stock. Therefore, stock price dynamics reflects investors’ estimation to the trend of the value of stock. Investors’ confidence to their estimation offers us a natural predictive indicator for stock price.

In this paper, we study the problem of stock price prediction from the perspective of market confidence. Different from existing methods that attempt to capture market confidence by their behavior in contexts that are exogenous to stock market, we propose to extract market confidence directly from stock transaction records. Our study is conducted on transaction data of 50 stocks in two Chinese Stock Exchanges. The data contains the identifier of traders in each transaction, allowing us to identify how active each trader is. To characterize market confidence, we check the fraction of traders who also participate trades in the immediately previous trading day. Using Granger causality test, we find that stock price is strongly correlated with an index of market confidence, i.e., the ratio of the number of traders who is active in two successive trading days to the number of active traders in a certain trading day. By combining the market confidence index together with time series of stock price, we propose a stock price prediction model based on feed forward neural network. Results demonstrate that the accuracy of stock price prediction is significantly improved by the inclusion of the market confidence index. Furthermore, we investigate four types of trading patterns for traders that participate trades in two successive trading days: buy-buy, buy-sell, sell-buy, and sell-sell. We find a negative correlation between trading pattern buy-buy and price return, and sell-sell is the most relevant pattern with stock price. For manipulated stock, buy-sell is the most relevant trading pattern to stock price.

Price is determined by supply and demand. In stock market, supply and demand is reflected by the trading activity of investors, i.e., sell and buy. Stock price is correlated with the number of sellers and the number of buyers [

For clarity, we separately show the results of manipulated stocks and non-manipulated stocks. Stocks are ranked according to the statistical significance.

The unreliability of supply and demand at predicting stock price motivates us to find alternative indicators for stock price prediction from investors’ trading behavior. From this perspective, our data possesses a unique advantage, i.e., it contains trader identifier in each executed transaction. With trader identifier, we could investigate the trading behavior of the same trader across trading days. Based on cross-day trading behavior, we propose an index to characterize market confidence of investors. Specifically, for a given trading day

We now validate whether the proposed market confidence index is effective at predicting the change of stock price. Given a stock, for each trading day _{t} (See

We use two metrics, i.e., accuracy and MAPE (see

We also perform the method on manipulated stock data set. We can see that MAPE is below 5% for eight manipulated stocks. The accuracy of prediction is below 66%. For manipulated stocks, the prediction accuracy is lower. One possible reason is that the stock price is manipulated by colluded traders and becomes less predictable using supply-demand relationship. Manipulation detection is a research topic with high relevance to stock market analysis. This topic is out of the scope of this paper. Here we show the difference of manipulated and non-manipulated stocks by analyzing different types of trading relationship.

In the previous section, we see that the proposed method for stock price prediction exhibits different performance at manipulated stocks and non-manipulated stocks. To clarify what matters in the proposed prediction method, we classify active traders (i.e., the traders who participate trades in two successive trading days) into four categories according to their trading patterns in two successive trading days, being B-B, B-S, S-B, and S-S respectively. The first letter denotes sell or buy in the first day and the second letter denotes sell or buy in the second day. We denote the number of traders in each category as _{B−B}, _{B−S}, _{S−B}, and _{S−S}. In this way, we analyze the correlation between these four trading patterns and stock price.

The price is presented in the upper panel and the four kinds of trading patterns in two successive days are shown in the bottom panel.

Active traders provide critical indictors for understanding trading behavior. As an illustration, we now show that the distribution of active traders over four categories could differentiate manipulated stocks from non-manipulated stocks. _{B−B} and the change of stock price for most stocks. Compared with non-manipulated stocks, manipulated stocks behave differently. We find that the correlation between _{B−B} and the change of stock price is not significant. This has two implications. First, price determines traders’ behavior for non-manipulated stocks. If some people buy in the first day and still buy in the second day, the stock price falls. For manipulated stocks, this phenomenon diminishes. Second, compared to other trading pattern,

Stocks are ranked in terms of correlation coefficient.

We investigated the dynamic behavior of traders in stock markets. Our study is based on transaction data, i.e., the executed transaction generated by electronic trading system in stock exchange. This kind of data provides us an effective way to grasp the trading relationship among investors and provides us a potential way to learn the trading behavior of investors. Based on transaction data, we consider the supply-demand relationship for each stock. We study whether trading behavior could predict the change of stock price.

Using transaction record of stocks with identifier of traders, we introduce an index to characterize investors’ confidence to the estimated value of stock. Strong Granger causality is found between stock price and the index of market confidence, i.e., the ratio of the number of traders who is active in two successive trading days to the number of active traders in a certain trading day. We further deployed a feed forward neural network to predict stock price, with the input being historical stock price, trading activity, and market confidence. Results showed that the inclusion of market confidence could significantly improve the prediction accuracy of stock price. Then we refine the active traders’ behavior into four trading pattern: buy-buy, buy-sell, sell-buy, and sell-sell. We find a negative correlation between trading pattern buy-buy and the change of stock price, and sell-sell is the most relevant pattern to the change of stock price. For the manipulated stock, buy-sell is most relevant to the change of stock price. This phenomenon means manipulators affect stock price by frequent short-term trade shares.

The data used in this paper are transaction data of stocks listed on Shanghai Stock Exchange and Shenzhen Stock Exchange in 2004. This data is also used in our previous studies [

Following the method used in our previous works [

In this paper, we use Granger causality test to judge whether trading activity and market confidence are useful at forecasting the change of stock price. The change of stock price at day _{t} is the opening price at day

Granger causality test could provide some insight about which trading activity is potential at predicting stock price. However, Granger causality test is based on linear regression model and thus cannot uncover the relevant factors which are non-linearly predictive for stock price. To address this problem, we develop a three-layered feed forward neural network model which is non-linear model and could fully exploit the potential prediction power of its input.

To train the neural network, we divide all the data into two equal-sized parts: the training set and the test set. For the stocks in training set, the future stock price is used to train the neural network. For the stocks in test set, only the past time series of stock price, trading activity, and market confidence are known. To assess the role of trading activity and market confidence, we compare the performance of neural networks with two different sets of inputs: (1) the time series of stock price and the time series of trading activity; (2) the time series of stock price, the time series of trading activity, and the time series of market confidence. The output of neural networks is the change of stock price.

The effectiveness of prediction method is measured in terms of the Mean Absolute Percentage Error (MAPE) and the accuracy at predicting the rise or fall of stock price. MAPE is a measure to evaluate the accuracy of the predicted time series relative to the real time series. Denoting with _{t} the real change of stock price and _{t} the predicted change of stock price, MAPE is defined as: