MULTI-DATA MINING FOR UNDERSTANDING LEADERSHIP BEHAVIOR

We propose an approach for understanding leadership behavior in dot-jp, a non-profit organization, by analyzing heterogeneous multi-data composed of questionnaires and mailing list archives. Attitudes toward leaders were obtained from the questionnaires, and human networks were extracted from the mailing list archives. By integrating the results, we discovered that leaders must receive messages from other people as well as send messages to construct reliable relationships.


INTRODUCTION
We studied dot-jp (http://www.dot-jp.or.jp/), a non-profit organization (NPO) in Japan that organizes seminars and internship programs, which give university students the opportunity to participate in political activities with diet members.Through the internship program, students learn how diet members engage in political activities.
The headquarters of dot-jp is in Osaka, and seven branch offices are distributed all over Japan (Branches A, B, C, D, E, F, and G 1 ).Each branch office has nine to twenty one staff members, and one of the staff members is appointed to be the area manager responsible for managing all staff members.
Most staff members in dot-jp are university students, and having face-to-face meetings with each other is difficult because of the distances involved.For these reasons, staff members primarily use e-mail to exchange information, plan and arrange events, and discuss matters.Communication via e-mail creates a 1 We used fake names to maintain secrecy.

S61
Data Science Journal, Volume 6, Supplement, 6 March 2007 virtual office and complements real office communication.An overview of the e-mail archives in the seven branch offices of the 14th period (from October 2004 to March 2005) that we analyzed is shown in Table 1.Thousands of e-mails were exchanged during the period.
Table 1.Number of staff members and e-mails exchanged during the 14 th period (October 2004-March 2005) Much research on NPOs has showed that reliable relationships among staff members are crucial to make the most of human and knowledge capital (Krackhardt & Hanson, 1993;Drucker, 1993;Wenger, 1999).
Leadership behaviors also play an important role in determining the atmosphere or culture of an organization (Perkins & Wilson, 1999) and the ability to create knowledge from the experiences of staff members (Krogh, Ichijo, & Nonaka, 2002).In an ongoing project about NPOs (OSSIP04, 2004), the aspects of capital, scale of operation, human resources, and partnerships with governments and organizations are being studied.However, the relationships among staff members are not being investigated although they are quite relevant for sharing missions and motivating voluntarism.
We propose an approach for understanding leadership behavior in organizations by using questionnaires and human influence networks.We first determined the results of the questionnaire for 97 staff members working in dot-jp to understand the degrees of satisfaction of staff members with leaders.Then, we extracted human influence networks from the e-mail archives used at dot-jp to understand the relationships between leaders and other staff members.Finally, we integrated the results of the questionnaire with human influence networks and determined reliable leadership behaviors.

QUESTIONNAIRES
Understanding the degree of satisfaction of staff members in an organization can be a key to discovering specific problems related to faults in human resource management (Cohen & Prusak, 2001).We sent questionnaires to all staff members (104) working for dot-jp to understand their degree of satisfaction, specifically with their managers and their branch offices.Then, by comparing the results of questionnaires received from 97 staff members (correction rate: 93%) with the achievement rate of activity, explained later, we determined factors that lead to successful activities at dot-jp.

Questionnaires
To investigate the degree of satisfaction of staff members with their branch offices, area, seminar, and internship managers, we sent questionnaires to 104 staff members working in the seven branch offices in March 2005, the last month of the 14th period.The questions in the questionnaire were as follows: -Q1.Please rate your degree of satisfaction with your branch office.
-Q2.Please rate your degree of satisfaction with the area manager in your branch office.
-Q3.Please list up to three substantive leaders in your branch office.

Achievement Rate
The activity for each branch office was numerically evaluated by comparing the total number of students coming to seminars and diet members accepting student interns with the desired number for both groups set at an early stage of each period.The achievement rate was calculated using the following equation: Achievement rate (%) = x / y * 100 where x= actual number of students and diet members y= desired number of students and diet members

Assumptions of Leadership Behavior
The results of the questionnaire (Q1, Q2, Q3, and Q4) and the achievement rate for each branch office are shown in Table 2. To understand the relationships between the degrees of satisfaction and the achievement rates, we classified the branch offices into three groups based on their achievement rates.
-Group-1: The high-achievement-rate group, including Branches A, B, and C. Staff members in Branches A and B had high degrees of satisfaction with their branch offices and area managers.Staff members in Branch C had a high degree of satisfaction with their branch office but a low degree of satisfaction with their area manager.
-Group-2: The middle-achievement-rate group, including Branches D and E. The degree of satisfaction with the area manager in Branch D was quite high compared to that in Branch E.
-Group-3: The low-achievement-rate group, including Branches F and G.The degree of satisfaction with the area manager in Branch G was much higher than that in Branch F. Table 2. Achievement rate and averaged degree of satisfaction for each branch office and area manager.
We did not consider that these groups reflect the classification of management ability of area managers because the achievement rates depend on uncontrollable external factors.It means, however, that the degrees of satisfaction within the same group reflected the management ability of area managers since they excluded effects related to the achievement rate.We made an assumption regarding the management ability of area managers by comparing branch offices in the same groups with different degrees of satisfaction with area managers.

Assumption:
The high degrees of satisfaction with area managers come from a high level of management ability.
We interviewed 10 staff members individually, face-to-face, and made sure that the assumption was empirically getting to the point, (i.e., area managers of Branches C, E, and F were not trusted regarding their management ability, whereas area managers of other branches had the esteem of their staff members).In the following sections, we further investigate leadership behavior by comparing human influence networks obtained from the archives of e-mail with this assumption.

IDM
The IDM was originally an algorithm for measuring values of influence of messages, senders, and terms from threaded messages (Matsumura, 2003) (Matsumura, 2005a(Matsumura, 2005b).In the IDM, the influence between a pair of senders is defined as the number of terms propagating human-to-human influence via messages.Details are listed in the above references.

S64
Data Science Journal, Volume 6, Supplement, 6 March 2007 By mapping the propagating terms between senders, a human influence network can be obtained, where the number of outgoing terms corresponds to the values of influence (I <out> p ) and the number of incoming terms corresponds to the influenced values (I <in> p ). From the figure, the relationships of influence among the staff members can be visually understood.

Verification of Assumption
We applied the IDM to the archives of e-mail in each branch office and obtained I <out> p and I <in> p as influence and influenced values of each staff member.The top six staff members by I <out> p for each branch office are shown with I <in> p in Table 3, where the same names across branch offices represent distinct staff members2 .The staff names with "*" (asterisks) represent area managers.From Table 3, the following tendencies can be seen: -Three out of seven area managers have the highest I <out> p value in their branch offices.
-Those who have high I <out> p values do not always have high I <in> p values, and vice versa.
Table 3.For each branch office, the top six staffs by I <out> p are shown with I <in> p .Note that the same staff names in different branches represent distinct staffs, and the staff names with "*" (asterisk mark) represent area managers.

Interpretation of Human Influence Network
The graphs of human influence networks help us understand the behavior of area managers from a structural point of view.The graphs of the IDM obtained from the archives of e-mails of Branches A, C, F, and G are shown in Figures 1, 2, 3, and 4 respectively.Here, the figures are composed of the top six most influential staff members and influential links between them with values of influence of more than 10.In the figures, area managers are depicted as gray nodes, and the relationships among staff members are shown as directed links with values of influence.These figures illustrate the behavior of area managers toward their staff members.We summarized the leadership behaviors based on human influence networks and the analysis of the questionnaires.
-Interactive behavior: In Branches A, B, and D (Upper left figure in Figure 1), for example, the behaviors of the area managers are "interactive" enough to give and receive influence from other staff members.The questionnaires showed that these area managers are highly trusted.
-Partially interactive behavior: In Branch E and G (Lower right figure in Figure 1), for example, the behaviors of the area managers are "partially interactive", that is the area managers receive influence from five staff members while influencing three staff members.From the results of the questionnaires, the reason that the area manager in Branch G succeeded in managing the Branch came from the high influential and influenced values.By contrast, the area manager in Branch E failed because of the low influential and influenced values as compared to other managers.
-Preferential behavior: In Branch C (Upper right figure in Figure 1), the behavior of the area manager is "preferential", stricter than partially interactive behavior, that is the area manager influences four staff members and is influenced by only two staff members.The lack of interaction, specifically from his low values of influence, resulted in the failure of managing the Branch.
-Passive behavior: In Branch F (Lower left figure in Figure 1), the behavior of the area manager was "passive", that is the area manager was influenced by four staff members and influenced only one staff member.The apparent lack of communication caused staff members to distrust the area manager in this Branch.

CONCLUSION
The activities of NPOs are based on the voluntarism of staff members.Unfortunately, staff members often lose this motivation and eventually leave organizations.Part of the reason for this comes from staff members' dissatisfaction with the lack of their leaders' management ability.We proposed an approach for integrating questionnaires and IDM analysis and determined ideal leadership behavior.
For future work, we will apply the proposed approach to other popular communication tools, such as weblogs or BBSs, to accumulate further knowledge on leadership behavior.In addition, we are planning to establish guidelines regarding leadership behaviors relating to communication to put the results of our research to use.

Figure 1 .-
Figure 1.Human Influence Networks (Upper left: Branch A, Upper right: Branch C, Lower left: Branch F, Lower right: Branch G)