Knowledge Building in the Fields of Sti (strategic Technical Information) and Geopolitics for Competitive Intelligence: a Strategic Application for the Chinese Market in the Field of Agricultural Biotechnologies in the People's Republic of China (prc) 2 Limagrain and Its Needs for Information in the

In this paper the authors underline the increasing importance of Chinese scientific information. They compare the results of a request launched on Western and Chinese databases concerning Chinese scientific paper publication. Although Chinese scientists are having more and more visibility in international scientific journals, their publications in English are far fewer than those in Chinese. The difference is so drastic that it emphasizes the need for westerners (researchers and observers) to access and master the use of Chinese sources of information. The aim of this paper is to define the specificity of Chinese scientific information and also to present some primary results concerning the automatic analysis of Chinese writing. Our method uses a specific core language close to the point of view of the expert and his or her knowledge that will permit accurate information retrieval from a huge quantity of documents. 1 INTRODUCTION The inexorable development of China on the international scenes of economics, politics, and science illustrates the necessity for decision makers to consider Chinese information to be a key issue. Therefore, it is now necessary for decision makers to develop various strategies to understand the development of all types of Chinese activities: not only as a source and sale target for potential markets but also as a scientific and technological actor in the world development. The complexity of the Chinese language, however, necessitates the development of a better understanding of Chinese databases as well as the development of special tools to retrieve Chinese information. The aggregation of " genuine " Chinese information with the Chinese information retrieved from Western databases will provide a better overview of the Chinese activities. In this paper, we use the tools that we have developed in our data processing laboratory and apply them to an analysis of the Chinese environment. For this purpose, we, of course, consider the different English language information sources available on World Wide Web, but we also postulate that one cannot have all the relevant information without taking into account information in the Chinese language. Our work is focused on this specific aspect. This research has been done within the framework of an industry-university partnership (the Limagrain Group and the University of Paris Est Marne la Vallée). The results will be applied to implement a strategic monitoring of the Chinese seed market. After the presentation of the framework of the study, we introduce the Chinese language information sources …


INTRODUCTION
The inexorable development of China on the international scenes of economics, politics, and science illustrates the necessity for decision makers to consider Chinese information to be a key issue.Therefore, it is now necessary for decision makers to develop various strategies to understand the development of all types of Chinese activities: not only as a source and sale target for potential markets but also as a scientific and technological actor in the world development.The complexity of the Chinese language, however, necessitates the development of a better understanding of Chinese databases as well as the development of special tools to retrieve Chinese information.The aggregation of "genuine" Chinese information with the Chinese information retrieved from Western databases will provide a better overview of the Chinese activities.
In this paper, we use the tools that we have developed in our data processing laboratory and apply them to an analysis of the Chinese environment.For this purpose, we, of course, consider the different English language information sources available on World Wide Web, but we also postulate that one cannot have all the relevant information without taking into account information in the Chinese language.Our work is focused on this specific aspect.This research has been done within the framework of an industry-university partnership (the Limagrain Group and the University of Paris Est Marne la Vallée).The results will be applied to implement a strategic monitoring of the Chinese seed market.After the presentation of the framework of the study, we introduce the Chinese language information sources and propose an approach to their treatment by the MEVA (Mémoire Evénementielle et Virtuelle d'Actualisation: Updated Event and Virtual Memory) method.

Presentation of the company
The Limagrain Group is an agricultural cooperative comprising 600 stakeholders, which was founded in 1942 with headquarters in Chappes, not far from Clermont-Ferrand in the Auvergne region in the centre of France.
Without detailing the history of the group, we point out some key elements that allowed us to have a comprehensive view of this seed company, which today has become a European leader in marketing corn and wheat, number four in the world behind Monsanto, Pioneer, and Syngenta in seed production.
Already present in Europe and North America, in 2000, the group started developing a relationship with Asia, establishing itself in Japan, China, Thailand, India, and Australia.Within the framework of its international development project, the group arrived in China in 1997 and set up a joint venture with a Chinese partner, first "to feel" the market.The group, which works in China through many subsidiary companies (Vilmorin, Hazera, Clause-Tézier, etc.), has had difficulty understanding the Chinese market.
The first obstacle was the integration of transcultural dimensions within the group.Previously, company and subsidiary locations outside of France were in Western environments where cultural and linguistic differences were minor.In China, of the 90 collaborators, five speak English fluently, and fewer than five others can express themselves in English for strictly professional communications only.The rest of the staff speaks only Chinese, and even they have to adapt themselves to local dialects according to the areas in which they work.The French national employees based in China (all in Beijing) speak French and English.Thus the decision makers realized the necessity of having a French colleague who could speak and read Chinese and decode local behaviours through knowledge of the local culture.Expectations of the Limagrain Group as far as China is concerned are multiple: penetration of the market, preservation of its world competitiveness, a source of new plant species, and research partnerships in biotechnology.

The complex Chinese equation
Chinese society is in constant change and is evolving at a very fast rate.The Chinese government has adapted itself to the international market system (an important step of the process was China's entry into the WTO in 2001), and the central government has liberalized the economy and supported the creation of private companies.
One is able to view the emergence of numerous companies in all branches of industry.It is this phenomenon that created the Chinese market such as it is today: extremely segmented.Following the example of other market sectors, the seed sector is fragmented, with about 3000 Chinese seed companies of all sizes: from a very small company selling only one kind of seed at a local level to national, large scale firms covering all vegetable species and having international ambitions.Thus, it is a very complex proposition to observe the competitiveness of these companies.In the same way, monitoring the scientific nature of these companies becomes difficult as there are 150 listed laboratories working in research fields closely related to those of Limagrain (biotechnology, genomic, plant selection, plant health, etc.).
As China has opened itself to the world with increasing speed since its accession to the WTO in 2001, its legal system is in constant evolution at all levels: import-export regulations, hygiene securities, agricultural, business, intellectual property, etc.We can, moreover, point out the difficult Chinese equation: a slowly increasing population of 1.300 billion inhabitants, only 13% of the land useful for agriculture (this figure is 54% in France) and this percentage decreasing year by year due to urbanization, ground and water pollution, and expansion of deserts.Moreover, large losses of cereals and live stock occur each year because of drought and sandstorms in the north and floods and typhoons in the south.
Finally, we must point out recent changes noted during the year 2007: -The first observation is that most of the important international competitors have set up joint business ventures or research partnerships with Chinese institutions or companies.Actually the international leaders of the seed market are taking positions in China encouraging the phenomenon of market concentration.-The second observation is the explosion of the number of high level Chinese scientific articles in international scientific journals.We are thus facing a complex equation where the various actors in the system are in position to change significantly the market dynamics.

The need for information
Considering the Chinese agricultural market as a complex system in which it is necessary to identify and reveal the social and economic actors, the need for Limagrain Group to create an "academic" research in information sciences through a PhD program became clear.The final goal is to create a scientific, economic, and geopolitical observatory of the market for agricultural biotechnologies in China.There is a strong need to understand the Chinese market through indicators different from those used for the Western market because this market obeys a different dynamic and because available information is not of a comparable nature (see the following section).For example, the usual financial indicators in competitive intelligence are not available in the Chinese context as they are in the West.
The main questions concerning the seed companies belonging to the Beijing ISF (International Seed Federation) branch regarding the Chinese market are: -Will the Chinese government authorize the use of GMOs (genetically modified organisms) and if yes when?-Will Chinese researchers make a breakthrough in the techniques of hybridization?-Who will be the actors in these movements?-Who will be the Western partners?-What can slow down or accelerate such movements?-What are the forces according to which the concentration of the market takes place?
To answer these questions, one must keep in mind that we are analyzing a very dynamic area that evolves irrevocably.To anticipate this evolution, it is necessary to identify the partners, the networks of influence and lobbying (soft power), the categories of actors, etc. from a behavioural angle.Our goal is not to provide "raw" information to the decision makers but rather to help them to distinguish the different systems interacting together and discover the dynamic that animates them.
The observatory thus created will respond to the different steps of the intelligence cycle.The observatory, which could be called the "Agricultural Biotechnologies Observatory", will first operate on the Chinese market and will be built with tools allowing the analysis of qualitative contexts and various possibilities according to which these contexts can be linked.
If information about China on the World Wide Web is increasingly expanding, we postulate that an exhaustive information search cannot be based only on English language sources.Our work is thus centred mainly on information available on the Chinese Web.It is necessary to know how to gather this information and to analyse the data in order to make relevant recommendations to the decision makers.We expect to be able to find in the Chinese literature indicators that are undetectable in the English literature.

INFORMATION SOURCES IN CHINA
To find scientific articles, we used for the most part two of the most relevant data warehouses: Wanfang Data and CqVip.To present the structure and the contents of these bases, we use Wanfang Data as an example.(We note that these databases are competing and that the majority of the articles may be found in both databases.)Table 1 provides a summary of the major classes of documents found in the Wanfang database.Table 2 summarizes the major areas of interest covered.Table 3 lists other topics covered by these databases.

PRESENTATION OF THE MEVA SYSTEM
A European decision maker arriving in China will find himself immediately projected into a completely notunderstandable environment where he can control almost nothing (Flaherty, 2003).He is not able to take a taxi, to look at the news on television, to ask his way in the street, or to order a dish in a restaurant.The French Consul in Shanghai in 2003 said that newly arrived foreigners were like deaf, blind, and dumb people.We add also that foreigners are completely dependent on interpreters who translate incompletely.It is thus advisable to help the decision maker understand an obscure environment.
Our approach of searching information is different from the information scientist's frequently used methods.Someone searching information on the Web generally launches a quite general query on a search engine.Then, if the person does not want to sort among the thousands of results that appear relevant to him, the searcher must reduce the query in a heuristic way by selecting key words or with the help of Boolean operators.
The MEVA's method (2009) consists also in launching a rather general request then operating a memory indexation built on the base of knowledge links.The MEVA system is operated in OPENLISP.The point of view and concerns of the experts are expressed not in terms associated with Boolean operators but in actions and results expected (Paoli-Scarbonchi, 2005).Only a few tens of URLs thematically referenced are brought back.They reflect the exact thoughts of the expert.
The first method finds thousands of URLs amongst which only few will be relevant, whereas the MEVA method proposes a few tens of URLs that all are perfectly relevant regarding the request of the expert.Thus, the request [maïs + Chine] on Google proposes more than 7 million URLs, the request [corn + China] proposes more than 16 million, and the request [玉米+中国] on the Baidu search engine proposes more than 18 million web sites.(See documents in Appendix A).However among all these answers, less than one hundred are really relevant to the expert.

MEVA and the theory of the social systems
We previously defined the framework of our study as a complex environment in which several actors and factors are in constant interaction.Thus, we can consider the following areas of interest for the experts: -Limagrain -Competitive environment -Chinese constraints of sustainable environment and growing population -Agricultural policy and food sufficiency strategy of China -The world scientific research context in the field of agricultural sciences According to the ideas of Niklas Luhmann, (Barbesino, 1998) any system is permanently distinguishing itself from its environment.The environment is considered by the system to be in chaos.Therefore, the system is constantly seeking to reduce the complexity of the chaos by interpreting (understanding the meaning) the communications (exchanges from the system with the exterior).In other words, the system reacts and observes itself in order to update its available topics.However, regarding the Limagrain Group's interrogations about the Chinese market, they are all about determining which social actors will affect the evolution of the system.The observed system here is the Chinese social system in which the problems of the Limagrain Group fit.This social system is indicated hereafter by initials CSS for "Communicating Social System" (Ricoeur, 2000), the characteristic of a social system being its communication.It is not only necessary to distinguish the various components of the CSS, but it is also necessary to release their relations and to determine which component is in the most favourable position to institute changes inside the system.
By the MEVA method (Paoli-Scarbonchi, 2006), the watcher must have an interpretative approach to the context and, for that, must put himself inside the situation and know the interlocutor's system, the recipient of the final information.The social group to which the expert belongs has to be deciphered to capture the modalities of its environment.The observation is a selection by the expert of what brings up to date and potentiates a part of reality (Vygotski, 1997).The watch profile, built on the basis of the expert's implicit knowledge, is in fact one of the forms of the CSS.By contextualizing the knowledge, it delimits the borders of the request and potentiates the other forms.The predictive capacities will rise out of the actualization of the expert's knowledge organized within the watch profile.

The watch profile
The traditional approach for searching information is based on a semantic indexation by launching queries with a Boolean combination of keywords, title words, codes, author names, institution names, dates, etc. (Dialog, 2008).This method of retrieval presupposes that the user knows the words, synonyms, etc. used by the author of the information to be retrieved.In many cases, key information is missed either because the words used by the writer are different from those known by the end user or because the information is different in nature than that expected, although still relevant to the subject.This is why it is preferable to perform a very large query at the beginning and then search for the right information using the MEVA method in this corpus.The MEVA method is based on a cognitive indexation (or social semiotic) with meaning as the basic unit.Thus, there is no need to build an ontological dictionary indexing all the keywords in the expert's domain.Here one creates connections of knowledge that are dynamic because they are moved from inside by semiotic relations.According to Maturana (1988), "What exists is what is communicated".During the day to day watch, the watcher has to refine the knowledge and the CSS context thus permitting an increase in the precision of the connections among the elements of knowledge.The qualitative indexation of the knowledge elements is contextualized.Raw information is contextualized in order to produce high value information.Once its meaning appears in a context, it then becomes useful, actionable knowledge.It is necessary to continuously update the watch profile with new knowledge in parallel with changes in the CSS or the watch orientation (or new requests) or with the evolution of the problems.

MEVA core language and the Chinese language
In the MEVA method, the indexation of the watch profile in the core language allows construction of the system connected with the role of internal observer of the Limagrain CSS.The core language of each watch profile is specific to each interacting system.There is no need for ontology because one considers only what is important for the relevant CSS.The watch profile is built on semiotic elements and not semantic ones.As a CSS is specific to a category of people, there is a tacit consensus within the communication of the knowledge in the domain as to connections versus meaning units, any language being a "system of systems" (Guillaume, 1984).
To create a terminology, it is necessary to discover the links within the meaning.
Thus the interpretative step is a key one for indexation.The MEVA core language is close to the internal representation of reality in human psyches (Internal Semiotic Forms or ISF).With the MEVA method, based on memory indexation, a multilingual information sources watch is possible because indexation is done directly on the meaning, upstream of the language.Working on the meaning in a specific language, that is in External Semiotic Forms (ESF), is done in the second step.
The Chinese language (Xu, 1996), as it belongs to the primary linguistic area determined by Guillaume (1984), is interesting at this level because it operates in a way closer to psychic representations (ISF) than Indo-European languages do.In Chinese, the meaning can be ascertained from the character (Xu, 2000).The operative time for understanding is shorter than in Indo-European languages.The linguistic code of the Chinese language is similar to the conceptual translation of a relation-with language.
The written Chinese language, such as we know it today, was at the beginning a literate people language.Without reconsidering the history and the evolution of the Chinese language, one can say that a Chinese character is a combination builder.The "Chinese character" is a meaningful syllable, word, or part of a word, unlike alphabets where the graphic unit, the letter, is a non-significant phoneme.The Chinese character in itself is not meaningful.That is where the difficulty for building French-Chinese bilingual dictionaries begins.
For example, the character 交 (jiao) is given in the Oxford Chinese English dictionary as: Verb: Advance Publication, Data Science Journal, 13 January 2010 We specify that the categories "verb, noun, adverb, adjective" are given to the western reader to help understand the structure of the Chinese language but do not have any value for a Chinese speaker (Tavassoli, 2002).
The Chinese character acts like a symbolic attractor in the theory of chaos or as a kind of form that will be able to become meaningful in a local order.It must interact with other characters to become meaningful, and it takes its semiotic load only according to the other characters that are linked to it in a certain context.The Chinese character is a knowledge scheme with a stronger semiotic load than a word.A word has a definition.A Chinese character is a construction of psychic operations that are connections.In this direction, it is closer to the ISF than words are.Some examples are necessary to clarify this: In Table 5, the left-hand column gives the meaning of a character and the right-hand column gives the meaning of the words formed by the association of two characters from the left-hand column.To take a slightly more complex example and to come back to our topic, the Chinese translation of "GMO" (Genetically Modified Organism) is as follows: 转基因: GMO 转 to turn, change, transfer 基 bases, foundation, base, principle 基因 gene 因 causes, reason Any Chinese person, knowing these three characters individually, immediately understands the meaning of the combination, without any previous knowledge of biotechnology.A Chinese child can immediately understand that it means something about passing a gene or passing the smallest element found in one organization to another.A French child will not understand what a GMO or even a genetically modified organism is unless he has already spent a few hours in biology lessons.It is necessary to explain what a genetic modification means to make a semantic network of terminology.All connections must be unfolded in French, whereas in the Chinese language, it is done directly.
If the language characters in Chinese are closer from the psychic point of view, they are nevertheless still distant from the brain's ISF of neuronal constructions.The MEVA core language, by the use of memory descriptors in the meaning indexation, gets closer to the ISF within the psychic system.The core language gives a greater freedom for connecting the cognitive schemes, the modal values, the willingness to do, the willingness to act, etc.The modalities depend on the domain of knowledge.Finally, one could represent the scale of the languages in terms of proximity in the following way: The meaning to be communicated needs a code, and this meaning is much closer to the psychic system both in Chinese and in the MEVA core language because it is independent of any linguistic system.A character language is very close to a semiotic system.

CONCLUSION
This method proposes new approaches for the creation of high value information in science, technology, and geopolitics and in the practical application of competitive intelligence.The method has been already used with the English, French, Spanish languages, but this is the first time that it has been applied to the Chinese language.
The increasing amount of Chinese information, as well as the importance of China on the international scene, calls for the development of a strategy in Chinese information analysis.
For western firms and institutions, reading Chinese is difficult or expensive if translations must be made.This is why it is necessary to concentrate available facilities on the most strategic information.If the classical method of information retrieval is used (keywords, titles words, author names, etc. with Boolean combinations), the information retrieved is too specific, and key information may not be retrieved because published data is expressed in another form, either using different words or expressing a global point of view or new ideas related to the subject but in a form unknown to the end user.This is why the MEVA method starts from a very large corpus and selects the correct information by using an approach other than the classical retrieval method (information scientist process).
The methodological approach presented in this paper explains the quest for the best related information that will correspond to the expert's needs by expressing these needs not by keywords but by a more general concept.Accordingly, the information retrieved in the expression of this general concept can be modified to fit with the dynamic changes expressed in the new information sources (Deleuze, 2005;Brown, 2008).

Figure 1 .
Figure 1.A major Chinese agricultural review journal -The Journal of Agricultural Biotechnology

Figure 2 .
Figure 2. Presentation of an article in the Wanfang database

Table 1 .
Classes of documents in Wanfang Data

Table 2 .
Areas of interest in Wanfang Data

Table 4 .
The structure of the section "Agricultural Sciences" in Wanfang Data