« Proposal to Add a Protocol to Textual Data Analysis: For an Experimental Procedure in Lexicométrie » (p. 25-63)
Abstract: This paper presents a methodological reflection comparing various automated data processing techniques, approaches, and methods. It promotes the use of various types of software while working on textual data, explains how results of software can be used as data in different settings and promotes an experimental approach in atextometrical or lexicometrical field.
Key-words: Lexicometry, Data Analysis, Method, Visualization.
MARTINE PAINDORGE, JACQUES KERNEIS ET VALÉRIE FONTANIEU
« Computer-Assisted Textual Data Analysis: The Combination of Three Methodologies, Benefits and Limitations » (p. 65-92)
Abstract: In this article, we question the methods to be used to analyse the content of the teaching programmes and resources published by the Ministry of Education in France. The corpus has a particular form, including texts that are sparsely drafted, tables and many graphics. The article presents an exploratory study and first indicates why and how we articulate two software methods of analysis (Alceste and Tropes) and a so-called “manual analysis”. The results obtained point to similar functions and complementarity. Finally, we specify the conditions that must be met and the limits of this work.
Key-words: Textual Analysis Comparison, Lexicometric Analysis, Linguistic Analysis, Analysis of Contents, Alceste, Tropes.
BERNARD PATEYRON, MAURICE WEBER ET PIERRE GERMAIN
« Attempt at a Lexical Analysis and stemma codicum of Eighty-Three Kadosh Knight Rituals from the Research Workshop Sources Collection » (p. 93-144)
Abstract: Eighty-three rituals for the Kadosh Knight grade of the Scottish rite, dated approximately from 1750 to the present day, are digitally processed by methods of text mining or lexical analysis. To facilitate the understanding of our work, these methods are briefly described and software implementations are compared. For these texts, dates of first appearance are often uncertain, and so we attempt to establish chronological criteria and elements of kinship. A phylogenetic dendrogram appears as a necessary resource to determine the probable parentage of these rituals. Such a tree is built on the concept of distance and thus allows to compare the numerical proximity (similarity) or distance (dissimilarity) of these texts. For the purpose of digital processing, a metric based on Muller’s method or khi2 is used a priori on the graphical forms. It appears in retrospect that the same metric, when used on syntactic functions, leads to a nearly identical phylogenetic tree.
Key-words: Text Mining, Lexical Distance Dating, Syntactic Functions, Phylogenetic Tree, Rituals of Masonry.
ELISA OMODEI, YUFAN GUO, JEAN-PHILIPPE COINTET ET THIERRY POIBEAU
« Social Diversity and Semantics: Socio-Semantic Representation of a Scientific Corpus, the Case of the ACL Anthology Corpus » (p. 145-179)
Abstract: We propose a new method to extract multiword expressions from scientific papers. Our approach is made of two major steps: a first list of candidates is extracted based on a score using frequency and specificity information. This list is then filtered based on the status of the term in the abstract of the scientific papers under investigation. These abstracts are annotated using a text zoning analyser. The terms are then classified in different categories according to the text zoning analysis: we make a difference between terms appearing in the method section of the abstract vs terms appearing in other zones. This method is applied to the ACL Anthology collection, containing the papers published by the ACL between 1980 and 2008. We show that the technique we use allows us to model interesting facts concerning the evolution of the domain and of the methods used in computational linguistics.
Key-words: Corpus, Term Extraction, Discourse, Text Zoning, ACL Anthology.
« Computer-Assisted Textual Data Analysis. How Complex Thought and a Relational Approach can Feed some Methodological Considerations » (p. 181-215).
Abstract: This article explores two methodological concerns related to data-mining or text analytics: 1) the danger of “decontextualizing” ideas as a consequence of quantifying textual data; and, 2) the importance of determining the historical context and the origins of a document prior to analysis. 11 020 articles that appear in Canadian and French periodicals in 2005 were data-mined by a program called SPAD. We conclude that the program responds well to the first concerns. SPAD allows researchers to produce factor analyses of textual data while preserving access to the original text. This in turn ensures that researchers can understand the “meaning” of the words that are analyzed. Our case study does not do so well in regards to the second methodological consideration though, as we present how difficult it is to pre-establish the historical context and the origins of each document with such a large sample. To understand how we were still able to pursue our meta-analysis, in spite these difficulties, we turned to theoretical principals proposed by relational research and complex systems theory.
Key-words: Complex Systems Theory, Relational Research, Globalization, Textual Data Mining, SPAD, Lexicometry, Media.
« Inverted Speaking? Marine Le Pen and Her Identity-Resource Language » (p. 217-252)
Abstract: This study reports on recent computerized analyses of political discourses, particularly of populist discourses. The focus is on the discursive construction of an empathic identity on behalf of Marine Le Pen, leader of the French National Front: this chosen identity allows the political leader to answer gender stereotypes as well as to keep specific fundamentals defining the Front national politics. Working with software such as Termostat or Sketchengine discursive trends are detected by key words such as solidarity, suffering; these trends can then be refined by qualitative studies. On the one hand, this study confirms the political performativity of emotions, when consistent with gender stereotypes. On the other hand, it attests to the presence of rhetorical features of an anti-system party (key concepts prevailing over the years) which are adapted to new political circumstances (the point of view and focus being reworked). Indeed we suggest that Marine Le Pen has switched the FN’s rhetoric from a speech focused on resentment, contempt and nostalgia (Jean-Marie Le Pen’s style) into a discourse playing first on more positive emotions, such as empathy, which is also more in line with a female ethos.
Key-words: National Front, Empathy, Femininity, Marine Le Pen, Termostat, SketchEngine.
MAUD HIDALGO, ISABELLE RAGOT-COURT ET CHLOÉ EYSSARTIER
« Between-Lane Traffic: Convenient for Two-Wheeled Vehicles, but what do Car Drivers Think of It? A Comparative Analysis of Car Drivers’ Discourse on this Typical Behavior of Two-Wheeled Motor Vehicle Users » (p. 253-284)
Abstract: This study aims to analyze the view of car drivers on lane-splitting by motorized two-wheelers (PTW). Never asked about this typical PTW’s behavior, it is yet a practice that involves them on the operating point of view, despite they are not initiating this behaviour. For this, sixty semi-structured interviews with car drivers chosen for three criteria (city mobility, length of driving license and PTW’s practice or not) have been conducted and have yielded a substancial lexical corpus of information. This corpus has been analyzed with the software ALCESTE. The results of this detailed analysis emphasize, among others, importance of the individuals’ acquaintance with PTW and importance of the context of mobility and its social norms on practice and attitudes about lane-splitting.
Key-words: Lane Splitting, Car Drivers, Computerized Discourse Analysis, ALCESTE, Traffic Context, Acquaintance with PTW.
« Methodological reflection on the use of Modalisa and Iramuteq Software for the study of a corpus of newspapers about anorexia nervosa » (p 285-323)
Abstract: Anorexia nervosa is a complex multi-factorial disease currently considered as a public health problem by the medical profession. However, media discourses on this subject are relatively recent. This article aims to understand the characteristics of the media coverage of this pathology and the representations produced by the media of this disorder during adolescence while showing how the use of software for automated analysis of textual data can be helpful. For this purpose, we conducted a quantitative and content analysis of a corpus of 131 articles, published between 1995 and 2009, in several French daily newspapers, with the Modalisa software. Then, we used the Iramuteq software to identify lexical worlds structuring the discourses, based on a second more limited corpus.
Key-words: Anorexia Nervosa, Media Coverage, Daily Press, Automated Analysis of Textual Data, Modalisa, Iramuteq.
MARIA ZIMINA ET SERGE FLEURY
« Perspectives on the Frame/Thread Architecture for Multilingual Alignments » (p. 325-253)
Abstract: Multilingual text alignment is challenging due to the complexity of text and discourse organisation. Multilingual textual space can be explored using a textometric data model (Thread/Frame).A Thread is a textual flow represented as a system of items with position identifiers. A Frame is used to locate different textual objects (containers and contents) and their contexts. Following these principles, all text parts and annotations (including alignments) are stored and exchanged through different computerised procedures. Incremental textual resources trace all processing steps (from the initial segmentation to subsequent explorations and quantitative analyses). The software implementation of this model in Le Trameur allows exploring richly annotated multilingual text corpora (treebanks).
Key-words: Alignments, Annotation, Bi-Text, Frame, Multilingual Corpora, Dependency Relations, Textometric Analysis, Thread, Treebanks.
« The Postulate of a Rational Actor in Social Sciences and Humanities: A Persistent Half-Truth » (p. 355-375
Abstract: The postulate of an individual who acts rationally, autonomously, consciously, intentionally and in his own best interest has been denounced repeatedly, in particular by relational approaches. Critics have underlined the importance of the subconscious and that of emotions in the human psyche, as well as the impossibility to understand human action without taking into consideration social structures and the illegitimacy of a subjectivity deliberating in a monadic way. In and of themselves, these critics should have eliminated the rationalizing axiom. Yet, this axiom has lost nothing of its force; it continues to dominate modelizations in human studies. The question thus arises as to how or rather why it persists. Its duration cannot be unjustified. We have identified seven ways by which specialists in human studies manage to legitimate this axiom, which constitutes, at best, a half-truth. We enumerate and describe each of these justifications and explain why neither one really represents an answer to the critics raised by relational or relational-type approaches.
Key-words: Rational Actor, Relational Approach, Emotion, Subconscious, Freedom, Social Structures.
« Complexity on the Fringes of rationality: Proposal for a Definition of the Basic Structure Underlying the Complexity of the Action-Attitude Coupling through a Critique of the Excluded Middle Principle » (p. 377-424).
Abstract: This article aims to define a basic structure of the complexity in the relationship between cognitive attitude and action by questioning the principle of the excluded middle. Taking as a starting point the human will, we show that, given its weaknesses (Aristotelian akrasia and objectives referred to as essentially secondary effects by Jon Elster), the person implements self-restraint strategies (such as commitment), which are irrational but effective. Their effectiveness arises from the human capacity to self-deceive. The structure of this “mauvaise foi”, as Jean-Paul Sartre conceptualized it, is based on the result of the contradictory co-presence of incompatible beliefs and, even more, their mutual strengthening, in spite of the fact that forcing oneself to believe is an example of an essentially secondary effect. We describe the mechanism of this contradiction, thus questioning the law of the excluded middle. The result can be thought of as the «elementary brick» of human complexity.
Key-words: Will, Irrationality, Self-Deception, Essentially Secondary Effect, Principle of the Excluded Middle.