The aim of this project is a variationist study of wh-questions in French and Persian. The study shall soon also include Brazilian Portuguese. Syntactic phenomena have been a neglected area of investigation in this sociolinguistic branch so far. We take advantage of the availability of two different mechanisms of displacement – wh-scrambling in Persian and wh-movement in French and Portuguese – in order to highlight general syntactic and information-structural properties of wh-interrogatives. These languages show a rich pattern of word order variants - in particular, they allow wh-in-situ as well as wh-initial constructions. In the context of the current debate on the relation between theory and data, evidence from different sources will be collected: gradient grammaticality judgments, spontaneous speech, and social data. We will bring together these sources in a multi-lingual database. Based on the spontaneous speech data, a treebank of spoken language will be constructed. The analysis will not be restricted to the standard set of socio-demographic variables, but takes into account a variety of modern socio-economic indices as well as the socio-cultural profile (lifestyle). It builds on a cross-culturally oriented social structure analysis.
The empirical strategy consists of a 3x2 design, dissociating language and socio-cultural context, in view of a theory of social variation of a higher degree of generalization. We compare speakers of European French, Persian, and Brazilian Portuguese from metropolitan areas of their countries of origin, with expatriate speakers of the respective communities in New York City. The figure below gives an overview of the 3x2 design, divided in four field works, and the different types of evidence collected from each subject. The field works in Tehran (N = 102) and Paris (N = 98) have already been carried out in 2005. The next steps are field works in São Paulo (N = 100) and New York City (N = 50 for each language).

A major methodological challenge for any empirical study on interrogative sentences in spoken language is the fact that the classic sociolinguistic interview technique elicits merely any interrogatives. since the interviewee answers the questions of the interviewer. Coveney (1996: 116) obtained in his study on French (after excluding rhetorical or echo-questions) in average as few as 4.25 yes/no-questions and one wh-question per speaker, although the interviews had a mean duration of 36 minutes. He obtained this sparse outcome despite his attempts to create interview situations that should motivate the interviewee to ask questions back. The outcome in Behnstedt (1973: 217&222) is merely better. These numbers clearly indicate that the free interview technique is inappropriate for eliciting interrogatives in suitable quantity. We We developed a game task, in which a fictive scenario brings subjects into the situation of asking the interviewer questions, while leaving them freedom to choose the topics. Subjects were instructed to investigate a fictive murder case and were not aware of our interest in their use of interrogatives. According to the scenario, the subject represented a police investigator, and the experimenter the doorman of the victim’s building who had found the body. In view of sociolinguistic authenticity, subjects were instructed to remain themselves and not to imitate, for example, some commonly known literary figure.
The recordings of 10 subjects from the Paris field work and 14 subjects from the Tehran field work were transcribed and time-stamped as a pilot project. In addition, the transcriptions of French language were annotated with a syntax tag system (which is an intermediate, but nevertheless autonomous, step in the construction of the treebank).
These data, and the stylebooks of transcription and annotation, are available here.
|
We worked in the scope of the field works in Tehran and Paris with a specific instrument, a gradient grammaticality judgment test with written stimuli (GGJW), based on the principle of graphic rating. Its development and evaluation of test-theoretical properties, and a successful application has been described in Adli (2004a: 81-111) and Adli (2005c). Subjects read stimuli on a test sheet and express their judgments by drawing a line. The figure to the right shows and example sheet of the GGJW for Persian sentences. Subjects express their judgments on a bipolar scale with the endpoints “-” (obviously ungrammatical) and “+” (obviously grammatical). The scale had a length of 122 millimeters (i.e. 4.80 inches). The length of the line represents the degree of grammaticality perceived. The test was presented in a A4 ring binder containing two A5 sheets. The upper one holds the reference sentence, the lower one two experimental sentences. In order to not only provide the endpoints “-” and “+” but also a scale anchor, subjects rated one reference sentence, (29), at the end of the training phase that remained visible throughout the test. Only the lower sheet with the experimental sentences was turned after completing the rating of its two sentences. The reference sentence is a marked, but not ungrammatical, construction . This mostly results in an intermediate scale anchor.
However, this method is inappropriate when phonological factors need to be controlled. In order to study the interaction between syntactic locality and intonation of French wh-in-situ questions, we developed a variant of the first instrument, namely the gradient grammaticality judgment test with auditory stimuli or GGJA (Adli 2006c). Subjects listened by headphone to pre-recorded test sentences, each of which had been recorded (by a native speaker experienced in acting) with three different intonational contours. In the Paris field work, the GGJA technique was applied in addition to the GGJW technique. The use of the GGJA can be seen in this pdf-document, which contains embedded, interactive audio examples. |
![]() |
We used extensive social questionnaires, consisting of 300 (Paris) or 240 (Tehran) items, respectively.
Regarding the objective side of social structure, we worked with a combination of proven socio-economic indicators of contemporary societies. One example is the index of housing density (cf. INSEE 2005: section E.02), which reflects the available housing space taking into consideration extensive information on the specific family structure within the household. Other examples are monthly income of the household, high school orientation, academic orientation, level of education, and socio-professional category (in many cases also concerning the parents and the partner of the interviewee).
Regarding the subjective side of social structure, we further developed the lifestyle approach (Bourdieu 1979). While lifestyle had been based in Adli (2004a) on 54 items from the two scales leisure activities and media, we applied statistical factor- and cluster-analytical data reduction to approximately four times as many items. Apart from new media subscales (e.g. the Internet), the Paris and Tehran questionnaires included new dimensions of sociocultural orientation: clothing and appearance and socio-political and ethical values.
The following figure briefly illustrates the operationalization of the lifestyle variable, using the example of the less complex case in Adli (2004a) - one should bear in mind that the operationalizations of the Paris and the Tehran questionnaires are more complex. Essentially, subjects are assigned to one lifestyle type based on their answers to the 66 single questions on activities and media (12 questions are excluded for reasons of construct validity). In order to create a limited number of well-defined types, statistical methods of data reduction have to be applied. In a first step factor analyses reduce the number of features which describe each person, passing from 66 items to 9 factors. In a second step the sample is divided in four different groups using a cluster analysis. Subjects with a similar profile on the nine factors are grouped together. In the final result each person is characterized by a single variable which can take one of 4 possible values, namely the membership to a particular lifestyle group.

The clusters are defined based on each person's individual profile on the set of factors. The figure below shows the four profiles, again for the less complex operationalization in Adli (2004a).

More information on the concept and the operationalization of lifestyle are available on this poster here.