Alessandro DURANTI, University of California, Los Angeles

Comment on Enfield, N. J. and Jack Sidnell. 2017. The concept of action. Cambridge: Cambridge University Press.

The concept of action (hereafter “CoA”) by Nick Enfield and Jack Sidnell is a welcome discussion of two key theoretical and methodological issues in the study of language-mediated interaction, namely, (a) whether or not speakers need access to an inventory of types of acts in order to produce or respond to a meaningful act (in the form of an utterance or gesture or combination of the two), and (b) the impact of language-specific grammatical or lexical forms (e.g., words, pronouns, particles, tense-aspect markers) on how speakers understand or carry out a particular type of action (e.g., agreeing, requesting, claiming previous knowledge). The latter issue falls under the general phenomenon known as “linguistic relativity,” a phrase and concept associated with the writings of Benjamin Lee Whorf (1956) and prefigured in Franz Boas’ (1889) groundbreaking discussion of speakers’ difficulty in hearing meaningful phonetic distinctions in a language different from their native one(s). By exploring these two issues, Enfield and Sidnell sketch the outlines of an ontology of action that distinguishes between what is available to participants in the midst of interaction and what analysts subsequently describe, categorize, and try to explain. Some parts of the book reproduce arguments previously made by the two authors either jointly or separately. The integration of these previous contributions into a single volume should help readers grasp the bigger picture and to evaluate the logic of argumentation of the specific proposals that Enfield and Sidnell offer. Whether one agrees or not with the details of their approach, this is a good book to think with.

At the beginning of chapter 2, Enfield and Sidnell mention Malinowski’s interest in the study of language as an “instrument of social action (and interaction)” [428](32)—the addition of “interaction” to Malinowski’s original formulation turns out to be important, as we will see. This is presented as a way for the authors to distinguish themselves from those in “the sub-discipline named linguistic anthropology” who focus on “the relation between language and thought (and the consequences of linguistic diversity for thinking)” and from those who have shifted “from a focus on language in use . . . to a focus on people’s ideas about language . . . under the guise of what is known as language ideologies” (33). In the midst of this shift, they claim that “an anthropological account of language as action got lost” and the “promise of mid-century ordinary language philosophy . . . put forth by Austin and Wittgenstein – and indeed before them by Malinowski” (33) has not been fulfilled.

It is hard to exclude “the relation between language and thought” in a book like CoA that deals with inference, guessing, reflecting, realizing, doubting, and even “thought” (42). The analysis of these and other experiences in terms of how they are being made explicit through particular behaviors, acts, or turns does not per se exclude them from the domain of “thought” pursued by philosophers, cognitive scientists, and other scholars who are comfortable with using mental constructs. Nor is “thought” or “thinking” easily excluded by invoking Peirce’s notion of interpretant (23–8, 35, 65, etc.) or Silverstein’s use of “indexical relations” (128–30). In fact, the nature of indexes (e.g., address forms, pointing gestures, regional accents) is such that they “may be transposed from the current context into other ones, recalled, imagined or merely projected” (Hanks 2001: 121, emphasis added). Rather than excluding “thought” or “thinking” altogether from the study of language as action, it makes sense to follow an approach that examines how members of a particular (speech) community talk (or not) about experiences such as thinking, guessing, reminding (which Enfield and Sidnell also do, for example, on page 37) and whether they treat such inner experiences as some kind of obscure or dangerous entity that is better left unmentioned or unknown (e.g., Rosen 1995; Robbins and Rumsey 2008).

As for the indeed popular study of language ideologies, in its concern for “people’s ideas about language,” I see the effort to document and explain the social impact of those ideas on educational policy, debates about national languages, individual and collective sense of identity, and, among other issues, the reproduction of racist stereotyping (e.g., Bucholtz 2001; Hill 2001; Alim, Rickford and Ball 2016).

For these reasons I do not think that over the last few decades “an anthropological account of language as action got lost.” Not only do Enfield and Sidnell themselves mention a number of linguistic anthropologists who have been pursuing this very issue, but there are specializations within linguistic anthropology that are manifestly dedicated to the study of language as action, including language socialization (e.g., Duranti, Ochs and Schieffelin 2012) and the study of language contact and missionization (e.g., Makihara and Schieffelin 2007; Hanks 2010).

If I am right in saying that linguistic anthropologists are still pursuing the study of “language as action,” why are Enfield and Sidnell feeling alone in their endeavor? A possible answer is hinted at by the insertion of “(an interaction”) after “action” in the above-mentioned evocation of Malinowski. “Interaction” is shorthand for an understanding of “language” as “the directly observable collaborative practices of using words, grammar, and associated semiotic resources, in human interaction” (ix). This is an approach that draws from the field of “conversation [429]analysis,” which Enfield and Sidnell know well and to which they have contributed over the last decade (e.g., Enfield and Levinson 2006; Sidnell 2009, 2010). Conversation analysis is an inductive approach on interaction as “the primordial site of language” (Schegloff 2005: 455) and turn-taking as a prime form of social organization. The focus on conversation started with Harvey Sacks’ analysis of recorded phone calls to a suicide center (see Schegloff 1992: xv–xvii), and then expanded its focus of inquiry to a range of interactional contexts, including therapy sessions, informal phone conversations between friends or family members, and doctor–patient interactions, and other institutional encounters (Heritage and Clayman 2010). Contributions by conversation analysts are seen as groundbreaking by a growing number of scholars in a wide range of fields. There has also been criticism for the exclusive focus on “interaction” or “talk-in-interaction” (Schegloff 1988)—as opposed to “society,” “language,” “mind,” “power,” “inequality,” and other concepts traditionally studied by social or cognitive scientists—as well as for the rejection by most conversation analysts of findings based on traditional methods like participant-observation, elicitation, interviews, or sampling techniques. It is in this historical context, I believe, that Enfield and Sidnell’s comments about linguistic anthropologists’ current foci of study must be evaluated. CoA is a contribution to an ontology of action (see in particular chapter 4), inspired by the methods developed by conversation analysts, who prefer to keep ethnographic information to a minimum.

One main focus of CoA is how people interacting with one another understand how to react appropriately to what was just said or done by someone else, including the extent to which such an understanding entails the recognition of a particular type of (speech) act. For example, when someone says that’s a nice shirt (10), does the addressee need to understand that both an assessment and a compliment have been produced in order to know that he could respond by saying thanks or by some other response that acknowledges the compliment but rejects the positive assessment (e.g., I wasn’t sure this would be appropriate)? Or when someone asks do you know what time it is?, does the recipient need to classify the question as a “request for information” (about the time) in order to be able to produce an adequate response (e.g., It’s ten after five)? This issue has been discussed by Stephen Levinson (2013) under the name of “action ascription,” which is meant to replace “action recognition” used by Emanuel Schegloff (2007: xiv), one of the founders of conversation analysis. Recognition has indeed been a major theme in conversation analysis, starting from the discussion of the opening sequence in phone calls (see Sacks 1992: Lecture 1 [1964–65]; Schegloff 1968) and the use of first names (e.g., John, Penny) as preferred forms for personal reference to achieve recognition (Sacks and Schegloff 1979).

The problem with the term “recognition” for Levinson (and also for Enfield and Sidnell) is that it makes it sound as if there is something that needs to be correctly guessed, namely, the type of action produced by the previous speaker. This view, for Levinson, is problematic because “the process of attributing an action to a turn is fallible, negotiated, and even potentially ineffable” (Levinson 2013: 104), and it must happen very fast: the utterance by the next speaker is on average produced within 200 milliseconds from the end of the turn of the previous speaker (Levinson 2013: 103).[430]

In the first four chapters of CoA, Enfield and Sidnell expand on this line of reasoning by exposing the problems of what they call the “binning approach” (46), which they attribute to speech act theorists. This approach, they claim, assumes that there is a pre-existing inventory of types of (speech) acts out of which speakers and hearers would pick one to produce or interpret a given utterance. In CoA the target of criticism, then, is not Schegloff’s (or other conversation analysts’) use of the notion of “recognition,” as it is for Levinson (2013), but Austin’s (1962) theory of speech act types (and Searle’s reformulation of Austin’s original proposal). Starting in chapter 3, which introduces the notion of agency, partly building on Paul Kockelman’s (2007) work, Enfield and Sidnell present their own alternative proposal, packed with insightful observations, which become occasionally afflicted by some terminological or logical confusion. This confusion might be the price for pushing for a theoretical rethinking of the complex notion of action in a relatively short book (201 pages of text) and doing so while mixing theoretical approaches that are a bit like oil and water (e.g., conversation analysis and Peirce’s semiotics). Another challenge is the abandonment of terminology that has the intuitive appeal of Austin’s and Searle’s use of categories offered by ordinary language. The solution for Enfield and Sidnell is to distinguish between the metalanguage used when categorizing a particular act and the practice of acting “on the fly and infer[ring] what a speaker is doing from a broad range of evidence” (47). They point out that philosophers, linguists, and other analysts of ordinary talk are not the only ones to refer to speech types like requesting, apologizing, informing, etc. Native speakers do it too when they provide a description of their action, as in I was complimenting you (45) or I requested that he get off the table! (47). But these uses of speech types are not what routinely happens. In spontaneous interactions people “do not need to recognize action types or categories in order to respond appropriately (or inappropriately for that matter)” (124). Rather, participants are “considering the details of particular turns-at-talk for their relevance in deciding what to do next and how to do it” (124).

The focus on the “details” is more than a methodological choice for Enfield and Sidnell. It is a theoretical stance that assumes that “actions can be dealt with at the token level and need not be seen as tokens of action types at all” (111). And here comes the punch line: “A radical version of our claim would be that there are no actions, only the parts of actions” (111).

This “radical” claim needs some unpacking. A concern for “parts” usually implies a mereology, that is, a theory of the relation of part to whole and of parts among themselves but always within a whole (e.g., Husserl 2001: 161–80). This means that, logically speaking, “token” only makes sense in opposition to “type,” and therefore any rejection of the token-type relation is at least puzzling if not downright illogical. But what if Enfield and Sidnell mean something different from a theory of the parts without the whole? For example, the focus on the details/parts could simply be a heuristic. This foregrounding of the details of interaction fits with a conversation analytical approach as well as with a grammatical analysis of utterances. Either approach can be contrasted with speech act theorists’ tendency to look at speech acts as “wholes” without much attention to what happens inside of them, that is, without much sophistication in the use of particular tense/aspects, or the sentential particles that are abundant in so many languages but not [431]so much in English (this reading is supported by the method of cross-linguistic analysis presented in chapter 5). Another way to interpret the focus on parts/details is as an attempt to adopt a “distributed cognition” approach, which Enfield and Sidnell mention (87) and from which they derive the notion of “distributed agency” (see chapter 3). According to this approach, participants do not need to know or decide the type of act of a particular utterance because its interpretation is distributed across semiotic and material resources such that no one participant or “agent” is in full control. This idea is implied in Husserl’s notion of intersubjectivity and its various interpretations and modifications (Merleau-Ponty 1945; Jackson 1998; Duranti 2010, 2015: 208–32) as well as in Vygotsky’s discussion of mediation within a socio-historical approach (Cole 1985).

Focusing on parts might also be supported by the simultaneous adoption of the idea of a “larger project” that guides and gives meaning to specific moves (Levinson 2013). This is what Enfield and Sidnell seem to be aiming at with their discussion of a “ladder of action” and the distinction between “practices” and “actions” (105–6). The use of these two interdependent categories, however, entails the notion of typification, which is what they had been trying to avoid. This is not accidental. It has to do with the fact that typification cannot be avoided by participants as well as by analysts. But by accepting types, one does not need to accept Austin’s or Searle’s ways of typifying. It also does not mean to accept the idea that the relevant type for a particular sign or sequence of acts is decided ahead of time and once and for all. When we look at the ways in which cooperative action is carried out and collaboration achieved, we see that “what is being done” or the “meaning” of what is being done not only changes over time (Goodwin 1979) but it must be designed so that it can change and adapt to the contextually given and contextually realized circumstances. This means that in any “ontology” of action, one must conceive of categories, types, or meanings to be sufficiently open entities. This openness, which is traditionally associated with artistic production (Eco 1962), allows for a certain level of ambiguity and room for invention, negotiation, and reformulation. The conventional ways of using goals, intentions, and plans for describing human action need to be rethought in order to allow for micro- and macro-adjustments and, above all, to allow for the improvisation that is a constituting quality of our ways of being in the world, as originally recognized by Bourdieu (1977: 79) in his adoption of the concept of habitus and as key in socialization (Duranti and Black 2012). An openness of and in typification means that ready-made labels for types might not be what we need in order to understand what is really going on in a given interaction. Let me elaborate on this point by examining one of the examples provided in CoA.

Enfield and Sidnell suggest that one can act in ways that are “adequate for the situation” (113, emphasis in the original) without having to consider the type of action that is being enacted. One example they give to illustrate this point involves an ambiguity between two possible speech act types. At the end of a meal a person holds a hand out across the table toward someone else who has a plate in front of him. This, they say, could be considered an “offer” or a “request,” but there is no need to decide because “whether it was a request or an offer is not an issue for the participants” (113). Perhaps. But how do we know that the choice is not an issue for the participants? Wouldn’t it be better to hypothesize that the type of action in this [432]case is neither “offer” nor “request,” but something of a different nature? Their example could illustrate the relevance of an ethics of cooperation as facilitation, which makes a recipient, who also happens to be a guest, pick up the moveable object in front of him and give it to the person holding a hand in suspended motion as a way of helping complete what appears as an unfinished action sequence. Pursuing the latter interpretation pushes things in the direction of the distributed agency Enfield and Sidnell seem to be striving for, and it also adds an ethical dimension of action in the presence of another person (Levinas 1969) that is implied but not theorized in the discussion of accountability (in the Preface and on pages 53–61).

As is made clear in the Postface, in their preference for details and parts, Enfield and Sidnell see their method of investigating language as continuing in a tradition that goes back to Wittgenstein’s anti-essentialism and his belief that there are “countless kinds” of words, sentences, acts, or games (Wittgenstein 1958: 11). This position contrasts with the one proposed by Austin, who believed that we should treat “uses of language” just like entomologists treat species of beetle, that is, by counting them all, one at a time (Austin 1970: 234).

The diverse ways of doing “the same” action with words is the leading theme of chapter 5, “Collateral effects.” The action in question is the linguistic performance of an “epistemically authoritative second-position assessment” (137). If speaker A’s utterance he is rude is the first position assessment, an immediately following yes, he is very rude by speaker B would be the second-position assessment. The “epistemically authoritative” aspect of the exchange is the expression by speaker B of her “primary rights or greater authority to make such an evaluation” (138). Enfield and Sidnell’s detailed discussion of how this action is performed in three languages (Caribbean English Creole, Finnish, and Lao) shows that the morpho-syntactic resources that each language has for speaker B to convey agreement and claim greater authority produce different effects on subsequent talk. For example, in Creole and Lao, the forms available to perform the action (an “if-prefaced” question and a perfective particle, respectively) seem to shut down further talk on the topic, whereas the Finnish variable word order seems to allow for further elaboration in subsequent talk. This is an elegant discussion of linguistic relativity that can constitute a model for further research.

Chapter 6 counterbalances the relativity argument with a discussion of “natural,” and therefore potentially universal, ways of performing a particular action by means of language. Enfield and Sidnell focus on polar questions, i.e., yes-or-no questions, which can be answered by two grammatical constructions: (a) interjections (yes, no, as in did you get your paper? Yes), or (b) a language-specific “echo system” (e.g., in English, Is he home? He’s home). An interesting point made in this chapter is that the choice between these two formats, which are said to be available in all languages, is not arbitrary. Instead, the interjection “yes is less agentive than it is in response to a [polar] question” (188). This observation is further used to argue in favor of the importance of iconicity in language and the use of this type of linguistic analysis to assess different degrees of agency. The connection between language-mediated-interaction and language-encoded-agency made explicit in this chapter constitutes an achievement that will ensure that this book will hold an important place within current debates about human action in the social and cognitive sciences.[433]


