Semantic Conflict and Response Conflict in the Stroop Task

The most common ways researchers explain the Stroop effect are either through semantic or through response conflict. According to the literature, there are several methods capable of disentangling these conflicts: to use words outside of the response set, to use associatively related colors and words, or to use a “2:1” paradigm (requiring the same response for two types of stimuli). However, we believe that these methods cannot entirely differentiate semantic and response conflicts. We propose the following alternative method: when naming the color of a printed word (e.g., red, yellow, etc.) in the Stroop test, participants were asked to use different color names for some colors. For example, the red-colored stimuli had to be named by the word “yellow”. This approach allowed us to create semantically congruent stimuli, but with the conflict at the response level (the word red appears in red, but the participants have to say “yellow” because of the rule). Some stimuli remain congruent at the response level, but with the conflict at the semantic level (the word yellow appears in red, and the participants have to say “yellow” because of the rule). The results showed that semantically congruent stimuli do not produce the Stroop effect even if the meaning of the word corresponds to an incorrect response. In turn, congruence at the response level reduces the interference effect, but interference remains significant. Thus, the response conflict affects the magnitude of the Stroop effect only when there is a semantic conflict. Our data do not correspond to models that assume direct activation of responses corresponding to word meaning. Correspondence: Alexey Starodubtsev, fleksbr@yandex.ru, St. Petersburg University, 7/9 Universitetskaya nab., 199034 St. Petersburg, Russia; Mikhail Allakhverdov, m.allakhverdov@smolny.org.

of the meaning of the word in a given trial with the correct response to previous test trials .
The conflict between the font color of a word and its meaning may occur at different stages of information processing. In the early processing stages, the physical features of the stimulus are analyzed; then the stimulus is semantically processed; and the response to the stimulus is prepared in the latest stages. Accordingly, three types of conflict situations are identified in the literature, which affect the speed and accuracy of task performance: task conflict, semantic conflict, and response conflict, respectively. Task conflict reflects the influence of the process of reading the word. Because the reading process is not relevant to the task, it conflicts with the task of naming the ink color. However, the task conflict factor is weaker in terms of response speed compared to semantic conflict or response conflict . It is believed that the influence of task conflict can be found only in specific conditions. Kalanthroff and co-authors conclude that in conditions where there is only a small number of control stimuli (in most experiments, less than half), the "task conflict does not arise (or is resolved very quickly)" (Kalanthroff, Davelaar, Henik, Goldfarb, & Usher, 2017, p. 1).
Semantic conflict reflects the impact of conflicting representations of a word's meaning and color. At this stage of information processing, the recognition of the color of the word and its meaning has already been completed, but the corresponding responses have yet to be formed. The conflict at this stage of processing is denoted in a different way depending on the researchers' understanding of interference mechanisms. For example, if interference is described in connection with word processing, the conflict is called "lexical" (the earlier stage, in this case, is called "pre-lexical" and the later stage is called "post-lexical"; see, for example, . If the researchers assume that a decision is made to "translate" characteristics of the stimuli into the corresponding response, the stage is called "conflict at the decision level" , or simply by the word "decision" in quotation marks . Similar concepts lie behind the terms "perceptual conflict" (Bekci & Karakas, 1985;Doehrman, Landau, & O'Connel, 1978) or "stimuli-stimuli conflict" ). The authors may have different interpretations of information processing at this stage and sometimes do not distinguish between semantic and task conflicts (e.g., Steinhauser & Hubner, 2009). However, semantic conflict and response conflict get distinguished more often.
Response conflict has been traditionally believed to be a significant factor of interference . The main assumptions of the response conflict hypothesis are: a person cannot simultaneously name both the ink color and the meaning of a word, the meaning of a word has priority in processing in comparison with its ink color (because the meaning of the word is processed more quickly or because reading is an automatic process, unlike the processing of a word's font color). There are also modern models which explain interference by the time-consuming suppression of an irrelevant answer (see .
Nevertheless, there are also hypotheses of semantic conflict as the main cause for interference .
The majority of scientists accept the presence of both semantic conflict and response conflict. In many ways, such a difference in positions is due to different methods of separating the semantic conflict from the response conflict. Let us look more closely at the methods that allow us to disentangle the influence of the response conflict and the semantic conflict in experiments.

Methods that Differentiate Semantic and Response Conflicts
Methods that induce only the semantic conflict without any response conflict use different approaches to prevent participants from making an erroneous response that matches to the meaning of the word. Researchers have widely accepted the list of procedures laid out by Parris and colleagues ) that describe essential steps for disentangling semantic and response conflicts: to use words outside of the response set, to use semantically related words and colors, and to use the button that matches either to the color or to the meaning of the incongruent stimulus.
The first method is to use words which are not part of the response set. For example, if the colors of the stimuli can only be red or blue, then the response set for the colornaming task consists of the elements red and blue. However, the meanings of words outside of the response set (e.g., the word green appearing in blue for the case described above) still produce an interference effect, although the effect is significantly reduced compared to trials that use words within the response set (e.g., the word red in blue) (e.g., .
The second way to differentiate the influence of semantic conflict and response conflict in an experiment is to use words associatively connected with correct responses. For example, a participant is slower to name the ink color of the word water printed in red, than the word water printed in blue. Since the relation between the meaning of the word water and the ink color of its font do not correspond directly to the responses, the interference in case of the word water printed in red suggests only a semantic conflict (e.g., . Finally, the third way to induce a semantic conflict without a response conflict is to use tasks in which two different types of stimuli require the same response (2:1 paradigm). The instruction for participants is to give one response (for example, to press a button) if the stimulus is either red or blue. In this case, the word red in blue color will be semantically conflicting. Still, both the color and meaning correspond to the same response, and therefore this stimulus will be congruent at the response level (e.g., Schmidt, Hartsuiker, & De Houwer, 2018;Shichel & Tzelgov, 2017;Steinhauser & Hubner, 2009).

Criticism of Traditional Approaches Utilized to Differentiate Semantic and Response Conflicts
The methods of disentangling different types of conflicts are based on assumptions that may seem flawed. For example, are the meanings of words outside of the response set The Russian Journal of Cognitive Science Vol. 6, Issue 4, December 2019 not processed at the response level? It seems plausible that a response conflict in words outside of the response set is attenuated, but not absent. Nonetheless, some authors use response conflict to explain the results of experiments in which they used only words outside of the response set (for example, see . Another line of criticism is that the response set draws participants' attention to certain physical features of the stimuli. If the participants' responses can only be "red" or "blue", then they will pay attention to the words which are similar to these responses. The response set can also influence the semantic processing of the stimulus. It is known, for example, that expecting a certain stimulus affects the effectiveness of its subsequent recognition (effect of the perceptual set). By analogy, a response set may also affect the semantic processing of word meanings. The effect of an associative connection between the color of a word's appearance and its meaning may also carry a conflict at the response level. For example, presenting the word grass will speed up the subsequent response "green. " Similarly, the word water may pave the way for the response "blue". Therefore, presenting the word water in red may cause a conflict of the responses "red" and "blue, " i.e., a reduced response conflict. Schmidt, Cheesman, and Besner (2013) showed that the effect of an "associative conflict" (the word grass in red) only occurs when the word "green" belongs to the response set. In the authors' opinion, it is the response "green" that competes with the word "red" if the word grass is presented in red, but there is no semantic conflict between the word grass and the response word "red. " Indeed, Schmidt and colleagues showed that in tasks other than the Stroop test (the task of reading words, lexical decision task), the word "red" speeds up the subsequent processing of the word grass.
For this reason, we cannot exclude the influence of the response conflict in the case of the semantic relation of a word's meaning and its color. In Riley's work with coauthors  in the picture-word paradigm 1 , the data suggest that the "semantic conflict" may vary if participants familiarized themselves with the permitted responses before the experiment. This further complicates the possible ways to disentangle semantic and response conflicts.
Using the "2:1" paradigm seems to be the most reliable way to trigger a semantic conflict without a conflict of responses. However, in the experiments known to us, the 2:1 paradigm was used only in the motor versions of the Stroop task (manual or oculomotor). In such a procedure, participants are asked not to name the ink color aloud but to make a particular movement with their hand or eye, which has been assigned as corresponding to a specific color before the experiment. In the study of , participants responded by moving their eyes. For instance, if the stimulus was blue or green, it was required to look at one of the squares displayed on a screen, while if it was red or yellow, then they were expected to look at another square. In a study by van Veen and Carter (2005), participants had to press buttons. For example, participants should press one button if the stimulus was green or blue, and another button if the stimulus was red or yellow. In this task, the word green appearing in blue was semantically conflicting but congruent at the response level (the meaning of the word and its color corresponded to the same response). The congruence at the response level was sufficient to significantly reduce or eliminate the Stroop effect.
Nevertheless, the motor Stroop task is considerably different from the task of overt naming of ink color. In color naming, a processing priority for word meaning has been revealed. Word meanings affect the speed of color naming, while ink color does not influence reading speed (Stroop effect asymmetry;. This is not the case for the motor Stroop task. When using keyboard presses or mouse clicks to respond to word meaning, participants react slower when the word meaning is incongruent to the ink color (reverse Stroop effect; . It is well-known that participants more quickly compare the color of a filled rectangle with the font color of the word rather than with the word's meaning (e.g., . This fact often contributes to the critique of the conflict response hypothesis. After all, if the word gets priority in processing, then participants should match the meaning of a word with the color of a rectangle faster than matching the color of a word with the color of a rectangle. Such criticism may seem weak because, in this case, the task considerably differs from the classical Stroop task. In our opinion, the use of non-classical methods should be justified either logically or by the results of experiments. However, even the motor (manual) Stroop task does not comply with this requirement. For example, in the oral version of the Stroop test, there is a negative priming effect: participants are slower to name the ink color of a word if the color matches the meaning of the previous word. This effect can be interpreted via the response conflict hypothesis: the suppression of a response that corresponds to the meaning of a word extends to the processing of the next stimulus, in which this response is already the correct response to the task. However, no negative priming effect was found in the manual Stroop test . Moreover, the difference between the manual and oral versions of the Stroop test is indicated in the study of Sharma and Makenna (Sharma & McKenna, 1988). In their experiment, different stimuli featured one or several components of interference: lexical, semantic relatedness, semantic relevance, and response set membership. A colored set of "X" characters (XXXXXX) does not contain any of these components; neutral words (table, nail) include only the lexical factor; words related to color (sea, grass) contain the semantic relatedness component; colored words outside of the response set (orange, white) contain elements of semantic relevance; and classical Stroop stimuli include all described components. Each of the components increased interference in the oral Stroop test. Still, only response set membership influenced the interference level in the manual version of the Stroop test. This suggests that the oral and manual versions of the Stoop test cannot be used as interchangeable methods, at least not when studying the roles of semantic and response conflicts.

Study Rationale
Thus, most often, semantic and response conflicts get separated either by varying the response set, by associative connection, or by using the "2:1" paradigm. Variation of the response set and associative connection allow for interpretation with conflicts at the response level only. The "2:1" paradigm has so far been used only in the motor versions of the Stroop task. For these reasons, our first goal was to reproduce the "2:1" paradigm in the oral Stroop test.
Another motivation for our research is that the description of different types of conflicts in modern works on Stroop interference could be simplified. We believe that the co-presence of many types of conflicts (and many types of "control" over these conflicts) is logically redundant. For example, if there is a semantic conflict that should be resolved to accomplish the task correctly, there should be no response conflict. If the meaning of the word is suppressed at the response level, then there is no need to suppress it at the semantic level as well.

Study Design
In our study, we planned to evoke a "pure" conflict at either the response level or the semantic level in the oral Stroop task. In the oral Stroop test, two colors were to be renamed with other color labels (the idea for this method was suggested in a theoretical paper by Arbekova and Gusev, 2017). For example, participants should say "yellow" in response to red stimuli. Let us consider the experimental conditions that can be implemented with such an approach. In this case, the word red appearing in red is presented when the participants should call the red stimuli by the word "yellow. " On the one hand, the meaning of the word does not activate the correct response for the task. On the other hand, there are more sources of information related to the representation of "red" on the semantic level. In some cases, redundant information speeds up the responses (Utochkin & Bolshakova, 2010).
Another condition that is important for us is matching the meaning of a word with the correct response in the case of a semantic conflict. For example, the participant has to say "red" in response to the yellow stimulus, and the stimulus is the word red in yellow ink. The color is not the same as the meaning (semantic conflict), but the meaning of the word is the same as the participant's response. We expect that this design will allow us to distinguish between a conflict at the response level and a semantic conflict.
The proposed experiment implements not only the "semantic conflict without response conflict" con-dition but also the "response conflict without semantic conflict" condition. Thus, the main aim of the research is to find out whether the factors of "the response conflict" and "the semantic conflict" are independent of each other. Examples of the stimuli that are implemented with our method are given in Table 1.

Method
Participants. Twenty-four individuals participated in the study. They were from 18 to 40 years old (M = 24.4; SD = 5.8) (7 men and 17 women). Participation in the experiment was part of the "Experiment Week" event. Respondents received no compensation for participating in the study.
Equipment. Stimuli were presented on the LCD of a desktop computer with a noise-suppressing microphone; the distance from the participant's eyes to the display screen was 50-60 cm. LCD monitor characteristics: diagonal 24" (61 cm); display width: 53 cm; display height: 29.5 cm; resolution: 1920 × 1080 px (16:9); refresh rate: 60 Hz (with maximum of 144 Hz). The presentation of stimuli and recording of responses was performed with the help of Psy-choPy2 software. The time interval between the appearance of a stimulus and the beginning of the response, as well as the correctness of the vocalization, were calculated manually in the Praat program.
Stimuli. Neutral stimuli: "XXXXXX" characters displayed in blue, green, red, or yellow. Congruent stimuli: The words red, blue, yellow, and green displayed in the color that matches their meaning. Incongruent stimuli: words written in colors that do not match their meaning; that is, the words red, blue, yellow and green printed in one of the other three colors. All color-meaning combinations for the condition "incongruent stimuli" appeared an equal number of times during the experiment.
There were 12 rules for color naming. For example, one of the rules was to call red stimuli "blue" and to say "green" in response to the yellow stimulus. Each participant was assigned one of the 12 rules.
Procedure. Before the experiment, we assigned each participant a rule of color naming using the Latin square method. They had to name two of the used colors in the usual way and use specific different names for the two other colors (e.g., "blue" instead of "green", and "yellow" instead of "red"). During the practice stage, participants received the instruction to answer as accurately as possible without worrying about the time. In the experimental stage, the task was to respond as quickly as possible. We also informed participants that any extraneous sounds ("aaa, " "mmm, " etc.) and word stretching and drawling ("bluuue") would count as errors.
In the two first stages of the procedure, participants were practicing the unfamiliar naming. In the first stage, they pronounced the colors of 60 neutral stimuli (4 color variants). In the second stage, they named 36 stimuli that could be neutral, incongruent, or congruent. The stimuli appeared one by one on a grey background, written in Arial upper case (on average 8 cm wide and 1 cm high).
In the first two stages, the participants had to press the space bar to start the next trial. The experimenter sat next to them and, if necessary, reminded them of the rule or called their attention to their errors. In the third stage, the participants had to perform the test on their own. The sequence of presentation within a trial was as follows: an empty screen -1000 ms; a fixation cross -300 ms; a blank screen -400 ms; and the Stroop stimulus -1700 ms. After this time passed, the next stimulus appeared no matter how the participants answered or whether they responded at all. There were 144 stimuli in total, a third of which were congruent, a third were incongruent, and the remaining stimuli were neutral (factor: type of stimuli); half of the trials presented a stimulus that was to be named in the regular way, and the other half required the rule-modified naming (factor: type of naming). The stimuli presentation order was random, but there were no repetitions of color, meaning, or congruence more than three times in a row; repetition in terms of naming type was restricted to no more than five times in a row.

Results
We counted partially or fully wrong answers and external vocalizations as errors. We removed them from further analysis: 3.5 % for congruent stimuli, 8 % for noncongruent stimuli, and 3.1 % for neutral stimuli in the case of regular color names; 1.9 %, 6.4 %, and 2.8 %, respectively, in the case of color names modified by the rule. Only the correct response times were further analyzed. Figure 1 shows the average time for correct responses as a function of "naming type" and "stimulus type" factors.
Post hoc analysis of the "stimulus type" factor with Bonferroni correction revealed a slower response to incongruent stimuli (M = 927 ms; SD = 27 ms) compared with the speed of responses to neutral (M = 816 ms; SD = 23 ms) and congruent stimuli (M = 803 ms; SD = 22 ms). Both differences were statistically significant (MD = 123 ms; SE = 13; p < .001 and MD = 110 ms; SE = 17; p < .001). Participants were significantly slower in responding if it was necessary to use another label for the color of the stimuli (MD = 79 ms; SE = 12). All averages are shown in Figure 1.
In the reported analysis, we did not account for the semantically conflicting stimuli that were congruent at the response level (e.g., it is the word red in yellow color when the instruction is to name yellow stimuli with the word "red"). Each participant faced eight such "response-congruent" stimuli. In our experiment, we made equal proportions of stimuli of each type, color, and meaning because otherwise, we would face artifacts of associative learning or expectancy effects. Therefore, there were not many "response-congruent" stimuli. However, we reused part of our data to analyze response times specifically for the "response-congruent" stimuli. Response times to "response-congruent" stimuli were compared with response times to congruent stimuli (another color naming condition) and other incongruent stimuli (another color naming condition, but not response-congruent). Response time to response-congruent stimuli was on average smaller than response time to other incongruent stimuli (M = 908 ms, SD = 44 ms vs. M = 958 ms, SD = 32 ms), but longer than response time to congruent stimuli (M = 908 ms, SD = 44 ms vs. M = 848 ms, SD = 24 ms). Both differences were statistically significant: t (1, 23) = -2.3, p = .032 and t (1, 23) = 2.1, p = .046). In both cases, the effect size was average: d-Cohen = .43 and d-Cohen = .46, respectively. This result is interesting and requires independent verification since we used the same data set for two different types of statistical analysis.

Discussion
The main result of our work is that matching the color and meaning of the word causes a decrease in response time, even if the response does not match the meaning of the word. Thus, a response conflict does not occur without a semantic conflict. It can be thought of as if the color and meaning of a word are combined into one representation, and then this representation is "translated" into another one according to the rules of response production. However, combining the representation of color and meaning is possible only after processing both the font color and meaning of the word. But if the color of a word is already identified, then the meaning of a word is no longer necessary for effective task performance. Moreover, in this case, the word meaning does not match the correct response, and its "translation" into the correct response takes time.
For this reason, congruence at the semantic level in itself should not speed up the response time.
To explain the obtained results, let us consider the mechanisms that allow a correct response to the Stroop task despite conflicts. Those mechanisms usually are related to cognitive control ; see also Schmidt, 2019 for a critical review). The general belief is that control weakens the tendency to respond to the meaning of the word and increases the tendency to respond to the font color. However, a question remains how cognitive control "knows" which one of the response trends it should support. In the model proposed by Notebaert (2008, 2009), conflict detection occurs before the response is ready. Moreover, conflict detection does not contain an indication of which of the processes is "erroneous, " but signals the conflict "in general. " In other words, conflict detection occurs at the semantic level.
To summarize, there is a monitoring mechanism that is triggered when it detects a semantic conflict. There is also a response control mechanism that works with prepared responses. If a conflict is detected on the semantic level, the prepared responses are controlled, which requires more time. After activation of the response control mechanism, task performance slows down, and the more reasons there are to reject a particular response, the sooner the final response is given. We hypothesize that this is why, in our study, when the color of a word matches its meaning, the detection mechanism was not triggered. Responses were given faster despite the incongruence at the response level. In turn, a conflict is detected when the color and meaning do not match, even if the word matches the correct response. Indeed, the conflict detection mechanism only takes the meanings of words and does not determine whether a word's meaning will match the right response. After a semantic conflict is detected, the response control is triggered. If, in reality, there is no conflict of response (e.g., "response-congruent" stimulus), the time of this control is reduced.
It should be noted that such an interpretation does not directly imply any conclusions about how exactly the responses are controlled. For now, we can only suggest that more time is spent on answering when there is a detected conflict. According to Vergats andNotebaert (2008, 2009) the activation of all representations in the mind increases, including the representations of the color and the meaning of the word. Nevertheless, alternative explanations can be offered: after a conflict is detected, the requirements for accuracy of the answer increase, and impulsive responses are suppressed, etc. Our results allow us to offer a solution for one of the contradictions in the study of interference. On the one hand, word meaning processing is referred to as ballistic processes. Once these mental processes are started, there is no possibility to interrupt or attenuate them (including by cognitive control mechanisms) (see . On the other hand, the amount of interference may decrease when the context changes (e.g., when the number of incongruent stimuli increases compared to neutral stimuli), which is cited as evidence of non-ballistic word processing (e.g., . The results of our research suggest that word processing on a semantic level is "ballistic, " but semantic processing in its turn can trigger response control mechanisms.

Conclusion
Our research shows that semantic conflict and response conflict are not independent of each other in the Stroop test. We believe that the semantic conflict triggers control processes, but this control itself works with the responses, suppressing irrelevant answers if necessary.