1 The Sotaq optimality based computer program and secondary stress in two varieties of Portuguese *' Maria Bernadete Abaurre Charlotte Galves Arnaldo Mandel Filomena Sandalo ** Abstract Typical postlexical interface phenomena, like secondary (rhythmic) stress, can be succesfully modeled by OT analyses, which predict optimally stressed outputs from a set of possible inputs and a hierarchically ranked set of constraints. This paper presents an OT analysis for European and Brazilian Portuguese secondary stressing. Based on this analysis, a computer program, sotaq, has been developed, allowing for automatic testing, against large corpora, of proposed constraint hierarchies for both varieties of Portuguese. Test results are presented, showing suitable hierarchies generating secondary stresses for both varieties of Portuguese. Keywords: Optimality theory; secondary stress; European and Brazilian Portuguese; sotaq program; automated analysis; shortest path. 1. Introduction * This research was supported by FAPESP (Fundação de Amparo à Pesquisa do Estado de São Paulo, Brazil) under grants 98/ and 98/ (thematic project "Rhythmic Patterns, Parameter Setting, and Language Change"), and CNPq (Conselho Nacional de Pesquisa) under grants / (TIPAL project, "Probabilistic Tools for Pattern Identification Applied to Linguistics"), /92-5, and /88-7. We thank Pierre Collet, João Costa, Sónia Frota, Antonio Galves and Marina Vigário for valuable suggestions and discussion of earlier versions of this paper. The annotation of secondary stresses in Brazilian Portuguese was made by Flaviane Romani and Luciani Tenani. The annotation of the European Portuguese corpus was made by João Costa, Sónia Frota and Marina Vigário. All remaining errors are of course our own. The names of the authors are arranged in alphabetical order. A first attempt to approach rhythmic differences between EP and BP in terms of constraint ranking was proposed in Abaurre & Galves (1998) and in Sandalo, Abaurre & Galves (1999). Linguistics Department, Universidade Estadual de Campinas & CNPq. Computer Science Department, Universidade de São Paulo. ** Linguistics Department, Universidade Estadual de Campinas. Corresponding author. Present address: IEL/UNICAMP, CP 6045, Campinas-SP, Brazil address
2 This paper presents an analysis of secondary stress in Brazilian and European Portuguese (BP and EP) based on Optimality Theory (OT). The model used associates to each linguistic string structural descriptions that are collections of decompositions into chunks of consecutive syllables, one of which is singled out as stressed. So, the input for the OT generator (Gen) consists of sentences and the output is a collection of feet. The secondary stresses are inferred by the OT evaluator (Eval) from those appearing in the segmentations that are evaluated as optimal according to a set of ranked constraints. In current OT literature, constraint rankings have been tested manually on small data sets, with a small set of outputs. We have developed a computer program, sotaq, that tests proposed stress systems (formulated in terms of constraint hierarchies) for both varieties of Portuguese against observed actual stresses and in large corpora, thus allowing for automatic testing of Optimality Theory predictions for secondary stress. Such analysis is presented here, along with the corresponding constraint hierarchies for BP and EP. 2. Primary stress Although BP and EP place primary stress exactly at the same position, secondary stress positioning is remarkably different, as can be noticed below. The examples present some possible instances of secondary stress (rhythmic stress) placement in both European and Brazilian Portuguese according to native speakers of each of the varieties. The syllables bearing primary stress are in bold and those bearing secondary stress are underlined: EP: ( 1 ) a. A autoridade do governador diminuiu ~
3 b. A autoridade do governador diminuiu BP: ( 2 ) a. A autoridade do governador diminuiu ~ b. A autoridade do governador diminuiu EP: ( 3 ) a. A modernização foi satisfatória ~ b. A modernização foi satisfatória BP: ( 4 ) a. A modernização foi satisfatória ~ b. A modernização foi satisfatória. EP: ( 5 ) a. A catalogadora compreendeu o trabalho da pesquisadora ~ b. A catalogadora compreendeu o trabalho da pesquisadora BP: ( 6 ) a. A catalogadora compreendeu o trabalho da pesquisadora ~ b. A catalogadora compreendeu o trabalho da pesquisadora The facts of primary and secondary stress in Portuguese favor Van der Hulst's (1997) position, according to which primary and secondary stresses are not derived by the same algorithm. Van der Hulst notes that, in the majority of languages, the assignment of primary stress does not depend on prior exhaustive footing. Indeed, the assignment of primary and secondary stresses in Brazilian and European Portuguese is clearly independent. In this paper we assume that primary stress in Portuguese is part of the language's lexical information. That is, it is not assigned by the computational system of the language. Our assumption is based on the fact that, although it is well-known that Portuguese main stress falls in one of the last three syllables, none of the current analyses of Portuguese is able to successfully predict which of the three last syllables will be stressed without an extraordinary use of lexical extrametricality, as shown below.
4 Since many Portuguese words bear primary stress on the last syllable if it is heavy, many researchers have postulated that primary stress is assigned by constructing noniterative moraic trochees from right to left. This is the analysis assumed, for instance, by Bisol (1992), Mateus (1975, 1983), Massini-Cagliari (1995), among many others. However, something must be said about the great number of nouns ended in light syllables that bear a stress on the last syllable (e.g. sofá) and about the great number of words with antepenultimate stress (e.g. pérola). There are also many words with penultimate stress even when the last syllable is heavy (e.g. cadáver). According to this analysis, most of the exceptions are dealt with via lexical extrametricality. Given the high number of words that remain unaccounted for by an analysis that postulates moraic trochees for Portuguese, Lee (1994) revisits Camara (1953) and postulates that /e/, /a/ e /o/ in final position of nouns are thematic vowels and are outside the stress domain. According to Lee, Portuguese stress domain is the root, not the stem, and primary stressing relies on a non-iterative iambic pattern. According to this analysis, words like mesa bear stress on the penultimate syllable because their last vowel is a thematic vowel, that is, a suffix, and it is, therefore, outside the stress domain. And words like sofá bear stress on the last syllable because they do not have a thematic vowel. Although this analysis has the advantage of decreasing the number of exceptions, it is circular because we only know that a vowel is thematic (i.e. a suffix) once we know whether it is stressed. In addition to its circularity, this analysis still has many exceptions since the words with an antepenultimate stress pattern and the words ending by a heavy syllable bearing a penultimate stress pattern remain unaccounted for. In conclusion, both types of analysis require an extraordinary amount of lexical extrametricality to solve the great number of exceptions, which suggests that it is more
5 economical to postulate that primary stress is phonemic. This kind of conclusion is already widely assumed for Spanish, whose main stress phenomena are quite similar to Portuguese. According to Hayes (1995:96), "main stress in Spanish is phonemic, though it can be predicted to a fair extent by complex lexical rules, whose character continues to be debated". 3. Secondary stress Our analysis of secondary stress is based on a corpus of 20 sentences which were read three times by three native speakers of Portuguese from Lisbon, Portugal, and by two native speakers of Portuguese from São Paulo, Brazil. The data have been transcribed on the basis of auditive perception, but spectrograms were used as support for the phonetic transcription 1. Our analysis holds for a normal rate of speech in sentences that convey new information, as in headline news. Slow, deliberate speech can lead to stress patterns that will be disregarded here. For instance, it is well known that a different stress pattern may result from what intuitively feels like special emphasis on a particular element. The data suggested two basic distinctions: In BP, secondary stress follows a binary pattern, while no similar restriction holds for EP; 1 The spectrograms were made by means of the Multi Speech program for acoustic analysis.
6 EP allows functional words to be stressed, while BP does not. We elaborate on this as follows. Brazilian Portuguese secondary stress follows a rarely violated binary (two-syllable) pattern. The exceptions to the binary system are mostly cases of the so-called initial dactyl (Prince 1983). That is, there is an initial ternary alternation (the initial dactyl) when the stress domain has an odd number of syllables. The initial dactyl is not obligatory, however. For instance, a word like satisfatória can be stressed as satisfatória, an example of the initial dactyl, or as satisfatória. It is well known that Spanish presents exactly the same phenomenon (Harris 1983, 1989, Roca 1986). Harris (1989), within Metrical Theory, has suggested an analysis for Spanish which states that the two variants represent alternative outcomes to the resolution of a stress clash. On Harris's analysis, secondary stress in Spanish is applied by building trochees from right to left on the syllables preceding the syllable bearing main stress. If we allow degenerate feet at an intermediate stage of the derivation, the sort of clash shown in 7 will result. Initial dactyls can then be derived by applying a rule of rightward destressing and reparsing, whose effects are shown in 8, where one syllable in the middle of the word (ti) is left unparsed. The other option is to resolve the clash with leftward destressing, as shown in 9. ( 7 ) ( x ) ( x ) (x )(x ) (x )(x ) σ σ σ σ σ σ σ σ σ constantinopolitanismo
7 ( 8 ) ( x ) ( x ) ( x ) (x )(x )(x ) ( x ) (x )(x )(x ) σ σ σ σ σ σ σ σ σ σ σ σ σ σ σ σ σ σ constantinopolitanismo constantinopolitanismo ( 9 ) ( x ) (x )(x )(x ) (x ) σ σ σ σ σ σ σ σ σ constantinopolitanismo Hayes (1995) points out that "the crucial point of Harris's analysis is that it relies on a temporary degenerate foot, set up in the middle of the derivation (7), that either is expanded into a proper foot by destressing and reparsing, or is itself deleted." In neither case the degenerate foot surfaces and Hayes maintains that it shows that the crucial point of the Spanish phonology is the presence of a constraint that bans degenerate feet. One could argue that the same analysis could be employed for BP. Our data, however, shows that this analysis faces empirical problems, as discussed below. An acoustic analysis of the BP facts shows that many words containing an odd number of syllables have undergone vowel deletion, which resulted in a perfect binary system. In other words, the syllable that Harris supposes to be left unparsed is actually not realized. Thus, the word satisfatória was actually realized as satsfatória, where the vowel /i/ has been deleted, resulting in a perfect binary structure ((satsfa)σ (tória)σ). One could argue that the one strategy employed for Brazilian Portuguese to avoid degenerate feet is vowel deletion instead of simply reparsing. Thus, an analysis along the lines of Harris's proposal could be offered, provided that a rule of /i/ deletion is added. The phenomenon of vowel deletion in Brazilian Portuguese, however, shows that the facts are more complex than a metrical analysis can predict. The words containing an odd number of syllables are
8 the target for vowel deletion, which suggests that we are indeed looking at a language that prefers to avoid degenerate feet, as claimed by Hayes. The realizations in 10 and 11, however, are problematic for Metrical Theory because, if secondary stress results from an alternation of stressed and non-stressed syllables from right to left on the syllables preceding the syllable bearing main stress, there would be no reason for vowel deletion because there are four syllables preceding the syllable with main stress in investigador and in modernização, and therefore a perfectly binary alternation would result. The prosodically-induced vowel deletion of 10 and 11 only makes sense if we assume that there is a constraint that forces binary feet (i.e. (in vest)σ (ga dor)σ) and (mo dern)σ (za ção)σ), and there is no need to introduce directionality (right to left counting) in order to obtain binarity via perfect alternation between strong and weak syllables, as predicted by a Metrical Theory analysis. ( 10 ) O in ves ti ga dor já lhe de vol veu o di nhei ro. [win vest ga dor já lhe de vow vew: di nhei ro] (11) A modernização foi satisfatória [a mo dern za ção foi sats fa tó ria] We will propose in section 3 that the facts of Portuguese result from a conflict of forces instead of from a computation of alternating strong and weak syllables like it has been widely assumed for Spanish and also for Brazilian Portuguese within Metrical Theory (see for instance Collischonn 1993 for Brazilian Portuguese). In this system we will derive the facts of initial dactyl without postulating degenerate feet that never surface. Such degenerate feet represent cases of absolute neutralization and it is widely accepted that
9 absolute neutralization must be avoided given the problems that it may bring for language acquisition. Since our OT analysis makes it possible to generate cases of initial dactyl where there are no cases of vowel deletion, it may be the case that our analysis can be extended also to Spanish avoiding absolute neutralization also for that language. A process of vowel deletion that forces a binary system has been noticed before for primary stress (Bisol 2000, among others). For instance, it is well known that words like pérola pearl are often realized as perla. This paper represents the first time that a similar phenomenon has been noticed for secondary stressing. Abaurre (1979) discusses several cases of vowel deletion in BP, but the phenomenon is not associated with foot binarity. Below are the acoustic configurations of the word modernização, where the first spectrogram attests the mentioned vowel deletion and the second spectrogram shows the same word with no vowel deletion.
10 European Portuguese differs from Brazilian Portuguese in that it is not a binary system. In European Portuguese the beginning of a sentence tends to be prominent, as noticed already by Frota (1998) and Vigário (1998). This fact can be noticed in the example below: ( 12 ) O investigador já me ofereceu dinheiro ~ O investigador já me ofereceu dinheiro. But we find in our corpus other prominences at the beginning of smaller domains (cf. A catalogadora comprendeu o trabalho da pesquisadora). We leave for further research the exact characterization of this domain. Here, we refer to such domain as phonological
11 phrase. 2 The important fact is that EP shows unbounded secondary footing. D Andrade & Laks (1991) have claimed that secondary stresses are assigned via binary feet construction in EP, and Carvalho (1988/1989) claims that secondary stress is assigned via ternary feet. The transcription of our data by three native speakers of EP does not indicate either binary or ternary alternations. Another point where EP and BP differ concerning secondary stressing is that functional words can bear secondary stress in EP (A catalogadora ~ A catalogadora). That is, EP accepts the placement of a secondary stress on either the functional word that starts a phonological phrase or on the first syllable of the first lexical word of a phonological phrase. In BP, functional words never bear stress in a non-emphatic pronunciation. Finally, EP and BP differ in that only EP has the option of not assigning any secondary prominences in a word (cf. O investigador já lhe devolveu o dinheiro). The variation on secondary stress placement in both EP and BP is problematic for a Metrical Theory analysis because, in a derivational analysis, we would have to postulate that one form is default and derive the other form via re-arrangement rules. Since EP accepts a range of variation that includes even the possibility of not assigning any secondary stresses, the re-arrangement rules for EP could be so complex as to make a derivational analysis unwieldy. To sum up, an analysis in OT terms has the advantages of : (i) generating all the facts of both Brazilian and European Portuguese without postulating any cases of absolute neutralization; (ii) not forcing the usage of the notion of directionality, thus implying a 2 According to Vigário (personal communication), the relevant domain for EP is the phonological word, rather than the phonological phrase. This will be taken into account in future developments of this work.
12 simplification of the phonological theory; and (iii) being able to generate variant forms in parallel. 3. An Optimality Analysis We now describe our OT model in detail. The inputs will be sentences in a language (in our case, BP or EP). The structures assigned by Gen to each input are decompositions into segments. Those objects can be of two types: 1) A regular segment - a sequence of consecutive syllables, with one of them marked. This marked syllable is called the core of the segment. Note that this is a formal construct: the tagging has no a priori relation to any stresses, primary or secondary. 2) A pseudo segment - a single syllable, with no stress. Furthermore, in a segmentation yielded by Gen, each syllable is contained in exactly one segment. Recall that, in constructing an OT tableau, one draws one line for each possible output, that is, for each segmentation, in our case. As we will see later, tableaux are totally impractical for this model, since even moderately sized inputs have an extremely large number of possible outputs. Even a computer would not be able to list all those outputs, so a true mathematical optimization approach has to be taken to find the true optimal solutions without exaustively searching all possibilities.
13 Each segmentation yields a stressing of the given sentence simply by stressing the core of the regular segments. Under this correspondence, regular segments can be interpreted as metric feet of the resulting stressed sentence. This model entails a specific locality restriction on the type of constraints we are willing to consider: each constraint ought to be checkable by considering each segment individually, or by checking each pair of adjacent segments. It turns out that most constraints already used in other OT work can be expressed this way, so we are not handicapping ourselves too much. One important aspect of our model is that we have not restricted ourselves to a strict ranking of the constraints, but have completely accepted the possibility of a stractified dominance hierarchy. The reason for that is large amount of free variation observed in our data, and the virtual impossibility of accounting for it with strict hierarchies. We note that the way we use stratified hierarchies differs from the proposal of Reynolds (1994), and is not exactly contemplated in section 9.4 of Kager (1999); the counting is reminiscent of the Tesar/Smolensky Learning Algorithm. Our data gives the impression that we can better account for free variation than by other methods, although further research is needed to clarify whether this is the case. We describe each constraint that follows in two forms: An intensional form (in italics), giving an idea, and a formal form, telling when a violation mark must be assigned. The constraints found to be relevant to this analysis so far are: DEP ST : Deletion of lexical stresses is not allowed. Violated by a segment containing a lexically stressed syllable not tagged as the core. RIGHTMOST: Lexical stresses occur at the last foot of each lexical word. Violated by a
14 segment not containing the last syllable of a word, provided the segment's core has a lexical stress. INTLEX: A lexical word must be a prosodic word. Violated by a segment that contains syllables of more than one word. ALIGN (FT, L, PHP L): Every foot has its left boundary at the left edge of a phonological phrase. Violated by a regular segment whose left boundary is not the left edge of a phonological phrase. 3 FOOTBIN/BINGRAD: Feet must be binary. FOOTBIN is violated by a regular segment that does not have exactly two syllables. BINGRAD is a gradient form of the same restriction: long feet count one violation for each syllable exceeding the initial two. PARSE: All syllables must be parsed into feet. Violated by each pseudo-segment which is not a functional word. TROCHEE: All feet must be left-headed. Violated by a segment whose core is not its initial syllable. NOCLASH: No stressed syllables can be adjacent. Violated by a pair of successive segments, the core of the first in the last syllable, the core of the second in its initial syllable. CLASHINT: No stressed syllables within a lexical word can be adjacent. Like NOCLASH, but with the two cores in the same word. CLASHEXT: No stressed syllables in successive words can be adjacent. Like NOCLASH, but with the second segment starting at the beginnig of a lexical word. NOLAPSE: No adjacent unstressed syllables inside a word-medial foot. Violated by a
15 segment occurring not at the beginning or end of a lexical word containing two adjacent non-core syllables. One should get an intuitive picture of the dynamic of conflict among those constraints, in order to appreciate their significance. Two of them, RIGHTMOST and CLASHINT, are undominated in both EP and BP. This is evident in the analyzed corpus, and we conjecture that this is a valid generalization for these languages. RIGHTMOST reflects properties of lexical words. 4 CLASHINT comes from the rhythmic need for alternation of prominent and non-prominent syllables in these languages for the style of speech we are considering. Next comes DEP ST, which for the experiments we have done has surfaced as undominated. There are, however, circumstances of primary stress retraction in PB, as discussed by Sandalo & Truckenbrodt (2001). In the OT system, this fact can be captured by other constraints that are not listed here and that dominate DEP ST. In future developments of this work, these constraints will be explored. In opposition to FOOTBIN/BINGRAD, NOLAPSE, and INTLEX (that tend to require small segments), there are Parse and ALIGN (FT, L, PHP L). 5 The latter constraint forces the inclusion of a functional word in the stress domain as well as forces unbounded feet since it requires all feet to be aligned with the phonological phrase. The different 3 In the actual version of sotaq, phonological phrases are delimited manually, but see Sandalo & Truckenbrodt (2001) for a proposal that could be implemented in sotaq in order to generate phonological phrases in BP automatically. See also footnote 2 for EP. 4 Indeed, this constraint is so obviously true that we only were made aware of the need to make it explicit when Sotaq started producing unnaceptable segmentations that clearly violated it. It is well known that all known natural languages have primary stress on an edge window, and Rightmost just places the window for Portuguese. 5 Intlex is given as LXWD = PRWD in Kager (1999).
16 hierarchization of these constraints will be crucial to derive the main differences between BP and EP. In BP foot binarity is crucial and in EP feet are unbounded. Another type of opposition is that between TROCHEE and CLASHEXT. While TROCHEE forces a specific position for the stressed syllable in a foot, CLASHEXT disallows some positions due to the interaction of adjacent feet. Note that, following the assumptions of OT, we use the same constraints with different rankings to derive BP and EP stress patterns. There is, however, a noticeable partial exception, namely FOOTBIN/BINGRAD. All the constraints, except for one, have categorical violations. FOOTBIN/BINGRAD is a manifestation of a same constraint with different ways to compute violation. While violations of FOOTBIN are computed as categorical, violations of BINGRAD are gradient. Recall that a long foot will compute as a single violation of FOOTBIN, whereas for BINGRAD the number of violations increases with the length of the foot. The strong preference for binary feet in BP has been attested in many works (Bisol 1992, Collischonn 1993, Lee 1994, Massini-Cagliari 1995). Moreover, our handling of the data showed that FootBin is too weak a constraint for generating the correct facts of BP, while BinGrad is too strong for EP, even if very lowly ranked. 6 Recall that BP shows a phenomenon of vowel deletion induced by rhythm. It is well known that EP also undergoes vowel deletion (Mateus 1975, 1983). We could not employ constraints to handle the BP/EP facts because the phenomenon of vowel deletion has not been fully understood, for both Portuguese varieties, until the present point of this research. In other words, we had to avoid handling vowel deletion automatically at this point of our
17 project because we do not have a complete description of the facts yet. It is, however, crucial to implement the BP rhythmic vowel deletion in order to generate the correct facts relative to secondary stress in this variety of Portuguese. Therefore, we informed sotaq manually what we know about vowel deletion (via spectrogram analysis of our corpus). We marked with a + the BP vowels that can be deleted. The program does not count as a violation of BINGRAD any foot containing three syllables one of which contains a vowel marked by +. Other processes of ressyllabiffication, like those resulting from the application of vocalic sandhi rules, or internal diphthongization are also important, and our tests and data clearly shows so. At this moment we have not dealt with those phenomena yet, but they will be duly considered in the future. 6 The implications of this fact for language acquisition remain to be investigated. It may be the case that specific the way (gradient or categorical) to compute violations of a constraint is not innate, but acquired via exposition to the input.
18 4. Sotaq In most of the current OT literature, a given analysis is tested manually via manipulation of a very restricted amount of data, usually consisting of words or very short phrases. We have developed a computer program, named sotaq, for automatic testing of various different constraint hierarchies on a robust amount of data, thus providing more substantive evidence for our analysis. An earliest version of such a program was conceived and implemented by Pierre Collet and Antonio Galves; the current version is a new implementation, building on their initial ideas. Here we present an abridged description of sotaq, explaining some of its underlying algorithms. A more detailed explanation will be presented elsewhere. Roughly speaking, sotaq is fed a constraint hierarchy and it processes sentences assigning secondary stresses according to the corresponding OT model. The constraints explained earlier are all implemented. As can be seen from their definition, their computation requires information about syllabification, lexical stresses, lexical words and phonological phrases. While much of this information could be computed automatically from the sentences, at this point they are given in the input. This complicates the input slightly, but makes sotaq a leaner program, concentrated on its main task. The input for sotaq is a collection of sentences, each one being a collection of tagged syllables. Each of these is a phonological syllable in an actual Portuguese utterance, preceded by a numerical tag that encodes some properties of that syllable. Some such properties are: whether it starts a word, whether it has a primary stress, whether a vowel can be erased in speech. Here is an example:
19 6 O 10 in 0 ves 16 ti 0 ga 1 dor 7 já 2 lhe 10 de 0 vol 1 veu 6 o 2 di 1 nhei 0 ro. The tag for a syllable is a sum of values as such: O: 2 (starts a word) + 4 (starts a phonological phrase) in: 2 (starts a word) + 8 (secondary stress 7 ) já: 1 (primary stress) + 2 (starts a word) + 4 (starts a phonological phrase) ti: 16 (vowel may be erased). The constraints may refer to the tags; actually, sotaq processes only the tags in its search for the optimal stresses; the textual syllables are used only to produce humanreadable output 8. When the program is called, the name of the file containing sentences tagged as above is specified, together with a constraint hierarchy. Here are two examples of sotaq' s output; we use the same input sentence, with two different rankings. Example 1: The ranking used was: DEP ST : RIGHTMOST : CLASHINT >> INTLEX >> BINGRAD : PARSE : NOLAPSE >> CLASHEXT >>TROCHEE >> ALIGN This is the one that best fits BP, so far. The output was: I: A in te li gên cia da ca ta lo ga do ra foi de ter mi nan te [ ~~ ^^^ [ ~~ ~~ ^^ [^^^ ~~~ ^^^ + O: a in TEli GÊNcia da CAta LOga DOra FOI deter minante  7 This secondary stresses were also obtained through auditory perception, and the transcriptions can be used to check sotaq's result. This way, one may test the OT model's predictiveness. 8 Currently there are no user-friendly facilities for preparing input for sotaq. A better userinterface is already being designed, and will be implemented RSN.
20 O: a INteli GÊNcia da CAta LOga DOra FOI deter minante  Total cost: syllables. 2 optimal segmentations possible segmentations (tableau lines) Example 2: The ranking used was: DEP ST : RIGHTMOST : CLASHINT >> TROCHEE >> ALIGN : INTLEX >> PARSE >> CLASHEXT >> FOOTBIN : NOLAPSE This is the one that best fits EP, so far. The output was: I: A in te li gên cia da ca ta lo ga do ra foi de ter mi nan te [ ~~ ^^^ [ ~~ ^^ [^^^ ^^^ O: a INteli GÊNcia da CAtaloga DOra FOI determi NANte  O: A inteli GÊNcia da CAtaloga DOra FOI determi NANte  O: a INteli GÊNcia DA cataloga DOra FOI determi NANte  O: A inteli GÊNcia DA cataloga DOra FOI determi NANte  Total cost: syllables. 4 optimal segmentations possible segmentations (tableau lines) In each case, the line labeled I and the following one describe the input. The first line is the textual sentence, the second describes tags. A [ marks the beginning of a phonological phrase, a marks the beginning of a lexical word; a syllable is underlined with carets (^^^^) if it bears a primary stress, and it is underlined with tildes (~~~) if secondary stress was auditorily perceived. Further, a + under a vowel means that it may be deleted in speech.