COMPONENT‐COMPATIBILITY IN HISTORICAL BIOGEOGRAPHY

The problems of reconstructing historical relationships for areas of endemism from distributional data for groups of taxa and the cladistic relationships among the members of those groups can be solved by applying the two principles of parsimony and mutual inclusion or exclusion (compatibility) of components. Components can be extracted from a data matrix by means of transcription into partial monothetic sets. The data matrix thus derived represents the distribution over areas for the monophyletic groups in one or more cladograms. It is derived from two different matrices by boolean multiplication. The first matrix gives the binary representation of distributions of taxa over areas of endemism; the second describes the cladogram for the same taxa, in terms of character states converted into binary form by additive binary coding. The derived data matrix can be used in historical biogeography to represent the given phyletic data (Assumption 0 here newly defined), and can be amended to reflect Assumptions 1 or 2 to accomodate the problems of wide‐spread taxa and missing areas. Areacladograms are determined from the derived matrix by searching for the largest sets of mutually compatible components. Area‐cladograms are evaluated in terms of support (vicariance) and contradiction (ad hoc interpretations such as dispersal and extinction). Area‐cladograms that best fit the data matrix regarding the balance between support and contradiction are selected as the best possible recontructions of relationships among the areas of endemism. The procedure is illustrated by the example of the poeciliid fish genera Heterandria and Xiphophorus, and several other standard examples.


Introduction
Historical biogeography assumes that there is a correspondence between speciesrelationships and area-relationships. Comparisons between the cladistic relations within various groups of organisms occurring in a certain region might elucidate general patterns which can be used to develop hypotheses on the historical relationships of biota (Rosen 1978, Nelson and Platnick 1981, Humphries 1982, Humphries and Parenti 1986. For this purpose, species cladograms are transformed into area-cladograms by replacing each species represented at the terminals with its distributional area (described in terms of the smallest relevant biogeographical entity, the area of endemism). Congruencies between area-cladograms are then depicted in general area-cladograms, which represent the general patterns, the cladistic relationships between areas of endemism.
At present two types of analysis with different purposes and different lines of reasoning can be recognized in historical biogeographic reseach. The first type aims at reconstructing the historical development of biota ('cladistic biogeography'). For this purpose, the delimitation of ares of endemism is a starting-point (Humphries and Parenti 1986: 1). The main problems concern the manipulation of wide-spread species and the absence of groups from one or more areas under consideration. The methods of Rosen (1976) and Platnick (1978, 1981) belong here. The second type concerns the search for data for constructing evolutionary scenarios of particular groups under study (e.g. 'vicariance biogeography'). Here, emphasis is put on vicariance patterns as a framework for postulating speciation models. An essential element is the search for geographic separation between sister taxa (Wiley and Mayden 1985: 598-600). In this case the main problem is coping with overlapping ranges. Wiley's 'ancestral species map' method belongs to this category, as also his parsimony method outlined recently (Wiley, in press).
To date, the general treatments of cladistic biogeography (Nelson and Platnick 1981, 1. DATA MATRIX The data matrix shows the distribution pattern of the various intrinsic character states amongst the terminal taxa. These character states may be coded as unordered (neutral), uniquely ordered (additive binary), or exhaustively ordered (all additive combinations) with respect to the sequence of their transformations, and the data matrix may consist of one or either type or a mixture.

MONOTHETIC SETS
Character states, either separately or in combination, uniquely represented by a group of terminal taxa define this group as a cladogenetic unit and a building block for cladograms (cladon: i.e. a set of terminal taxa, without rank, name, or phyletic structure). Clada are extracted from the data matrix by means of transcription into monothetic sets (Beckner, 1959;Sharrock and Felsenstein, 1975;Farris, 1978). When a cladon is defined in terms of unique character states only then is it called a partial monothetic set. When it is defined in terms of a unique combination of character states, none of which need be unique, but some may be so, then it is called a strict monothetic set (Zandee, 1984). Charater state distributions defining the same cladon represent a character type (sensu Nelson and Platnick 1981). The rationale behind the use of monothetic sets of taxa as building blocks for cladograms is the idea that a series of homologous character states (i.e. a character) must be kept as an integrated whole as long as this is tenable in the context of parsimony, rather than broken up into different characters (a non-homologous series of states). In other words, convergence is rejected as a first level explanation for similarities observed among taxa. This implies that we try not to fit a minimal number of singular step to a tree, but build a tree from sets of integrated series of steps.
A search is made for the largest clique present in the list of clada. The recognition of cliques is based on the concept of group-compatibility (not character compatibility). Cladons are compatible with other cladons when they include or exclude each other. These inclusion and exclusion relations can be represented by a graph (network). In the case when all clada in a set are mutually compatible, the set is a clique. In the graph representing the relations, the sets in the clique are all mutually connected. Clique as used here in its original unambiguous meaning is a concept from graph theory, and stands for a maximal complete subgraph (Garey and Johnson 1979). This concept, and other elements from graph theory, are used in the implementation of the method in computer algorithms.

CLADOGRAMS
The largest cliques are transcribed into cladograms. This is followed by interpreting and evaluating their implications in terms of the character states in the data matrix and choosing one or a few of them. For this purpose two criteria are used in association. The first (quantitative) criterion is the maximization of the value of support minus contradiction (or, equivalently, minimization of the value of contradiction minus support). Support means the number of fitting character states, i.e. those in a distribution pat1 tern which can be explained by assuming a single origin (synapomorphies). Contradiction (homoplasy) means the number of independent multiple origins and/or reversals, of the remaining character states. The second criterion, used supplementarily to the first, pertains to the total number of synapomorphies for each cladogram as determined by local outgroup comparison in the context of all its 3-cladon statements (Zandee, 1984). The latter analysis might elucidate synapomorphies among those states with multiple origins which were homoplasious in the first instance. As a consequence these states are considered to belong to different transformation series.
Eventually, one or several cladogram(s) are selected on the basis of these criteria as the most likely representation(s) of a phylogeny.
General area-Cladogram Construction Humphries (1982: 445) stated that cladistic biogeography is the pursuit of a method which encompasses a code comparable to cladistics. The analogy involves exchanging species with areas of endemism and homologies with sister groups. Following this analogy, the method of historical biogeography proposed here can be summarized in a way similar to the cladistic analysis described above. However, it must be noted that biogeographic analyses can be carried out under three assumptions, i.e. Assumption 0, Assumption 1, and Assumption 2 (see also Wiley, in press). The latter two were proposed by Nelson and Platnick (1981), but they did not give full credit to the possibilities of Assumption O.
Assumption 0 begins with the idea that the reconstructed phylogeny is the best possible estimate of the true phylogeny. Assumption 0 implies that if a species appears to be wrongly delimited and actually represents two or more species, the resulting species are assumed t . he sister groups and should together represent a monophyletic group. Consequently, the ah_ inhabited by each wide-spread species represent undisputable components. This means that in Fig. 1d the identity of species 1 might be misconstrued but not its phylogenetic relations, and that from the distribution of species 1 the component, A +B can be derived (Fig. 1c). . Under Assumption 1 (Nelson and Platnick 1981: 421), it is assumed that whatever is true of a wide-spread taxon in one part of its range must also be true of the taxon in other parts of its range (compared to other areas inhabited by the other species of the monophyletic group).
In Fig. 1a the wide-spread species 1 (occurring in areas A and B) is more closely related to species 2 (occurring in area C) than to species 3 (occurring in area D). Assumption 1 implies that area A is more closely related to area C than to D and that area B is more closely related to area C than to area D. This means that if species 1 in the future is regarded as actually representing more than one species, the newly delimited species should be sister species, or branch off from the cladogram sequentially (Platnick 1981: 224). The same holds for the areas they inhabit (Fig. Ic-e), Thus, the wrongly delimited species might actually comprise a mono-or paraphyletic group, and the component defining A+B is in doubt, but the component defining A+B+C is considered indisputable (Fig. 1b). Assumption 2 (Nelson and Platnick 1981: 432) indicates that whatever is true of a widespread taxon in one part of its range might not be true of the taxon in other parts of its range.
The implications of this assumption are illustrated in Fig. 2. In Fig. 2a, Assumption 2 implies that area A is more closely related to area C than to D (Fig. 2b) and/or that area B is more closely related to area C than to area D (Fig. 2c), but this is not necessarily true for both areas A and B. If the relations of A compared to C and D are true, than B can take one of all possible positions (white circles in Figs. 2b, c) on the area-cladogram (Fig. 2b). If the relations of B compared to C and D are true, A can take one of all possible positions to branch off in the area-cladogram (Fig. 2c). In other words, both components A+B and A+B+C may be in doubt. This means that both the identity and the relationships of species 1 might be misconstrued. Furthermore, this wrongly delimited species might actually comprise a mono-, para-, or polyphyletic group with all the consequences for the resulting area-cladograms. Fig. 2(a). Cladogram showing the relations between the species 1, 2, and 3, and their distribution areas A-D. (b,c) the two area -cladograms required for the application of Assumptwn 2 derived from the cladogram. o indicates the positions which the missing areas (B and A, respectively) can take.

DATA MATRIX
The derivation of a data matrix depends on which of the three assumptions are made. This paper is not primarily concerned with a full evaluation of these assumptions, but it provides the procedures to derive the data matrices appropriate for analyses under each of the three assumptions.

Data matrix under Assumption 0
The data matrix comprises the distribution patterns in terms of areas of endemism of the terminal taxa and all corresponding monophyletic groups present in the cladograms (clada x areas). This data matrix is derived from two sets of raw data in a stepwise manner. First, for each group of taxa we have a binary matrix with distributional data for taxa, comprising only terminal taxa and areas (taxa x areas). Second, we have a cladogram for each study group of taxa, converted into a binary matrix by additive binary coding (taxa x nodes; Farris et al. 1970). The combination (boolean innerproduct) of each distributional matrix with each cladogram matrix gives a biogeographical data matrix (areas x dada from one cladogram). Each column in the fmal data matrix represents the distribution over a terminal taxon or monophyletic group for each group of analyzed taxa. 'Identical distributions belong to the same distributional type (character type: Nelson and Platnick, 1981). From this data matrix areacladograms can be derived under Assumption O. When the same set of areas obtains for several groups of taxa, the resulting separate data matrices can be joined (column wise) to form a larger data matrix (areas x clada from several cladograms) from which general area-cladograms can be derived.
A data matrix to derive a general area-cladogram for several groups of taxa can also be obtained via a kind of feed-back loop. For each separate group of taxa area-cladograms are extracted and evaluated. During this evaluation one area-cladogram is selected for each group of taxa. The distributional types corresponding to all of the nodes of this area-cladogram are selected from the data matrix based on Assumption 0 in order to obtain a reduced data matrix. Reduced matrices for each group of taxa can be then joined together to form a compound reduced matrix suitable for analysis to derive a general area-cladogram, The derivation of a data matrix under Assumption 0 is equivalent to the procedure described by Brooks (1981: 232-234) which he used to extract a data matrix from parasite distributions over hosts and parasite cladograms in order to infer host relationships from the parasite data. In fact we suggest that the method described here is just as suitable to derive general host cladograms from parasite distributions over hosts and parasite cladograms by simply substituting hosts with areas.

Data matrix under Assumption 1
The derivation of a data matrix fit for an analysis under Assumption 1 is more complex. We start from the same two sets of raw data, i.e. the distributions for terminal taxa and their cladogram(s). A combination between the two is made as described for Assumption 0, but with one proviso; the distributions that correspond with all possible subsets of areas of each wide-spread taxon (occupying 2 or more areas) are combined with the areas of its sister group. These combinations together with the subset distributions are scored in the data matrix. In the actual analysis, this proviso will cause two possible results which together represent the principle of Assumption 1 i.e. the areas occupied by a wide-spread taxon might either branch off sequentially or occur as sister areas in the area-cladograms.
The derivation of general area-cladograms for several groups of taxa can then proceed on the basis of joined data matrices in the same way as described for Assumption O. When the data matrix for a general analysis is obtained via a feed-back loop, the provisions for wide-spread taxa were already made in the analyses for the separate groups of taxa, thus only selected distributional types are joined as columns.

Datil matrix under Assumption 2
The derivation of a data matrix suitable for an analysis under Assumption 2 is even more complex. Starting from the same two sets of raw data a combination between the two is made in the same way as described above for Assumption 0 but now followed by two extra steps.
In the first step, we combine the distributions corresponding with all possible subsets of areas of each wide-spread taxon with the distributions of all other monophyletic groups in the cladogram(s). These combinations as well as those from the subset distributions themselves are added to the data matrix.
In the second step, we take all the separate missing areas and all of their possible combinations and we combine each of these with all of the columns of a duplicate of the data matrix derived under Assumption O. We take a separate duplicate for each of these operations. To compile the complete data matrix we consequently join all of the duplicates, now amended for missing areas, with the original as amended for wide-spread taxa in the first step. In the actual analysis, these steps allow for many possible results as implied by Assumption 2 (see above and Figs. 2, 4), i.e. all but one of the areas occupied by a wide-spread taxon might branch off from any other branch in the areacladogram. From Assumption 2, as defined by Nelson and Platnick (1981) and exemplified by their thought experiments (ibid. p. 462 et seq.), it does Mt follow that distributions of taxa over areas should be weighted differentially including a possible zero weight. This means that all distributions of taxa over areas implied by this assumption must be included in the data matrix which must be considered as a whole. The matrix should not be broken down into several parts prior to analysis because of incongruencies present within it, a principle which also applies to Assumption 1.
Although in this implementation of Assumption 2 the data matrix is expanding rapidly, as compared with a matrix analysed under Assumption 0, it is conservative nevertheless. For a more elaborate implementation, in the second extra step we can amend for missing areas on the basis of the data matrix as derived in the first extra step, i.e. after amending for wide-spread terminal taxa. Neither Nelson and Platnick (1981) nor Humphries and Parenti (1986) indicated which implementation should be preferred, because in their examples they deal either with missing areas or with wide-spread taxa but not with both simultaneously.
The data matrix for a general analysis can be obtained by joining the data matrices for the separate groups of taxa. It can also be obtained via a feed-back loop, i.e. after the evaluation and selection of an area-cladogram for each separate group of taxa (see appendix). In that case the provisions for wide-spread taxa and missing areas were already made in the analysis of each separate group of taxa, thus only selected distributional types are joined together.
It should be noted that the coding used in the data matrix is necessarily unordered (or neutral) with respect to areas. Any kind of additive coding is superfluous because the clada among themselves already constitute internested sets of terminal taxa. If areas are to be connected in a sequence, it is because the terminal taxa or groups of taxa occurring in these areas are connected in a hierarchical sequence provided by the cladograms.

COMPONENTS
In this step the data matrix is transcribed into a list of (partial) monothetic sets of areas. This transcription is equivalent to the derivation of components (Nelson and Platnick, 1981). A component (analogous to a cladon) is characterized by a particular distribution. Components serve as building blocks for area-cladograms as well as for general area-cladograms (as do clada for cladograms).

CLIQUES
Inclusion and exclusion relations (compatibilities) are recognized among the components and maximal cliques are sought in the network expressing these relations. Cliques comprise components which mutually include or exclude each other.

AREA-CLADOGRAMS AND GENERAL AREA-CLADOGRAMS
The maximal cliques are transcribed into area-cladograms or general area-cladograms. Ifonly one group of taxa is used to compile the data matrix, one or more area-cladograms will be obtained. If more than one group of taxa is used in a compound matrix (based on several cladograms pertaining to the same areas) general area-cladograms will be obtained. As more unrelated groups of taxa are involved, the resulting general areacladograms will be more 'general', These diagrams can then be interpreted and evaluated in terms of vicariance, primitive absence, dispersal, and extinction with regard to every taxon involved in the analysis. Synapomorphy, in a strict sense, as used in cladistic analysis, does not really apply to character states of areas or groups of areas. General area-cladograms may share monophyletic groups of taxa, but not intn"nsic characters, at least not directly since areas do not have members with genes by which their descendants inherit and consequently share monophyletic groups of taxa. It follows that 'synapomorphy' or homology in biogeography is a monophyletic group of taxa uniquely sharing a group of areas as a consequence of shared history. Indicators of non-sharedhistory, such as dispersal and extinction, are ad-hoc statements and thus are analogues of homoplasy. Evaluation by means of the parsimony criterion leads to a choice for one or a few area-cladograms.
Columns of the data matrix indicating monophyletic groups responding to vicariance events fit the area-cladogram and as such represent support. Columns indicating dispersal or extinction d~n.ot fit the area-cladog~amand represent contradiction. As in cladistic analysis, character states on internal nodes of an area-cladogram are estimated by optimization (Farris, 1970). After optimization the number of state changes for each column of the data matrix can be computed for each area-cladogram. All columns with zero or one state change represent support. Character states which are present from the root onwards are considered to have zero changes. Choosing an appropriate outgroup might tell us whether the state really originated at the root. Until this can be examined such states are taken as support without assuming a single origin at the root. Columns indicating multiple parallel origins and reversals, represent contradiction. Contradiction minus support is chosen as a measure to express the degree of best fit for areacladograms with respect to the data matrix, as it serves as a parsimony criterion for the evaluation of area-cladograms, This criterion is chosen in preference to the total number of state changes (steps). because it enables us to make a distinction between area-cladograms of equal length:" We question whether all of the columns in the data matrix should be used as a means of evaluating area-cladograms by the parsimony criterion. We can certainly use them all in Assumption 0 when all columns, from the respective cladograms, are based on real f,mpirical data. For Assumptions 1 or 2 it is doubtful that the extra columns refer to real ] ata but to assumptions expressing doubt with respect to some of the data used to econstruct phylogenies. It is our opinion that only real observations rather than assump-. ions that can be used to evaluate hypotheses. This includes hypotheses regarding biogeographical events in terms of parsimony. This point of view is not without practical consequences. It implies that if an analysis using Assumption 0 on at least two groups of taxa results in at least one fully resolved general area-cladogram, this general areacladogram will also prove to be among the best ones resulting from an analysis with Assumptions 1 or 2. We show this in an example dealing with the poeciliid fish (p. 317).
As a consequence analysis under Assumption 1 or 2 may be unneeded if an analysis under Assumption 0 already results in fully resolved general area-cladograms, for it can only produce more equally parsimonious explanations. It is only when analysis with ,Assumption 0 produces general area-cladogram(s) that are not completely dichotomous, that further analyses under Assumptions 1 or 2 make any sense.
There is yet another problem when analysing all columns at once using Assumption O. Given a contradiction regarding a certain column in the data matrix with respect to a particular area-cladogram, the same contradiction will occur in all other columns that includethe areas indicated by the affected column. It follows that contradictions shown by a data matrix with regard to a particular area-cladogram may not be independent. Nevertheless, they are all counted in the inventory of state changes. This may result in exaggerating the degree of bad fit, especially in area-cladograms in which small (i.e. less inclusive) monophyletic groups fail to respond to vicariance events. The problem is caused by the nested mutual interdependence of columns refering to monophyletic taxa, i.e. the interal nodes of a cladogram. Columns refering to terminal taxa are not affected because these distributional types always fit an area-cladogram. As yet, no solution to this problem is implemented in the algorithms used so far.

Results
In the analyses given here, two elements are emphasized. First, we explore the relationships of components with respect to their compatibilities in order to compile areacladograms or general area-cladograms. Second, we interpret area-cladograms or general area-cladograms in terms of parsimony, i.e. the balance of implied support and conradiction. Monophyletic taxa responding to vicariance events imply support for the 'area-cladogram; ad hoc hypotheses necessary to explain distribution in terms of dispersal or extinction imply contradiction.
As a corollary, we note that a chosen general area-cladogram can be used to interpret the area-cladograms in terms of incongruencies shown between them. Congruency of area-cladograms with the general area-cladogram can be interpreted as due to a common cause, i.e. analogous to synapomorphy. Incongruency is analogous to homoplasy.
However, it must be stressed that vicariance is not the only mechanism for explaining biogeographic patterns. As a consequence, the analogy implies that 'positive' incongruencies (occurrences incnfigruent with the general area-cladogram) might be caused by dispersal and the 'negative' incongruencies (the absence of taxa in particular areas i.e. 'reversals') might be evidence of extinctions or primitive absence. We use this evaluation for general area-cladograms in cases from the literature, where original data matrices are unavailable.

CRITICAL EXAMPLES UNDER ASSUMPTION 1
The problems we encountered using Assumption 1 and 2 are illustrated by comparing some hypothetical examples discussed by Humphries (1982) and Humphries and Parenti (1986).  Taking Assumption 1 (Nelson and Platnick, 1981) for granted, this example can be analysed using the method outlined in the previous paragraph. Area-cladogram 1 (Fig. 3a) yields the components given in Table 1. One component (A+B) is undisputable. The other three are disputable in the sense that they represent components from different (alternate) area-cladograms. Fig. 3a represents the consensus situation harmonizing the contradictions among these components. Consequently, from these remaining three components only one can be supported by this area-cladogram in the final general areacladogram. In a similar way, area-cladogram 2 ( Fig. 3b) generates the components given in Table 2.
Taken together, Table 3 lists the components, including the sequence number of their supporting area-cladograms (u = undisputable). From this list, the cliques present are given in Table 4.
Evaluation of these four possible cliques (general area-cladograms) with regard to support minus contradiction leads to the choice for clique 2 (general area-cladogram Fig. 3c) and differs from the conclusions given by Humphries (1982;our Fig. 3d as well as Humphries and Parenti (1986;our Fig. 3e).  The three area-cladograms (Figs. 4a-c) yield the components presented in Table 5 (in this case also, one component is undisputable, whereas of the other three only one can be supported by the respective area-cladograms), Taken together, the list of the components is given in Table 6. From this list, eight maximal cliques are available (Table 7) Humphries (1982) for the same data. (e). general area-cladogram presented by Humphries and Parenti (1986) for the same data. General area-cladograms 1 and 2 ( Fig. 4d and 4e) are alike with regard to support minus contradition. General area-cladogram 3 is second best (Fig. 4f). These three possibilities are implied in the conclusion of Humphries and Parenti (1986, Fig. 2.29g). However, they do not recognize the difference in likelihood. Moreover, Humphries (1982: Fig. Hvii) concludes that all 15 possible general area-cladograms (of these four areas) are equally likely. Using the present analysis we fail to confirm this conclusion. Table 3. Components derived from Fig. 3a, b under Assumption 1.

Component
Support Contradiction Summary 2x + 3x +,1 x - The next example (Fig. 5), determined under Assumption 2 is also taken from Humphries (1982: Fig. H.i-iii). For each of the given area-cladograms (Fig. Sa-c), the components found are presented in Table 8 (none of them being undisputable). Taken all together, the components are presented in Table 9. Within this list all 15 cliques possible for four areas are found, only one of which is presented by Humphries (1982) and Humphries and Parenti (1986). Moreover, these authors do not mention alternative solutions. The distribution of support and contradiction for the general area-cladogram is ¢ven in Table 10. The general area-cladogram in Fig. 5d is the most likely hypothesis s it lacks (unambiguous) contradition. There are seven possibilities showing 4-as the flalue of support minus contradiction. Fig. 4(a-c). Three consensus area-cladograms of threedifferent monophyletic groups; (d,e). the two best general area-cladograms derived from these area-cladograms under Assumption 1. (I). the second best general areacladogram under Assumption 1. (g). general area-cladogram presented by Humphries (1982) for the same data and assumption..(h). general area-cladogram presented by Humphries and Parenti (1986) for the same data and assumption'. What we have shown here is that by building from a list of possible components present in, or on the basis of additional assumptions derived from each area-cladogram, and judging their compatibilities and building general area-cladograms, we can arrive at one (or a few) likely general area-cladograms on the basis of a parsimony criterion. A comparable, equally explicit procedure cannot be derived from the explanations given by Nelson and Platnick (1981), Humphries (1982), and Humphries and Parenti (1986), especially in those examples in which alternate, conflicting possibilities are comprised into one (or a few) general area-cladogram(s). Table 7. Cliques from the components derived from area-cladograms given in Fig.4a-c.  A+B 1, 2, 3 A+C 1, 2, 3 A+D 1, 2, 3 B+C 1, 2 B+D 1, 2,3 C+D 1, 3 A+B+C 1,2,3 A+B+D 1, 3 A+C+D 2 B+C+D 2,3

Assumption 0
The theoretical and empirical implications of Assumption 0, in terms of the complexity of the data matrix and of the number of area-cladograms to be evaluated, are simpler than those of the other two assumptions. The analysis has been carried out without area 11 (for coding see Rosen, 1976) because it is assumed to be of hybrid origin (Wiley 1981).
A B c o Fig. 5(a-c). Three area-cladograms of three different monophyletic groups; (d). the best general area-cladogram derived from these area-cladograms under Assumption 2. Table 10. Cliques based on the components derived from the area cladograms in Fig. 5a-c.  Rosen (1976Rosen ( , 1978. The distribution over areas is coded as a binary matrix in Table 14 (for species names see Table 11). The phylogeny of Heterandria as given by Rosen (1975: Fig. 6a), shows 8 terminal taxa and 7 (internal) dada, and is also coded in binary form (Table 12). From these two binary matrices a data matrix is derived, giving distribution over areas for each dadon in the dadogram (Table 16a). The data matrix represents the boolean product of the matrices in Tables 12 and 14. It is compiled from combining distributions over areas (Table 14) for all terminal taxa indicated in each column of Table 12.
From this data matrix, 19 partial monothetic sets (components) can be extracted (Table  17). A search for the largest set of mutual compatible components reveals one maximal dique and one completely resolved area-dadogram (Fig. 7d). The wide-spread species 8 represents a true component under Assumption 0 and, therefore, areas 2 and 3 are 'sister areas'. The relations among the other areas are similar to those postulated by previous authors (e.g, Wiley, 1981).
The phylogeny of Xiplwphorus as given by Rosen (1976: Fig. 6b), shows 9 terminal taxa and 8 dada, and is represented in binary form in Table 13. The distribution over areas for the species of Xiplwphorus is given in Table 15 (for species names see Table 11). The combination of these two matrices gives another matrix indicating the distribution over areas for each cladon in the cladogram (Table 16b). From this data matrix, 18 partially monothetic sets of areas can be derived (Table 18). Table 12. Binary matrix for Heterandria cladogram (Fig. 6a).  Binary matrix for Xiplwplwrus cladogram (Fig. 6b).  Table 14.

Clada
Distributions of Hnerandri« species (after Rosen, 1976; but excluding area 11. As with Heterandria, the analysis results in one, partially resolved area-cIadogram (Fig.  7b). It differs in several respects from that of Heterandsia, in the positions of areas 3, 6, and 9. Furthermore, area 7 appears at the root becasue Xiphoplwrus is absent from this area. This area-cladogram is basically similar to those presented by previous authors (e.g. Wiley, 1981).
When the two data matrices are combined (Tables 16a and b) and analyzed together, 26 components can be defined (Table 19). From these components, three cliques of maximum size were found (the general area-cladograms in Fig. 8).
A joint analysis of two or more groups of taxa can start also from a combination of . two reduced data matrices. In this particular case, where separate analyses of each genus yield only one area-cladogram for each genus, there is no need for a selection of distributional types. As a consequence, the respective data matrices need not be reduced but joined directly. We meet a different situation when analysing under Assumption 1 or 2.

Assumption 1
Platnick (1981) considered the reduced area-cladogram obtained by Rosen equivalerit to analysis under Assumption 1. However, removal of incongruent subtrees from areacladograms is not part of this Assumption but a form of consensus tree analysis. It is a misconception to suggest that incongruencies can lead only to incomplete or only partially resolved general area-cladograms and that they cannot be resolved into fully informative (dichotomous) hypotheses. To be able to do this, the component compatibility method is required and parsimony must be applied as shown in the next example.
From this list 54 maximal cliques are present. When evaluated on the basis of the complete data matrix, i.e. including the assumptional columns, four of them have the best value of support minus contradiction (=10: Fig. 11a-d), but only two show a distributional pattern of areas 4+5 (Fig. 11a, b). The first is identical with Fig. 8b i.e. one of general area-cladograms derived from Assumption 0 but the other differs from the first in that it shows areas 9 and 10 branching sequentially.
When evaluated with the Assumption 0 data matrix, there are two general areacladograms, one better than the other. The first is identical with Figs. 8c and 9d, the second with Figs. 8d and 9d. They have the same support but differ with respect to 1 homoplasy. The best solution comes from an analysis using Assumption 0 (Fig. 8c) which is among the best ones from an analysis under Assumption 1, even when the evaluation includes or excludes assumptional columns in the data matrix.
When we use the distributional types corresponding with Figs. 8a, b to build a reduced data matrix we can subsequently derive 26 components. Exploring their mutual compatibility reveals 3 general area-cladograms. The best two are identical with Figs. 9a, d. Fig. 9a is slightly better because it has one less homoplasy. Humphries (1982), Platnick (1981), and Humphries and Parenti (1986) give only one, incompletely resolved general area-cladogram under this assumption. In contrast, our analysis yields several completely dichotomous general area-cladograms.

Assumption 2
In Assumption 2 all data concerning wide-spread taxa and missing areas are doubted from the beginning. Nelson and Platnick (1981) and others have a preference for Assumption 2 which seems to be due to the fact that their method uses consensus trees. Therefore, they must doubt specific aspects of the data in advance, i.e. they regard some aspects of the data to be ad hoc. In our procedure the reverse is true; conflicts can be resolved or they might lead to ad hoc hypotheses. Apart from this criticism, Assumption 2 as implemented here is hard to apply in actual practice because even in simple cases, it leads to an unmanageable number of possibilities, as illustrated by the following results.
Under Assumption 2, the data matrices for Heterandria and XipJwpIwrus (Table 16) are extended with columns added for wide-spread taxa and missing areas. These data matrices are far too large to be shown here in their entirety but we will briefly describe how they might be compiled. For a wide-spread taxon we first take the corresponding column from the data matrix derived under Assumption 0 (e.g, the species column 6 in Table 16b for the wide-spread Xiplwphorus species in area 6). Then we extract all its possible subsets of areas (Table 20b, columns 1-6) and combine these with the distributions of all other clada (Table 16b) to obtain the columns for Assumption 2. This procedure is repeated for all wide-spread taxa in both genera. For missing areas (e.g. area 7 in Table  15 for XipJwpIwrus) we extract all possible subsets (here only the area itself) and combine these with the distributions of all other clada ( having a range of 2 or more areas, leads to an unmanageable number of possibilities. Independent analysis of Heterandria results in 123 completely resolved area-cladograms. ren these are evaluated on the basis of the complete data matrix two better ones emerge regarding the balance between support (52) and contradiction (65). These are identical with those in Figs 9b, c but as noted earlier under Assumption 1, they cannot be accepted because areas 4 and 5 branch off sequentially rather than as sister areas. There are two second best area-cladograms with 54 supporting states and 68 contradictory events, but these also do not support areas 4 and 5 as sister areas. There are 11 third best areacladograms. Their support values range from 49-54 and their homoplasies from 64-72. Among them is the one area-cladogram that has the least total number of state changes (108) of all 123 area-cladograms. It is also the only one out of the 11 that supports 4 and 5 as sister areas. It has the same topology as that in Fig. 9.1. When evaluated against the Assumption 0 data matrix the best area-cladogram has 15 supporting states and no homoplasies which corresponds with Fig. 9a.
A separate analysis of Xiplwplwrus results in 2613 completely resolved area-cladograms. Evaluation against the complete data matrix (178 columns) gives 6 equally good areacladograms (231 homoplasies, 90 support, balance 141) and 15 second best (223-231 homoplasies, 81-90 support, balance 142). Among the latter are the three shortest areacladograms (302 steps). All three have an identical topology to the cladogram in Fig.  10. Evaluation against the data matrix used for Assumption 0 revealed 3 of the best cladograms, corresponding with those in Fig. 10. By ignoring those without areas 4 and 5 as sister areas, we end up with the cladogram in Fig. 10, which is identical to that in Fig. 8b.
It is obvious that an analysis of Heterandria and Xiplwplwrus taken together and based on reduced data matrices will give the same result as obtained under Assumption 1 because the selected distributional patterns come from the same area-cladograms (Figs 8b and  9a). An analysis of Heterandna and Xiphoplwrus with a complete combined data matrix (257 columns) generates 8431 completely dichotomous general area-cladograms. Evaluation against the complete data matrix gives two 'best' general area-cladograms (homoplasy-support = 241, 478 steps). Both show sequential branching for areas 4 and 5. For the next four values (242,245,246,250) there are also four general areacladograms, all showing areas 4 and 5 as sister areas. One of them (value 250) is the shortest general area-cladogram (451 steps; Fig. 13).
As to t¥ number of area-cladograms our results are most surprising, as they differ remarkably from those obtained by Platnick (1981), Humphries (1982), and Humphries and Parenti (1986). They mention only 3 possible general area-cladograms, but it remains unclear how they obtained their results. Compared with the provisions of our data matrix, their small number of general area-cladograms might be due to the fact that they apparently let only areas 3 and 9 take different positions under Assumption 2 because they are part of the range of the wide-spread Heterandria species in area 8 and the Xiplwphorus species in area 7. However, when this assumption is strictly applied, the different positions for areas 10 and 2, 9 and 2, or 10 and 3 should also be taken into account. It seems strange to us that these authors do not mention the general areacladogram in our figure 8a as a valid alternative hypothesis, because under Assumption 2 the component 2 +3, based on the wide-spread species in area 8, might be true. Humphries and Parenti (1986) distinguish three different methods for historical biogeographic analysis, i.e. those of Rosen (1976), Wiley ( , 1981, and Nelson and Platnick (1978). Rosen's (1976Rosen's ( , 1978method is the construction of reduced consensus area-cladograms, in which the incongruent parts of the compared area-cladograms are deleted. These reduced area-cladograms imply that observed congruencies are due to factors that affect all study groups (Wiley, 1981, p. 293). However, reduced consensus area-cladograms lose information. The first aim of historical biogeography should be the determination of the relationships between delineated areas. By omitting areas, their relationships with the other areas cannot be established. Criticism analogous to points raised against character compatibility methods (especially by Farris, 1983 p. 31) obtains here. Wiley's ( , 1981 'ancestral species maps' method (see Humphries and Parenti 1986) aims at reconstructing a sequence of speciation events. For this purpose. reduced area-cladograms are used implying that incongruencies are caused by unique factors. In the sole worked example, only minor incongruencies are met. It is unclear as to what might happen in an example including more groups in which there exists a great degree of incongruence. At the extreme, one wonders what the conclusions might be if an analysis for several groups revealed total incongruence. Would a general area-cladogram derived from a compound data matrix by component compatibility and selected on the basis of parsimony not constitute a more general explanation than only the unique factors? Furthermore, speciation events should be interpreted a posterWri based on a well resolved general area-cladogram.

Discussion and Conclusions
Recently, Wiley (in press) presented a general outline of a formal parsimony method in which he uses Brooks' (1981) method for converting taxon trees to binary codes for areas. In this method he explores different coding strategies, different from ours, for groups with missing taxa and/or with wide-spread species, using the PAUP Program (Swofford 1985) as an analytical tool. His method and its coding strategies are aimed at obtaining improved resolution than current methods allow in answering the two main questions posed in historical biogeography; (1) what are the relationships between areas, and (2) what are the co-evolutionary patterns between taxa with common or partly common distributions. As far as can we judged from the results obtained so far Wiley's new method operates in much the same spirit as ours and as a consequence removes the objections that can be raised against methods aimed solely at estimating a minimum number of unique events. Nelson and Platnick's (1981) component analysis aims at constructing general consensus area-cladograms when several conflicting biogeographic patterns are encountered in different groups of taxa. It has been amply demonstrated that consensus general areacladograms are coupled with a loss of information besides being notoriously unparsimonious explanations of the considered events. We have demonstrated that contradictions arising from incompatible area-components can be resolved perfectly in a general area-cladogram by applying the parsimony criterion. , We show also a fully implementation of Nelson and Platnick's Assumption 2 can hardly; be strictly applied in practice, as it leads to many general area-cladograms. Therefore, because the method described here resolves conflicts present in a data matrix, it is sufficient to undertake historical biogeographic analysis under Assumption O.
In conclusion, our biogeographic method renders full account of all information present in the initial data and gives the most resolved parsimonious general area-cladograms possible.