The Journal of Allergy and Clinical Immunology
Volume 120, Issue 6 , Pages 1433-1440, December 2007

Multivariate statistical analysis of large-scale IgE antibody measurements reveals allergen extract relationships in sensitized individuals

  • Daniel Soeria-Atmadja, MSc

      Affiliations

    • Division of Toxicology, National Food Administration, Uppsala, Sweden
    • Department of Medical Sciences, Uppsala University Hospital, Uppsala, Sweden
  • ,
  • Annica Önell, PhD

      Affiliations

    • Research and Development, Phadia AB, Uppsala, Sweden
  • ,
  • Anita Kober, PhD

      Affiliations

    • Research and Development, Phadia AB, Uppsala, Sweden
  • ,
  • Per Matsson, PhD

      Affiliations

    • Research and Development, Phadia AB, Uppsala, Sweden
  • ,
  • Mats G. Gustafsson, PhD

      Affiliations

    • Department of Engineering Sciences, Uppsala University, Uppsala, Sweden
    • Department of Medical Sciences, Uppsala University Hospital, Uppsala, Sweden
    • Corresponding Author InformationMats G. Gustafsson, Department of Engineering Science, Uppsala University, PO Box 534, SE-751 21, Uppsala, Sweden.
  • ,
  • Ulf Hammerling, PhD

      Affiliations

    • Division of Toxicology, National Food Administration, Uppsala, Sweden
    • Corresponding Author InformationReprint requests: Ulf Hammerling, Division of Toxicology, National Food Administration, PO Box 622, SE-751 26 Uppsala, Sweden.

Received 2 April 2007; received in revised form 28 June 2007; accepted 16 July 2007. published online 10 September 2007.

Article Outline

Background

Many allergenic sources are reportedly cross-reactive because of protein structural similarities. Although several aggregations are well characterized, no holistic mapping of IgE reactivity has hitherto been reported.

Objective

The aim of this study was to disclose relevant associations within a large set of allergen preparations, as revealed by specific IgE antibody levels in blood sera of multireactive human donors.

Methods

A dataset of recorded IgE antibody serum concentrations of 1011 nonidentifiable multireactive individuals (devoid of clinical records) to 89 allergen extracts was compiled for in silico analysis. Various algorithms were used to identify specific multivariate dependencies between the IgE antibody levels.

Results

Exhaustive cluster analysis demonstrates that IgE antibody responses to the 89 extracts can be aggregated into 12 stable formations. These clusters hold both well-known relationships, unexpected patterns, and unknown patterns, the latter categories being exemplified by the coclustering of wasp and certain seafood and a clear differentiation among pollen allergens.

Conclusion

Identified relationships within several well-known groups of cross-reactive allergen extracts confirm the applicability of dedicated multivariate data analysis within the allergology field. Moreover, some of the unexpected IgE reactivity associations in sensitized human subjects might help in identifying new relationships with potential importance to allergy.

Clinical implications

Although clinical implications from this study should be validated in subsequent investigations with documentation on symptoms included, we believe this seminal approach is a key step toward the development of new analysis tools for interpretation of allergy data generated by using high-throughput recording systems.

Key words: IgE reactivity, allergens, multivariate data analysis, clustering algorithms

Abbreviations used: CCD, Cross-reacting carbohydrate determinant, CMDS, Classical multidimensional scaling

 

Early diagnosis of allergy and adequate means to monitor clinical intervention are of importance to sustain quality of life for the patient. The presence of allergen-specific IgE antibodies and exposure to the relevant allergen are prerequisites for an IgE-mediated allergic reaction, although the clinical symptoms also depend on other factors in each patient.

IgE antibodies are selected to recognize and bind sensitizing proteins with high affinity through interaction with epitopes but can also target proteins containing motifs with sufficient similarity to the initially sensitizing allergen. This phenomenon is generally referred to as cross-reactivity and typically involves allergenic proteins from phylogenetically related species.1, 2, 3 Moreover, a relatively high degree of identity at the amino acid sequence level is typically seen between IgE cross-reactive proteins.4 An understanding of many cross-reactive patterns has developed in recent years; for example, it is common that patients allergic to major pollen allergens also react to a variety of fruits, vegetables, or both because of the occurrence of structurally similar components. Some common examples thereof are collectively referred to as the pollen-fruit syndrome.5 Certain cross-reacting structures, typified by cross-reacting carbohydrate determinants (CCDs) in particular but also involving other motifs, can, however, give rise to specific IgE antibodies without consistent relation to clinical symptoms.6, 7, 8 Moreover, IgE antibody assays, although useful tools to discriminate IgE-mediated allergy from other diseases with similar symptoms, are generally not able to distinguish a genuine multisensitization to several allergens from a serologic cross-reaction that also gives rise to IgE antibodies to a protein present in various allergen sources. Hence clinicians might not always be able to appropriately interpret multiple positive IgE test results.

Over the last 8 to 10 years, use of bioinformatics methods to collect, store, and analyze molecular information, clinical information, or both of importance to allergy have increased markedly.9, 10 Major applications include compilation of allergenic proteins (typically appearing as amino acid sequences) into publicly accessible repositories, computational assessment of amino acid sequences that can cause allergic reactions, identification of potential epitopes in allergen components, and categorization of interspecies allergenic proteins according to family or function.11, 12, 13, 14, 15, 16, 17, 18 As evident from this brief summary, bioinformatics algorithms used in allergology have preferentially been applied to molecular, rather than serologic, data.

In this article we have set out to apply bioinformatic methods to outline major IgE antibody response patterns on a relatively comprehensive level. For this purpose, a very large dataset of recorded serum IgE antibody measurements was compiled and analyzed. To our knowledge, this is the first study ever described in which multivariate data analysis and pattern-recognition algorithms have been applied on an IgE reactivity dataset of this size.

Back to Article Outline

Methods 

Datasets 

Measurements of IgE reactivity to allergens 

A depot of human blood serum holding about 49,000 samples from nonidentifiable sensitized individuals from the United States and Northern Europe has been established by purchase from several commercial suppliers. Its main purpose is to support the research, development, and quality assessment of ImmunoCAP (Phadia AB, Uppsala, Sweden). By using this technology, these donor blood sera have, over the last several years, been subjected to many thousands of measurements of specific IgE responses to a wide range of allergen preparations. The resulting data are stored in an internal database.

A subset of the database was compiled to include the largest possible numbers of individual blood sera and consistently tested with the largest panel of allergen preparations. The final dataset held 1011 unique samples (blood serum of human donors), each with recorded IgE antibody concentrations against 89 separate extracts (see Table E1 in the Online Repository at www.jacionline.org). Because there have been changes in assignment of negative test results over the years (listed as either <0.35, 0.0, or as the actual measured values between 0 and 0.35), all values of less than this cutoff level were set to 0.35 kUA/L.

Representation of allergens in terms of amino acid sequences 

Amino acid sequences used in this work were mined from publicly available allergen databases and compiled into a local catalogue, which is described in an earlier report.14 Proteins that were homologues to allergens but without any documentation on allergenicity/cross-reactivity were not considered in the analysis. Although such proteins would possibly represent the allergen preparations more accurately, the main purpose of sequence-based clustering was solely to learn whether serologic patterns could be related to known allergenic proteins. Moreover, sequences shorter than 50 amino acids were dismissed because they are likely to represent fragments, which are of poor relevance to the investigations conducted in this article. For 32 of the allergen sources, no amino acid sequences of allergenic proteins were available. Thus 57 extracts were represented by at least 1 allergenic protein. Moreover, with the aid of records in the Allergome database (http://www.allergome.org; accessed between August 30, 2006, and September 10, 2006),9 amino acid sequences of inhalant allergens were excluded from the food extracts. Each extract in Table E1 (available in the Online Repository at www.jacionline.org) is followed by 1 or several Uniprot accession numbers, which correspond to amino acid sequences of reported allergenic proteins.

Distance calculation between allergens 

Creating a distance matrix from IgE antibody profile values 

The serologic distance dser (X,Y) between 2 extracts X and Y was defined based on the degree of correlation between the corresponding IgE recordings (see the Online Repository at www.jacionline.org for details). Initial analysis showed that serum IgE values were not directly useful to reveal meaningful patterns and were therefore either transformed to natural logarithmic numbers (ie, ln[IgE]) or to their respective ranking numbers among all 1011 individuals before further processing. Thus in the second approach all 1011 IgE antibody concentrations to a particular allergen extract were each replaced by their ranking number subsequent to assortment in ascending order. The advantage of IgE value transformation lies in that high extreme values have less of an effect on calculations of, for example, correlation coefficients between 2 allergen extracts.

Creating a distance matrix based on amino acid sequences of the allergen components 

The amino acid sequence–based distance dseq (X,Y) between 2 extracts X and Y, each containing n and m known amino acid sequences of allergenic proteins (X = [x1,…,xn] and Y = [y1,…,ym]), respectively, was defined based on the largest sequence similarity found between all possible pairs (xi,yi) obtained from conventional (Smith-Waterman) local sequence alignment. For details, see the Online Repository at www.jacionline.org.

Clustering 

Hierarchic clustering 

Conventional hierarchic clustering, as used in this study, grouped the allergen extracts into a hierarchic family tree (dendrogram). For more details, see the Online Repository at www.jacionline.org.

k-Means clustering 

The k-means clustering algorithm was used to group allergen extracts in a predefined number of k separate groups (clusters) by trying to minimize the total sum of distances between the individual observations and their corresponding cluster centers. For more details, see the Online Repository at www.jacionline.org.

Validation of clustering methods with the silhouette technique 

The silhouette validation technique19 is a graphic aid for validation of data clusters through measurements of the proximity of each object in a cluster to objects in neighboring clusters. For details, see the Online Repository at www.jacionline.org.

Classical multidimensional scaling 

Classical multidimensional scaling (CMDS) is a multivariate technique that aims to find a low-dimension representation of data that preserve interpoint distances in the original space. For more information on CMDS, see Hastie et al.20

Implementation 

All computations were performed in the MATLAB programming environment (The MathWorks, Inc, Natick, Mass). Apart from the core program, the MathWorks Statistics and Bioinformatics toolboxes were used for visualization and sequence alignments, respectively.

Back to Article Outline

Results 

Cluster analysis of large-scale IgE antibody concentration measurements 

As an introductory approach to disclosing structure (ie, cross-reactivity/multisensitization relationships) in the dataset on IgE antibody reactivity, hierarchic clustering was applied. Regardless of the linkage methods used (see the Methods section and the Online Repository at www.jacionline.org for details), a majority of the extracts appeared in one huge cluster, whereas several of the remaining formations each held only a single cluster member (data not shown).

Eighty-nine allergens can be compiled into 12 clusters 

The k-means algorithm, with incremental (5-25) cluster numbers (k) was applied to both of the transformed matrices (logarithm or ranking). By means of average silhouette width of all cluster objects as a discriminator between different values of k, both logarithm and rank representations of IgE antibody measurements showed preference for a k value of 12 (Fig 1, A). Moreover, although logarithm values in k-means clustering analysis are favored compared with ranks (Fig 1, A), their corresponding cluster silhouettes are quite similar (Fig 1, B and C). Generally, allergens appearing in the clusters defined by logarithm values are in compliance with known relationships (Table I).

  • View full-size image.
  • Fig 1. 

    Silhouette analysis of k-means clustering. The influence of predefined cluster numbers (k) on mean silhouette values of all cluster members (A) using logarithm (solid line) or column ranks values (dashed line; A) is shown. Applying an accordingly derived optimal setting (k = 12), cluster silhouettes are depicted for logarithm (B) and column rank values (C). Each horizontal bar represents 1 of 89 extracts.

Table I. Allergen extract composition of the 12 clusters based on IgE antibody readouts
1. Fungi
Aureobasidium pullulans (325)
Botrytis cinerea (317)
Helminthosporium halodes (345)
Polymyxa betae (343)
Fusarium moniliforme (297)
Aspergillus fumigatus (332)
Stemphylium botryosum (447)
Epacris purpurascens (316)
Penicillium notatum (290)
Yeast (523)
Alternaria alternata (307)
Candida albicans (329)
Cladosporium herbarum (363)
Rhizopus nigricans (378)
Mucor racemosus (314)

2. Weed pollen
Marguerite (506)
Mugwort (552)
Dandelion (570)
Ragweed (588)
Cocklebur (541)

3. Mites
Dermatophagoides pteronyssinus (620)
Dermatophagoides farinae (596)

4. Tree and weed pollen
Willow (677)
Cottonwood (490)
White ash (691)
Box elder (515)
Saltwort (471)
Scale (499)
Sycamore maple (696)
Walnut (pollen) (427)
Goosefoot (603)
Plantain (497)
Elm (533)
Olive (pollen) (384)
Pulmonaria officinalis (607)
Parietaria judaica (564)
Nettle (582)
Sheep sorrel (612)
Japanese cedar (603)
Mountain juniper (498)
Carrot (137)

5. Tree pollen and apple
Beech (504)
Birch (549)
Oak (482)
Apple [11] (124)

6. Epithelium/dander
Dog (757)
Horse (532)
Cat (683)
Guinea pig (314)

7. Animal food
Beef (537)
Chicken (409)
Milk (578)
Egg white (172)
Pork (170)

8. Fish
Cod (456)
Tuna (523)

9. Seafood and insect
Shrimp (429)
Blue mussel (477)
Cockroach (409)
Wasp (353)

10. Grass pollen
Cocksfoot (672)
Timothy grass (775)
Johnson grass (743)
Common reed (769)
Bermuda grass (714)

11. Plant food general
Rice (347)
Oat (333)
Garlic (499)
Maize (556)
Wheat (354)
Barley (483)
Rye (587)
Orange (472)
White bean [12] (131)
Strawberry (551)
Buckwheat [12]
Onion (484)
Coconut [12] (437)
Sesame [12] (151)
Bee [9] (298)
Tomato (350)
Potato (318)
White pine [4] (539)

12. Nuts and legumes
Hazel nut (154)
Pea (333)
Brazil nut (520)
Almond (333)
Peanut (479)
Soy bean (165)

Numbers of positive tests (IgE level >0.35 kUA/L) for each allergen extract are displayed within parentheses.

Allergen extracts, which undergo transition from one cluster to another on reduction of 1011 individuals to 100 or 50 groups. The numbers appearing in brackets refer to new cluster assignment.

A separate test was performed to reveal whether cluster definition was appreciably influenced by potentially redundant or skewed representation of serum donors. First, the entire dataset of 1011 donor sera was assigned to either 100 or 50 groups, as accomplished by k-means clustering. Subsequently, “metadonor” reactivity profiles were created by averaging over donor blood sera IgE antibody readouts in each such cluster. Finally, k-means clustering (using k = 12) was conducted on the accordingly obtained 100 × 89 and 50 × 89 data matrices, respectively. As evident from Table I, the majority of clusters maintained an identical composition after substantial reduction of the original donor set. A few extracts, though, seemed somewhat promiscuous, as revealed by their appearance in alternating clusters (each extract swapping between 2 groups). These allergens (appearing as tagged in Table I) were typically those with low silhouette values (Fig 1, B).

Allergen maps: Visualization of cluster patterns based on IgE antibody reactivity data 

Apart from cluster assignment, as outlined above, CMDS might reveal embedded patterns in the multidimensional IgE reactivity data and, additionally, support overview of relationships among preparations not obvious in the 89 × 89 correlation matrix. Therefore a map of interrelationships between allergen preparations was created (Fig 2). The 3-dimensional map shows all-against-all distances between the extracts, as projected down onto 3 dimensions. Distance calculations were based on the Pearson correlation coefficient between logarithm values of IgE antibody concentrations (see the Methods section). Color-encoded preparations, according to cluster association, enable visualization of the 12 clusters identified with the k-means analysis. For example, the 2 “mites” extracts are clearly tightly colocated, as well as those of “fish,” whereas “epithelium/dander” extracts and “animal foods” are a little less well assembled. In line with results derived from k-means analysis (Table I), pollen partially overlaps with “plant food general.”

  • View full-size image.
  • Fig 2. 

    Allergen map of all 89 allergens. A 3-dimensional map (2 separate views) showing interrelationships between extracts projected in 3 dimensions by means of CMDS on logarithm values of IgE reactivity data is shown. Each preparation is color encoded according to 12 clusters generated by k-means clustering.

Comparison of cluster analysis with either IgE reactivity or amino acid sequence data 

Considering the nature of IgE antibody cross-reactivity, there is reason to assume connections between human IgE antibody responses and similarity between amino acid sequences of the allergenic proteins.3, 21 Cluster analysis was therefore conducted on extracts holding at least 1 allergenic protein entailed to documentation on its amino acid sequence in a total of 57 preparations. Aggregation of extracts based on protein components (amino acid sequences) was delineated by means of hierarchic clustering (Fig 3) because these sort of data are not accessible to analysis by the k-means algorithm (see the Online Repository at www.jacionline.org). IgE reactivity clusters based on 57 extracts emerged as almost entirely equivalent to those derived from the complete (89 extracts) IgE antibody dataset (not shown).

  • View full-size image.
  • Fig 3. 

    Sequence similarity dendrogram for 57 allergens. Relationships within 57 extracts are based on amino acid sequence similarity, as derived by means of hierarchic clustering (average linkage). Cluster indices are indicated below each branch. For each such formation (except for single clusters), protein families dictating extract assembly formations are listed below dashed lines.

Extract/source relationships, as revealed by means of cluster analysis on amino acid sequence similarity, showed, on an overall basis, an appreciable agreement with those derived from IgE antibody data, but several dissimilarities are also apparent (compare Fig 3 and Table I). The “fungi” groups appear identical, whereas those involving pollens and plant-derived foods differ quite markedly. Moreover, formations corresponding to the IgE reactivity clusters “epithelium/dander” and “animal foods” are matched with a serum albumin cluster, as well as a group consisting of egg and milk components, whereas “mites” and “seafood and insect” have been merged into a tropomyosin cluster (Table I and Fig 3). Notably, for the smaller clusters appearing on end branches of the dendrogram, the observed proximities presumably relate to sequence similarity between allergens in 1 or 2 protein families. Extracts in the larger cluster are, however, likely to be connected as a consequence of similarity between various allergens across several different protein families (Fig 3).

Back to Article Outline

Discussion 

Multivariate strategy on a unique IgE reactivity dataset suggests and visualizes 12 main aggregations of extracts 

Among 1011 individual blood donors, only 25 subjects are monosensitive, whereas 833 have high IgE levels against at least 10 separate allergen preparations. Although the compiled dataset is not fully comprehensive with regard to allergen preparations, its size and composition, in combination with multivariate data analysis, enable disclosure of interesting interextract relationships. Even though we lack confirming symptom data on patients, the functional relevance of our analysis results is based on the fact that sensitization to protein allergens, resulting in the expression of IgE antibodies, is fundamental to IgE-mediated allergy. Most sera were collected from multisensitive donors, which implies that the compiled dataset does not represent a random selection. Moreover, regional sampling could bias the validity of the findings for other geographic areas.

A variety of clustering and visualization algorithms were applied to the dataset to unravel global relationships. The principal outcome is a structure that materializes as 12 separate IgE antibody–reactivity clusters. Either fewer or more categories showed lower scores in k-means/silhouette tests (Fig 1, A). Inspection of each silhouette representation shows that clusters 2 (“weed pollen”), 3 (“mites”), 8 (“fish”), and 10 (“grass pollen”) are most robust, closely followed by groups 5 (“tree pollen/apple”) and 6 (“epithelium/dander”; Fig 1, B, and Table I). Additionally, extended data analysis involving reduction of the number of individuals did not appreciably change the overall pattern (Table I).

Although the silhouette graphs and data derived from them provide insight into the relevance of serology-derived clusters (Fig 1), CMDS was used to create a 3-dimensional allergen map (Fig 2) that displays relationships between the allergen preparations. Because compression of an 89-dimensional space into a 3-dimensional allergen map might involve substantial information loss, displayed relationships must not be overinterpreted in their details. Although distances between the items in it are difficult to directly interpret into an allergologically quantifiable unit, these maps are very powerful to present relations between clusters. Clearly, the “seafood/insect” cluster and that of “epithelium/dander” are situated in opposite segments of the volume representation. Likewise and as anticipated, “animal foods” are very remote from the “plant food general” assembly.

Clustering based on amino acid sequence similarity suggests IgE cross-reactivity as being implicated in the majority of patterns 

A large body of reports demonstrates that IgE-mediated allergy generally aggregates in clusters of allergens.22 A variety of different mechanisms can, however, cause multiple IgE reactivities in a sensitized individual.6, 7, 21, 23

In an attempt to further resolve the results derived by k-means clustering of IgE antibody reactivity data, each allergen extract/source was searched for the presence of 1 or more molecularly defined allergens. These allergenic proteins were further analyzed to discover similarities between their corresponding amino acid sequences. An exhaustive search revealed that 57 of 89 allergen extracts could be connected with particular allergenic proteins. A relatively high degree of correspondence is seen between clustering based on IgE antibody reactivity to allergen extracts and clustering based on sequence alignment of proteins (Fig 3). Hence although the aggregations defined in Table I depict multimechanism-based relationships in IgE-mediated reactivity rather than any specific sort of interaction, these results suggest an appreciable involvement of IgE cross-reactivity in most clusters derived from large-scale IgE antibody reactivity analysis.

Identification of known relationships supports the validity of the multivariate approach 

The reactivity aggregation conforms, for the most part, to several widely known cross-reactivity patterns. The aggregation of fungi into a single cluster (Table I and Fig 2), presumably involving IgE cross-reactivity of enolases and serine proteases, as indicated by sequence similarity results (Fig 3), is in agreement with other reported studies.2, 24 Another intuitive collection is the “fish” (no. 8) cluster, which holds highly interrelated allergens (Figs 1, B, and 2). A tight connection between tuna and cod is presumably dictated by parvalbumins.25, 26 Furthermore, cluster 5 (“tree pollen and apple”) shows well-known connections between tree pollens belonging to the Fagales order and apple.27 Moreover, the aggregation of the 4 mammalian animals into cluster 6 (“epithelium/dander”) is also in line with reports describing cross-reactions caused by shared components within the lipocalin28, 29 and serum albumin protein30 families.

Intriguing and unexpected patterns identified by means of multivariate inspection 

A large and rather heterogeneous cluster, designated “plant food general” (no. 11), contains 2 unanticipated items: bee and white pine. These results are in conflict with the observed hyaluronidase formation in the dendrogram, where bee and wasp aggregate well (Fig 3). Studies have identified hyaluronidase as a conserved allergenic component in insect venom, although the clinical relevance of corecognition is disputed because IgE antibody responses also seem directed to CCDs.31, 32 Moreover, it has been shown that sera containing such CCD-specific IgE antibodies might also bind allergen glycoproteins in pollens,33 which gives some support for the appearance of bee in the “plant food” cluster.

As regards Dermatophagoides species allergens, cysteine proteases and (especially) the group 2 mite proteins are known to be highly cross-reactive.34, 35 Tropomyosins, major allergens of crustaceans and mollusks, have, however, also been reported as important cross-reactive mite allergens,36 and a recent report suggests tropomyosin as a clinically relevant component in IgE cross-reactivity between mite, cockroach, shrimp, and crab.37 Interestingly, however, cluster 9 (“seafood and insects”) holds crustaceans, mollusks, and cockroach but not mites. Hence segregation of extracts in clusters 3 and 9 supports the view that tropomyosins are less important components in mite sensitization relative to, for example, group 1 and 2 allergens.38, 39 This perception is further reinforced by the fact that “mites” is not very close to “seafood and insects” in the allergen map (Fig 2). Clinical information on the individuals' symptoms are, however, not available.

Although the 3 neighboring clusters 2 (“weed pollen”), 3 (“tree and weed pollen”), and 10 (“grass pollen”) seem to overlap in the allergen map (Fig 2), they are aggregated into 3 separate categories by the clustering procedure. The differentiation of the 5 different grass allergens (all belonging to the Poaceae family) from other pollen formations is highly interesting because it indicates that the methodology used here might distinguish allergens of monocotyledons from those of dicotyledons. It should also be stressed that members of cluster 2 (“weed pollen”) all belong to the Asteraceae family, whereas no allergen of this family is present in cluster 3 (“tree and weed pollen”). The distinct differentiation of pollen and plant food extracts is not obtained in sequence-based clustering, as illustrated by the large cluster on the ultimate left branch of the sequence similarity dendrogram (Fig 3). This atypical formation might be related to a phenomenon caused by extracts with multiple (characterized) allergenic proteins. Such extracts show close (sequence similarity based) relationships with several other extracts, although each relationship can be due to different allergens. Consequently, all related extracts appear in the same cluster, even though some of them do not share similar characterized allergens.

Multivariate data analysis for component-based formats 

Although still at an early developmental stage and subject to evaluation of diagnostic accuracy, purified and biotechnology-produced allergen components are promising candidate reagents for conferring significantly enhanced information to allergy diagnostics.40, 41, 42 Until pure allergen proteins of preserved allergenicity become available in sufficient numbers to accurately portray a wide range of IgE-mediated allergy disease profiles, however, such formats can be most adequately used as complementary to assay systems based on allergen extracts. For example, certain allergen extracts are known to hold labile allergens that might undergo (partial) degradation during processing. This is a potential source of spurious findings, which possibly could apply also to some unexpected results in this study. We believe the concept of mapping the IgE antibody reactivity relationship by means of advanced bioinformatic methodology, as outlined in this article, is even more indispensable for data derived from component-based assay formats because of the presumably higher complexity in information derived from such experiments.

Conclusion 

The present study shows that methods for multivariate data analysis are necessary for extraction of relationship patterns in large datasets containing human IgE antibody reactivity readouts to multiple allergen extracts. The identification of 12 stable clusters of extracts in the largest serologic dataset ever described in this context reinforces many well-known cross-reactivity patterns. In addition, some hitherto unrecognized relationships between allergen sources, such as the coclustering of wasp and certain seafood and a differentiation of various pollen extracts, are indicated. A combined application of extracts and pure proteins, in conjunction with clinical data on allergy symptoms, should be ideal to further unravel the complex multimechanistic associations in IgE-mediated allergy. To benefit maximally from this development, we suggest the use of multivariate data analysis similar to that presented here in experimental allergology.

Back to Article Outline

 

We thank Ulf Bengtsson at the Asthma and Allergy Research Group, Department of Respiratory Medicine and Allergy, Sahlgrenska Academy at Gothenburg University, for scientific input to the manuscript. Anonymous reviewers are acknowledged for valuable comments on the manuscript.

Back to Article Outline

Appendix. Supplementary data 

Online Repository.

Back to Article Outline

References 

  1. Ferreira F, Hawranek T, Gruber P, Wopfner N, Mari A. Allergic cross-reactivity: from gene to the clinic. Allergy. 2004;59:243–267
  2. Bowyer P, Fraczek M, Denning DW. Comparative genomics of fungal allergens and epitopes shows widespread distribution of closely related allergen and epitope orthologues. BMC Genomics. 2006;7:251
  3. Breiteneder H, Mills C. Structural bioinformatic approaches to understand cross-reactivity. Mol Nutr Food Res. 2006;50:628–632
  4. Aalberse RC. Structural biology of allergens. J Allergy Clin Immunol. 2000;106:228–238
  5. Vieths S, Scheurer S, Ballmer-Weber B. Current understanding of cross-reactivity of food allergens and pollen. Ann N Y Acad Sci. 2002;964:47–68
  6. Mari A, Iacovacci P, Afferni C, Barletta B, Tinghino R, Di Felice G, et al. Specific IgE to cross-reactive carbohydrate determinants strongly affect the in vitro diagnosis of allergic diseases. J Allergy Clin Immunol. 1999;103:1005–1011
  7. Foetisch K, Westphal S, Lauer I, Retzek M, Altmann F, Kolarich D, et al. Biological activity of IgE specific for cross-reactive carbohydrate determinants. J Allergy Clin Immunol. 2003;111:889–896
  8. Asero R, Ballmer-Weber BK, Beyer K, Conti A, Dubakiene R, Fernandez-Rivas M, et al. IgE-Mediated food allergy diagnosis: current status and new perspectives. Mol Nutr Food Res. 2007;51:135–147
  9. Mari A. Importance of databases in experimental and clinical allergology. Int Arch Allergy Immunol. 2005;138:88–96
  10. Brusic V. Information management for the study of allergies. Inflamm Allergy Drug Targets. 2006;5:35–42
  11. Gendel SM, Jenkins JA. Allergen sequence databases. Mol Nutr Food Res. 2006;50:633–637
  12. Taylor SL. Review of the development of methodology for evaluating the human allergenic potential of novel proteins. Mol Nutr Food Res. 2006;50:604–609
  13. Bjorklund AK, Soeria-Atmadja D, Zorzet A, Hammerling U, Gustafsson MG. Supervised identification of allergen-representative peptides for in silico detection of potentially allergenic proteins. Bioinformatics. 2005;21:39–50
  14. Soeria-Atmadja D, Lundell T, Gustafsson MG, Hammerling U. Computational detection of allergenic proteins attains a new level of accuracy with in silico variable-length peptide extraction and machine learning. Nucleic Acids Res. 2006;34:3779–3793
  15. Cui J, Han LY, Li H, Ung CY, Tang ZQ, Zheng CJ, et al. Computer prediction of allergen proteins from sequence-derived protein structural and physicochemical properties. Mol Immunol. 2007;44:514–520
  16. Stadler MB, Stadler BM. Allergenicity prediction by protein sequence. FASEB J. 2003;17:1141–1143
  17. Jenkins JA, Griffiths-Jones S, Shewry PR, Breiteneder H, Mills EN. Structural relatedness of plant food allergens with specific reference to cross-reactive allergens: an in silico analysis. J Allergy Clin Immunol. 2005;115:163–170
  18. Radauer C, Breiteneder H. Pollen allergens are restricted to few protein families and show distinct patterns of species distribution. J Allergy Clin Immunol. 2006;117:141–147
  19. Rousseeuw PJ. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comp App Math. 1987;20:53–65
  20. Hastie T, Tibshirani R, Friedman J. The elements of statistical learning: data mining, inference and prediction. New York: Springer-Verlag; 2001;
  21. Aalberse RC, Akkerdaas J, van Ree R. Cross-reactivity of IgE antibodies to allergens. Allergy. 2001;56:478–490
  22. Sicherer SH. Clinical implications of cross-reactive food allergens. J Allergy Clin Immunol. 2001;108:881–890
  23. Burastero SE, Paolucci C, Breda D, Longhi R, Silvestri M, Hammer J, et al. T-cell receptor-mediated cross-allergenicity. Int Arch Allergy Immunol. 2004;135:296–305
  24. Simon-Nobbe B, Probst G, Kajava AV, Oberkofler H, Susani M, Crameri R, et al. IgE-binding epitopes of enolases, a class of highly conserved fungal allergens. J Allergy Clin Immunol. 2000;106:887–895
  25. Poulsen LK, Hansen TK, Norgaard A, Vestergaard H, Stahl Skov P, Bindslev-Jensen C. Allergens from fish and egg. Allergy. 2001;56:39–42
  26. Van Do T, Elsayed S, Florvaag E, Hordvik I, Endresen C. Allergy to fish parvalbumins: studies on the cross-reactivity of allergens from 9 commonly consumed fish. J Allergy Clin Immunol. 2005;116:1314–1320
  27. Mari A, Wallner M, Ferreira F. Fagales pollen sensitization in a birch-free area: a respiratory cohort survey using Fagales pollen extracts and birch recombinant allergens (rBet v 1, rBet v 2, rBet v 4). Clin Exp Allergy. 2003;33:1419–1428
  28. Virtanen T. Lipocalin allergens. Allergy. 2001;56:48–51
  29. Fahlbusch B, Rudeschko O, Schlott B, Henzgen M, Schlenvoigt G, Schubert H, et al. Further characterization of IgE-binding antigens from guinea pig hair as new members of the lipocalin family. Allergy. 2003;58:629–634
  30. Goubran Botros H, Gregoire C, Rabillon J, David B, Dandeu JP. Cross-antigenicity of horse serum albumin with dog and cat albumins: study of three short peptides with significant inhibitory activity towards specific human IgE and IgG antibodies. Immunology. 1996;88:340–347
  31. Wilson IB, Harthill JE, Mullin NP, Ashford DA, Altmann F. Core alpha1,3-fucose is a key part of the epitope recognized by antibodies reacting against plant N-linked oligosaccharides and is present in a wide variety of plant extracts. Glycobiology. 1998;8:651–661
  32. Hemmer W, Focke M, Kolarich D, Dalik I, Gotz M, Jarisch R. Identification by immunoblot of venom glycoproteins displaying immunoglobulin E-binding N-glycans as cross-reactive allergens in honeybee and yellow jacket venom. Clin Exp Allergy. 2004;34:460–469
  33. Hemmer W, Focke M, Kolarich D, Wilson IB, Altmann F, Wohrl S, et al. Antibody binding to venom carbohydrates is a frequent cause for double positivity to honeybee and yellow jacket venom in patients with stinging-insect allergy. J Allergy Clin Immunol. 2001;108:1045–1052
  34. Smith AM, Benjamin DC, Derewenda U, Smith WA, Thomas WR, Chapman MD. Sequence polymorphisms and antibody binding to the group 2 dust mite allergens. Int Arch Allergy Immunol. 2001;124:61–63
  35. Thomas WR, Smith WA, Hales BJ, Mills KL, O'Brien RM. Characterization and immunobiology of house dust mite allergens. Int Arch Allergy Immunol. 2002;129:1–18
  36. Aki T, Kodama T, Fujikawa A, Miura K, Shigeta S, Wada T, et al. Immunochemical characterization of recombinant and native tropomyosins as a new allergen from the house dust mite, Dermatophagoides farinae. J Allergy Clin Immunol. 1995;96:74–83
  37. Purohit A, Shao J, Degreef JM, van Leeuwen A, van Ree R, Pauli G, et al. Role of tropomyosin as a cross-reacting allergen in sensitization to cockroach in patients from Martinique (French Caribbean island) with a respiratory allergy to mite and a food allergy to crab and shrimp. Allerg Immunol (Paris). 2007;39:85–88
  38. Pittner G, Vrtala S, Thomas WR, Weghofer M, Kundi M, Horak F, et al. Component-resolved diagnosis of house-dust mite allergy with purified natural and recombinant mite allergens. Clin Exp Allergy. 2004;34:597–603
  39. Hales BJ, Martin AC, Pearce LJ, Laing IA, Hayden CM, Goldblatt J, et al. IgE and IgG anti-house dust mite specificities in allergic disease. J Allergy Clin Immunol. 2006;118:361–367
  40. Lidholm J, Ballmer-Weber BK, Mari A, Vieths S. Component-resolved diagnostics in food allergy. Curr Opin Allergy Clin Immunol. 2006;6:234–240
  41. Mothes N, Valenta R, Spitzauer S. Allergy testing: the role of recombinant allergens. Clin Chem Lab Med. 2006;44:125–132
  42. Wohrl S, Vigl K, Zehetmayer S, Hiller R, Jarisch R, Prinz M, et al. The performance of a component-based allergen-microarray in clinical practice. Allergy. 2006;61:633–639

 Supported by the Swedish Governmental Agency for Innovation Systems (VINNOVA).

 Disclosure of potential conflict of interest: A. Önell and A. Kober are employed by Phadia AB. P. Matsson owns stock in, has patent licensing arrangements with, and is employed by Phadia AB. The rest of the authors have declared that they have no conflict of interest.

PII: S0091-6749(07)01393-0

doi:10.1016/j.jaci.2007.07.021

The Journal of Allergy and Clinical Immunology
Volume 120, Issue 6 , Pages 1433-1440, December 2007